SAINT PETERSBURG STATE UNIVERSITY THE INTERNATIONAL SOCIETY OF DYNAMIC GAMES (Russian Chapter)

CONTRIBUTIONS TO AND MANAGEMENT Volume X

Collected papers Edited by Leon A. Petrosyan and Nikolay A. Zenkevich

Saint Petersburg State University Saint Petersburg 2017 518.9, 517.9, 681.3.07

Contributions to game theory and management, vol. X. Collected papers presented on the Tenth International Conference Game Theory and Management / Editors Leon A. Petrosyan, Nikolay A. Zenkevich. – SPb.: Saint Petersburg State University, 2017. – 404 p.

The collection contains papers accepted for the Tenth International Conference Game Theory and Management (July 7-9, 2016, St. Petersburg State University, St. Petersburg, Russia). The presented papers belong to the field of game theory and its applications to management. The volume may be recommended for researches and post-graduate students of management, economic and applied mathematics departments. Sited and reviewed in: Math-Net.Ru and RSCI. Abstracted and indexed in: Mathe- matical Reviews, Zentralblatt MATH and VINITI.

c Copyright of the authors, 2017 c Saint Petersburg State University, 2017

ISSN 2310-2608









Contents

Preface ...... 5

Cost Optimization for the Transport Network of Yakutia ...... 7 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova

A Signaling Advertising Model Between an Intelligent Consumer and Two E-tailers ...... 17 M. Esmaeili, M. Masoumirad

Information Pooling Game in Multi-Portfolio Optimization ...... 27 Jing Fu

Cooperation in Dynamic Network Games ...... 42 Hongwei Gao, Yaroslavna Pankratova

Games with Incomplete Information on the Both Sides and with Public Signal on the State of the Game ...... 68 Misha Gavrilovich, Victoria Kreps

Static Game Theoretic Models of Coordination of Private and Pub- lic Interests in Economic Systems ...... 79 Olga I. Gorbaneva, Guennady A. Ougolnitsky

On the Conditions on the Integral Payoff Function in the Games with Random Duration ...... 94 Ekaterina V. Gromova, Anastasiya P. Malakhova, Anna V. Tur

Modelling of Information Spreading in the Population of Taxpayers: Evolutionary Approach ...... 100 Suriya Sh. Kumacheva, Elena A. Gubar, Ekaterina M. Zhitkova, Zlata Kurnosykh, Tatiana Skovorodina

A Search Game with Incomplete Information on Detective Capability of Searcher ...... 129 Ryusuke Hohzaki

Application of Game Theory in the Analysis of Economic and Po- litical Interaction at the International Level ...... 143 Pavel V. Konyukhovskiy, Victoria V. Holodkova 4

Game-Theoretic Approach for Modeling of Selfish and Group Routing ...... 162 Alexander Yu. Krylatov, Victor V. Zakharov

Stationary Nash Equilibria for Two-Player Average Stochastic Games with Finite State and Action Spaces ...... 175 Dmitrii Lozovanu, Stefan Pickl

Integrative Approach to Supply Chain Collaboration in Distribu- tion Networks: Impact on Firm Performance ...... 185 Natalia Nikolchenko, Anastasia Lebedeva

Blotto Games with Costly Winnings ...... 226 Irit Nowik, Tahl Nowik

Social Welfare under Oligopoly: Does the Strengthening of Compe- tition in Production Increase Consumers’ Well-Being? ...... 233 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse

Cooperation in Bioresource Management Problems ...... 245 Anna N. Rettieva

Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes ...... 287 Victor V. Rozen

Design and Simulation of as Lead Generating Mechanism299 Maxim Shlegel, Nikolay Zenkevich

On a Dynamic Traveling Salesman Problem ...... 326 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

Constructive and Blocking Powers in Some Applications ...... 339 Svetlana Tarashnina, Nadezhda Smirnova

Coordination in Multilevel Supply Chain ...... 350 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Strategic Alliances Stability Factors ...... 375 Nikolay Zenkevich, Anastasiia Reusova

10 Years Game Theory and Management (GTM) ...... 396 Maria Bulgakova Contributions to Game Theory and Management, X, 5–6

Preface

This edited volume contains a selection of papers that are an outgrowth of the Tenth International Conference on Game Theory and Management with a few addi- tional contributed papers. These papers present an outlook of the current develop- ment of theory of games and its applications to management and various domains, in particular, finance, , environment and economics. The International Conference on Game Theory and Management, a three day conference, was held in Saint Petersburg, Russia in July 7-9, 2016. The conference was organized by St. Petersburg State University in collaboration with The Inter- national Society of Dynamic Games (Russian Chapter). 86 participants from 22 countries had an opportunity to hear state-of-the-art presentations on a wide range of game-theoretic models, both theory and management applications. Plenary lectures covered different areas of games and management applications. They had been delivered by Professor Jean-Jacques Herings, School of Business and Economics, Maastricht University (The Netherlands); Professor (Nobel Prize in Economics), Department of Economics, Harvard University (USA); Professor Eilon Solan, School of Mathematical Sciences, Tel Aviv University (Is- rael); Professor Alexander Tarasyev, Department of Dynamic Systems, Institute of Mathematics and Mechanics, RAS, Ekaterinburg (Russia). The importance of strategic behavior in the human and social world is increas- ingly recognized in theory and practice. As a result, game theory has emerged as a fundamental instrument in pure and applied research. The discipline of game theory studies decision making in an interactive environment. It draws on mathematics, statistics, operations research, engineering, biology, economics, political science and other subjects. In canonical form, a game takes place when an individual pursues an objective(s) in a situation in which other individuals concurrently pursue other (possibly conflicting, possibly overlapping) objectives and in the same time the ob- jectives cannot be reached by individual actions of one decision maker. The problem is then to determine each individual’s optimal decision, how these decisions interact to produce equilibrium, and the properties of such outcomes. The foundations of game theory were laid more than seventy years ago by von Neumann and Morgen- stern (1944). Theoretical research and applications in games are proceeding apace, in areas ranging from aircraft and missile control to inventory management, market devel- opment, natural resources extraction, competition policy, negotiation techniques, macroeconomic and environmental planning, capital accumulation and investment. In all these areas, game theory is perhaps the most sophisticated and fertile paradigm applied mathematics can offer to study and analyze decision making under real world conditions. The papers presented at this Tenth International Conference on Game Theory and Management certainly reflect both the maturity and the vitality of modern day game theory and management science in general, and of dynamic games, in particular. The maturity can be seen from the sophistication of the theorems, proofs, methods and numerical algorithms contained in the most of the papers in these contributions. The vitality is manifested by the range of new ideas, new applications, the growing number of young researchers and the expanding 6 world wide coverage of research centers and institutes from whence the contributions originated. The contributions demonstrate that GTM2016 offers an interactive program on wide range of latest developments in game theory and management. It includes recent advances in topics with high future potential and exiting developments in classical fields. We thank Anna Tur from the Faculty of Applied Mathematics (SPbSU) for displaying extreme patience typesetting the manuscript.

Editors, Leon A. Petrosyan and Nikolay A. Zenkevich Contributions to Game Theory and Management, X, 7–16

Cost Optimization for the Transport Network of Yakutia

Galina I. Bubyakina1, Taisia M. Plekhanova2 and Ekaterina V. Gromova3 1 St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected] 2 St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected] 3 St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected]

Abstract The paper studies game-theoretic approach to the problem of reducing costs of agricultural products transportation on the transport net- work roads of the Sakha (Yakutia) Republic. Also were proposed a math- ematical formulation of problem of players cooperation as the problem of reducing costs for the transport network in the form of cooperative game with characteristic function, as well as a cooperative game with a coalition structure. The solution of such cooperative games, i.e. the optimal cost dis- tribution between players was found in the form of . Keywords: cooperative game, characteristic function, Shapley value.

1. Introduction The economic development of the Russian Far East and Siberia depends on the transport network. The transport network of Sakha (Yakutia) Republic has special features such as that hard-surface roads connect only 7,3% of all settlements of republic. At the same time 30% of all roads are hard-surface roads and the remaining 70% are ice roads. The game-theoretical approach to the problem of cost optimization of agricul- tural products transportation on the transport network of the Sakha (Yakutia) Republic is considered in this paper. Also is proposed a mathematical formulation of problem of players cooperation as the problem of reducing costs for the transport network in the form of cooperative game with characteristic function, as well as a cooperative game with a coalition structure. The application of for similar problems were reviewed in (Ergun et al., 2007; Krajewska et al., 2008; Khmelnitskaya and Yanovskaya, 2007). Transport network can be represented as a finite graph, defined by a finite set of vertices and edges. The vertices of this graph are settlements, and the edges are roads with a different type of road surfaces. The goal of each player is to transport products to Yakutsk, and the player’s is to choose the way with minimal transportation costs. Transport network is considered in summer and winter periods. The cooperative game theory helps to construct a mathematical model of eco- nomic problems of transporting products (Von Neumann and Morgenstern, 1944). Such problems required minimizing the total costs of transporting products, for example by creating various coalitions of participants using this road network. 8 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova

Games with a coalition structure were studied in papers that described co- operative solutions in the form of the Owen value and the Aumann-Dreze value (Owen, 1971; Aumann and Dreze, 1974). In this paper were proposed a mathemat- ical formulation of the problem of cooperation between producers of agricultural products in transporting it and the construction of a cooperative game in the form of a characteristic function in the problem of transporting manufactured products. The cooperative theory of games allows to investigate the possibilities of producers of agricultural products in order to reduce the costs of its transportation. As a prin- ciple of optimality of cooperative game we took the Shapley value (Shapley, 1953). The Shapley value is an n-dimensional vector in which the component i is a cost of the player i. Thus, the Shapley value determines the optimal distribution of total costs incurred by the maximum coalition (coalition of all players) as a result of the cooperative effect. A mathematical model in the form of cooperative games in similar problems was investigated in the papers (Zakharov and Shchegryaev, 2012; Shchegryaev and Zakharov, 2014). Classical cooperative model assumes that it is advantageous for players to unite into a maximal coalition in order to minimize total costs due to the subadditivity of characteristic function. Therefore, the problem is to find a sharing of the minimum costs. So, it is assumed that there are several producers of agricultural products located in the ulus centers, which are connected to the city of Yakutsk road network. Each manufacturer has a certain own resource, designed to transport the products from the ulus center in Yakutsk. In order to reduce the cost of transporting agricultural products, local producers can be combined into different coalitions. Therefore, in each coalition, there may be a redistribution of costs between the participants of this coalition, and as a consequence, a change in the way (strategy) of transporting the products of each producer in comparison with the ways planned without taking into account possible cooperation. Then, firstly, it is required to find the optimal way for each player of the coalition and determine its total costs, and secondly, to calculate the values of characteristic function of the corresponding cooperative game. On the basis of the foregoing, the task is to study this optimization problem in the form of a cooperative game in order to effectively divide the total costs between producers of products.

Let S N be some coalition of players. The minimum aggregate costs vh(S) of the S coalition⊆ are the sum of the minimum total costs of each player i of the coalition S. The cost of the transition to the common path is included in this amount once, see (Karpov and Petrosyan, 2012). This can be interpreted in such a way that the coalition S to overcome the common path carries costs proportional to the length of this path. The cost of passing the ferry line of the coalition S is determined by the carrying capacity of the vehicle. For example, a player with the longest minimum path among all participants in the game offers services for transporting products to other players to overcome the common path, assuming that transportation costs are divided among the players of the coalition S in accordance with the Shapley value. The following is a review of the literature on which the research is based in this paper. Cost Optimization for the Transport Network of Yakutia 9

In the paper (Karpov and Petrosyan, 2012) cooperative solutions in communication networks are studied. The algorithm for constructing the optimal allocation of costs among players constructed in this work allows us to find ra- tional solutions of cooperative games on communication networks. The study of optimization problem in the form of a cooperative game makes it possible to effectively divide costs among players in the form of the Shapley value. In the paper (Zakharov and Shchegryaev, 2012) the problem of optimizing trans- portation costs for carriers’ cooperation on networks is considered. The developed algorithm of the coalition induction of the construction of the characteristic func- tion in the cooperation of transport companies, which ensures the performance of the property of subadditivity of the characteristic function, allowing us to find the optimal distribution of costs. As a solution, the Shapley value is considered. As a result of the proposed algorithm, effective solutions of the corresponding cooper- ative game-effective routes for cargo transportation-are constructed to obtain the characteristic function. Then this algorithm allows to determine the distribution of optimal costs between players in the form of the Shapley value.

2. The game-theoretic approach to the problem of cost reduction for the transport network of Yakutia 2.1. Mathematical model of the problem of cooperation of players with the purpose of reducing transportation costs

Let G = (X, R) be the graph, where X is the set of vertices xi, i =1, 2, ..., n, repre- senting settlements of the republic, R is the set of edges rij = (xi, xj ), j =1, 2, ..., n, that represents the model of the roads. The edges have weighted numerical char- acteristics determined by the type of road surface. The weight characteristics pij satisfy the equality pij = pji. Thus, this weight characteristics form a symmetric matrix P = (pij ). Since every edge of the graph G has a weight, this graph is a network. A conventional graph in this sense is a network whose weight pij of each edge (xi, xj ) is equal to one. A non-negative symmetric real function Sij = s(rij ) is defined on the set of edges R. The value of this function determines costs that are associated with the passing from vertex xi to vertex xj on the edge rij . It is clear that sij = sji and sii = 0. In addition to the function sij = s(rij ), we also introduce function Lij = l(rij ) as follows

Lij = pij sij , where pij is coefficient of road surfaces or edge weight characteristic. Now let us give some theoretical basis for the formulation and solution of problem in the form of cooperative game. Network state G = (X, R) is an n-dimensional vector z = (z1,z2..., zn) such that its every component corresponds to the vertex of graph G in which located player i (Karpov and Petrosyan, 2012). We denote set of states of network G by Ω. By path L(rij ) from vertex xi to vertex xj of network G is meant any finite sequence of edges from the set R that connect the vertex xi to the vertex xj . Now consider n-person game on network G = (X,U), where N = 1, 2, ..., n is the set of players, and z0 and zl is the initial and final states of the network G. We denote this game by Γ = (G,N,z0,zl). 10 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova

i i By strategy h of player i is meant any path connecting its initial position z0 i with finite position zl ,i = 1, 2, ..., n. By H( i ) denoted the set of all strategies of player i, and by H(N) denoted the set of all{ possible} situations. Next we denote U(H) as the set of edges in the situation h = (h1,h2, ..., hn). A feature of this set is that it contains only different edges. We also introduce the concept of total costs in the Γ game corresponding to the situation of h, as follows (Karpov and Petrosyan, 2012):

l(h)= l(rij ). (1)

rij U(h) X∈

Trajectory in a network G is defined as a sequence of network states (z1,z2, ..., zm), 1 2 n where zk = (zk,zk, ..., zk ), k = 1, 2, ..., m. Each network state zk defines n vertices in which the players are located, i.e. each trajectory corresponds to n player strate- gies and a certain situation h. By P (z1,zm) denoted the trajectory of passing from the state z1 to the state zm. The optimal trajectory in the network G is the trajectory P ∗(z1,zm) that corresponds to the situation h∗ which minimizes the total costs l(h∗) in the game Γ that is: l(h∗) = min l(h). (2) Consider the following problem: find the optimal trajectory in n-person game Γ = (G,N,z0,zl) on the network with the initial state z0 and the final state zl. The solution of this problem can be found using dynamic optimization, precisely on the basis of Bellman’s optimality principle (Bellman, 1960). This optimality principle allows us to find the recurrence functional Bellman equation. Now define the characteristic function v(S) in a recurrent way (Zakharov and Shchegryaev, 2012; Shchegryaev and Zakharov, 2014):

v(S) = min min(v(Q)+ v(S Q)); vh(S) . (3) {Q S \ } ⊂ The representation of the characteristic function in the form (3) can be interpreted as the coalition S minimum guaranteed cost. The cost vh(S) of the coalition S is composed of two components: the direct costs lh(S) for transporting agricultural products along the path U(hs) U0(hs), where U (h ) is the set of common edges in the path hi S and the cost\a (S) of 0 s ∈ h the coalition S to overcome the path U0(hs). Hence follow equality:

vh(S)= lh(S)+ ah(S), where lh(S)= α l(rij ),

rij U(hs) U (hs) ∈ X\ 0

ah(S)= α l(rij )

rij U (hs) ∈X0 and α is the cost of transporting agricultural products per unit of path. Assuming that α = 10 from the calculation that cost of 1 liter of gasoline is 50 rubles, and crossing 1 kilometer of the path it will take 10 rubles. Maximum costs for the ferry crossing by car Nizhny Bestyakh — Yakutsk depend on the weight of the car: up to Cost Optimization for the Transport Network of Yakutia 11

1 ton (transportation of agricultural products by one manufacturer) - - 340 rubles, up to 2 tons (transportation by two manufacturers) – 610 rubles, up to 3 tons (transportation by three manufacturers) – 935 rubles, up to 4 tons (transportation by four manufacturers) – 1275 rubles. The costs of passing the ferry crossing for the coalition S are determined by the carrying capacity of the vehicle and are presented in the table 1.

Table 1. Costs of the coalition to overcome the ferry crossing

s Costs of coalition S(in rubles) 1 340 2 610 3 935 4 1275

For the four person game Γ we have the following algorithm for calculating the values of the characteristic function: Step 1. Find the value of the characteristic function for singleton coalitions

v( i )= v ( i ), { } h { } where h H( i ). Step∈ 2. Compute{ } the value of the characteristic function for two-element coali- tions v( i, j )= min v( i )+ v( j ); v ( i, j ) , { } { { } { } h { } } where h H( i, j ), i = j, i, j =1, 2, 3, 4. ∈ { } 6 Step 3. Find the value of the characteristic function for three-element coalitions

v( i, j, k )= min v( i, j )+ v( k ); v ( i, j, k ) , { } { { } { } h { } } where h H( i, j, k ), i = j, j = k, k = i, i,j,k =1, 2, 3, 4. ∈ { } 6 6 6 Step 4. Calculate the value of the characteristic function for the maximum coalition

v(N)= min v( i, j, k )+ v( l ); v( i, j )+ v( k,l ); v ( i,j,k,l ) , { { } { } { } { } h { } } where h H( i,j,k,l ), i = j, j = k, k = l, l = i, i,j,k,l =1, 2, 3, 4. ∈ { } 6 6 6 6

2.2. The problem of reducing cost for transport network of Yakutia as a cooperative game in the form of a characteristic function Consider cost reducing optimization problem of transporting agricultural products on transport network of the Sakha (Yakutia) Republic. There are several producers of agricultural products located in the ulus centers: village Amga (Amginskiy ulus), village Churapcha (Churapchinsky ulus), village Borogontsy (Ust-Aldansky ulus) and village Pavlovsk (Megino-Kangalassky ulus). Ulus centers are connected with the city of Yakutsk by transport network and depicted as a graph. Taking into account the coefficient of road surface, which is equal to 0.1 for roads with smooth surface; 0,25 for roads with a hard surface; 0,5 for ice roads; 0,75 for dirt roads and 1 for ferry line (Evtyukov and Evtyukov, 2013), 12 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova the graph of transport network is presented in Fig. 1. This graph represents to the summer state of transport network. Each manufacturer has a certain resource for transporting its products from an ulus center to Yakutsk. The goal of each player is to transport agricultural products with minimal costs, i.e., finding the way in which transportation costs will be minimal.

Fig. 1. The graph of summer state of highway network

The initial state of the network is the state z0 = (1, 2, 3, 4), and the final state is zl = (11, 11, 11, 11), that is, for example, the first producer of products must overcome the path from the vertex 2 to the vertex 11. Based on the algorithm described in work (Karpov and Petrosyan, 2012) we find the optimal routes for the producers:

1 h∗ = (1, 7), (7, 8), (8, 11) ; { } 2 h∗ = (2, 5), (5, 7), (7, 8), (8, 11) ; { } 3 h∗ = (3, 6), (6, 5), (5, 7), (8, 11) ; { } 4 h∗ = (4, 8), (8, 11) . { } Cost Optimization for the Transport Network of Yakutia 13

Taking into account the ferry transportation tariffs we find the following minimal costs for the each manufacturer:

1 2 3 4 (l(h∗ ), l(h∗ ), l(h∗ ), l(h∗ )) = (714.5; 688; 860.5; 402.5) and, additionally i l(h∗)= l(h∗ ) = 2665.5. i N X∈ Now let us consider cost reducing optimization problem which differs from the previous one. This time graph corresponds to road network in winter. The transition from the road network in summer period to summer period corresponds in the context of the work (Butenko, 2015) to external influences — ”shocks”. We will find out how the optimal ways of the producers of products change during the formation of winter crops. Taking into account the coefficient of road surface the graph of transport network is presented in fig. 2.

Fig. 2. The graph of winter state of highway network

Thus, each of manufacturers carries following optimal costs:

1 2 3 4 (l(h∗ ),l(h∗ ),l(h∗ ),l(h∗ )) = (464, 5;438;261, 5; 89) and, additionally i l(h∗)= l(h∗ ) = 1253. i N X∈ Comparing calculated optimal costs vectors of two optimization problems we can see that graph changing, based on ice roads formation implies decrease of each agricultural manufacturers optimal costs. 14 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova

The initial state of the network is the state z0 = (1, 2, 3, 4), and the final state is zl = (11, 11, 11, 11), that is, for example, the first producer must follow the path from the vertex 2 to the vertex 11. According to the algorithm described in work (Karpov and Petrosyan, 2012) were found the optimal ways for the transportation of the products:

1 h∗ = (1, 7), (7, 8), (8, 11) ; { } 2 h∗ = (2, 5), (5, 7), (7, 8), (8, 11) ; { } 3 h∗ = (3, 9), (9, 12), (12, 11) ; { } 4 h∗ = (4, 10), (10, 11) . { } Cooperative game of n persons is the game Γ = (N, v), where N = 1, 2, ..., n is the set of players, and v is the characteristic function defined on the set{ coalition.} Let us now look at a 4-person cooperative game Γ1∗ = (N, v) in the form of the characteristic function on the network G. The need for coalition formation can be justified as follows. Let S N be some coalition of players. The minimum ⊆ cumulative costs vh(S) of the coalition S are the sum of the minimum total costs of each player i of the coalition S. The cost of the transition by the common path is included in this amount only once regardless the number of players using this path (Karpov and Petrosyan, 2012). For example, the player with the longest minimum path among all participants in the game offers services to other players to cross the common path, assuming that transport costs are divided among the players of the S coalition in accordance with the Shapley value. We find the value of the characteristic function for all possible coalitions and write its values as table 2.

∗ Table 2. Values of the characteristic functions for the cooperative game Γ1

S v(S) S v(S) {1} 714.5 {2,4} 1020.5 {2} 688 {3,4} 1193 {3} 860.5 {1,2,3} 2053 {4} 402.5 {1,2,4} 1688 {1,2} 1300.5 {1,3,4} 1860.5 {1,3} 1473 {2,3,4} 1773 {1,4} 1047 N 2432.5 {2,3} 1385.5

Now let us consider the distribution of costs between players. As optimality principle, we choose the Shapley value, each component of which is the spending of the corresponding player:

(s 1)!(n s)! Sh (v)= − − (v(S) v(S i ), s = S , i =1, 2, 3, 4. (4) i n! − \{ } | | S:i S X∈ Computing the components of the Shapley value, according to this rule, we obtain Sh∗(v) = (666.167; 609.167; 781.667; 375.5) Cost Optimization for the Transport Network of Yakutia 15 and at the same time completing its property of efficiency

Shi∗(v) = 2432.5. i N X∈ The results for two non-cooperative games Γ1 and Γ2, corresponding to the summer and winter seasons along with the cooperative variant of the first game referred to as Γ1∗ are presented in Tables 3 and 4.

Table 3. Non-cooperative games Γ1 and Γ2

Player Optimal path in Γ1 Costs Optimal path in Γ2 Costs 1 (1,4),(4,8)(8,11) 714,5 (1,4),(4,8)(8,11) 464,5 2 (2,5),(5,8)(8,11) 737,5 (2,5),(5,8)(8,11) 487,5 3 (3,9),(9,8)(8,11) 930 (3,9),(9,12)(12,11) 680 4 (7,8)(8,11) 402,5 (7,10)(10,11) 152,5

∗ Table 4. Cooperative game Γ1

∗ Player Optimal path in Γ1 Components of the Shapley value 1 ((1,4),(4,8)(8,11) 573,25 2 (2,5),(5,8)(8,11) 596,25 3 (3,9),(9,8)(8,11) 788,75 4 (7,8)(8,11) 261,25

We can observe that in the cooperative version of the game the players bear smaller costs when the game takes place in the summer period. In the winter period, however, the transportation costs turn out to be smaller than in the summer period and there is no opportunity for cooperation.

3. Conclusion In this paper was studied game-theoretic approach to the problem of reducing costs of agricultural products transportation on the transport network of the Sakha (Yakutia) Republic. The game-theoretic approach allows producers of agricultural products, acting together bear the minimum costs for its transportation. The mathematical model of the problem of reducing costs for transportation of products studied in the work can be used to solve similar economic problems asso- ciated with transport costs. The practical application of this kind of mathematical models in such problems of transporting products will help to avoid unnecessary spending in transportation of products.

Acknowledgments Ekaterina Gromova acknowledges the grant number 17-51-53030 of Russian Foun- dation for Basic Research.

References Aumann, R. J. and J. Dreze (1974). Cooperative games with coalitional structures. Inter- national Journal of Game Theory, 217–237. 16 Galina I. Bubyakina, Taisia M. Plekhanova, Ekaterina V. Gromova

Bellman, R. E. (1960). Dynamic Programming, Translation From English By Andreeva, I. M. Inostrannaya Literatura: Moscow (in Russian). Butenko, M. S. (2015). A two-stage optimality principle on a network game with a shock of a special kind. In: Control Processes and Stability: Proceedings of the 46th International Scientific Conference of Post-Graduate Students and Students, Publishing House of St. Petersburg State University: Saint Petersburg, 573–578 (in Russian). Ergun O. G. Kuyzu and M. W. P. Savelsbergh (2007). Shipper Collaboration. Computers & Operations Research, 34, 1551–1560. Evtyukov, S. A. and S. S. Evtyukov (2013). Parameters affecting the coupling properties of road surfaces. Technical and Physical and mathematical sciences, 3, 75–82 (in Russian). Karpov, M. I. and L. A. Petrosyan (2012). Cooperative solutions in communication net- works. Bulletin of St. Petersburg University, 10(4), 37–45 (in Russian). Khmelnitskaya A. B. and E. B. Yanovskaya (2007). Owen coalitional value without additiv- ity axiom. Mathematical Methods of Operations Research, 66(2), 255–261. Krajewska, M. A. H. Kopfer, G. Laporte, S. Ropke and G. Zaccour (2008). Horizontal cooperation among freight carriers: request allocation and profit sharing. Journal of the Operational Research Society, 59, 1483–1491. Mazalov, V. V. (2010). Mathematical Game Theory and Its Applications, Lan’: Saint Pe- tersburg (in Russian). Owen, G. (1971). Game Theory, Mir: Moscow (in Russian). Petrosyan, L. A., N. A. Zenkevich and E. V. Shevkoplyas (2017). Game theory, BHV: Saint Petersburg (in Russian). Route map of the Republic of Sakha (Yakutia) (2009). The First Publishing polygraphic Holding: Saint Petersburg (in Russian). Shchegryaev, A. N. and V. V. Zakharov (2014). Multi-period cooperative vehicle routing games. Contributions to Game Theory and Management, 7, 349–359. Shapley, L. S. (1953). A value for n-person games. Contributions to the Theory of Games II ., eds Luce R.D. and Tucker A.W. – Princeton: N.J. Princeton University Press, 307–317. Von Neumann, D. and O. Morgenstern (1944). The Theory of Games and Economic Behavior.– Princeton: Princeton University Press. Zakharov, V. V. and A. N. Shchegryaev (2012). Stable cooperation in dynamic problems of transport routing. Mathematical Game Theory and Its Applications, 4(2), 39–56 (in Russian). Zenkevich, N. A. and A. V. Zyatchin (2016). Strong coalitional equilibrium in transport game. Mathematical Game Theory and Its Applications, 8(1), 63–79 (in Russian). Contributions to Game Theory and Management, X, 17–26 A Signaling Advertising Model Between an Intelligent Consumer and Two E-tailers

M. Esmaeili and M. Masoumirad Alzahra University, Faculty of Industrial Engineering, Tehran,Iran E-mail: esmaeili [email protected]

Abstract Nowadays, by increasing sales forces’ cost, trades shift more to e- business. In this paper, we present a signaling internet business model as a multi stage game. We consider two e-tailers and one intelligent consumer as players. The e-tailers can advertise or not, while, the consumer can search or not. The communication between one of the e-tailers and the consumer and also the interaction between the two e-tailers are considered as sequence at stages zero and one. We obtain Nash and Separating Equilibrium for each stage. Finally, is obtained at stage two based on the historical impact of the players’ actions. We show that in the signaling models when the game is not single shot, the good reputation is as important as advertising to signal about the product quality. In addition, searching and advertising costs have a great impact on the consumer and e-tailers’ decisions. Keywords: Advertising, B2C internet business models, Multi stage games, Reputation.

1. Introduction With the rapid development of the Internet, many retailers as e-tailers have been using online technology to sell their products in e-business. There is no need to have physical place and this eliminates the sale’s costs such as the sales force and the location’s costs. Time and location independent business gives freedom to customers to shop in their own convenience without traveling. Consumers have the freedom to shop online without time limitation. However, e-business is a threat to the e-tailers and the consumers. First, from the consumer point of view, there is no reliable tool to assess the products attributes and qualities since they cannot touch and feel the products. Second, the e-tailers have to pay more attention to effective ways for sale such as advertisements and since they cannot use their sales skills, the consumers can switch to another e-tailer easily without buying the product. Considering the effect of return on consumer’s purchase intention could be considered as one of sales’ skill (Pei et al., 2014). One of the most inertial theories in the field of advertising refers to the signaling role of advertising that is based on Nelson’s theory (Milgrom and Roberts, 1986; Nelson, 1974; Horstmann and MacDonald, 2003). The e-tailers should find a more effective way for signaling their consumers. These signals could be used to show the quality of the product and would be transmitted to the consumers by either the price or the advertising messages (Kirmani and Akshay, 2000; Linnemer, 2001). In addition, Mitra and Fay (2010) have proved that the reaction to the price signal is just a behavioral concept and cannot present solely the quality of the product. Nev- ertheless, the consumers’ behavior are changed by increasing or decreasing of price 18 M. Esmaeili, M. Masoumirad and the advertising cost (Esmaeili et al., 2009a; Esmaeili et al., 2009b). Sahuguet (2011) presents a direct relationship between the number of advertisement and the number of potential consumers. Moreover, Internet advertising campaigns has been considered by chance constrained optimization model regarding the uncertainty of the supply of Internet viewers (Deza et al., 2015). A significant shortcoming of all these models is that they assume that the customers have a static role.However, today customers can find out the product’s attributes, qualities and prices through the Internet. Therefore, they are neither static nor naive anymore and cannot be deceived easily by advertisements. Regardless the advertising budgeting and the marketing strategies, purchas- ing due to advertising leaves the consumer with a satisfaction or dissatisfaction (Beltran et al., 2013; Muzellec et al., 2015). One of the main reasons should be whether the advertisements are honest and they make a good reputation for the e-tailers. Or the advertisements are deceiving and consist of wrong information which is called noisiness of advertisements (Anand and Shachar, 2009). In other words, selling and buying makes a history on the e-tailer and the consumer’s mind. However, most signaling models of advertising are presented for only one stage. A company decides to advertise or not and the customer chooses to buy the prod- uct or not and then the game is finished (Linnemer, 2001; Fluet and Garella, 2002; Horstmann and MacDonald, 2003; Anand and Shachar, 2007; Anand and Shachar, 2009 ). Neglecting the impact of history of advertising and purchasing on the con- sumer’s and e-tailers’ actions is another salient concern of signaling models of ad- vertising. To our knowledge, trading has not been presented as the multi stage game in the literature. Although e-business is the evolutionary part of the business world, most of the signaling models of advertising ignore e-business and present the traditional form of trading. Therefore, in this paper we are going to present a signaling internet business model as the multi stage game between two e-tailers and one intelligent consumer. The e-tailers present a product with two qualities- high and low- where this quality difference is not obvious. The e-tailers choose to advertise or not for selling durable products in order to maximize their profit. On the other hand, the consumer is assumed to be intelligent and curious about advertising signal which triggers him/her to search about the products qualities. Therefore, the consumer cannot be deceived or misperceived by the noisiness of advertisement. The con- sumer’s utility is maximized by choosing the appropriate quality of the product. At stage zero, the communication between one of the e-tailers as a dynamic one and the consumer is considered to obtain Nash Equilibrium, while another e-tailer has no static action. At stage one; Separating Equilibrium is obtained for both e-tailers’ interaction according to the nature of signaling games. Since our consumer is intel- ligent, the history of e-tailer’s activities is magnified for the consumer at stage two. Therefore, based on the history, Nash Equilibrium is obtained at stage two by con- sidering the interaction between the intelligent consumer and the two e-tailers. It is shown that in the multi stage game, the history of business’s actions (reputation) is as important as advertisements to signal about the product quality. In addition, we also present that the optimal policy of the e-tailers and the consumer dependent upon advertising and searching costs. The rest of the paper proceeds as followed. Section 2 describes notations and problem formulation. The e-tailers’ and the consumer’s models are presented in A Signaling Advertising Model 19

Section 3. In Section 4 we discuss about multi stage game. Conclusion with some suggestion for further research is considered in Section 5.

2. Notation and Problem Formulation This section introduces the notation and formulation of our model. Here, we state decision variables, parameters and assumptions underlying the model. 2.1. Parameters of the Model Pi The price at which the e-tailer sells the product, Qi The type of product’s quality in the market whether high or low (i = H,L; QH >QL), CA The cost of advertising, CBi The purchase cost based on the quality of product (i = H,L), CS The cost of consumer’s searching, Π The e-tailer’s profit, U The consumer’s utility, F The consumer’s fund. 2.2. Decision variables BiE The binary variable taking value 1 if the i e-tailer advertises and 0 otherwise, BC The binary variable taking value 1 if the consumer searches and 0 otherwise. 2.3. Assumptions The proposed model is based on the following assumptions: 1. The e-tailers and the consumer in the market are risk neutral. 2. There is only one kind of product with high and low qualities that the quality difference is not obvious. 3. The consumer is intelligent and advertisements trigger’s him/her to search for information about the product quality. 4. The consumer can afford to buy the product and the remained fund could not be saved for him/her. 5. The transaction continues periodically and histories of previous stages affect e-tailers’ and consumer’s behavior in the next stages. 6. The product is durable and the consumer buys only one unit of it in each stage. 2.4. Signaling games Signaling games are a two-player game of incomplete information in which one player is informed and the other is not. The informed player’s strategy is a type-contingent message and the uninformed player’s strategy is a message-contingent action. The type of the equilibrium depends on the strategy chosen by the players. If they both choose the same strategy, the equilibrium is called Pooling Equilibrium. Otherwise a Separating Equilibrium is achieved. Since the quality difference is not obvious for the consumer (Assumption 2), the presented model is the between the e-tailers and the consumer. In other words, the e-tailers are aware of the quality of the product while the consumer does not know about that. Therefore, e-tailers try to send signals to the consumer to help him/her guess the quality. Such as each game, the signaling game has three specifications: the number of players, their strategies and their payoffs. The e-tailers and the consumer are the players of the game. In Section 4, their strategies and their payoffs are explained. 20 M. Esmaeili, M. Masoumirad

3. The e-tailers’ and the consumer’s model In this section, the e-tailers and the consumer’s models are presented. 3.1. The e-tailers’ model Assume a market in which there are two dynamic and static e-tailers. The static e-tailer is a potential competitor who does not take action in stage zero except being cautious about the dynamic e-tailer’s actions. The dynamic e-tailer buys a product with high or low quality while the purchase costs are CBH and CBL in sequence (CBH > CBL) and prefers to sell both qualities at the same price indexed by P because of two reasons. The first reason is the monopolistic nature of the market(the static e-tailer has no action) and the second one is that the quality difference is not obvious. To maximize profit, the e-tailer decides to advertise or not to advertise. Therefore, E-tailer’s Profit= Selling price - Purchasing cost - Advertising Cost * E-tailer’s binary variable

maxΠ(B )= P C C B (1) iE − Bi − A e Please note that if the dynamic e-tailer does not advertise, BiE=0 and then the third part will be omitted. 3.2. The consumer’s model The consumer -as an intelligent one- has the ability to search about the products qualities. Since the consumer’s fund is enough to afford purchasing the product and the remained fund could not be saved for the consumer (Assumption 4), we have considered a lost profit in the consumer model. The lost profit is the difference between the fund and the price (F P ). It happens when the consumer has bought the product and the remained fund− could not be spent for another purchasing, therefore it is called lost profit. Therefore, the consumer’s utility would be: Consumer’s Utility= Quality of the consumed product - Purchasing cost - Lost profit - Searching cost * Consumer’s binary variable

max U(B )= Q P (F P ) C B (2) C i − − − − S c max U(B )= Q F C B (3) C i − − S c Please note that the consumer can search through the internet and this reduces the cost of searching. In addition, if the consumer does not search, BC =0 and then the third part will be omitted.

4. Multi stage game In this section, we present the interaction between the e-tailers and the consumer. At stage zero, the communication between the dynamic e-tailer and the consumer is considered. In addition, the interactions between the e-tailers, the two e-tailers and the consumer are presented at stages one and two in sequence. 4.1. The stage zero At stage zero, the interaction between the dynamic e-tailer (DE) and the consumer (C) is investigated. The static e-tailer chooses no action. However, the dynamic e-tailer’s strategies (SR) include advertise (A) or not to advertise (NA) for sell- ing durable products under two qualities. The consumer’s strategies (SC ) will also A Signaling Advertising Model 21 be search (S) or not to search (NS). Therefore, the game models the situation as: SC ,SR= (S,A),(NS,A),(S,NA),(NS,NA) The explanation of each situation in se- quence is{ as below: } 1. (S,A)= The dynamic e-tailer advertises and the consumer searches about the product. The dynamic e-tailer sells the high quality product because the con- sumer is aware of the differences in quality by searching. By considering adver- tising costs, the e-tailer’s profit would be Π(B ) = P C C and the DE − BH − A consumer’s utility U(BC )= QH F CS, 2. (NS,A)= The dynamic e-tailer advertises− − and the consumer does not search. Since the consumer does not search, the seller can either sell the high or low qual- ity product to the consumer with the same probability and therefore Π(BDE)= 1 1 ( (P C C )+ (P C C ) As the consumer does not search, 2 − BH − A 2 − BL − A he/she would not understand the differences in product quality. Therefore, the consumer may make a wrong purchasing decision and gains the utility 1 1 U(B )= (Q F )+ (Q F ), C 2 H − 2 L − 3. (S,NA)= The dynamic e-tailer does not advertise and the consumer searches. As mentioned before, the advertising signals make the consumer aware of the product and trigger’s him/her to search about the product. Therefore, when there is no advertising signal, there would be no decision to search, 4. (NS,NA)= The dynamic e-tailer does not advertise and the consumer does not search. In such a situation no trade takes place and the e-tailer’s profit and the consumer’s utility would be zero. The players’ payoffs for each situation are summarized in the following Table:

Table 1. The payoffs of the zero stage model

DE DE A NA C S (QH − F − CS, P − CBH − CA) - 1 1 1 1 C NS ( (QH − F )+ (QL − F )), ( (P − CBH − CA)+ (P − CBL − CA) (0,0) 2 2 2 2

For the dynamic e-tailer, the best strategy that maximizes the e-tailer’s profit is to advertise. In addition, by comparing two strategies of the consumer, we face with two optimal solutions. If QH 2CS >QL, the consumer chooses to search otherwise he/she does not search. It is− obvious that the best strategy of consumer depends on the searching cost (CS ). If CS is too high, it would be better for the consumer to buy the product without searching. However, as our model is in e-business, the search cost is really negligible according to the fast and easy access to the internet and the consumer should search to increase his/her utility. Therefore, situation (S,A) is the Nash equilibrium at the stage zero that makes sense. The dynamic e-tailer advertises and sells the high quality product and the consumer searches. 4.2. The stage one The game is not over yet and it continues to the stage one. To enrich our model, we consider the static e-tailer breaks down the monopoly of the first e-tailer (the 22 M. Esmaeili, M. Masoumirad dynamic e-tailer) and a signaling competition takes place between the two e-tailers. Regarding to the intelligent consumer, the history about the activities of the zero stage game is magnified which makes our proposed model more realistic. To our knowledge, the multi stages game with observed actions is ignored by most of the signaling models. Based on definition of the multi stage games with observed actions, in the stage one the players know the actions chosen at the stage zero. By entering the second e-tailer (the static e-tailer) as a competitor, the e-tailers do not want to destroy the consumer’s trust with selling low quality products at the high price. Therefore, they need to sell the high and low quality product at PH and PL in sequence. In addition the second e-tailer (SE) has received the signals that the first e- tailer (FE) has sold the product with high quality to the consumer to make him/her well known. We have a set of pair strategies that the first component represents the quality of product and the second one represents advertising decision. The both e-tailers have the same strategies that include (L,A),(L,NA),(H,A),(H,NA) . In other words, both e-tailers can provide the product{ in either low quality (L) or} high quality (H) while they advertise (A) or do not advertise (NA). The payoff of each player is shown in Table 2.

Table 2. The payoffs of the one stage model

FE FE FE FE (L,A) (L,NA)(H,A) (H,NA)

SE (L,A) ♣ ♣ (PL − CBL − CA, PH − CBH − CA) (PL − CBL − CA, PH − CBH ) 1 1 SE (L,NA) ♣ ♣ ( (PL − CBL), PH − CBH − CA) ( (PL − CBL), PH − CBH ) 2 2 SE (H,A) ♣ ♣ z z SE (H,NA) ♣ ♣ z z

It is obvious that zones will never been chosen by the first e-tailer because of his/her positive history. In fact♣ he/she does not present the product with low quality because he/she does not want to destroy the consumer’s trust and lose his/her loyal consumer. In addition, zones z will never been chosen by the second e-tailer. The reason is if the consumer wants to pay the high price for the product with high quality, he/she will prefer to buy it from the first e-tailer (positive history of the first e-tailer). In other words, the second e-tailer cannot compete with the first e- tailer when they both behave the same. Therefore, in the Table 2, we discuss only about the zone which shows the pay offs:

1. (H,A),(L,A) : The first and second e-tailers sell high and low quality in se- quence{ while} both advertise. 2. (H,NA),(L,A) : The first e-tailer sells high quality without advertising activi- ties.{ While the} second e-tailer sells high quality with advertising. 3. (H,A),(L,NA) : The first e-tailer sells high quality with advertising. While the second{ e-tailer} sells high quality without advertising. 4. (H,NA),(L,NA) : The first and second e-tailers sell high and low quality in sequence{ while they} do not advertise.

Since the first e-tailer has his/her loyal consumer then he/she never chooses advertising strategy because of the advertising cost. Therefore, the first e-tailer presents high quality product without advertising to have a maximum profit (H,NA). A Signaling Advertising Model 23

Moreover, the second e-tailer presents low quality product to obtain his/her market share among the consumer with low fund. Therefore, he/she needs to make the consumer aware of his/her product. Thus, on one hand if the second e-tailer does not advertise, with an equal probability the consumer buy or does not buy from him/her. On the other hand he/she faces with the advertising cost in using infor- mative tools. If the advertising cost is too low (PL CBL > 2CA), there would be a Separating equilibrium in which the first e-tailer− sells high quality product, does not advertise and benefits from the positive history of the stage zero while the second e-tailer sells low quality product and advertises. Therefore, the game’s sep- arating equilibrium would be (H,NA),(L,A) . In contrast, if PL CBL < 2CA, the strategy set (H,NA), (L,NA){ would be the} Pooling equilibrium.− In other words, if advertising{ cost will be so} high such that advertising would not be profitable for the second e-tailer, there would be a Pooling equilibrium in which both e-tailer choose the same message, and does not advertise. However, as our model reflects the e-business trade; the advertising cost is really negligible according to access to the inexpensive internet marketing tools such as e-mail marketing. Therefore, strategy set (H,NA),(L,A) would be the Separating equilibrium at stage one. { } 4.3. The stage two In this section, we present signaling games between the e-tailers and the consumer. According to stage zero, the first e-tailer has sold high quality product to the intel- ligent consumer that brings him/her some sort of positive reputation for the next stages. On the other hand, in stage one the first and the second e-tailer decided to sell high and low quality product sequentially. As the consumer does not know which quality of the product the second e-tailer is going to sell, the second e-tailer prefers to advertise. Therefore, regarding to the mentioned history the first and second e-tailers ’ strategies would be (H,NA),(L,A) . Moreover, the consumer has two strategies, search (S) or not search{ (NS). The players’} payoffs for each situation are summarized in Table3.

Table 3. The payoffs of the two stage model

FE SE (H,NA) (L,A)

C S (QH − F − CS , PH − CBH ) (QL − F − CS , PL − CBL − CA) 1 1 1 1 1 1 C NS ( (QH − F ) + (QL − F ), (PH − CBH )) ( (QH − F ) + (QL − F ), (PL − CBL − CA)) 2 2 2 2 2 2

1. ( NS,(H,NA) and NS,(L,A) )= If the consumer does not search, the situ- ations{ (H,NA)} and{ (L,A) for} the first and second e-tailers should be con- sidered{ simultaneously. In} such a situation, the first e-tailer sells high qual- ity without advertising and the second e-tailer sells low quality and adver- tises. As the consumer does not search, there is a probability that he/she makes a wrong purchasing decision and chooses the wrong e-tailer and gains 1 1 U(B )= (Q F )+ (Q F ). Therefore, the probability of the first e-tailer C 2 H − 2 L − 1 to sell the high quality product cuts in half and gains Π(B )= (P C ) FE 2 H − BH 24 M. Esmaeili, M. Masoumirad

and also, the second e-tailer sells low quality product and gains Π(BSE) = 1 (P C C ), 2 L − BL − A 2. ( S,(H,NA) and S,(L,A) )= If the consumer searches, the situations (H,NA) and{ (L,A) for} the{ first and} second e-tailers should be considered simultaneou{ sly. As the consumer} searches, he/she with complete information chooses the prod- uct with high or low quality. If the consumer buys the high quality product from the first e-tailer, the first e-tailer gains Π(B )= P C and the consumer FE H − BH utility would be U(BC ) = QH F CS while the second e-tailer endures the advertising cost in addition to− purchasing− cost. Otherwise, the consumer buys the low quality product from the second e-tailer, the second e-tailer’s profit is Π(BSE) = PL CBL CA, the consumer gains U(BC) = QL F CS and the first e-tailer− only sustains− the purchasing cost. − −

Totally, the significant decrease in the e-tailers’ utility and the consumer’s profit that is because of not searching strategy of the consumer proves that it would be better for both sides of the trade that the consumer searches. Therefore, the Nash equilibrium depends on the consumer’s taste in product quality. If the consumer needs a low quality product, the second e-tailer would be the best choice and the zone with strategy set S,(L,A) would be the Nash equilibrium. Otherwise, the consumer would buy the{ product} from the first e-tailer and the Nash equilibrium would be strategy set S,(H,NA) . According to sections 4-1 and 4-2, if the searching or advertising cost is too{ high, the} equilibrium would break down and the consumer and the second e-tailers would decide not to search and not to advertise strategies in sequence.

5. Conclusion There are a number of industrial and government statistical reports that show that business on the Internet is speed up. Advertising and e-business are the concepts that have changed the world nowadays. In this paper we present a signaling internet business model as the multi stage game between two e-tailers and one intelligent consumer. The intelligent consumer has two strategies, search or not to search while the e-tailers choose advertise or not for selling durable products under two qualities. At stage zero, the communication between one of e-tailers and the consumer is considered to obtain Nash Equilibrium. In the stage one, Separating Equilibrium is obtained for both e-tailers’ relation according to nature of signaling game. At stage two, the interaction between the intelligent consumer and the e-tailers is investigated to obtain Nash Equilibrium. It is shown in the multi stage game, the history of business’s actions (reputation) would be as important as the advertising to signal quality. In addition, searching and advertising costs play a role key in the consumer and e-tailers’ decisions that should be considered in the business world. In the presented model we shed light on the periodic nature of the business, the consumer awareness, and positive reputation which makes our model more real. There are several ways of extending this model such as increasing the number of products and consumers. In addition, the noisiness of advertising is not investigated completely in the model. Whereas there is a new evolutionary trend in advertising such as hidden advertising in search engines and other sources of customer informa- tion. Also other kind of signaling games like cooperation between the players could be studied. A Signaling Advertising Model 25

References Adriani, F. and L. G. Deidda (2011). Competition and the signaling role of prices. Inter- national journal of industrial organization, 29(4), 412–425. Adriani, F. and L. G. Deidda (2009). Price signaling and the strategic benefits of price rigidities. Games and economic behavior, 67, 335–350. Anand, B. and R. Shachar (2007). (Noisy) Communications. Quantitative marketing and economics, 5(3), 211–232. Anand, B. and R. Shachar (2009). Targeted advertising as a signal. Quantitative marketing and economics, 7, 237–266. Beltran, C., Zhang, H., Blanco, L. A. and J. Almagro (2013). Multistage multiproduct advertising budgeting. European Journal of Operational Research, 225, 179–188. Bester, H. and K. Ritzberger (2001). Strategic pricing, signaling, and costly information acquisition. International journal of industrial organization, 19, 1347–1361. Christou, C. and N. Vettas (2006). On informative advertising and product differentiation. International journal of industrial organization, 19, 1347–1361. Clark, R. and I. Horstmann (2005). Advertising and coordination in markets with consump- tion scale effects. Journal of economics and management strategy, 14(2), 377–401. Daughety, A. F. and J. F. Reinganum (2007). Competition and confidentiality: signaling quality in a duopoly when there is universal private information. Games and economic behavior, 58, 94–120. Deza, A., Huang, Kai. and R. Metel (2015). Chance Constrained Optimization for Targeted Internet Advertising. Omega, 53, 90–96. Esmaeili, M., and Aryanezhad, M. and P. Zeephongsekul (2009a). A game theory approach in sellerbuyer supply chain. European journal of operational research, 195, 442–448. Esmaeili, M., and Abad, P. and M. Aryanezhad (2009b). Seller-buyer relationship when end demand is sensetive to price and promotion. Asia-Pacific journal of operational research, 26, 1–17. Fluet, C. and P. Garella (2002). Advertising and prices as signals of quality in a regime of price rivalry. International journal of industrial organization, 20, 907–930. Horstmann, I. and G. MacDonald (2003). Is advertising a signal of product quality? Ev- idence from the compact disk player, 1983-1992. International journal of industrial organization, 21, 317–345. Kaya, A. (2009). Repeated signaling games. Games and economic behavior, 66, 841–854. Kim, J. (2000). Product compatibility as a signal of quality in a market with network externalities. International journal of industrial organization, 20, 949–964. Kirmani, Amna. and R. Rao. Akshay (2000). No Pain, No Gain: A Critical review of the literature on signaling unobservable product quality. Journal of marketing, 64(2), 66–79. Lee, B., Ang, L. and C. Dubelaar (2005). Lemons on web: A signaling approach to the problem of trust in Internet commerce. Journal of economic psychology, 26, 607–623. Linnemer, L. (2001). Price and advertising as signals of quality when some consumers are informed. International journal of industrial organization, 20, 931–947. Milgrom, P. and J. Roberts (1986). Price and advertising signals of product quality. Journal of political economy, 94(4), 796–821. Mitra, D. and S. Fay (2010). Managing service expectation in online markets: A signaling theory of e-tailer pricing and empirical tests. Journal of retailing, 86, 184–199. Muzellec, L., Ronteau, S. and M. Lambkin(2015). Two-sided Internet platforms: A business model lifecycle perspective. Industrial Marketing Management, 45, 139–150. Nelson, P. (1974). Advertising as information. Journal of political economy, 82(4), 729– 754. Pei, Z. and Paswan, A. and R. Yan(2014). E-tailer’s return policy, consumer’s perception of return policy fairness and purchase intention. Journal of Retailing and Consumer Services, 21, 249–257. 26 M. Esmaeili, M. Masoumirad

Roddie, C. (2010). Theory of signaling games. Oxford: Nuffield College. Sahuguet, N. (2011). A model of repeat advertising. Economic letters, 111, 20–22. Schmalensee, R. (1978). A model of advertising and product quality. Journal of political economy, 86(3), 485–503. Sobel, J. (2007). Pricing strategy, quality signaling, and entry deterrence. University of California: epartment of economics. Thomas, L., Shane, S. and K. Weigelt (1998). An empirical examination of advertising as a signal of product quality. Journal of economic behavior and organization, 37, 415–430. Utaka, A. (2008). Multistage multiproduct advertising budgeting. International journal of industrial organization, 26, 878–888. Contributions to Game Theory and Management, X, 27–41 Information Pooling Game in Multi-Portfolio Optimization⋆

Jing Fu Fukuoka Institute of Technology, Department of System Management, 3-30-1 Wajiro-higashi, Higashi-ku, Fukuoka, 811-0295 Japan E-mail: [email protected]

Abstract In this paper, an information pooling game is proposed and stud- ied for multi-portfolio optimization. Our approach differs from the classical multi-portfolio optimization in several aspects, with a key distinction of al- lowing the clients to decide whether and to what extent their private trading information is shared with others, which directly affects the market impact cost split ratio. We introduce a built-in factor related to the clients’ vertical fairness regarding the outcomes, which is termed as “dissatisfaction indica- tor”. With balanced horizontal dissatisfactions across all accounts, the main formulation guarantees that no client is systematically advantaged or disad- vantaged by the information pooling process. This is a novel mechanism to incorporate both the horizontal and vertical fairness in the optimization pro- cess. We show that information pooling solution outperforms the pro-rata collusive solution from fairness aspect, and the Cournot-Nash equilibrium solution for its Pareto optimality. Moreover, the empirical results suggest that within our framework, information pooling has non-negative impact on all participants’ perceived fairness, although it may hurt some account’s realized benefit compared to null information pool. Keywords: information pooling, multi-portfolio optimization, horizontal fairness, vertical fairness

1. Introduction Since the introduction of modern portfolio theory (Markowitz, 1952), financial mod- els with incorporation of various new factors and findings have been constantly reinvented. In the portfolio optimization process, almost all portfolios need to be adjusted during their lifetimes, so incurring periodic transaction costs is inevitable. In October 2000, the Texas Permanent School Fund rebalanced its portfolio of 2,200 securities of about $17.5 billion. Not to mention the administrative costs, the trans- action cost itself is $120 million (PlexusGroup, 2002). Managers cannot afford to ignore transaction costs, a large portion of which is attributed to the market impact. In practice, financial advisers usually provide their services to multiple clients simultaneously. In order to efficiently serve a large number of clients, Securities and Exchange Commission (SEC) allows the manager to “bunch orders on behalf of two or more client accounts, so long as the bunching is done for the purpose of achiev- ing best execution, and no client is systematically advantaged or disadvantaged” (Securities and Exchange Commission, 2011). In this case, a problematic interac- tion arises between the multiple portfolios because the transaction cost for a given client may depend on the overall level of trading and not just on that client’s trading requirements (O’Cinneide et al., 2006). The rebalancing price tends to be underes- timated largely due to the market impact of bunched trades: benefits sought for ⋆ This work was supported by start-up grants of Fukuoka Institute of Technology. 28 Jing Fu individual accounts through trading are lost due to increased overall transaction costs.

To model the market impact cost more accurately, several multi-portfolio op- timization approaches have been developed. In the first paper recognizing this problematic interaction (O’Cinneide et al., 2006), the authors propose a pro-rata collusive solution to the problem where the objective function is the sum of the objective functions of each individual account. They assert that this is fair since the solution obtained is the same as if each account directly competes in an open market for liquidity. Showing that certain accounts may be better off acting alone instead of participating in the collusive solution, practitioners propose to solve the problem by identifying the set of portfolios that form a Cournot-Nash equilibrium (Savelsbergh et al., 2010). In their model, each account is optimizing its own ob- jective by assuming that it is “made aware” of the trades of other accounts that are being pooled together for execution, and gives its best response. An attractive property of this approach is that the actual market impact cost for each account is exactly what it has anticipated. However, the Cournot-Nash solution is not neces- sarily Pareto optimal, which means that it may violate the SEC best execution rules. Moreover, some heuristic approach has to be applied to bypass the intractability problem in solving the overall equilibrium (Fabozzi et al., 2010). A recent publi- cation (Iancu and Trichakis, 2014) has well documented the solutions available to multi-portfolio optimization problems.

Let us organize the fairness issue from the investors’ perspective. The contract renewal decision by a client is not only affected by the final gain/loss of the in- vestment, but also by the performance difference between her own account and others’ (horizontal fairness), and the difference between her expected and realized net return (vertical fairness). Investors care about fairness, for it is a crucial role in establishing and maintaining relationships (Kahneman et al., 1986). However, it is frequently sacrificed in the efficient approaches, and the perfectly fair Cournot-Nash equilibrium solution is not Pareto optimal. Hence, a natural question arises: how can we implement both horizontal and vertical fairness in an efficient multi-portfolio optimization solution reasonably?

Vertical fairness is responsive to a wide range of factors, i.e., needs, wants, beliefs, prior expectations, etc. An empirical study on procedural fairness (Bies et al., 1993) suggests that without involvement (voice) of investors, the portfolio optimization process is less than appropriate to be regarded as fair. This indicates that an investor participates more in the optimization process, more information is acquired, and the vertical fairness might be improved. We propose an information pooling game to give a potential answer to the above-mentioned question.

The remainder of this paper is organized as follows. In Section 2, three re- lated prominent solutions in literature are reviewed and discussed. In Section 3, key modeling choices in information pooling game is elaborated in detail first, and the main formulation is highlighted then. In Section 4, to compare the different solu- tion concepts and verify the effect of information pooling, two numerical studies are conducted and discussed. Finally, Section 5 summarizes the main contribution of this work and comments on future research briefly. Information Pooling Game in Multi-Portfolio Optimization 29

2. Multi-portfolio Optimization

In this section, three prominent approaches proposed in literature to solve the multi- portfolio optimization problem are reviewed and discussed. Suppose a financial ad- viser is managing n distinct portfolios (accounts), indexed by P = 1, ..., n . To improve the operational efficiency, managers prefer to invest in assets{ from} the same pool of available investments, reflecting a particular investment style. Thus, for simplicity, the pool of investments available for all clients is assumed to be A = 1, ..., m . In this paper, each account is assumed to have an all-cash initial position{ to simplify} the discussion. In other words, in a single rebalancing period, n all clients are not allowed to short assets. Let w = (w1, ..., wn) R denote the m ∈ initial cash positions of all accounts. Let xj R represent the rebalancing trades ∈ mn of account j P in units of currency, and x = (x1,...,xn) R represent the ∈ ∈ vector of all trades. Natural constraints in a single rebalancing period to xj are Rm xj 0 and i A xij wj . Let ξj denote the feasible trades of account j satisfying≥ the two∈ constraints≤ above.⊆ P Market impact cost is, broadly speaking, the price an investor has to pay for obtaining the liquidity in the market. It is the deviation of the transaction price from the market (mid) price that would have prevailed had the trade not occurred. In general, liquidity providers experience negative costs while liquidity demanders will face positive costs. One of the common approaches in both literature and practice to model market impact is through a nonlinear and strictly convex function of the T amounts traded in the form x c(x), where c(x) = (c1(x), ..., cm(x)) is a vector function giving the cost per unit traded for each asset. The vector function c(x) is assumed to be independent for each asset (ignoring the cross-asset price impact) p and expressed in the form of polynomial ci(x) = ( j P xij ) , i A, where p is a rational number between 0.5 and 1 (Almgren et al.,∈ 2005). Then,∀ ∈ the market impact cost of executing the trades of account j PPis given as xT c(x). ∈ j 2.1. Independent Optimization Solution First, consider a simplest setting with a single account j P , where the objective function for account j is to maximize its net utility. The independent∈ optimization problem to determine the portfolio for account j can be represented as

T max uj (xj ) xj c(xj ) (1) xj ξj − ∈

The net utility for portfolio j is the expected return uj (xj ) derived from its rebal- T ancing trades, subtracting its market impact cost xj c(xj ). Notice that uj(xj ) is T Rm assumed to be concave, xj c(xj ) is convex, ξj is a convex subset of . Via convex optimization techniques, the problem can be solved efficiently in variables xj . The independent optimization problem has been extensively studied in the literature. A direct expression for the problem above is that each account is acting inde- pendently and unaware of the market impact by the trades of other accounts. Let ind us denote the optimal solution as (xj )j P . If all accounts are optimized following ∈ ind T ind ind this model, the true market impact cost to each account is (xj ) c(x1 ,...,xn ), ind T ind which is greater than the prior expectation (xj ) c(xj ) and reduces the vertical fairness of the clients considerably. A numerical example has shown that the true market impact cost to each account might have been 9900% higher than the antic- 30 Jing Fu ipation in a 100-portfolio setting (Savelsbergh et al., 2010). Neither horizontal nor vertical fairness can be implemented in independent optimization solution. 2.2. Collusive Solution The basic idea of the collusive solution is to optimize all trades simultaneously by aggregating the utility functions of all accounts (O’Cinneide et al., 2006). The problem can be written as

T max uj(xj ) ( xij )i Ac(x) (2) xj ξj , j P − ∈ ∈ ∀ ∈ j P j P X∈ X∈ As with the independent optimization solution, collusive optimization problem can be solved efficiently as well. The resulting market impact cost is allocated propor- tionally to the trading amount of each account, hence we also refer the original collusive solution as pro-rata collusive solution in this paper. The authors argue col that this approach is horizontally fair as the solution (xj )j P is the same as the one that would have been obtained if clients are competing∈ in an open market for liquidity. However, in certain situations, some accounts may have to sacrifice their own benefits for the good of others in order to maximize the total welfare (Savelsbergh et al., 2010), and the horizontal fairness cannot be justified if those accounts deviate from the collusive solution. 2.3. Cournot-Nash Equilibrium Solution Motivated by the significant underestimation of market impact in independent op- timization and unfairness in the collusive solution, an equilibrium solution is de- veloped by optimizing each account’s objective with the assumption that the trade decisions of all other accounts that participate in the pooled trading have been made and fixed (Savelsbergh et al., 2010). More precisely, for account j P , the ∈ trades for all other accounts x−j = (xk : k = j P ) are fixed and known. The optimization problem for j can be modeled as6 ∈

T max uj(xj ) xj c(xj , x−j ) (3) xj ξj − ∈ cn In microeconomics, the equilibrium solution (xj )j P is referred to as Cournot- Nash equilibrium (Mas-Colell, 1984). It has the property∈ that the expected market impact cost exactly corresponds to the realized market impact cost for each account and no client will have the incentive to deviate from her Cournot-Nash portfolio unilaterally, which is superior to the previous two approaches from the vertical fairness facet. However, Cournot-Nash solution is not necessarily Pareto optimal. It is possible to have at least one account improved without negatively impacting any other account with the independent optimization approach, violating the best execution rules.

3. Information Pooling Game In this section, a multi-portfolio optimization approach with incorporation of both horizontal and vertical fairness from the clients’ perspective is proposed and justi- fied, which is termed as information pooling game. First, three key modeling choices will be explained, namely, Information Pooling Game in Multi-Portfolio Optimization 31

1) information pooling: it allows the clients to decide whether and to what extent their private trading information is shared with others in the same bunched trade; 2) vertical fairness: for each account, it is reflected by the difference between the expected and realized net utility, which is termed as dissatisfaction indicator; 3) horizontal fairness: information pooling process does not inflict particularly high or low dissatisfaction to any account.

3.1. Information Pooling

Cournot-Nash equilibrium is criticized to be unfair from certain aspect, because it coerces the clients to participate in an “artificial game” and share their complete in- formation with others (Iancu and Trichakis, 2014). Motivated by the shortcomings of Cournot-Nash equilibrium, our approach invites all clients to decide whether and to what extent to share their private trading information. Moreover, as is discussed in the introduction, initiative and active participation in the portfolio optimiza- tion process improves the information transparency, and ultimately the perceived fairness of clients. mn For a more detailed explanation, let τ = (τj )j P R denote an information m ∈ ∈ pool from all clients, where τj = (τij )i A R is a binary vector and τij 0, 1 is a binary indicator of client j’s willingness∈ ∈ or preference on whether to share∈{ her} trading information of asset i. More precisely, τij = 0 indicates that client j rejects to pool her trading information of asset i, while τij = 1 indicates that j is willing to share i’s trading information with others who contribute to the information pool of i. It is a natural and fair assumption to avoid the free-rider phenomenon in information pooling, and we define the vector function of the cost per unit traded for each asset as below

Definition 1. For account k P , its estimation on the vector function of the cost per unit traded for asset ∈i A with information pool τ can be defined as k k ∈ c (x τ ) = (ci (x τ ))i A, where | | ∈

(x )p if τ =0 k x τ ik ik ci ( )= p (4) | (( j P (τij xij )) if τik =1 ∈ P ε ip Let Uj (xj τ ) denote the expected net utility for account j P with information pool τ . It can| be derived by the information pooling problem∈ as follows

ε ip T j Uj (xj τ )= max uj (xj ) xj c (x τ ) (5) { | xj ξj − | } ∈

Conditional on a fixed information pool τ , the set of equilibrium solutions to prob- lem (5) is the same as the set of simultaneous solutions of the first-order optimality ε T j conditions for all accounts. For account j P , let Uj (xj τ )= uj (xj ) xj c (x τ ), and the conditions can be written as ∈ | − | 32 Jing Fu

ε x τ j ∂Uj ( j ) ∂uj(xj ) j ∂ci (x|τ ) | = + ci (x|τ )+ xij 0, i A − ∂xij − ∂xij ∂xij ≥ ∀ ∈ ε ∂Uj (xj τ ) xij ( | )=0, i A − ∂xij ∀ ∈ (6) x 0, i A ij ≥ ∀ ∈ x w ij ≤ j i A X∈ m ε This is a nonlinear complementary problem NCP (R , U (xj τ )) such that + −∇ j | ε T ε xj 0, U (xj τ ) 0, (xj ) ( U (xj τ )) = 0 (7) ≥ −∇ j | ≥ −∇ j | with and an extra constraint of i A xij wj . ∈ ≤ ip Lemma 1. (xj )j P is an equilibriumP of the information pooling problem if and ∈ ip Rm ε only if (xj )j P is a solution to NCP ( + , Uj (xj τ )) j P constrained by up- ∈ { −∇ | } ∈ per bounded trades of i A xij wj , j P . ∈ ≤ ∀ ∈ In portfolio selectionP theory, the expected return for account j P in a single T ∈ rebalancing period is generally modeled as uj (xj )= ̟ xj , where ̟ = (̟i)i A is a random return rate vector. We prove that ∈ Theorem 1 (Existence and Uniqueness of Equilibrium Solution). Assume T ip uj(xj )= ̟ xj , j P , then there exists a unique equilibrium solution (xj )j P to the information∀ pooling∈ problem. ∈ Remark 1. With an null information pool where all accounts deny to share their ip trading information, (xj )j P corresponds to the independent optimization solution ∈ ind (xj )j P . In complete information pooling where all accounts reach a consensus ∈ ip on sharing their trading information, (xj )j P is consistent with the Cournot-Nash cn ∈ equilibrium solution (xj )j P . ∈ 3.2. Vertical Fairness The clients are invited to express their preferences on pooling the trading infor- mation, however, this approach does not affect the aggregative optimization by the manager for efficiency and best execution. In other words, the bunched trading deci- sions are still made in accordance to the collusive solution, and this paper concerns τ the split mechanism of the resulting market impact cost. Let rj denote the market τ impact cost split ratio of account j P with information pool τ . Let ρj (0, 1) ∈ τ τ ⊆ denote the feasible set of j’s split ratio satisfying rj 0 and j P rj = 1. Then the realized net utility of account j with information≥ pool τ can be∈ written as P col col τ col T col Uj (xj |τ )= uj (xj ) rj ( xij )i Ac(x ) (8) − ∈ j P X∈ col where (xj )j P is the solution to the optimization problem (2). Following the argument on perceived∈ fairness such that actions which made some party worse off than the prior expectations are generally viewed as unfair (Kahneman et al., 1986), we have Information Pooling Game in Multi-Portfolio Optimization 33

Definition 2. A dissatisfaction indicator of account j P with information pool τ is defined as ∈

ε ip col τ Uj (xj τ ) Uj(xj |τ ) dsj = | − (9) U ε(xip τ ) j j | The dissatisfaction indicator reflects the vertical fairness for a client by the relative difference between her expected and realized net utility, and higher dissatisfaction implies worse vertical fairness. 3.3. Horizontal Fairness From the clients’ perspective, they are desiring that any trade x executed with information pool τ will generate non-positive dissatisfaction. However, it is very difficult to be implemented both theoretically and in practice. Hence, the following optimization problem is introduced to decide the market impact cost split ratio.

τ τ τ min V ar(ds1 , ds2 , ..., dsn) (10) rτ ρτ , j P j ∈ j ∀ ∈ τ The optimal solution (rj ∗)j P guarantees the horizontal fairness in splitting the resulting market impact cost∈ by minimizing the variance of dissatisfaction indica- tors across all accounts. Although 100% envy-freeness is not implemented in our mechanism, at least clients’ dissatisfactions (or satisfactions) do not spread out too much from a certain level. For example, an investor thinks it unfair if she suffers a 10% dissatisfaction while others in the same bunched trade only suffer 1%, however, it will be judged to be fair if others are dissatisfied at 9.99%. 3.4. Main Formulation: Information Pooling Game Next the main formulation based on the modeling choices elaborated above will be summarized. The manager would proceed as follows 1) Determine the trades xcol by solving the collusive optimization problem (2), and execute it.

2) Invite each client j P to determine her information pooling strategy τj = ∈ (τij )i A, which forms an information pool τ . ∈ 3) Authorize client j to access her corresponding information pool. Both the manager and client j may estimate j’s expected net utility by solving the in- formation pooling problem (5) with SLCP (Sequential Linearly Constrained Programming). τ 4) Determine the split ratio (rj )j P of the resulting market impact cost by solving problem (11), where the∈ dissatisfaction indicator for account j P is defined by equation (10). Then the realized net utility for j can be determined∈ by equation (9). From clients’ perspective, the process above can be viewed as an information pooling game. They have to decide their own information pooling strategies, and the formed information pool directly affects the allocation of resulting market impact cost by the bunched trades. Here is a very simple example illustrating our mechanism.

Example 1 (Information Pooling Game). Suppose that there are only two accounts, account 1 with $100 and account 2 with $10 initially for investment. Furthermore, 34 Jing Fu

suppose there is only one risky asset available for investment with an expected return rate, i.e., 40%. Assume p = 0.6 (p = 0.6 0.038 with 67% probability, Almgren et al., 2005), then ± 1) The collusive solution is xcol = (0.0583, 0.0408), and the total resulting mar- ket impact cost is 0.0248. 2) The potential information pools are τ 1 = (0, 0), τ 2 = (0, 1), τ 3 = (1, 0), τ 4 = (1, 1). ε ip 3) Regarding each information pool, the expected net utility Uj (xj τ ) of ac- count j 1, 2 can be summarized as an payoff matrix | ∈{ } Table 1. Expected net utilities of the information pooling problems

Account 2 τ2 = 0 τ2 = 1 τ1 = 0 (0.0149, 0.0149) (0.0149, 0.0149) Account 1 τ1 = 1 (0.0149, 0.0149) (0.0065, 0.0065)

4) Due to the same expected net utilities for both accounts with any of the informa- i tion pools, the optimal market impact cost split ratio vector is rτ = (0.6414, 0.3586), i 1, 2, 3, 4 , and the realized net utility vector is U(xcol|τ )=(0.0074401, 0.0074417). Moreover,∀ ∈{ the} dissatisfaction indicators are

1 2 3 dsτ = dsτ = dsτ = (0.50066, 0.50055) 4 (11) dsτ = ( 0.14921, 0.14946) − − Remark 2. In this example, although the realized net utilities for both clients does not vary with the information pool, their vertical fairness is improved considerably by approximately 65%. It will help establish a stable manager-client relationship in practice. If the resulting market impact cost is simply allocated in a pro rata fashion (O’Cinneide et al., 2006), the split ratio will be rpr = (0.5883, 0.4117), and realized net utilities will be U pr(xcol)=(0.0088, 0.0061). In this case, there is no information sharing between the clients, and the expected net utilities is corresponding to that with τ 1. Hence the dissatisfactions will be dspr = (0.4124, 0.5888), and client 2 with a small account is suffering almost 20% higher dissatisfaction compared to client 1. Moreover, our Pareto optimal information pooling solution (0.0074401, 0.0074417) outperforms the Cournot-Nash equilibrium solution (0.0647414, 0.0647414) for both accounts by sacrificing less than 0.03% horizontal fairness.

3.5. Discussion This mechanism allows the manager to jointly optimize all clients’ trading and split the market impact cost in a fair way. More precisely, the resulting market impact cost is allocated by minimizing the variance of dissatisfactions (vertical fairness) across all accounts (horizontal fairness). It produces Pareto optimal utilities while also keeps the satisfactions of all accounts at a similar level, complying with the SEC best execution rules. From the clients’ perspective, the information pooling game improves their per- ceived fairness from two aspects: first, it allows the clients to decide whether and Information Pooling Game in Multi-Portfolio Optimization 35 to what extent to share their trading information, and their information pooling strategies will directly affect the market impact cost split ratio. A mechanism with involvement (voice) of investors is more likely to be regarded as fair. Second, it outperforms the pro-rata collusive solution in horizontal fairness, and overcomes the pitfall in Cournot-Nash equilibrium solution with a more tractable approach by introducing the expected net utility function.

4. Numerical Studies In this section, two numerical studies are conducted to (1) compare the solutions in our mechanism to pro-rata collusive solution and Cournot-Nash equilibrium so- lution; (2) verify the information pooling effect on the realized net utilities and dissatisfaction indicators. In our numerical study, the return rate ̟ is randomly selected between 20% and 40%, and p is assumed to be 0.6 as in the example above. − 4.1. Numerical Study on Three Solution Concepts Suppose a manager is in charge of 2 portfolios P = 1, 2 , and there are 50 assets A = 1, 2, ..., 50 available for investment. Account 1{ has}w = $1M and account 2 { } 1 has w2 = $100M initially. Assume that the two clients reach a consensus on sharing their trading information regarding all assets, that is, τij =1, i A, j P . This numerical study compares the performances of both accounts∀ by∈ collusive∀ ∈ solution, Cournot-Nash equilibrium solution, and the information pooling solution proposed in this paper. The statistical results are summarized in Table 2 with properties of average expected/realized net utilities and dissatisfaction indicators reported.

Table 2. Account performances with collusive solution, Cournot-Nash equilibrium solution and information pooling solution

Collusive Cournot-Nash Information Pooling Property 12121 2 Avg. Expected Net Utility (%) 1.9487 1.2375 1.2814 1.1023 1.2814 1.1023 Avg. Realized Net Utility (%) 1.3495 1.1951 1.2814 1.1023 1.3857 1.1947 Dissatisfaction Indicator (%) 30.7487 3.4263 0.0000 0.0000 -8.1395 -8.3825

Remark 3. The statistical results above provide further evidence that 1) Managers and investors cannot afford to ignore the market impact cost. The mean of the return rate is set to be 10%, however, the net return after cost is only around 1%. 2) In collusive solution, the account with lower initial cash positions is hurt more. The actual market impact cost is significantly underestimated for clients without any information (Remark 1), which reduces their vertical perceived fairness considerably. As shown in Table 2, the realized net utility is approximately 30% lower than the prior expectation for account 1. 3) Cournot-Nash equilibrium solution is not Pareto optimal and violates the best execution rules, although it implements perfect horizontal fairness. For all accounts, both collusive and information pooling solutions bring about higher realized net utilities. (Figure 1). 36 Jing Fu

4) Note that the expected net utility in our information pooling approach keeps consistent with that in Cournot-Nash equilibrium solution as all clients agree to disclose their trading information (Remark 1). Unlike the collusive solu- tion, it rewards (or hurts) both accounts by approximately the same ra- tio. Compared to the Cournot-Nash equilibrium solution, although our ap- proach sacrifices approx. 0.2% horizontal fairness, the Pareto optimal solution strictly improves the net utilities for both accounts by approx. 8%.

9 Account 1 Account 2 8

7

6

5

4

3 Relative Improvement (%)

2

1

0 Collusive Information Pooling

Fig. 1. Relative improvement of the realized net utilities with collusive and information pooling solutions compared to the Cournot-Nash equilibrium solution

4.2. Numerical Study on Information Pooling Game Within our framework, this numerical study focuses on the effect of clients’ infor- mation pooling strategies on the realized net utilities and perceived fairness. The setup is similar to the previous numerical study, but the manager is supposed to be in charge of 3 portfolios P = 1, 2, 3 with w1 = w2 = $1M and w3 = $100M in initial cash positions. With respect{ to} the 50 available assets, there are 2150 po- tential information pools and it exceeds the upper iteration limit of our program. Hence, four typical information pools will be compared, namely

1) Null information pool (τij =0, i A, j P ): all accounts decline to pool their trading information; ∀ ∈ ∀ ∈

2) Partial information pool (1 2) (τi1 = τi2 = 1, τi3 = 0, i A): accounts 1 and 2 with lower initial cash− positions agree to pool their trading∀ ∈ information;

3) Partial information pool (1 3) (τi1 = τi3 = 1, τi2 = 0, i A): a small account and a large account− consent to share their information,∀ ∈ which is the same as partial information pool (2 3). − Information Pooling Game in Multi-Portfolio Optimization 37

4) Complete information pool (τij = 1, i A, j P ): all accounts reach a consensus to disclose their trading information.∀ ∈ ∀ ∈ The average realized net utilities are summarized in Tables 3 and 4, and note that there are two pure strategy Nash equilibria (τ1, τ2, τ3) = (0, 0, 0) or (1, 1, 0). If we take the opposite of dissatisfaction indicators (satisfaction) as the payoff, the information pooling game can be represented in Tables 5 and 6, and the pure strategy Nash equilibria become (τ1, τ2, τ3) = (0, 0, 0)or(1, 1, 1). It indicates that although account 3 with higher initial cash position tends not to join the information pool if others do, it does perceive higher vertical satisfaction by acquiring more trading information.

Table 3. Avg. realized net utilities (%) if τ3 = 0

Account 2 τ2 = 0 τ2 = 1 τ1 = 0 (1.3037, 1.3040, 1.1922) (1.3037, 1.3040, 1.1922) Account 1 τ1 = 1 (1.3037, 1.3040, 1.1922) (1.3086, 1.3088, 1.1921)

Table 4. Avg. realized net utilities (%) if τ3 = 1

Account 2 τ2 = 0 τ2 = 1 τ1 = 0 (1.3037, 1.3040, 1.1922) (1.2896, 1.3113, 1.1923) Account 1 τ1 = 1 (1.3113, 1.2896, 1.1923) (1.3114, 1.3115, 1.1920)

Table 5. Opposite of avg. dissatisfaction indicators (%) if τ3 = 0

Account 2 τ2 = 0 τ2 = 1 τ1 = 0 (−33.1573, −33.1316, −2.0217) (−33.1573, −33.1316, −2.0217) Account 1 τ1 = 1 (−33.1573, −33.1316, −2.0217) (−30.2415, −30.2457, −1.9977)

Table 6. Opposite of avg. dissatisfaction indicators (%) if τ3 = 1

Account 2 τ2 = 0 τ2 = 1 τ1 = 0 (−33.1573, −33.1316, −2.0217) (−33.8768, 1.1962, 0.7095) Account 1 τ1 = 1 (1.1962, −33.8768, 0.7095) (3.3331, 3.3247, 2.3527)

Remark 4. The comparative results shown in Figures 2 and 3 also suggest that 1) From the perspective of realized net utility, account 3 with higher initial cash position has less incentive to pool its trading information compared to the other two accounts, and complete information pool actually hurts its benefit compared to null information pool. 2) Even though some account chooses not to disclose its information, its realized net utility is made worse off by the existence of partial information pool, and a small account is hurt more. 3) From the perspective of fairness, information pooling process improves all participants’ vertical fairness compared to the null situation, and has more impact on the small accounts as well. 38 Jing Fu

0.6

0.4

0.2

0

-0.2

-0.4

-0.6 Relative Improvement (%)

-0.8

-1 Account 1 Account 2 Account 3 -1.2 Partial (1-2) Partial (1-3) Complete

Fig. 2. Relative improvement of the realized net utilities with partial and complete information pools to that with null information pool

5

0

-5

-10

-15

-20

-25

-30 Increase in Dissatisfaction Indicators (%)

-35 Account 1 Account 2 Account 3 -40 Partial (1-2) Partial (1-3) Complete

Fig. 3. Increase of dissatisfaction indicators with partial and complete information pools from that with null information pool

5. Conclusion

In this paper, an information pooling game for multi-portfolio optimization is in- troduced with incorporation of both horizontal and vertical perceived fairness from the clients’ perspective. This novel mechanism invites the clients to decide whether and to what extent their trading information is shared with others, which directly Information Pooling Game in Multi-Portfolio Optimization 39 affects the split ratio of resulting market impact cost in the collusive solution. It also allows the manager to jointly optimize multiple portfolios and split the market impact cost in a fair way by keeping the satisfactions of all accounts at a similar level. The numerical study verifies that our approach outperforms the pro-rata collu- sive solution in fairness, and the Cournot-Nash equilibrium in Pareto optimality. It also suggests that small accounts are more sensitive to the information pooling process. For a more robust evidence, the existence of separate equilibria in multi- period information pooling game, as well as extensions with cross-asset effect still remain as our future work. Acknowledgments. The author expresses her gratitude to S. Muto for useful discussions on the subjects.

Appendix 1. Proof of Lemma 1 k Proof. With Definition 1, for account k P , the first derivative of ci (x|τ ) with respect to x , i A can be derived as ∈ ik ∀ ∈ k p 1 ∂ci (x τ ) p(xik) − if τik =0 | = p 1 (12) ∂xik (p( j P (τij xij )) − if τik =1 ∈ The second derivative with respectP to xik can be derived as

2 k p 2 ∂ ci (x τ ) p(p 1)(xik) − if τik =0 2 | = − p 2 (13) ∂xik (p(p 1)( j P (τij xij )) − if τik =1 − ∈ We have ignored the cross-asset marketP impact, thus the second derivative with respect to xhk, h = i A is 0 regardless of the value of τik. As p is a rational number between∀ 0.56 and∈ 1, we have

ck(x τ ) > 0, 2ck(x τ ) 0 (14) ∇ i | ∇ i | ≤ ε Then for account k, the second derivative of Uk (xk τ ) with respect to xik, i A can be represented as | ∀ ∈

2 ε ∂ Uk (xk τ ) 2 | ∂xik 2 k 2 k ∂ uk(xk) ∂ci (x|τ ) ∂ ci (x τ ) = 2 2 xik 2 | ∂xik − ∂xik − ∂xik 2 ∂ uk (xk) 2 p 1 2 (p + p)(xik) − < 0 if τik =0 ∂xik = 2 −  ∂ uk (xk) p 2 ∂x2 p( j P (τij xij )) − (2 j P (τij xij ) + (p 1)xik) < 0 if τik =1  ik − ∈ ∈ − P P (15)  And the second derivative with respect to x , h = i A is hk ∀ 6 ∈ 2 ε 2 ∂ U (xk τ ) ∂ u (xk) k | = k 0 (16) ∂xik∂xhk ∂xik∂xhk ≤ 40 Jing Fu

ε For any account j P , Uj (xj τ ) is twice continuously differentiable and concave ∈ | ip with respect to its own trade xj. Hence, (xj )j P is an equilibrium of the infor- ∈ ip mation pooling problem if and only if (xj )j P is a solution to the set of nonlin- Rm ∈ ε ear complementary problems NCP ( + , Uj (xj τ )) j P constrained by upper { −∇ | } ∈ bounded trades of i A xij wj , j P . ∈ ≤ ∀ ∈ 2. Proof of TheoryP 2 ε Proof. For account k P , assume the Hessian matrix of U (xk τ ) to be ∈ k | a a a 11 ··· 1i ··· 1m . .. .  . . .  H = a a a  i1 ii im   . . .   . .. .     a a a   m1 ··· mi ··· mm    Following the proof for Lemma 1, a , i A could be represented by equation ii ∀ ∈ (15), while aih and ahi, h = i A could be derived by equation (16). With T ∀ 6 ∈ uk(xj )= ̟ xk, it is very simple to show that

a > a , a > a (17) | ii| | ih| | ii| | hi| h=i h=i X6 X6 is satisfied for all i A. Hence H is negatively strictly diagonally dominant. For any account j P ,∈ the three conditions below are all satisfied ∈ ε 1) Expected net utility function U (xj τ ) is twice continuously differentiable j | and concave with respect to xj ;

2) Trades xj is bounded; 2 ε 3) U (xj τ ) has a negative strictly dominant diagonal for all xj 0. ∇ j | ≥ Based on K&M Theorem (Kolstad and Mathiesen, 1991), there exists a unique so- Rm ε lution to NCP ( + , Uj (xj τ )) j P . { −∇ | } ∈ References Almgren, R., C. Thum, E. Hauptmann and H. Li (2005). Equity market impact. Risk, 57–62. Bies, R. J., T. M. Tripp and M. A. Neale (1993). Procedural fairness and profit seeking: the perceived legitimacy of market exploitation. Journal of Behavior Decision Making, 6, 243–256. Fabozzi, F., S. Focardi and P. Kolm (2010). Quantitative Equity Investing: Techniques and Strategies. John Wiley & Sons: Hoboken, NJ. Iancu, D. A. and N. Trichakis (2014). Fairness and efficiency in multiportfolio optimization. Operations Research, 62(6), 1283–1301. Kahneman, D., J. L. Knetsch and R. H. Thaler (1986). Fairness and the assumptions of economics. Journal of Business, 59(4), 285–300. Kolstad, C. D. and L. Mathiesen (1991). Computing Cournot-Nash Equilibria. Operations Research, 39, 739–748. Markowitz, H. (1952). Portfolio selection. J. Finance, 7(1), 77–91. Mas-Colell, A. (1984). On a theorem of Schmeidler. J. Math. Econom. , 13, 201–206. Information Pooling Game in Multi-Portfolio Optimization 41

O’Cinneide, C., B. Scherer and X. Xu (2006). Pooling trades in a quantitative investment process. J. Portfolio Management, 32(4), 33–43. PlexusGroup (2002). Sneaking an elephant across a putting green: a transaction case study. Commentary 70. Savelsbergh, M. W. P., R. A. Stubbs and D. Vandenbussche (2010). Multiportfolio opti- mization: a natural next step. Handbook of Portfolio Construction, 565–581. Securities and Exchange Commission (2011). General information on the regulation of investment advisers. Retrieved on February 15, 2017, https://www.sec.gov/divisions/investment/iaregulation/memoia.htm. Contributions to Game Theory and Management, X, 42–67

Cooperation in Dynamic Network Games⋆

Hongwei Gao1 and Yaroslavna Pankratova2 1 College of Mathematics and Statistics, Qingdao University Qingdao 266071, China E-mail: [email protected] 2 Saint Petersburg State University 7/9 Universitetskaya nab., Saint Petersburg 199034, Russia E-mail: [email protected]

Abstract This paper reviews research on dynamic network games that has been carrying out in Saint Petersburg State University since 2009. We fo- cus on the problem of cooperation in dynamic network models noting time and subgame inconsistency of cooperative solutions. The problem of stable cooperation is also covered. Keywords: dynamic games, cooperation, network, pairwise interactions, time-consistency.

1. Two-stage games In this section we introduce basic definitions and analyze how mutual links connect- ing players can influence players’ behavior. Such links define a network. A two-stage game is considered as a basic model in which players form a network at the first stage, and at the second stage players choose their controls. The network game is given in a strategic setting, and following (Kuhn, 1953) a strategy of a player is a rule that uniquely defines his behavior at both stages of the game (player’s behavior at the second stage depends on the network formed at the first stage). It is supposed that payoff to each player depends on his behavior at the second stage and behavior of his “neighbors” in a network formed at the first stage of the game. Similar setting modeled with a two-stage network game, is considered in (Goyal and Vega-Redondo, 2005; Jackson and Watts, 2002). In these papers, the authors consider a model in which players form a network at the first stage, and at the second stage, players are involved in a 2 2 which is the same for all players. Another two-stage model on× a network as a location–price game is considered in (Lu et al., 2010). This model is based on papers studying processes of network formation, net- work evolution during the game as well as research connected with allocation rules and its properties for a fixed network (Bala and Goyal, 2000; Dutta et al., 1998; Goyal and Vega-Redondo, 2005; Feng et al., 2014; Igarashi and Yamamoto, 2013). In (Bala and Goyal, 2000) a Nash network is considered as a solution for the strate- gic setting, and the network evolution is modeled as a convergent stochastic pro- cess. In (Petrosyan and Sedakov, 2009) the network evolution is constructed as the result of players’ actions, and the solution is considered in the sense of subgame per- fectness. Other solution concepts for games on networks are studied regardless of network formation mechanisms (Dutta et al., 1998; Jackson and Wolinsky, 1996).

⋆ This research was supported by the Russian Foundation for Basic Research (grant No 17- 51-53030). Cooperation in Dynamic Network Games 43

In the papers mentioned above, the problem of time consistency is not studied. This problem was initiated by Petrosyan (Petrosyan, 1977) for cooperative differ- ential games, and later a special mechanism of stage payments—an imputation distribution procedure—was designed to overcome time inconsistency of cooper- ative solution concepts (Petrosyan and Danilov, 1979). Time-consistent solutions for differential games under deterministic and stochastic dynamics can be found in (Yeung and Petrosyan, 2006; Petrosjan, 2006; Yeung and Petrosyan, 2012). It has been shown that the time consistency problem arises not only in cooperative differ- ential games but also in other classes of cooperative dynamic games. In it is shown that the time inconsistency problem also arises in cooperative two-stage network games (Petrosyan, Sedakov and Bochkarev, 2013), where time inconsistency of the Shapley value is proved. A more strict property of cooperative solution concepts— the property of strong time consistency (Petrosyan, 2005)—is also studied. 1.1. The model The following model was proposed in (Petrosyan, Sedakov and Bochkarev, 2013). Consider the model in detail. Let N = 1,...,n be a finite set of players who can interact with each other. The interaction{ between} two players means the existence of a link connecting them and, therefore, communication between them. On the contrary, the absence of the link connecting players means the absence of any com- munication between the players. Under these assumptions cooperation of players is said to be restricted by a communication structure (or a network). A pair (N,g) is called a network where N is a set of nodes (and it coincides with the set of players), and g N N is a set of links. If pair (i, j) g, there is a link connecting players i and ∈j, and,× therefore, generating communication∈ of the players in the network. Below to simplify notations, the network will be identified with a set of its links and denoted by g, and a link (i, j) in the network will be denoted by ij. It is supposed that all links are undirected, so ij = ji. Consider a two-stage problem. At the first stage each player chooses his partners— the players with whom he wants to form links. Choosing partners and establishing links, players, thereby, form a network. Having formed the network, each player chooses a control influencing his payoff at the second stage. Consider the problem in detail. First stage: network formation Having the player set N given, define the link formation rule in a standard way: links, and, therefore, a network, are formed as a result of players’ simultaneous choices. Let M N i be the set of players i ⊆ \{ } whom player i N can offer a mutual link, and ai 0,...,n 1 be the maximal number of links∈ which player i can maintain (and, therefore,∈{ can− offer).} Behavior of player i N at the first stage is an n-dimensional profile gi = (gi1,...,gin) whose entries are∈ defined as: 1, if player i offers a link to j M , g = i (1) ij 0, otherwise, ∈  subject to the constraint:

gij 6 ai. (2) j N X∈ The condition g = 0, i N excludes loops from the network, whereas (2) shows ii ∈ that the number of possible links is limited. If Mi = N i , player i can offer a link to any player, whereas if a = n 1, he can maintain\{ any} number of links. i − 44 Hongwei Gao, Yaroslavna Pankratova

A set of all possible behaviors of player i N at the first stage satisfying (1)– ∈ (2) is denoted by Gi. The Cartesian product i N Gi is the set of behavior profiles at the first stage. It is supposed that players∈ choose their behaviors at the first stage simultaneously and independently of eachQ other. In particular, player i N ∈ chooses gi Gi, and as a result the behavior profile (g1,...,gn) is formed. Under the assumptions,∈ an undirected link ij = ji is established in network g if and only if gij = gji = 1, i.e., g consists of mutual links which were offered only by both players. Second stage: choosing control Having formed the network, players choose their behaviors at the second stage. Define neighbors of player i in network g as elements of the set Ni(g)= j N i : ij g . Players are allowed to reconsider their decisions made at the first{ ∈ stage\{ by} giving∈ } them the opportunity to break the previously selected links. Define components of an n-dimensional profile di(g) as follows: 1, if player i does not break the link formed at the first stage d (g)= with player j N (g) in network g, (3) ij  ∈ i  0, otherwise.

Elements di(g) satisfying (23) are denoted by Di(g), i N. It is obvious that profile ∈ (d1(g),...,dn(g)) affects network g formed at the first stage by removing some links: profile (d1(g),...,dn(g)) applied to network g changes its structure and forms a new network, denoted by gd. Network gd is obtained from g by removing links ij such that either dij (g)=0 or dji(g) = 0. Moreover, at the second stage player i N chooses control u from a finite set ∈ i Ui. For example, in (Goyal and Vega-Redondo, 2005; Jackson and Watts, 2002) Ui is a set of strategies of player i in a 2 2 symmetric coordination game in which i N plays with neighbors; in (Corbae× and Duffy, 2008) U is a set of strategies of ∈ i player i in a 2 2 stag-hunt game; in (Xie et al., 2013) Ui is a set of strategies in a prisoner’s dilemma× game (“cooperate” and “defect”) in which i also plays with neighbors in the network. The sets U1,...,Un are not specified, thus they could be of a general structure. We only claim the sets are finite. Then, behavior of player i N at the second stage is a pair (d (g),u ): it defines, ∈ i i on the one hand, links to be removed di(g), and, on the other hand, control ui. A payoff function K of player i N depends on both new network gd and i ∈ controls ui, i N. Specifically, it depends on player i’s behavior at the second ∈ d d stage as well as behavior of his neighbors in network g , i.e., Ki(ui,uNi(g )) is a d d nonnegative real-valued function defined on Ui j Ni(g ) Uj. Here uNi(g ) denotes × ∈ a profile of controls u chosen by all neighbors j N (gd) of player i in network gd. j Q i Assume that functions K , i N, satisfy the following∈ property: i ∈ (P): For any two networks g and g′ s.t. g′ g, controls (u ,u ) U ⊆ i Ni(g) ∈ i × > ′ j Ni(g) Uj , and player i, the inequality Ki(ui,uNi(g)) Ki(ui,uNi(g )) holds.∈ Q 1.2. Cooperation in two-stage network games Now we describe the cooperation in two-stage network game. We will answer three main questions: What is a cooperative solution in the game? Can it be realized? Is it strong time consistent? To answer all these questions, first we start analyzing an additional case which results will be used below. Cooperation in Dynamic Network Games 45

Two-stage Network Game: Cooperation at the Second Stage In this section it is supposed that players’ behavior profile (g1,...,gn), gi Gi, i N, which is chosen at the first stage, is fixed, and it forms network g. At the∈ second∈ stage players jointly choose n pairs (di∗(g),ui∗) Di(g) Ui, i N maximizing the sum of players’ payoffs. ∈ × ∈ Proposition 1 (Petrosyan, Sedakov and Bochkarev, 2013). The maximal sum of players’ payoffs can be calculated by the formula:

Ki(ui∗,uN∗ i(g))= max Ki(ui,uNi(g)). (4) ui Ui,i N i N ∈ ∈ i N X∈ X∈ Next problem is to allocate the maximal sum of players’ payoffs among the play- ers. After the allocating procedure, the game ends. To allocate the maximal sum of players’ payoffs, a cooperative TU-game (N, v(g)) is constructed. The characteristic function v(g) in this game is defined for any subset S N—a coalition—as follows: ⊆

v(g,N)= Ki(ui∗,uN∗ i(g)), i N X∈

v(g,S)= max Ki(ui,uNi(g) S), ui Ui,i S ∩ ∈ ∈ i S X∈ v(g, ∅)=0, subject to network g is fixed. In (Petrosyan, Sedakov and Bochkarev, 2013; Gao et al., 2017) the characteris- tic function was defined as in (Von Neumann and Morgenstern, 1944). Following this idea, the value v(g,S) is the maximal payoff that coalition S can guarantee for itself (the maxmin value) in a zero-sum game between two players: coalition S max- imizing its payoff, and its complement N S minimizing the payoff to S, provided that network g is fixed. However in (Petrosyan,\ Sedakov and Bochkarev, 2013) a simplified form of such characteristic function was proposed for the first time.

Proposition 2. If payoff functions Ki, i N, are nonnegative and satisfy property (P), the maximal payoff that coalition S∈can guarantee for itself is calculated by formula:

v(g,S)= max Ki(ui,uNi(g) S). (5) ui Ui,i S ∩ ∈ ∈ i S X∈ Note that under the assumptions the value v(g,S), S N can be calculated as a solution of the maximization problem (5). To find this solution⊂ is simpler than to solve the maxmin problem in general case. For a singleton i , its value is defined in the following way: { } v(g, i )= max Ki(ui), (6) ui Ui { } ∈ and it does not depend on the network. An imputation is an n-dimensional profile ξ(g) = (ξ1(g),...,ξn(g)), satisfying both the efficiency condition and the individual rationality condition:

ξi(g)= v(g,N), i N X∈ ξ (g) > v(g, i ), i N. i { } ∈ 46 Hongwei Gao, Yaroslavna Pankratova

Let the set of imputations in game (N, v(g)) be denoted by I(v(g)). A cooperative in TU-game (N, v(g)) with fixed network g is a rule that uniquely assigns a subset CSC(v(g)) I(v(g)) to game (N, v(g)). For example, if the cooperative solution concept is the⊆ C(v(g)), then

CSC(v(g)) = C(v(g)) = ξ(g) I(v(g)) : ξ (g) > v(g,S), S N . ∈ i ⊂ ( i S ) X∈ Two-stage Network Game: Cooperation at Both Stages Suppose now that players jointly choose their behaviors at both stages of the game. Acting as one player and choosing gi Gi, ui Ui, i N, the grand coalition N maximizes the value: ∈ ∈ ∈

Ki(ui,uNi(g)). (7) i N X∈ Let the maximum be attained when players’ behavior profiles g∗, u∗, i N are i i ∈ chosen where profile (g1∗,...,gn∗) forms network g∗. Here as well as in (4) to maximize the sum of players’ payoffs to N, players should not remove links from the network. Therefore, any profile di(g) coincides with gi for any player i N and any network g. Let ∈

∗ Ki(ui∗,uN∗ i(g ))= max max Ki(ui,uNi(g)). gi Gi,i N ui Ui,i N i N ∈ ∈ ∈ ∈ i N X∈ X∈ Again to allocate the maximal sum of players’ payoffs according to some impu- tation, a cooperative TU-game (N, V ) is constructed. The characteristic function V is defined similarly to function v(g) considered in Subsection 1.2. Proposition 3. In the cooperative two-stage network game the superadditive char- acteristic function V ( ) in the sense of von Neumann and Morgenstern is defined as: ·

∗ V (N)= Ki(ui∗,uN∗ i(g )), i N X∈

V (S)= max max Ki(ui,uNi(g) S), gi Gi,i S ui Ui,i S ∩ ∈ ∈ ∈ ∈ i S X∈ V (∅)=0. For a singleton i , its value is defined in the following way: { } V ( i )= max Ki(ui). (8) { } ui Ui ∈ An imputation in the cooperative two-stage network game is an n-dimensional profile ξ = (ξ1,...,ξn), satisfying i N ξi = V (N) and ξi > V ( i ) for all i N. Let the set of imputations in game (N,∈ V ) be denoted by I(V ). { } ∈ A cooperative solution conceptP in cooperative TU-game (N, V ) is a rule that uniquely assigns a subset CSC(V ) I(V ) to game (N, V ). For example, if the cooperative solution concept is the core⊆ C(V ), then

CSC(V )= C(V )= ξ I(V ): ξ > V (S), S N . ∈ i ⊂ ( i S ) X∈ Cooperation in Dynamic Network Games 47

1.3. Time-consistent and strong time-consistent cooperative solutions The problem of time consistenty and strong time consistency was systematically developed in (Petrosyan and Sedakov, 2014; Gao et al., 2017). Suppose that at the beginning of the game players jointly decide to choose behavior profiles gi∗, ui∗, i N to maximize the sum in (7), and then allocate it according to a specified cooperative∈ solution concept CSC(V ) which realizes an imputation ξ = (ξ1,...,ξn). It means that in the cooperative two-stage network game player i N should receive the ∈ amount of ξi as his payoff. What will happen if after the first stage (after choosing the profiles g∗,...,g∗ ) player i N recalculates the imputation according to the 1 n ∈ same cooperative solution concept? The behavior profile g1∗,...,gn∗ at the first stage forms network g∗, therefore, after recalculation of the imputation (according to the same cooperative solution concept as ξ), players i’s payoff will be ξi(g∗) based on the values of characteristic function v(g∗,S) for all S N. ⊆ The definition of time-consistent imputation was adopted for two-stage network games in (Petrosyan, Sedakov and Bochkarev, 2013). In the aforementioned paper it was shown for the first time that the Shapley value, the τ-value, and the core are inconsistent cooperative solutions in this class of games.

Definition 1. An imputation ξ CSC(V ) is time consistent if there exists an im- ∈ putation ξ(g∗) CSC(v(g∗)) such that the following equality holds for all players: ∈

ξ = ξ (g∗), i N. (9) i i ∈ A cooperative solution concept CSC(V ) is time consistent if any imputation ξ CSC(V ) is time consistent. ∈

Equality (9) means that if we choose a cooperative solution concept CSC(V ) at the first stage and according to it calculate the imputation ξ, defining players’ payoffs, and then at the second stage recalculate players’ payoffs according to the same cooperative solution concept CSC(v(g∗)), i.e., calculate a new imputation ξ(g∗), subject to formed network g∗, players’ payoffs will not change.

Proposition 4. Any cooperative solution concept based only on values V (N), V ( i ), i N is time consistent. { } ∈

Remark 1. Using the previous proposition, note that the CIS-value (CIS1,..., CISn) (Driessen and Funaki, 1991) calculated by the formula

V (N) j N V ( j ) CIS = V ( i )+ − ∈ { } , i N i { } N ∈ P| | is the time-consistent cooperative solution concept.

Since in most games condition (9) is not satisfied, the time consistency problem arises: player i N, who initially expected his payoff to be equal to ξ , can re- ∈ i ceive different payoff ξi(g∗). To avoid such situation in the game, a stage payments mechanism—an imputation distribution procedure (Petrosyan and Danilov, 1979) for ξ is proposed. The definition of the imputation distribution procedure was also adopted for two-stage network games. 48 Hongwei Gao, Yaroslavna Pankratova

Definition 2. An imputation distribution procedure for ξ in the cooperative two- stage network game is a matrix

β11 β12 . . β =  . .  , β β  n1 n2    where ξ = β + β , i N. i i1 i2 ∈ The value βik is a payment to player i at stage k = 1, 2. Therefore, the following payment scheme is applied: player i N at the first stage of the game receives ∈ payment βi1, at the second stage of the game he receives payment βi2 in order to his total payment received at both stages βi1 +βi2 would be equal to the component of allocation ξi, which he initially wanted to get in the game as his payoff. Definition 3. Imputation distribution procedure β for ξ is time consistent if

ξ β = ξ (g∗), for all i N. i − i1 i ∈ It is obvious that the time-consistent imputation distribution procedure for ξ = (ξ1,..., ξn) in the cooperative two-stage network game can be defined as follows:

β = ξ ξ (g∗), (10) i1 i − i β = ξ (g∗), i N. i2 i ∈ If the cooperative solution concept CSC(V ) assigns multiple allocations (for example, the core), a more strict property can be used—strong time consistency. In (Gao et al., 2017) the definition of strongly time-consistent solution (the core) was adopted, and the corresponding definitions and propositions were presented. Definition 4. An imputation ξ CSC(V ) is strong time consistent if the following inclusion is satisfied: ∈ CSC(v(g∗)) CSC(V ). (11) ⊆ A cooperative solution concept CSC(V ) is strong time consistent if any imputation ξ CSC(V ) is strong time consistent. ∈ Therefore, the core C(V ) is strong time consistent if C(v(g∗)) C(V ). The next result directly follows from Proposition 4. ⊆ Proposition 5. Any cooperative solution concept based only on values V (N), V ( i ), i N is strong time consistent. { } ∈ The proof of the statement is very similar to the proof of Proposition 4 replacing equality (9) with inclusion (11). For cooperative solution concepts which are not strong time consistent, one can also introduce an imputation distribution procedure. Definition 5. Imputation distribution procedure β for ξ is strong time consistent if (β ,...,β ) CSC(v(g∗)) CSC(V ), (12) 11 n1 ⊕ ⊆ n n where a A = a + a′ : a′ A ,a R , A R . ⊕ { ∈ } ∈ ⊂ Cooperation in Dynamic Network Games 49

Unfortunately, for strong time-consistent imputation distribution procedures it is impossible even to derive formulas similar to (10) in general. However, for the core, one can provide conditions for the existence of strong time-consistent imputation distribution procedures. Note, that inclusion (12) for the core can be rewritten as

(β ,...,β ) C(v(g∗)) C(V ). (13) 11 n1 ⊕ ⊆ Proposition 6 (Gao et al., 2017). Let a set C(W ) be an analog of the core in the game with characteristic function W (S)= V (S) v(g∗,S), S N, i.e., − ⊆ C(W )= (ξ ,...,ξ ): ξ > W (S),S N; ξ = W (N)=0 , { 1 n i ⊂ i } i S i N X∈ X∈ and let this set be non-empty. Then imputation distribution procedure β for an imputation from the core C(V ) satisfying the conditions

(β ,...,β ) C(W ), (14) 11 n1 ∈ (β ,...,β ) C(v(g∗)), 12 n2 ∈ is strong time consistent.

1.4. Two-stage games on undirected networks Now we consider a case of directed networks (Petrosyan and Sedakov, 2014). Since the network can be undirected, the characteristic function has to be redefined. Let the resulting network g consists of directed links (i, j) s.t. gij = 1. Define the closure of network g as an undirected networkg ¯ whereg ¯ij = max gij ,gji . Similarly to the { } d previous case, payoff function Ki of player i depends on network g , his control ui and controls u , j N (¯gd) of his neighbors in the closureg ¯d: j ∈ i

d R Ki(ui,uNi(¯g )): Ui Uj , i N, × d 7→ ∈ j Ni(¯g ) ∈ Y d When players act cooperatively, they should choose gi Gi and (di(g ),ui) D (g) U , i N to maximize the joint payoff: ∈ ∈ i × i ∈

d Ki(ui,uNi(¯g )). (15) i N X∈ Again, to allocate the maximal sum of players’ payoffs according to some solution concept, one needs to construct a cooperative TU-game (N, V ). Note that V (N)=

K (u ,u ∗ ). i i∗ N∗ i(¯g ) i N P∈ Consider a non-empty coalition S N. Denote a network, formed by profiles g , i N, s.t. g = (0,..., 0) for all j ⊂N S, by g . Letg ¯ be the closure of g . i ∈ j ∈ \ S S S For any controls ui, i S let controlsu ˜j (uS), j N S, where uS = ui , i S, solve the following optimization∈ problem ∈ \ { } ∈

Ki ui,uNi(¯gS ) S, u˜(N S) Ni(¯gS )(uS) = ∩ \ ∩ i S X∈ 

= min Ki ui,uNi(¯gS ) S,u(N S) Ni(¯gS ) . uj ,j (N S) Ni(¯gS ) ∩ \ ∩ ∈ \ ∩ i S X∈  50 Hongwei Gao, Yaroslavna Pankratova

Here uNi(¯gS ) S is the profile of controls chosen by all neighbors of player i from ∩ coalition S in the networkg ¯S, andu ˜(N S) Ni(¯gS )(uS) is a profile of controls chosen \ ∩ by all players from coalition N S who are neighbors of player i in the networkg ¯S. The next proposition is the\ analog of Proposition 3.

Proposition 7 (Petrosyan and Sedakov, 2014). Suppose that functions Ki, i N, are non-negative and satisfy the property (P). Then for all S N we have ∈ ⊂

V (S)= max Ki ui,uNi(¯gS ) S, u˜(N S) Ni(¯gS )(uS) . (gi,ui)∈Gi×Ui, ∩ \ ∩ i∈S i S X∈  In a similar way one can determine the characteristic function v(g∗,S) for S N. Note that ⊆

∗ v(g∗,N)= Ki(ui∗,uN∗ i(¯g ))= V (N), i N X∈ The following result becomes the analog of Proposition 2 for the case of directed networks. Proposition 8. If functions K , i N, are non-negative and satisfy the property i ∈ (P), the value v(g∗,S) can be calculated by formula

∗ ˜ ∗ v(g∗,S)= max Ki(ui,uNi(¯g ) S, u˜(N S) Ni(¯g )(uS)), ui∈Ui, S ∩ \ ∩ S i∈S i S X∈ where u˜ (u ), j N S, solve the following optimization problem: j S ∈ \

∗ ˜ ∗ Ki ui,uNi(¯g ) S, u˜(N S) Ni(¯g )(uS) = S ∩ \ ∩ S i S X∈   = min Ki ui,u ∗ ,u ∗ ∗ Ni(¯gS ) S (N S) Ni(¯gS ) uj ,j (N S) Ni(¯g ) ∩ \ ∩ ∈ \ ∩ S i S X∈   and g¯S∗ is the closure of network gS∗ , formed by profiles gi∗, i N, s.t. gj∗ = (0,..., 0) for all j N S. ∈ ∈ \ 1.5. Two-stage games with pairwise interactions In (Bulgakova and Petrosyan, 2015; Bulgakova and Petrosyan, 2016) two-stage co- operative network games with pairwise interactions were proposed. The first stage is a network formation stage. On the second stage players play bimatrix games between partners according to the network realized on the first stage. Description of the model The model under consideration was introduced in (Bulgakova and Petrosyan, 2015). Let N be a finite set of players, N = n 2. 1 | | ≥ On first stage z1 each player i N chooses his behavior bi , an n-dimensional vector of offers to connect with other∈ players. The result of first stage is a network 1 1 g(b1,...,bn). On the second stage z2(g) which depends upon the network chosen on the first stage, neighbors in the network play pairwise simultaneous bimatrix games and after that get their payoffs then the game ends. Consider first stage of game. As it was mentioned, players on the first stage 1 1 1 choose behaviors bi = (bi1,...,bin), with components: 1, if j M , b1 = i (16) ij 0, otherwise∈  Cooperation in Dynamic Network Games 51

1 1 Connection ij takes place if and only if bij = bji = 1. Briefly denote Ni(g) as Ni. After formation of network players pass to the second stage z2(g). On the second stage n-person game between all participants of network takes place. This game is a family of simultaneous pairwise bimatrix games γij between neighbors. Namely, let i N, j N,i = j, j N , then i plays with j a bimatrix game γ with matrix ∈ ∈ 6 ∈ i ij Aij and Bij for players i and j respectively,

aij aij aij bji bij bij 11 12 ··· 1k 11 12 ··· 1k aij aij aij bji bij bij  21 22 ··· 2k   21 22 ··· 2k  Aij = . . . . Bij = ......     aij aij aij  bij bij bij   m1 m2 mk  m1 m2 mk a 0 , b ···0, p =1 , ,m; l =1, ···, k  pl ≥ pl ≥ ··· ··· Characteristic function in two-stage game The characteristic function in the two-stage game with pairwise interactions under the consideration is defined using the ideas from (Petrosyan, Sedakov and Bochkarev, 2013). Moreover, due to the structure of interactions, the expression of the characteristic function in found in a closed form.

Game Γz2 can be considered in cooperative form. The characteristic function is defined in trivial way. Denote maximal guaranteed gain (maxmin) of player i(j) with neighbor j(i) as:

ij ji wij = max min a , wji = max min b , p =1,...,m,l =1, . . . , k. (17) p l pl l p pl

The values of the characteristic function v(z2; S) are equal to:

ij ji v(z2; ij ) = max(a + b )+ wir + wjq , j Ni, { } p,l pl pl ∈ r Ni j q Nj i ∈X\{ } ∈X\{ } v(z ; ij )= v(z ; ji )=0, j N N , 2 { } 2 { } ∈ \ i v(z ; i )= w , 2 { } ij j Ni X∈ 1 v(z ; S)= v(z ; ij )+ w , S N, 2 2 2 { } ik ⊂ i,j S,j Ni i S,k (Ni S) ∈X∈ ∈ X∈ \ n 1 ij ji v(z2; N)= max(a + b ). 2 p,l pl pl i=1,i=j,j Ni X6 ∈ In this case we define the value of the characteristic function for coalition S ⊂ N as lower value of zero-sum game between S and N S in game Γz2 , and the superadditivity follows from this. \ As before for S N define the characteristic function v(¯z1; S) as lower value of zero-sum game between⊂ coalition S, acting as player I (maximizing) and coalition N S, acting like player II (minimizing), where payoff of player S is sum of payoffs of\ players in S, and strategy of player S — element of Cartesian product of sets of players’ strategies from S. For minimizing player the best way of behavior is to eliminate all the connections with maximizing player (because of positive payoffs 52 Hongwei Gao, Yaroslavna Pankratova for each connection) Hence we get:

v(¯z ; i )=0, 1 { } v(¯z ; )=0, 1 ∅ max(apl + bpl), j Ni, p,l v(¯z1; ij )= ∈ { } ( 0, j N Ni, ∈ \ v(¯z ; ij )) v(¯z ; S)= 1 { } 1 2 i S,j Ni S ∈ X∈ ⊂ v(¯z1; N)= v(¯z2; N).

1.6. The core in two-stage three-person game With the use of the characteristic function considered in (Bulgakova and Petrosyan, 2015), we examine the core as solution in three-person game. Define also the core C(¯z) I(v) in game Γ and suppose, that for every z , z , C(¯z) = . For the second ⊂ 1 2 6 ∅ stage z2 we have following values of the characteristic function:

v(¯z ; )=0, v(¯z ; 1 )= w + w , 2 ∅ 2 { } 13 12 v(¯z2; 2 )= w21 + w23, v(¯z2; 3 )= w31 + w32, { } 12 21 { } v(¯z2; 12 ) = max(apl + bpl )+ w13 + w23, { } p,l 13 31 v(¯z2; 13 ) = max(apl + bpl )+ w12 + w32, { } p,l 23 32 v(¯z2; 23 ) = max(apl + bpl )+ w21 + w31, { } p,l 12 21 13 31 23 32 v(¯z2; N) = max(apl + bpl )+max(apl + bpl )+max(apl + bpl ). p,l p,l p,l

12 21 13 Introduce notations: A12 = max(apl + bpl ), D1 = w23 + w13, A13 = max(apl + pl pl 31 23 32 bpl ), D2 = w12 + w32, A23 = max(apl + bpl ), D3 = w21 + w31. Imputation pl x = (x1, x2, x3) belongs to core C(¯z2), when following conditions are satisfied. This system, which defines structure of the core C(¯z2) can be rewritten in the form:

x1 + x2 v(¯z2; 12 ) x + x ≥ v(¯z ; {13})  1 3 2 x + x ≥ v(¯z ; {23})  2 3 ≥ 2 { }  x1 v(¯z2; 1 )  ≥ { }  x2 v(¯z2; 2 )  ≥ { } x3 v(¯z2; 3 )  ≥ { } x1 + x2 + x3 = v(¯z2; N)    x1 + x2 A12 + D1 x + x ≥ A + D  1 3 13 2 x + x ≥ A + D  2 3 ≥ 23 3 x1 + x2 + x3 = v(¯z2; N)  Consider the core C(¯z1) of two-stage game Γ and rewrite it in accordance to new notations: Cooperation in Dynamic Network Games 53

x′ + x′ v(¯z ; 12 ) 1 2 ≥ 1 { } x′ + x′ v(¯z ; 13 )  1 3 ≥ 1 { }  x2′ + x3′ v(¯z1; 23 )  ≥ { } x1′ + x2′ + x3′ = v(¯z1; N)   x′ + x′ A  1 2 ≥ 12 x′ + x′ A  1 3 ≥ 13  x2′ + x3′ A23  ≥ x1′ + x2′ + x3′ = v(¯z1; N)  Strongly time-consistency of the core Using an IDP β, we get: β1 + β1 + β2 + β2 A 1 2 1 2 ≥ 12 β1 + β1 + β3 + β3 A (18)  1 2 1 2 13 β2 + β2 + β3 + β3 ≥ A  1 2 1 2 ≥ 23 For strongly time-consistency these inequalities must satisfy under following addi- tional conditions: 1 2 β2 + β2 A12 + D1 β1 + β3 ≥ A + D (19)  2 2 13 2 β2 + β3 ≥ A + D  2 2 ≥ 23 3 Fix β1, then for strongly time-consistency we must have (19) for all β2. β2 must 1 2 3 satisfy conditions (18). Also from that v(¯z2; N)= v(¯z1; N), we get β1 +β1 +β1 = 0.If 1 2 3 (18) satisfies under minimal values of β2 ,β2 ,β2 from condition (19), then it satisfies for other values as well. We get:

β3 + A + D A − 1 12 1 ≥ 12 β2 + A + D A (20)  1 13 2 13 −β1 + A + D ≥ A − 1 23 3 ≥ 23

Hence we get conditions for strongly time-consistency of the core C(¯z1) in game Γ .

Proposition 9. Suppose that the following conditions are satisfied

3 β1 D1 β2 ≤ D (21)  1 2 β1 ≤ D  1 ≤ 3

(there exists β1 which satisfy (21)), then the core C(¯z1) is strongly time-consistent.

2. Dynamic games with shock The following papers (Gao et al., 2017; Petrosyan and Sedakov, 2016) are devoted to repeated games with finite number of rounds. In the framework, the first round is the network formation stage where players form a network choosing their neighbors. All the subsequent rounds have similar structure: observing the network, each player may reconsider his set of neighbors (he can only make the set smaller) and after that the player selects an admissible control. Players’ decisions made in the current round do not influence the structure of the game in any of subsequent rounds. What does influence it is a so-called “shock”, an external factor with a stochastic nature. There are different types of shocks. For instance, in (Corbae and Duffy, 2008), the 54 Hongwei Gao, Yaroslavna Pankratova shock changes sets of players’ actions. In the setting, it is supposed that the shock makes a particular player inactive in the game. Moreover, it is assumed that the shock may appear in each round after the network formation stage, but once the shock appears, it will never appear in subsequent rounds of the game. One-shot network games are studied in (Bala and Goyal, 2000; Galeotti et al., 2006; Haller, 2012) where Nash equilibrium is considered as a solution. The model is based on a cooperative two-stage network formation game (Petrosyan, Sedakov and Bochkarev, 2013). It is worth noting that for static games (or two-stage games), similar settings involving network formation as well as a strategic component are well-studied for coordination games and prisoner’s dilemma games (Goyal and Vega- Redondo, 2005; Jackson and Wolinsky, 1996; Xie et al., 2013). Dynamic aspects of network formation including stochastic elements or cooperative behavior, are con- sidered, for instance, in (Feri, 2007; Fosco and Mengel, 2011; Feri and Mel´endez- Jim´enez, 2013; Jackson and Watts, 2002). The papers (Gao et al., 2017; Petrosyan and Sedakov, 2016) also cover the prob- lem of subgame consistency of a cooperative solution in repeated network games, namely, subgame consistency of the dynamic Shapley value (Shapley, 1953). It is known that the Shapley value is an efficient cooperative solution. The Shapley value is subgame consistent if for any player his entry of the Shapley value equals to the sum of cumulative individual stage payoffs up to an arbitrary round and his entry of the Shapley value in the subgame starting from this round, provided that all the players follow the cooperative agreement. The notion of subgame consistency was introduced in (Petrosjan, 2006) for a cooperative . Inconsistency of the cooperative solution may break the cooperative agreement, but by means of spe- cially designed imputation distribution procedure (Petrosyan and Danilov, 1979), the cooperative agreement can be kept throughout the game. As an application of the proposed theory, one can imagine a wireless network in which a pair of wireless agents (players) can transmit data to each other. Data transmission is successful if transmit power of the players is greater than a threshold, i.e. if they are “connected”. Thus the first stage can be interpreted as a stage at which players choose their transmit power. Then observing the network, agents can reduce their transmit power (if it makes sense) and select the transmission capacity according to a demand, while the shock can make the particular agent inactive in the network. In a cooperative scenario one may focus on finding a policy that maximizes the expected total profit of the network according to its topology and transmission capacities chosen by the agents.

2.1. The model Due to the importance of (Gao et al., 2017) from our perspective, we describe the model in detail. We consider a dynamic game with more than two stages. Let ℓ +1 be a length of the game. The game consists of one network formation stage and ℓ rounds. When the game is deterministic, one can easily extend the theory of two- stage games to the (ℓ + 1)-stage game. For this reason a stochastic element called “shock” influencing the network structure is introduced. The shock, which may appear between rounds with a given probability p, is characterized by a discrete random variable ω that takes only ℓ + 1 values. If ω = 0, the shock does not appear in the game, whereas ω = t specifies the game round before which the shock appears. It is supposed that the probability of the shock in round t equals Cooperation in Dynamic Network Games 55

t 1 Pr(ω = t)=(1 p) − p for t 1,...,ℓ , and the probability of not appearing the shock in the game− is Pr(ω =0)=(1∈{ p})ℓ. − Consider a tuple (τ ,...,τ ) where τ 0, 1,...,t for all t 1,...,ℓ . We 1 ℓ t ∈ { } ∈ { } connect realization of ω = t with a unique tuple (τ1(ω),...,τℓ(ω)) such that τ1(t)= ... = τt 1(t) = 0 and τt(t) = ... = τℓ(t) = t. Below, to simplify notations, the dependence− on ω is left out.

Game stages Again we distinguish the network formation stage and other subse- quent stages as in (Petrosyan, Sedakov and Bochkarev, 2013).

Network formation stage. This stage dos not differ from that of the two-stage model. Let network g be formed at this stage. Denote the player who has more neighbors in g than any other player by m N, i.e. ∈

m = argmax Ni(g) , (22) i N ∈ | | If more than one player satisfies (22), we select one of them for further consideration.

Round 1. At the beginning of the round the shock appears with probability p. It means that player m with this probability becomes inactive in the network. In other words, all links involving player m are eliminated from network g, yet this player still belongs to set N and receives zero payoffs. Note that in round 1, τ1 can take only two values: 0 and 1.

Let g m denote a network in which all the link with player m N are deleted, − ∈ i.e. g m = g (j, m) g : j Nm(g) . Thus we have a network − \{ ∈ ∈ } g, τ =0 g1,τ1 = 1 g m, τ1 =1.  −

After observing the network g1,τ1 , players are allowed to reconsider its structure: in particular, players can only delete some “ineffective” links. For this purpose, we 1,τ1 1,τ1 1,τ1 introduce n-dimensional vectors di(g ) = (di1(g ),...,din(g )), i N which show an updated network: ∈

1, if i keeps the link with player j N (g1,τ1 ) in g1,τ1 , d (g1,τ1 )= i (23) ij 0, otherwise. ∈ 

1,τ1 1,τ1 1,τ1 1,τ1 Let Di(g ) = di(g ): di(g ) satisfies (23) , i N. The profile d(g ) = 1,τ1 { 1,τ1 1,τ1 } ∈ (d1(g ),...,dn(g )) updates network g , thus a new network, denoted by d,1,τ1 1,τ1 1,τ1 g , consists of links (i, j) such that dij (g )= dji(g ) = 1. At the same time, each player chooses a control from a given set. In particular, 1,τ1 player i N chooses ui Ui. Then behavior of player i in round 1 is a pair 1,τ1 ∈ 1,τ1 ∈ (di(g ),ui ). A payoff to player i N is defined according to a real-valued ∈ d,1,τ1 1,τ1 payoff function Ki which depends on the updated network g , control ui of 1,τ1 d,1,τ1 1,τ1 1,τ1 player i, and controls of his neighbors u , j Ni(g ), i.e. Ki(u ,u d,1,τ ). j ∈ i Ni(g 1 ) Having received the payoffs in this round, players proceed to the next round with the similar structure. Consider an intermediate round t 2,...,ℓ and suppose d,t 1,τt ∈ { } that we have an updated network g − −1 after round t 1. − 56 Hongwei Gao, Yaroslavna Pankratova

Round t, t 2,...,ℓ . At the beginning of this round the shock appears with probability p∈(if { it did} not appear before). In case of the shock, player m becomes inactive with this probability. If shock appeared in previous rounds, nothing hap- pens. Thus we have a network just prior the round t:

d,t 1,τt−1 t,τt g − , τt 0, 1,...,t 1 , g = d,t 1,τt−1 ∈{ − } g m− , τt = t.  − After observing the network gt,τt , players are allowed to reconsider its link structure. t,τt t,τt For this purpose, we introduce n-dimensional vectors di(g ) = (di1(g ),..., d (gt,τt )), i N which show an updated network: in ∈ 1, if i keeps the link with player j N (gt,τt ) in gt,τt , d (gt,τt )= i (24) ij 0, otherwise. ∈  t,τt t,τt t,τt t,τt Let Di(g ) = di(g ): di(g ) satisfies (24) , i N. The profile d(g ) = t,τt { t,τt t,τt } ∈ (d1(g ),...,dn(g )) updates network g , thus a new network, denoted by d,t,τt t,τt t,τt g , consists of links (i, j) such that dij (g )= dji(g ) = 1. At the same time, each player chooses control from a given set. In particular, t,τt player i N chooses ui Ui. Then behavior of player i in round t is a pair t,τt ∈ t,τt ∈ (di(g ),ui ). In round t, a payoff to player i N is defined according to the same real-valued t,τt t,τt ∈ payoff function Ki(u ,u d,t,τ ). i Ni(g t ) Having received the payoffs in this round, players proceed to the round t +1 unless t = ℓ. In this case the game ends. Strategies To formalize the game, define strategies of players. Definition 6. A strategy x = xω of player i N is a rule that assigns a profile: i { i } ∈ ω 1,τ1 1,τ1 ℓ,τℓ ℓ,τℓ xi = (gi, (di(g ),ui ),..., (di(g ),ui )) to each value ω 0, 1,...,ℓ . ∈{ } Recall that ω defines profile (τ1,...,τℓ) in a unique way, therefore, for any ω = t, t 0, 1,...,ℓ and player i N, we get ∈{ } ∈ 1,0 1,0 (gi, (di(g ),ui ),..., t 1,0 t 1,0 t,t t,t ℓ,t ℓ,t t (di(g − ),ui− ), (di(g ),ui ),..., (di(g ),ui )),i = m, xi =  1,0 1,0 6  (gm, (dm(g ),um ),...,  t 1,0 t 1,0 t,t ℓ,t (dm(g − ),um− ), (0,um ),..., (0,um )), i = m.  Let Xi denote a set of strategies of player i N. Given a value ω = t, consider a profile xt = (xt ,...,xt ). A payoff to player i∈ N for ω = t equals: 1 n ∈ t 1 ℓ − j,0 j,0 j,t j,t K (u ,u d,j,0 )+ K (u ,u d,j,t ),i = m, i i Ni(g ) i i Ni(g ) t j=1 j=t 6 i(x )=  t 1 K  P− j,0 j,0 P  K (u ,u d,j,0 ), i = m.  m m Nm(g ) j=1  P Then a payoff i to player i N in the whole game is defined as his expected payoff, provided thatE the strategy profile∈ x = (x ,...,x ) X ... X is chosen: 1 n ∈ 1 × × n ℓ (x)= Pr(ω = t) (xt). Ei Ki t=0 X Cooperation in Dynamic Network Games 57

2.2. Cooperation in the dynamic game with shock In the previous section we specified rules of the and formalized it. Now the repeated game is considered from the perspective of classical cooperative theory. Characteristic function Under the cooperative agreement, the value V (N) can be easily determined using the ideas of (Petrosyan, Sedakov and Bochkarev, 2013). Since all the players aim at maximizing their total expected payoff, let

V (N)= max i(x)= i(x∗). (25) xi,i N E E ∈ i N i N X∈ X∈ A strategy profile x∗ = (x1∗,...,xn∗ ) which entries xi∗, i N are from (25) is called the cooperative strategy profile. ∈

Proposition 10. Let functions Ki, i N satisfy property (P). Then cooperative ω ∈ strategy xi∗ = xi∗ of player i N for a given ω = t, t 0, 1,...,ℓ is of the t { } 0 ∈ 0 t ∈t { } form: xi∗ = (gi∗, (gi∗,ui∗ ),..., (gi∗,ui∗ ), (gi∗,ui∗ ),..., (gi∗,ui∗ )) for all i N m t 0 0 t t ∈ \{ } and xm∗ = (gm∗ , (gm∗ ,um∗ ),..., (gm∗ ,um∗ ), (0,um∗ ),..., (0,um∗ )). From the previous statement, we conclude that only two networks are possible in the cooperative framework: network g∗ which is not changed until the shock appears, and network g∗ m which is not changed after the shock has appeared. The next result is an− extension of the result introduced in (Gao et al., 2017), provided that functions Ki, i N satisfy property (P). The statement connects the value V (S) for a given coalition∈ S N with values of “local” (or stage) charac- teristic functions in games played in⊆ each round. More specifically, given a round t and a number τt 0, 1,...,t , we define stage characteristic functions as in (Gao et al., 2017) v(∈S) { before the} shock andv ˆ(S) after the shock:

v(S)= max Ki(ui,uNi(g) S), (26) (gi,ui)∈Gi×Ui ∩ i∈S i S X∈

vˆ(S)= max Ki(ui,uNi(g) S). (27) (gi,ui)∈Gi×Ui ∩ i∈S\{m} i S m ∈ X\{ } Proposition 11. The value V (S) can be found from the recurrence equation V (S)= pℓvˆ(S)+(1 p)V (S) where − 1 V (S)= v(S)+ p(ℓ t)ˆv(S)+(1 p)V (S) t − − t+1 for t =1,...,ℓ 1 − with boundary condition Vℓ(S)=ˆv(S). 2.3. Cooperative solution Having determined values V (S) for all S N, one can define an imputation which is a n-dimensional vector showing how the⊆ maximal total expected payoff V (N) is allocated among players. Let the Shapley value Φ = (Φ1,...,Φn) be taken as the solution. Specifying the Shapley value as the solution, for all i N, its entries can be determined by the formula: ∈

Φ = α [V (S) V (S i )] , (28) i S − \{ } S N,i S ⊆X∈ 58 Hongwei Gao, Yaroslavna Pankratova where α = ( N S )!( S 1)!/ N !. S | | − | | | |− | | Let φ = (φ1,...,φn) and φˆ = (φˆ1,..., φˆn) be the “local” Shapley values calcu- lated for characteristic functions v andv ˆ respectively:

φ = α [v(S) v(S i )] , i S − \{ } S N,i S ⊆X∈ φˆ = α [ˆv(S) vˆ(S i )] . i S − \{ } S N,i S ⊆X∈

Note, that φˆm = 0. Proposition 12 (Gao et al., 2015). The Shapley value Φ in (28) can be found in an explicit form by means of the Shapley values φ and φˆ in stage games:

(1 p)[1 (1 p)ℓ] (1 p)[1 (1 p)ℓ] Φ = − − − φ + ℓ − − − φˆ , i = m, (29) i p i − p i 6   (1 p)[1 (1 p)ℓ] Φ = − − − φ . (30) m p m 2.4. Subgame-consistency problem The problem of subgame consistency of cooperative solutions was considered in (Yeung and Petrosyan, 2006; Yeung and Petrosyan, 2012) for cooperative differen- tial games. In cooperative dynamic network games this problem was examined in (Gao et al., 2017; Petrosyan and Sedakov, 2016). Before the game ΓC starts, play- ers agree on choosing cooperative strategies x1∗,...,xn∗ from (25), i.e. the strategies that maximize the total expected payoff, and allocating the value V (N) according to the Shapley value Φ. This means that in Γ each player i N expects his payoff to C ∈ be equal to Φi. If players recalculate the Shapley value after the network formation stage (after choosing g1∗,...,gn∗ ), unfortunately, it turns out that the recalculated Shapley value differs from the “original” Φ. This fact leads to breaking the coop- erative agreement since some players may refuse using their cooperative strategies. We study the problem in detail. Characteristic function and the Shapley value in a subgame Similarly to (26) and (27), we define characteristic functions: before the shock v(g,S) and after the shockv ˆ(g,S) for any S N, provided that network g has formed (which is the case after the network formation⊆ stage):

v(g,S)= max Ki(ui,uNi(g) S), (31) ui∈Ui, ∩ i∈S i S X∈

vˆ(g,S)= max Ki(ui,uNi(g) S). (32) ui∈Ui, ∩ i∈S\{m} i S m ∈ X\{ } t,τ t,τ Consider a game round t 1,...,ℓ and τ 0, 1,...,t . Let ΓC = (N, V ) ∈{ } ∈{ } t,τ denote a subgame of the game ΓC . The characteristic function V for any S N is defined similarly to V (S) as: ⊆

(ℓ t + 1)ˆv(g∗ m,S) for τ 1,...,t , t,τ − − ∈{ }t+1,0 V (S)= v(g,S)+ p(ℓ t)ˆv(g∗ m,S)+(1 p)V (S) for τ =0  − − ℓ+1,0 − with boundary condition V (S)=ˆv(g∗ m,S).  −  Cooperation in Dynamic Network Games 59

t,τ t,τ t,τ t,τ The entries of the Shapley value Φ = (Φ1 ,...,Φn ) in subgame ΓC can be determined by the formula:

Φt,τ = α V t,τ (S) V t,τ (S i ) . (33) i S − \{ } S N,i S ⊆X∈   Note that Φt,τ = 0 for all τ 1,...,t . m ∈{ } Let φ(g) = (φ1(g),...,φn(g)) and φˆ(g) = (φˆ1(g),..., φˆn(g)) denote the “local” Shapley values calculated for characteristic functions v(g,S) andv ˆ(g,S) respec- tively:

φ (g)= α [v(g,S) v(g,S i )] , i S − \{ } S N,i S ⊆X∈ φˆ (g)= α [ˆv(g,S) vˆ(g,S i )] . i S − \{ } S N,i S ⊆X∈

Note, that φˆm(g) = 0. Then we get a result similar to Proposition 12.

Proposition 13 (Gao et al., 2015). The Shapley value Φt,τ in (33), t 1,...,ℓ , ∈{ ˆ } can be found in an explicit form by means of the Shapley values φ(g∗) and φ(g∗ m) in stage games −

ℓ t t,0 (1 p)[1 (1 p) − ] Φ = 1+ − − − φ (g∗) (34) i p i   ℓ t (1 p)[1 (1 p) − ] ˆ + ℓ t − − − φi(g∗ m), i = m, − − p − 6   t,τ ˆ Φi = (ℓ t + 1)φi(g∗ m), τ 1,...,t , i = m, (35) − − ∈{ } 6 ℓ t t,0 (1 p)[1 (1 p) − ] Φ = 1+ − − − φ (g∗), (36) m p m   Φt,τ =0, τ 1,...,t . (37) m ∈{ } t t t The entries of the expected Shapley value Φ = (Φ1,...,Φn) in the remaining rounds of the game starting from round t 1,...,ℓ , provided that the shock has not appeared yet, have the following form∈{ (Gao et al.,} 2015):

Φt = (1 p)Φt,0 + pΦt,t i − i i ℓ t+1 (1 p)[1 (1 p) − ] = − − − φ (g∗) p i ℓ t+1 (1 p)[1 (1 p) − ] ˆ + ℓ t +1 − − − φi(g∗ m), i = m, − − p − 6   ℓ t+1 t t,0 (1 p)[1 (1 p) − ] Φ = (1 p)Φ = − − − φ (g∗). m − m p m

Thus, we come to the following observation. In game ΓC players agree on choos- ing cooperative strategies x∗, i N and allocating value V (N) according to the i ∈ Shapley value Φ determined by formulas (29) and (30). After forming network g∗ prescribed by the cooperative strategies, players may recalculate the Shapley value 60 Hongwei Gao, Yaroslavna Pankratova

1 1 which becomes Φ . Therefore, there may exist a player i N such that Φi = Φi . This fact means “inconsistency” of the Shapley value. The∈ Shapley value would6 be subgame consistent if for any player the statement was true: the entry Φi equals the sum of cumulative individual stage payoffs to player i up to round t and the entry t Φi in the subgame starting from this round, provided that all the players follow their cooperative strategies x∗, i N. i ∈ Mechanism of stage payments Since the Shapley value is subgame inconsis- tent (Gao et al., 2015; Petrosyan and Sedakov, 2016), we reallocate players’ stage payoffs with new payments specified below. Denote payments to player i N in 0 t,τ ∈ 0 all rounds in ΓC by βi = βi ,βi , t 1,...,ℓ , τ 0, 1,...,t . Here βi is a { } ∈ { } ∈t,τ { } payment to player i at the network formation stage, βi is a payment to player i in round t, provided that the shock appears in round τ for τ > 0 (or the shock does not appear before round t for τ = 0).

Definition 7. Imputation distribution procedure (IDP) of the Shapley value Φ = (Φ1,...,Φn) is a profile β = (β1,...,βn) such that

ℓ ℓ Φ = β0 + Pr(ω = τ) βt,τ , i N. (38) i i i ∈ τ=0 t=1 X X Definition 8. IDP β of the Shapley value Φ is subgame consistent if for all i N and t 1,...,ℓ we have: ∈ ∈{ } ℓ Φt,τ = βt,τ , τ 1,...,t , (39) i i ∈{ } j=t X ℓ ℓ ℓ Φt,0 = βt,0 + Pr(ω = τ t) βt,τ + Pr(ω =0 t) βt,0, i i | i | i τ=t+1 j=t+1 j=t+1 X X X i.e. if a stage payment to player i and his expected component of the Shapley value t,τ in the game starting from this stage till the end equals Φi for all t. Proposition 14. Subgame consistent IDP β for the Shapley value Φ is of the form:

0 1 βi = Φi Φi , i = m, t,0 t,0− t+1 6 βi = Φi Φi = φi(g∗), t 1,...,ℓ ,i = m, t,τ t,τ − t+1,τ ˆ ∈{ } 6 βi = Φi Φi = φi(g∗ m),t 1,...,ℓ , τ =0,i = m, 0 − 1 − ∈{ } 6 6 βm = Φm Φm, t,0 t,0− t+1 βm = Φm Φm = φm(g∗), t 1,...,ℓ , βt,τ = Φt,0 − Φt+1 =0, t ∈{1,...,ℓ}, τ =0. m m − m ∈{ } 6 The designed subgame consistent IDP prescribes players the following mechanism of stage payments: in each round players are paid according to their local Shapley values, whereas at the network formation stage players are paid the difference Φi 1 − Φi , i N which does not always equal zero. In∈ (Petrosyan and Sedakov, 2016), a more general model is considered when there is the second network formation stage where all players without the player affected by the shock can revise the network again. Alternative studies, for exam- ple, (Butenko and Petrosyan, 2014; Butenko and Petrosyan, 2015) differ from the Cooperation in Dynamic Network Games 61 model under consideration in that that the shock influences not a particular player, but particular links, or players’ action space. These model are applied to a three- stage problem, however they can also be extended to a game with an arbitrary number of stages.

3. Strategic support of cooperation This probled was studied in (Petrosyan and Sedakov, 2015). For players, coopera- tion is more preferable than non-cooperative behavior as cooperative behavior can be more beneficial for them. However, in dynamic games, a cooperative agreement creates two major problems. The first one is time inconsistency of any dynamic co- operative solution in general: Even if all players agree on the solution at the begin- ning of the game, a player/group of players, focusing on her/its cooperative payoffs, might want to revise the solution after some stages (time). To make players indiffer- ent to the revision of the cooperative solution, an imputation distribution procedure (Petrosyan and Danilov, 1979) reallocating players’ stage payoffs over time under the cooperative agreement is introduced. The second problem is that the players’ cooperative strategies which result in the cooperative payoffs are not a Nash equi- librium in general. This means that there exists a player who will benefit if she stops following her cooperative strategy prescribed by the agreement. Is is shown show that implementing the time-consistent imputation distribution procedure, it becomes possible to find a Nash equilibrium guaranteeing the cooperative payoffs in some class of strategies. However, we can only do it under a specific condition on parameters of the dynamic game. When we are able to have the cooperative payoffs as a result of both a time-consistent imputation distribution procedure and a Nash equilibrium, we can say that the cooperative agreement and, therefore, the cooperation of players is strategically supported (Parilina, 2014; Petrosyan, 2008; Petrosyan and Zenkevich, 2009; Yeung and Petrosyan, 2012). Here the theory of strategically supported cooperation is developed for dynamic games on networks in which a network structure is a central element (see, for exam- ple, (Petrosyan and Sedakov, 2015)). Again during the game, players form a net- work and choose their control variables, but they can benefit only from their neigh- bors in the network as in (Petrosyan, Sedakov and Bochkarev, 2013). The theory will also be applied to repeated games (Abreu et al., 1994; Aumann and Shapley, 1994; Myerson, 1997) which is a special class of dynamic games.

4. The model The game considered in (Petrosyan and Sedakov, 2015) consists of a network for- mation stage which is the same as in previous models and subsequent stages of a t t similar structure where at stage t each player chooses (di(g ),ui(g )) and the is re- t t t t warded according to his payoff function δ Ki(ui(g ),uNi(d(g ))(g )), where δ (0, 1) is a common discount factor. Here we note that for any player i N, his∈ control t t ∈ ui(g ) Ui(g ) depends on a network. After players receive their stage payoffs, we proceed∈ to next stage and the corresponding stage game on the network gt+1 given by a single-valued rule : gt+1 = (gt,u(gt)) where u(gt) = (u (gt),...,u (gt)). T T 1 n

Definition 9. A strategy ηi of player i N is a rule that uniquely prescribes be- ∈ t t havior gi of this player at the network formation stage and behavior (di(g ),ui(g )) at game stage t 1 on network gt (N). ≥ ∈ G 62 Hongwei Gao, Yaroslavna Pankratova

Given a strategy profile η = (η1,...,ηn), one can define the payoff to player i N in the game as a function of the strategy profile η as ∈

∞ t t t (η)= δ K (u (g ),u t (g )), Ki i i Ni(d(g )) t=1 X provided that the discounted sum exists. Suppose now that players jointly choose strategies η1,...,ηn to maximize the sum of their payoffs in the game. The profileη ¯ = (¯η1,..., η¯n) solving the following maximization problem

i(¯η) = max i(η), K η K i N i N X∈ X∈ (if the maximum exists), we call the cooperative strategy profile, and an element of the profile is a cooperative strategy. Since the game is considered in the cooperative setting, the payoff to a player is prescribed by a cooperative solution. As the solution, we take the Shapley value Φ = (Φ1,...,Φn) which is calculated for the characteristic function i N i(¯η), S = N, ∈ K V (S)= max min i S i(η),S N,S = ,  ηi,i S ηj ,j N S ∈ K ⊂ 6 ∅ P ∈ ∈ \  0,P S = , ∅ where approach from (Von Neumann and Morgenstern, 1944) is used. Thus, in the cooperative setting each player should follow her cooperative strategy, and the payoff to each player i N in the games equals Φ . However, due to time inconsistency of ∈ i the Shapley value, the player may not get the value Φi as her payoff. It means that the equality τ 1 − t t t τ τ Φ = δ K (¯u (¯g ), u¯ ¯ t (¯g )) + δ Φ (¯g ), (40) i 6 i i Ni(d(¯g )) i t=1 X does not hold at least for one τ 1 and player i N. Hereg ¯τ is the network to ≥ ∈ τ τ 1 τ 1 which the process comes at stage τ under cooperation, i.e.,g ¯ = (¯g − , u¯(¯g − )), τ T and Φi(¯g ) is the Shapley value in the infinite-horizon game starting from network g¯τ and calculated for the characteristic function V (¯gτ ,S), given by

∞ t τ t t t δ − Ki(¯ui(¯g ), u¯Ni(d¯(¯g ))(¯g )), S = N, t=τ  ∞ t τ t t P t  max min δ − Ki(ui(g ),uNi(d(g ))(g )),  d gt ,u gt , d gt ,u gt , V (¯gτ ,S)=  ( i( ) i( )) ( j ( ) j ( )) i S t=τ  i∈S,t≥τ j∈N\S,t≥τ ∈  P P  s.t. gt+1 = (gt,u(gt)), t τ, gτ =g ¯τ , S T N,S = , ≥  ⊂ 6 ∅  0, S = .  ∅  Here the value V(¯gτ ,S) is the maximal value which coalition S guarantees for itself if its complement N S acts against it in a zero-sum game, provided that network g¯τ is given. \ To fulfill condition (40) for all stages, we replace players’ stage payoffs with payments (and we also add payments at the network formation stage) according to an imputation distribution procedure (IDP) which reallocates the Shapley value over time and players’ stage payoffs at each game stage. In this model the IDP β = Cooperation in Dynamic Network Games 63

t βit i N,t 0 of the Shapley value Φ is determined in the way that t∞=0 δ βit = Φi. {The} IDP∈ β≥ is time consistent if for all τ 1 and players, we have: ≥ P τ 1 − t τ τ Φi = δ βit + δ Φi(¯g ). (41) t=0 X Dynamic games on networks of general structure In this section, we formu- late the results about strategic support of cooperation in case of a dynamic game of a general form. Proposition 15. The time-consistent IDP β of the Shapley value Φ for each i N is given by the following expressions: ∈

β = Φ δΦ (¯g1), i0 i − i β = Φ (¯gt) δΦ (¯gt+1), t 1, it i − i ≥ where g¯t+1 = (¯gt, u¯(¯gt)). T In general, cooperative strategy profileη ¯ is not a Nash equilibrium, therefore, even implementing a time-consistent IDP of the Shapley value, a player may break the cooperative agreement and switch from her cooperative strategy to some other trajectory. Below a condition when cooperative strategy profile is a Nash equilibrium is proposed. This result is obtained in a class of punishment strategies which is a sub-class of strategies in the sense of Definition 1. A punishment strategy ζi of player i N is determined in such way that if no one deviates from her cooperative strategy,∈ all players continue to follow these strategies, but if one player i N ∈ deviates from her cooperative strategyη ¯i at some game stage, the remaining players from N i start punishing her immediately from the next game stage onwards and never\{ switch} their strategies back to cooperative (in other words, starting from the next game stage players i and N i are involved in a zero-sum game in which i tries to maximize her future payoff,\{ whereas} the coalition N i acting as a single player minimizes it). \{ } Consider the following system of implicit inequalities with respect to δ:

Φi δV (g, i ), Φ (g) ≥ κ (g)+{ δV} ( (g,u(g)), i ),  i i ≥ for all i N,T g , { }  ∈ ∈ L where κi(g) is the stage payoff to player i if deviating from her cooperative strat- egy, she plays best response to opponents’ cooperative strategies in network g, and (N) is a set of networks generated by cooperative strategy profileη ¯. Here δ L ⊆ G implicitly appears in Φi, Φi(g), V (g, i ), and V ( (g,u(g)), i ). The system above is reduced to the following: { } T { } Φ Φ (g) κ (g) δ min min i ; i − i . (42) ≤ i N g V (g, i ) V ( (g,u(g)), i ) ∈ ∈L  { } T { }  Let there exist δ such that the minimum in the right-hand side in (42) exceeds it. Proposition 16 (Petrosyan and Sedakov, 2015). For any δ that solves (42), strategy profile (ζ1,...,ζn) with players’ payoffs as Φ1,...,Φn guaranteeing by time- consistent IDP β is a Nash equilibrium. 64 Hongwei Gao, Yaroslavna Pankratova

It is worth noting, that the system (42) may not have a solution from (0, 1). If it is the case, we cannot obtain the cooperative outcome Φ1,...,Φn as a result of a Nash equilibrium. Repeated Network Games Repeated games is a class of dynamic games in which a given normal-form game appears either a finite or an infinite number of periods. In this part of the review, we suppose that we have one stage at which players create a network structure, and after this stage we have a normal-form game on a network which is repeated an infinite number of periods. In other words, Ui(g)= Ui(g′)= Ui for all i N and g = g′, and for any g (N), we have (g,u(g)) = g. ∈ 6 ∈ G T In (Petrosyan, Sedakov and Bochkarev, 2013; Petrosyan and Sedakov, 2015) the structure of players’ cooperative strategies was proposed for two-stage games (one network formation stage and one game stage). Since the considered game is re- peated, a structure of players’ cooperative strategies in this game will be the same. Specifically, at the network formation stage players should choose g¯i, i N and form networkg ¯, and from this stage players do not change the network∈ choosing controlsu ¯i, i.e., d¯i(¯g)=¯gi. Assuming that players behave cooperatively, again, the Shapley value is taken as δ a solution of the game. The entries of the Shapley value Φ are: Φi = 1 δ φi, i N, − ∈ where φi is the entry of the Shapley value in any of stage games determined by the characteristic function

Ki(¯ui, u¯Ni(¯g)), S = N, i N ∈ v(S)=  maxP max Ki(ui,uNi(g) S ),S N,S = , gi,i S ui,i S ∩ ⊂ 6 ∅  ∈ ∈ i S  0,P∈ S = . ∅  δ Therefore, V (S)= 1 δ v(S), S N. The Shapley value Φ(¯g) in the infinite-horizon − ⊆ 1 game starting from networkg ¯ has a similar form: Φi(¯g) = 1 δ φi(¯g), i N, where − ∈ φi(¯g) is the entry of the Shapley value in any of stage games determined by the characteristic function

Ki(¯ui, u¯Ni(¯g)), S = N, i N ∈ v(¯g,S)=  Pmax Ki(ui,uNi(¯g) S),S N,S = , ui Ui,i S ∩ ⊂ 6 ∅  ∈ ∈ i S  0,P∈ S = , ∅  provided that the network g ¯ is given. Therefore, V (¯g,S)= 1 v(¯g,S), S N. 1 δ ⊆ Due to time inconsistency of the solution, the allocation− is realized with the use of an imputation distribution procedure. In case of repeated games, the time- consistent IDP β for the Shapley value Φ is of the form:

δ β = (φ φ (¯g)) , i0 1 δ i − i − β = φ (¯g), i N, t =1, 2,.... it i ∈

Under the cooperative agreement, players create networkg ¯ by profile (¯g1,..., g¯n) at the network formation stage, and do not change it choosing (¯u1,..., u¯n) at each subsequent stage. So, if player i N deviates from cooperative behavior at the ∈ δ network formation stage, she gets the value 1 δ v( i ) as her payoff in the game. − { } Cooperation in Dynamic Network Games 65

However, if she deviates at a game stage, she will play her best response to those of other players and get the value κi = maxui Ui Ki(ui, u¯N (¯g)) at this stage, and ∈ i after deviation, her future payoff will be δ v(¯g, i ). Therefore, player i will never 1 δ { } switch from her cooperative strategyη ¯ to− any other strategy if Φ δ v( i ) and i i ≥ 1 δ { } Φ (¯g) κ + δ v(¯g, i ). These two inequalities can be simplified to:−φ v( i ) i ≥ i 1 δ { } i ≥ { } − φi(¯g) v(¯g, i ) which always holds, and δ 1 − { } if κ = v(¯g, i ). If κ = v(¯g, i ), we κi v(¯g, i ) i i δ ≥ − − 1{ } 6 { } { } have Φ(¯g) v(¯g, i )+ 1 δ v(¯g, i )= 1 δ v(¯g, i )= V (¯g, i ) which always holds. Then we have:≥ { } − { } − { } { }

Proposition 17 (Petrosyan and Sedakov, 2015). For any δ δ∗, where ≥

φi(¯g) v(¯g, i ) δ∗ = max 1 − { } , (43) i∈N: − κi v(¯g, i ) κi>v(¯g,{i})  − { }  strategy profile (ζ1,...,ζn) with players’ payoffs as Φ1,...,Φn guaranteeing by time- consistent IDP β is a Nash equilibrium.

References Abreu, D., Dutta, P. and Smith, L. (1994). The Folk theorem for repeated games: a NEU condition. Econometrica, 62, 939–948. Aumann, R. and Shapley, L. (1994). Long-Term Competition—A Game-Theoretic Anal- ysis. In: Megiddo N. (ed) Essays in Game Theory. In Honor of Michael Maschler, Springer-Verlag, pp. 1–15. Bala, V. and Goyal, S. (2000). A non-cooperative model of network formation. Economet- rica, 68(5), 1181–1231. Bramoull´e, Y. and Kranton, R. (2007). Public goods in networks. Journal of Economic Theory, 135(1), 478–494. Bulgakova, M. and Petrosyan, L. (2015). The Shapley value for the network game with pairwise interactions. International Conference on ”Stability and Control Processes” in Memory of V.I. Zubov, SCP, pp. 229–232. Bulgakova, M. and Petrosyan, L. (2016). About strongly time-consistency of core in the network game with pairwise interactions. Proceedings of 2016 International Conference ”Stability and Oscillations of Nonlinear Control Systems” (Pyatnitskiy’s Conference), STAB 2016, pp. 229–232. Butenko, M. and Petrosyan, L. (2014). A combined solution concept in a multistage network game. Proceedings of the XLV International Conference on Control Processes and Stability (CPS14), pp. 452–457. Butenko, M. and Petrosyan, L. (2015). A two-step solution concept in a network game with shock of a special type. Proceedings of the XLVI International Conference on Control Processes and Stability (CPS15), pp. 573–578. Corbae, D. and Duffy, J. (2008). Experiments with network formation. Games and Eco- nomic Behavior, 64, 81–120. Driessen, T. S. H. and Funaki, Y. (1991). Coincidence of and collinearity between game theoretic solutions. OR Spektrum, 13(1), 15–30. Dutta, B., Van den Nouweland, A. and Tijs, S. (1998). Link formation in cooperative situations. International Journal of Game Theory, 27, 245–256. Feng, X., Zhang, W., Zhang, Y. and Xiong, X. (2014). Information identification in differ- ent networks with heterogeneous information sources. Journal of Systems Science and Complexity, 27(1), 92–116. Feri, F. (2007). Stochastic stability in networks with decay. Journal of Economic Theory, 135, 442–457. 66 Hongwei Gao, Yaroslavna Pankratova

Feri, F. and Mel´endez-Jim´enez, M. (2013). Coordination in evolving networks with endoge- nous decay. Journal of Evolutionary Economics, 23, 955–1000. Fosco, C. and Mengel, F. (2011). Cooperation through imitation and exclusion in networks. Journal of Economic Dynamics & Control, 35, 641–658. Galeotti, A. and Goyal, S. (2010). The Law of the Few. American Economic Review, 100(4), 1468–1492. Galeotti, A., Goyal, S. and Kamphorst, J. (2006). Network formation with heterogeneous players. Games and Economic Behavior, 54, 353–372. Gao, H., Dai, Y., Li, W., Song, L. and Lv, T. (2010). One-Way Flow Dynamic Network Formation Games with Coalition-Homogeneous Costs. Contributions to Game Theory and Management, 3, 104–117. Gao H., Liu Z. and Dai Y. (2011). The Dynamic Procedure of Information Flow Network. Contributions to Game Theory and Management, 4, 172–187. Gao, H., Petrosyan, L., Qiao, H. and Sedakov, A. (2017). Cooperation in two-stage games on undirected networks. Journal of Systems Science and Complexity, 30(3), 680–693. Gao, H., Petrosyan, L. and Sedakov, A. (2015). Dynamic Shapley value for repeated network games with shock. Control and Decision Conference (CCDC), 2015 27th Chinese, pp. 6449–6455. Goyal, S. and Vega-Redondo, F. (2005). Network formation and social coordination. Games and Economic Behavior, 50, 178–207. Haller, H. (2012). Network extension. Mathematical Social Sciences, 64, 166–172. Igarashi, A. and Yamamoto, Y. (2013). Computational Complexity of a Solution for Di- rected Graph Cooperative Games. Journal of the Operations Research Society of China, 1(3), 405–413. Jackson, M. (2008). Social and economic networks. Princeton: Princeton University Press. Jackson, M. and Watts, A. (2002). On the formation of interaction networks in social coordination games. Games and Economic Behavior, 41(2), 265–291. Jackson, M. and Wolinsky, A. (1996). A strategic model of social and economic networks. Journal of Economic Theory, 71, 44–74. Kuhn, H.W. (1953). Extensive games and the problem of information. Contributions to the Theory of Games II (ed. by Kuhn H.W. and Tucker A.W.), Princeton, 193–216. Lu, X., Li, J. and Yang, F. (2010). Analyses of location-price game on networks with stochastic customer behavior and its heuristic algorithm. Journal of Systems Science and Complexity, 23(4), 701–714. Myerson, R. (1997). Game Theory: Analysis of conflict. Harvard University Press. Parilina, E. (2014). Strategic stability of one-point optimality principles in cooperative stochastic games. Matematicheskaya Teoriya Igr i Ee Prilozheniya, 6(1), 56–72. Petrosjan, L. A. (2006). Cooperative stochastic games. In: Haurie A., Muto S., Petrosjan L. A., Raghavan T. E. S. (eds) Advances in Dynamic Games Applications to Economics, Management Science, Engineering, and Environmental Management Series: Annals of the International Society of Dynamic Games, Basel: Birkh¨auser, 52–59. Petrosyan, L. A. (1977). Stability of solutions in differential games with many participants. Vestnik Leningradskogo Universiteta. Ser 1. Matematika Mekhanika Astronomiya, 19, 46–52. Petrosyan, L. A. (2005). Cooperative differential games. Annals of the International Society of Dynamic Games. Applications to Economics, Finance, Optimization, and Stochastic Control, (ed. by Nowak A.S. and Szajowski K.), Basel, 183–200. Petrosyan, L. (2008). Strategically supported cooperation. International Game Theory Re- view, 10(4), 471–480. Petrosyan, L. A. and Danilov, N. N. (1979). Stability of solutions in non-zero sum dif- ferential games with transferable payoffs. Vestnik Leningradskogo Universiteta. Ser 1. Matematika Mekhanika Astronomiya, 1, 52–59. Petrosyan, L. A. and Sedakov, A. A. (2009). Multistage network games with perfect infor- mation. Matematicheskaya teoriya igr i ee prilozheniya, 1(2), 66–81. Cooperation in Dynamic Network Games 67

Petrosyan, L., Sedakov, A. (2014). One-way flow two-stage network games. Vestnik of Saint Petersburg State University. Ser 10: Applied Mathematics, Informatics, Control Processes, 4, 72–81. Petrosyan, L., Sedakov, A. (2015). Strategic support of cooperation in dynamic games on networks. Proceedings of the International Conference on ”Stability and Control Processes” in Memory of V.I. Zubov, SCP, pp. 256–260. Petrosyan, L., Sedakov, A. (2016). The Subgame-Consistent Shapley Value for Dynamic Network Games with Shock. Dynamic Games and Applications, 6(4), 520–537. Petrosyan, L. A., Sedakov, A. A. and Bochkarev, A. O. (2013). Two-stage network games. Matematicheskaya teoriya igr i ee prilozheniya, 5(4), 84–104. Petrosyan, L., Zenkevich, N. (2009). Principles of dynamic stability. Matematicheskaya Teoriya Igr i Ee Prilozheniya, 1(1), 106–123. Shapley, L. S., (1953). A value for N-person games. Contributions to the Theory of Games II (ed. by Kuhn H.W. and Tucker A.W.). Princeton, 307–317. Von Neumann, J. and Morgenstern, O. (1944). Theory of Games and Economic Behavior. Princeton: Princeton University Press. Xie, F., Cui, W. and Lin, J. (2013). Prisoners dilemma game on adaptive networks under limited foresight. Complexity, 18, 38–47. Watts, A. (2001). A Dynamic Model of Network Formation. Games and Economic Behav- ior, 34, 331–341. Yeung, D. W. K. and Petrosyan, L. A. (2006). Cooperative Stochastic Differential Games. Springer-Verlag, New York. Yeung, D. W. K. and Petrosyan, L. A. (2012). Subgame Consistent Economic Optimization. Birkh¨auser. Contributions to Game Theory and Management, X, 68–78 Games with Incomplete Information on the Both Sides and with Public Signal on the State of the Game⋆

Misha Gavrilovich and Victoria Kreps National Research University Higher School of Economics,St.Petersburg St.Petersburg Institute for Economics and Mathematics, RAS

Abstract Supposing that Player 1’s computational power is higher than that of Player 2, we give three examples of different kinds of public signal about the state of a two-person zero-sum game with symmetric incomplete information on both sides (both players do not know the state of the game) where Player 1 due to his computational power learns the state of the game meanwhile it is impossible for Player 2. That is, the game with incomplete information on both sides becomes a game with incomplete information on the side of Player 2. Thus we demonstrate that information about the state of a game may appear not only due to a private signal but as a result of a public signal and asymmetric computational resources of players. Keywords: zero-sum game; incomplete information; asymmetry; finite au- tomata.

1. Introduction The literature on repeated games with incomplete information usually assumes that players have unlimited computational capacity. Since in practice this assumption does not hold, it is important to study whether and how its absence affects the predictions of the theory. We consider zero-sum games of players with limited computational capacity, and discuss how these limitations may affect the information structure of the game. We show how difference in computation resources may give rise to informational asymmetry in an otherwise . Our model of limited computation resources is similar to the model of Abraham Neyman (Neyman, 1997; Neyman, 1998). The strategies available to players are limited to finite automata of different sizes. Starting with the seminal papers by Rubinstein (Rubinstein, 1986) and by Abreu, Rubinstein (Abreu and Rubinstein, 1988) there appeared a number of papers on repeated games where strategies of players are implemented by finite automata. These papers investigate properties of the set of equilibrium payoffs under this assumption. For an abundant bibliography on the subject see Hern´andez, Solan (Hern´andez and Solan, 2016). We are interested in another aspect. Supposing that Player 1’s computational power is higher than that of Player 2, we give three examples of different kinds of public signal about the state of a two-person zero-sum game with symmetric incomplete information on both sides (both players do not know the state of game but know its probability) where Player 1 due to his computational power learns the state of the game meanwhile it would be impossible for Player 2. ⋆ Support from Basic Research Program of the National Research University Higher School of Economics is gratefully acknowledged. This study was partially supported by the grant 16-01-00124-a of Russian Foundation for Basic Research. Games with Incomplete Information on the Both Sides 69

In our examples each player chooses a finite automaton. The both chosen au- tomata are given a signal depending on the state of the game. Intuitively it is clear that higher computational power of Player 1 may let him “know” the state of the game meanwhile it would be impossible for Player 2. That is, if Player 1 “computes” the state of the game with help of his computational resources and Player 2 does not, the game with incomplete information on both sides becomes a game with incomplete information on the side of Player 2. Thus we demonstrate that knowledge of the state of a game may arise not only due to a private information (a private signal) but as a result of a public signal and computational resources of players. In the first two examples both chosen automata are given a signal consisting of a string of 1’s whose length depends on the state of the game. In the first example a signal is deterministic. In the second example a signal is random. In the third example both chosen automata are given a random signal consisting of a string of 0’s and 1’s. The state of the game is determined by the value of the bit (0 or 1) of a certain fixed distance from the end of the string. In the first example where a signal is deterministic Player 2 gets no new in- formation on the state of the game. In the second and third examples of random signals Player 2 reestimates the probability of the state. Hence players are faced with a game with incomplete information on the side of Player 2 where the poste- rior probability of the state known to both players is more accurate than the prior one.

2. Games under consideration

As our approach only deals with revealing the information about the state of the game before the game starts, the number of stages is irrelevant. So we are not concerned with repetition of a game and do not go beyond analysis of games which are played once. We base our consideration on the classical setting of matrix games with incom- plete information on one side and with incomplete information on both sides (see (Harsanyi, 1967-68; Aumann and Maschler, 1995)). I. The case of symmetric incomplete information on both sides. Let (p) denote the matrix game with incomplete information on both sides given A by two square payoff matrices A1 and A2. Before the game starts a chance move determines the ”state of nature” k K = 1, 2 and therefore the payoff matrix A : with probability p the matrix A ∈is played{ and} with probability 1 p the matrix k 1 − A2 is played. Both players know the probability p and do not know the result of the chance move. As a matter of fact in such a game with incomplete information on both sides players are faced with the matrix game given by payoff matrix A(p)= pA + (1 1 − p)A2. We will denote the matrix game given by payoff matrix B by the same symbol B. The value V al (p) is a continuous function on p over the interval [0, 1], where V al (0) = V alA Aand V al (1) = V alA as equity of probability p to 0 or 1 means A 2 A 1 that players know what game is played: if p = 0 then it is A2 and if p = 1 then it is A1. Note that the absence of information on a state of the game on the both sides may be profitable for one player and not profitable for another one. Consider a 70 Misha Gavrilovich, Victoria Kreps simple example: 2 2-matrices with V alA = V alA =0 × 1 2 01 00 0 p A = , A = and so A(p)= . 1 00 2 10 1 p 0      −  It is easy to calculate that V al (p) = p(1 p) > 0. Hence in the game V al (p) Player 1 can guarantee the positiveA payoff while− his guaranteed payoff is only zeroA if both players know what game is played. If in matrices A1 and A2 the elements equal to 1 are replaced by elements equal to 1, then V al (p)= p(1 p) is negative. In this case it is Player 2 who gets a profit− from theA absence− of inform− ation on the both sides. II. The case of incomplete information on the side of Player 2. Now consider the same game as in case I but Player 1 is informed on the result of the chance move but Player 2 is not. So Player 1 knows exactly what game is played. Player 2 has no such information. Player 2 knows that Player 1 is the informed player. Let asy(p) denote this game with incomplete information on the side of Player 2. NaturallyA in this game Player 1 could guarantee himself not less than in the game (p). In any case he may play as though he ”forgives” the obtained information. ButA usually it is profitable for him to use his knowledge of the state of the game. Demonstrate it for the example of matrices A1 and A2 given in the case I. As Player 1 knows exactly what game is played he chooses the first row if it is A1 and the second row if it is A2. The best reply of Player 2 who has no information on the state is to choose the first column with probability p and the second column with probability 1 p. Thus Player 1’ guaranteed payoff (1 p if A1 is played or p if A2 is played) is greater− than his guaranteed payoff p(1 p−) in the game (p) with lack of information on the both sides. − A It is known (Aumann and Maschler, 1995) that the value V al asy(p) is a con- tinuous piecewise linear concave function over [0, 1] and as in theA previous case V al asy(0) = V alA and V al asy(1) = V alA . A 2 A 1 In this paper we consider a case in certain sense intermediate between I and II: both players do not know the state of the game but there is some additional information on this state besides its probability p. For considerably complicated cases Gensbittel (for infinite action spaces)(Gens- bittel, 2016) and Gensbittel, Oliu-Barton, Venel (for an evolution of states) (Gens- bittel et al., 2014) deal with another intermediate informational structure: the in- formed player does not observe the state variable directly but receives a stochastic signal whose distribution depends on the state variable. The authors generalize several classical asymptotic results concerning zero-sum repeated games with in- complete information on one side. Games with Incomplete Information on the Both Sides 71

III. The case of symmetric information, public signal and automata. m,n Let f (p) denote a modification of game (p) with incomplete information on bothA sides (both players do not know the matrixA chosen). Here f is an injective function of the state of the game, f(k), k =1, 2. The codomain of the function f is the set of binary strings (i.e. strings consisting of symbols 0’s and 1’s) of arbitrary length. Function f is known to both players. Each player chooses a finite automaton. Player 1 chooses an arbitrary automaton (Automaton 1) of size at most m, and Player 2 chooses an arbitrary automaton (Automaton 2) of size bounded by n where m> n. Note that the number of automata of size at most m is exponential in m, of order m2m2m+2m =22m log m+3m. This rough estimate is obtained as follows. There are exactly two edges leaving each of m vertices; there are m possibilities for an edge leaving a given vertex. This gives m2m possibilities for the choice of the edges of the automaton. There are m vertices and 2m edges, each labelled by either 0 or 1. This gives 2m+2m possibilities to choose the labels of the edges. Thus the number of possibilities for Player 1 to choose the automaton is of order 22m log m+3m while for Player 2 this number is 22n log n+3n. This implies that Player 1 has exponentially more options than Player 2 if m >> n. Both players know the size of automaton of the opponent. A meaningful strategy of choosing an automaton is as follows. A player runs each automaton of appropriate size on input f(k), k = 1, 2 and chooses the one whose output is k if such exists. In our examples such an automaton exists for Player 1 but not for Player 2. After the players have chosen their automata, the game sends the public signal f(k), k =1, 2. This signal is received by the chosen automata which compute their responses. The output of Automaton i is interpreted by Player i as an indication towards the state of the game (p). Payoffs of the players are determined accordingly. A In sections 4-5 we give examples of functions f and numbers m, n depending on f such that the size m of Player 1’s automaton allows him to “compute” the state of the game but the size n of Player 2’s automaton is not sufficient for this purpose. m,n Hence Player 1 learns the state of the game and the game f (p) is turning to the game with incomplete information on the side of Player 2.A

3. Automaton

For theory of finite automata see textbooks (Sakarovitch, 2009; Kobrinskii and Trakhtenbrot, 1965); here we give a quick overview and introduce the notation we use. An automaton is represented by a connected labelled directed graph with a finite set of vertices One vertex is distinguished as a initial vertex v . • 0 Each edge of the graph is labelled by either 0 or by 1. • Each vertex is labelled by either 0 or by 1. • There are exactly two edges leaving each vertex, • one labelled 0 and one labelled 1. There is no restriction how many edges enter a vertex. • 72 Misha Gavrilovich, Victoria Kreps

In our context a label on a vertex represents the output of the automaton which is interpreted as a state of the game. As input the automaton receives a signal which is a binary string; labels on the edges correspond to the symbols of the binary string. Next we explain how an automaton computes. The computation of the automata proceeds as follows: the automaton receives a string s1 ...sl of 0’s and 1’s from the game. Intuitively, we think that the automaton reads symbols one by one, starts at the initial vertex v0 and upon reading the symbol s1, moves to the vertex v1 by the unique edge labelled s1 coming out of v0. Then upon reading the symbol s2, moves to the vertex v2 by the unique edge labelled s2 coming out of v1 and so on... Thus there is a unique path of edges starting from the initial vertex v0

s1 s2 s3 sl−1 sl v0 v1 v2 ... vl 1 vl −→ −→ −→ −→ − −→ such that the path from the vertex vi 1 to the vertex vi is labelled by si,1 i l. − ≤ ≤ The output of the automaton is the label of the end vertex vl of this path.

4. Results. Degenerate cases In the first two examples the automaton is degenerate because the signal consists only of 1’s and thus only edges labelled 1 matter. In the first case the signal is deterministic, in the second case it is random. 4.1. Example 1: degenerate deterministic case Consider function f(k)=1kn!, k = 1, 2; here 1kn! denotes the string of 1 ... 1 consisting of 1 repeated kn! times. Theorem 1. For m>n such that m does not divide n!, Player 1 deciphers the signal while Player 2 does not. Proof. The proof is based on the following observation. Consider computation of an automaton G with n vertices on a string of 1’s, i.e., equivalently, the path e1 ...el of edges in G labelled by 1. If l>n, then the path necessarily has a cycle, i.e. for some n′ it holds ei = ei+n′ for all i>n. Then the output of the automaton G with n vertices is the same for two strings ′ 1i and 1i+rn , where r is an integer positive number. Thus, the output is the same for any two strings whose lengths are more than n and have the same reminders modulo n′. As n′ divides n! for each 1 n′ n, we get that for any automaton of size at most n, the output is the same≤ for the≤ strings f(1) = 1n! and f(2) = 12n!. This proves that the automaton of Player 2 can not distinguish the two strings f(1) = 1n! and f(2) = 12n!. On the other hand, as m does not divide n! by the hypothesis of the theorem, it is easy to construct an automaton of size m which distinguishes these two strings, as follows. Namely, consider the automaton such that its edges labelled by 1 form a single cycle of size m. As m does not divide n!, the reminders modulo m of the lengths of f(1) and f(2) are different, the end-vertices of the paths corresponding to the two signals are different. Now label them with different appropriate actions. Hence, this automaton cor- rectly distinguishes the states of the game and therefore it is optimal for Player 1 Games with Incomplete Information on the Both Sides 73 to choose this automaton to be able to use the information about the state of the game. Thus Player 1 knows the state of the game while Player 2 does not. m,n Corollary . Under hypothesis of Theorem 1 in the game f (p) the players are faced with the game asy(p) with incomplete informationA on the side of Player 2. Thus it may be said thatA the game m,n(p) is equivalent to the game asy(p). Af A 4.2. Example 2: degenerate random case m,n Definition. We say that the game f (p) with incomplete information on both sides is ε-equivalent to the game asyA(p) with incomplete information on the side m,n A asy of Player 2 if in the game (p) the players are faced with the game (p′) Af A where p p′ <ε. | − | m,n Now we consider the game f (p) where the function f(k), k K is random. For simplicity assume that m isA prime. ∈ The signal f(1) is a string of 1’s of a random length l, where l takes value uniformly among m, 2m, . . . , (m 1)m,m2. − The signal f(2) is a string of 1’s of a random length l, where l takes value uniformly among m +1, 2m +1,..., (m 1)m +1,m2 + 1. − m,n asy Theorem 2. For m> 100n the game f (p) is ε-equivalent to the game (p) with ε =0.05. A A

Remark. If we replace the condition m> 100n by m>Cn for an integer positive constant C, then ε =5/C. Proof. By the definition of the signal f we get that the reminders modulo m of the lengths of f(1) and f(2) are different. As in the previous proof, Player 1 may pick an automaton which runs through a cycle of length exactly m and hence distinguishes the f(1) and f(2). Hence Player 1 learns what game is played. We will show that, regardless of prior probability p, Player 2 is able to correctly guess the state of the game with probability at most 1/2+0.01. Indeed, as before, any automaton that Player 2 is allowed to choose has the property that there is n′

1)/[m/n′] < 1.01, as probability α1, respectively α2, is the sum of the probabilities of the signals that make the automaton outputs 1, respectively 2. Thus α 1.01α and α 1.01α . 1 ≤ 2 2 ≤ 1 Now use Bayes formula to reestimate the prior probability p of state 1

α1p α1 p′ = = p α p + α (1 p) α + (α α )(1 p) ≤ 1 2 − 1 2 − 1 − α p 1 < 1.05p. ≤ α 0.01α (1 p) 1 − 1 − So it holds p′ < 1.05p. m,n asy Hence in the game f (p) players are faced with the game (p′) where A m,n A asy p p′ <ǫ. Thus the game (p) is 0.05-equivalent to the game (p). | − | Af A 5. Result. Non-degenerate random case In the games above we considered signals consisting of only one symbol repeated many times. Much shorter signals suffice for the same effect (namely, that Player 1 can differentiate between the states but Player 2 can not) if one considers signals using at least two different symbols. m,n Here we consider the game f (p) where a random signal f(k), k K consists of a binary string of both 0’s andA 1’s. ∈ For simplicity assume that m =2L for some integer L. To define f(1), consider the probability distribution over the set of binary strings of length L l < 2L such that the probability of a string s1 ...sl is 0 if sl L = 1, l ≤ − and is 2− /L if sl L = 0. For this distribution the probability of a string having size l is 1/L. The− signal f(1) takes value according to this distribution. Similarly, to define f(2), consider the probability distribution over the set of binary strings of length L l < 2L such that the probability of a string s1 ...sl ≤l is 0 if sl L = 0, and is 2− /L if sl L = 1. As before, for this distribution the probability− of a string having size l is− 1/L. The signal f(2) takes value according to this distribution. Note that for the uniform distribution on binary strings of length l such that L l L l< 2L the probability that a string has length l is equal to 2 − . ≤ Observe that the signal described above is significantly shorter than in Theo- rem 2, namely signals of length l< 2L = 2 log2 m are shorter than signals of length m.

Theorem 3. Fix an 0 < ε 0.1 and an integer L. Let m > 22εLL and n < exp(2ε2L). The game m,n(p)≤, is ε-equivalent to the game asy(p). Af A Remark. The hypothesis of Theorem 3 assumes that m is substantially larger that n. For example, one may take L = 1000, ε = 0.1 and n = e19. Then the theorem requirements m = 21000 > 22001000, n = e19 < e20 = exp(2ε2L) are fulfilled and m =21000 10300, n = e19 108. ≈ ≈ To prove the theorem we need the following lemmas. Games with Incomplete Information on the Both Sides 75

Lemma 1 Let S be a set of strings of length L. Pick randomly and uniformly both an integer number k 0. If p 1/2+ ε then the size of S is bounded above by k S ≥ 2L 10exp( 2ε2L). · −

Remark. One way to construct such a set where pS =1/2+ ε is to take the set of all strings of length L starting with 1...1 repeated [2εL] times where [2εL] denotes the least integer not less than 2εL. Note that

( 2ε+1)L L 2 S =2 − < 2 10exp( 2ε L) | | · − as ε< 1. ( 2ε+1)L L Proof. Let S be a set such that pS 1/2+ ε. The inequality 2 − < 2 2 ≥ ( 2ε+1)L · 10exp( 2ε L) implies we may assume that S > 2 − . Now split S into three disjoint− sets: | | S = S S S 1 ∪ 2 ∪ 3 where S1 is the subset of strings containing more that (1/2+2ε)L occurrences of 1’s; S2 is the subset of strings containing not more than (1/2+2ε)L but more than (1/2+1/2ε)L occurrences of 1’s. Finally, S3 is the subset of strings containing not more than (1/2+1/2ε)L occurrences of 1’s. We have the following equality:

S = S + S + S . | | | 1| | 2| | 3|

Estimating the number of k’s such that sk = 1 among strings s1 ...sL S, we also get ∈ ε S S +2ε S + ε/2 S . | | ≤ | 1| | 2| | 3| Then ε( S + S ) 3ε S + ε/2 S | 2| | 3| ≤ | 2| | 1| and thus ε/2 S 3ε S and S 6 S . | 3|≤ | 2| | 3|≤ | 2| Finally, by the Chernoff bound (see for example (Hagerup, 1990))

S < 2L exp( (2ε)2L/2)=2L exp( 2ε2L), | 2| · − · − S < 2L exp( (8ε)2L/2)=2L exp( 4ε2L). | 1| · − · − Hence

S 2L[exp( 4ε2L)+ exp( 2ε2L)+6exp( 2ε2L)] 2L 10exp( 2ε2L), | |≤ − − − ≤ · − thereby proving the lemma. Remark. Note that there is another proof of the lemma using entropy bounds (see for example (Borda, 2011)). Now let us estimate how often an automaton of size n may correctly guess the state of the game, i.e. what is the probability that it outputs 1 when receiving signal f(1).

Lemma 2 An automaton of size n < 2Lexp( 2ε2L) guesses correctly with proba- bility at most ε +1/2. − 76 Misha Gavrilovich, Victoria Kreps

Proof. The automaton has at most n states (vertices) v1, ..., vn. Let Vi be the set of strings of length exactly L such that the automaton is in state vi after reading the string. Let pi = 1/2+ εi be the probability the automaton guesses correctly after reading an input string whose first L bits is in Vi. The automaton output depends on vi and the last l L bits of the input string. Note that these last bits are irrelevant for the correctness− of the output. At least for some choice of the rest of the string the conditional probability of success is at least pi. Without loss of generality let this string be 1...1. Then we see by Lemma 1 that the size of V exp((1 2ε2)L). | i|≤ − i The overall probability of success is at most

1/2+ ε p V p exp( 2ε2L). ≤ i| i|≤ i · − i 1 i n 1 i n ≤X≤ ≤X≤ We have to estimate the number of summands. First notice that we may disregard ε summands where εi > 2ε as we need at least 2 of them to get ε/2. By a calculation similar to the calculation above, the proportion of summands where εi < ε/2 cannot be more than 5/6. This implies that we need a number of summands of order exp( 2ε2L). This completes the proof of the lemma. − Lemma 3 There is an automaton of size of order L22εL which guesses correctly with probability at least ε +1/2.

Proof. The automaton is constructed as follows. Let l = [2εL] be the least integer not less than 2[εL]. It has states vs1...si where s1 ...si, 1 i l runs through strings of 0 and 1’s of length at most l. ≤ ≤

Vertices vs1...si and vs1...sisi+1 are connected by an edge labelled si+1. Further k there are states vs1...sl , l k L + l; ≤ ≤ k k+1 for l k

Assume Player 2’s automaton pointed out to state 2 whose probability is 1 p. − Similarly we get 1 p 1 p′ (1 + ε)(1 p). Hence (1 p′) (1 + ε)(1 p) − ≤ − ≤ − − ≤ − and thus p′ (1 + ε)p ε (1 ε)p for p 1/2. Hence the game is ε-equivalent asy≥ − ≥ − ≥ to game (p′). A Conclusions We consider zero-sum games with incomplete information on both sides with a public signal about the state of the game. Supposing that Player 1’s computational power is higher than that of Player 2, we give three examples of different kinds of public signal where Player 1 learns the state of the game meanwhile itis impossible for Player 2. Thus we show that a player may receive informationabout the state of a game due to a public signal and his computational resource. Note that boundedness of players’s computational resources is equivalent (in a certain sense) to considering effectively computable strategies only. Hence we demonstrate that such a restriction may change the information structure of the game. We hope to use this effect to shed some light on the open problem of exis- tence of the value of stochastic games formulated in (Mertens, 1986) (see also (Mertens et al., 2015)). Introduced by Shapley(Shapley, 1953) stochastic games mo- del dynamic interactions in which the current state of the game depends on the be- havior of the players. These games are games with complete information — players know the current state of the game. We plan to construct an example of a stochas- tic game for which the solution does not exist in the class of effectively computable strategies. Authors thank Fedor Sandomirski for valuable remarks and references.

References Neyman, A. (1997). Cooperation, repetition and automata. In: Hart, S., Mas-Colell, A. (Eds.), Cooperation: Game-Theoretic Approaches. In: NATO ASI Series F, vol.155. Springer-Verlag, 233–255. Neyman, A. (1998). Finitely repeated games with finite automata. Math. Oper. Res., 23, 513–552. Rubinstein, A. (1986). Finite automata play the repeated prisoners dilemma. J. Econ. Theory, 39, 83–96. Abreu, D. and A. Rubinstein (1988). The structure of Nash equilibrium in repeated games with finite automata. Econometrica, 56, 1259–1281. Hern´andez, P. and E. Solan (2016). Bounded computational capacity equilibrium. Journal of Economic Theory, 163, 342–364. Harsanyi, J. (1967-68). Games with Incomplete Information Played by Bayesian Players. Parts I to III. Management Science, 14, 159–182, 320–334, and 486–502. Aumann, R. and M. Maschler (1995). Repeated Games with Incomplete Information. The MIT Press: Cambridge, Massachusetts - London, England. Gensbittel, F. (2016). Continuous-time limits of dynamic games with incomplete informa- tion and a more informed player. to appear in Int. J. of Game Theory. Gensbittel, F., Oliu-Barton, M., H. Venel (2014). Existence of the uniform value in repeated games with a more informed controller. J. of Dynamics and Games, 1(3), 411–445. Sakarovitch, J. (2009). Elements of Automata Theory. Cambridge University Press. Kobrinskii, N. and B. Trakhtenbrot (1965). Introduction to the Theory of Finite Automata. Amsterdam, North-Holland. 78 Misha Gavrilovich, Victoria Kreps

Hagerup, T. (1990). A guided tour of Chernoff bounds. Information Processing Letters. 33 (6): 305. doi:10.1016/0020-0190(90)90214-I. Borda, M. (2011). Fundamentals in Information Theory and Coding. Springer. Mertens, J.-F. (1986). Repeated games. Proceedings of the International Congress of Math- ematicians Berkeley, California, USA, 1528–1577. Mertens, J.-F., Sorin, S., Zamir, S. (2015). Repeated games (Econometric Society Mono- graphs). Cambridge Univ. Press. Shapley, L. (1953). Stochastic Games. Proc. Nat. Acad. Sci. U.S.A., 39, 1095–1100. Contributions to Game Theory and Management, X, 79–93

Static Game Theoretic Models of Coordination of Private and Public Interests in Economic Systems⋆

Olga I. Gorbaneva1 and Guennady A. Ougolnitsky2 Southern Federal University, Institute of Mathematics, Mechanics and Computer Sciences, Mil’chakova str. 8a, Rostov-on-Don, 344090, Russia E-mail: [email protected] [email protected] WWW home page: ˜http://mmcs.sfedu.ru

Abstract A problem of inefficiency of equilibria (system compatibility) in static game theoretic models of resource allocation is investigated. It is shown that the system compatibility in such models is possible if and only if all agents are individualists or collectivists. Administrative and economic con- trol mechanisms providing the system compatibility are analyzed. Keywords: coordination of interests, hierarchical games, public goods econ- omy, resource allocation.

1. Introduction A problem of coordination of interests plays a key role in the investigation of social-economic systems based on mathematical modeling. The main research di- rections are theory of active systems (Mechanism design and management, 2013; Novikov, 2013), information theory of hierarchical systems (Gorelik and Kononenko, 1982), theory of contracts (Laffont and Martimort, 2002), mechanism design (Algo- rithmic Game Theory, 2007). An important role belongs to the notion of price of anarchy which characterizes a degree of coordination of interests of the active agents (Papadimitriou, 2001). The paper is dedicated to static game theoretic models of coordination of private and public interests (CPPI-models) in resource allocation in economic systems. In a seminal paper by Germeier and Vatel (1975) the mod- els in which payoff functions of all players consist of two parts: public one (the same for all players) and private one, were analyzed. It is shown that if the payoff function is a convolution by minimum then with natural propositions a Pareto- optimal Nash equilibrium exists in the game (the price of anarchy is equal to one), and an ideal coordination of interests is possible. The research was devel- oped, for example, by Kukushkin (1994). A powerful stream of literature in this domain belongs to the public goods economy which studies an optimal alloca- tion of the resources of active agents between a production of a social good and their private activity (Bergstrom et al., 1986; Boadway et al., 1989a,b; Warr, 1983). Among recent papers are, for example, (Christodoulou et al., 2015) which is con- cerned with a mechanism of proportional allocation of divisible resources, and (Kahana and Klunover, 2016) in which the conditions of optimal allocation of the resources between leisure and labor are received in the case when individuals have the same utility function but different abilities and non-labor incomes. It should be

⋆ The work is supported by Russian Foundation of Basic Research, project No 17-06- 00180. 80 Olga I. Gorbaneva, Guennady A. Ougolnitsky noticed that a complete coordination of private and public interests is attained ex- tremely rare, and special control mechanisms are designed to provide it. In a seminal paper by Burkov and Opoitsev (1974) was proposed an idea of optimal synthesis of a game of active agents in which the Nash equilibrium is profitable to the whole ac- tive system (the same idea is developed in mechanism design). The authors’ papers (Gorbaneva and Ougolnitsky, 2013, 2015) deal with analysis of the system compat- ibility in resource allocation and building of the respective control mechanisms. A monograph by Gorbaneva et al. (2016) describes modeling of corruption in the hi- erarchical control systems. The corruption is treated as an additional feedback on the bribe and a specific way of coordination of interests. The rest of the paper is organized as follows. The problem setup is given in the Section 2. A possibility of the system compatibility in the model of coordination of private and public interests is studied in the Section 3. Administrative and economic control mechanisms which provide system compatibility or at least permit to approach to it are introduced and analyzed in the Section 4. The Section 5 concludes.

2. The problem setup

Denote by N = 0, 1, 2,...,n a set of elements of an economic system, where 0 is the leader (Principal),{ and}M = 1, 2,...,n is a set of followers (agents). The{ } following types of games are considered:{ } (a) a game in normal form of equal players: Γeq =< M, U, J(u) >=< 1, 2,...,n , (U1,U2,...,Un), (g1,g2,...,gn) > . (b) a hierarchical game: { } Γ =< N, U, J(u) >=< 0, 1, 2,...,n , (U ,U ,U ,...,U ), (g ,g ,g ,...,g ) > . hi { } 0 1 2 n 0 1 2 n Here Ui is a set of strategies of the i-th player, ui is a strategy of the i-th player (ui Ui), and gi is a payoff function of the i-th player. The∈ games are considered in the context of resource allocation between public and private interests (objectives). The game theoretic models are based on the ap- proaches by information theory of hierarchical systems (Germeier and Vatel (1975); Kukushkin (1994)) and public goods economy (Bergstrom et al., 1986; Boadway et al., 1989a,b; Warr, 1983; Christodoulou et al., 2015; Kahana and Klunover, 2016). In the game Γeq each player has an amount of resources ri which he allocates be- tween public and private interests (production of a public good and his private economic activity, respectively). Strategies of players ui are amounts of resources assigned to the production of a public good (public interests). The rest r u fi- i − i nances his private activity. In this case Ui = [0, ri]. In the hierarchical game Γhi it is assumed that the leader has an amount of re- sources r which she allocates between a lower control level and her private activity. In turn, the lower control level shares his part between his followers and his private interests. Thus, in the hierarchical game a strategy of the i-th player is a share ui from the amount of resources ru1u2ui 1 assigned to the public objectives. In this − case Ui = [0, 1]. In both setups it is supposed that the public income is divided among the agents completely. The payoff function of each agent consists of two summands reflecting his private income and his share in the public income, respectively:

g = p (r u )+ s c(¯u), (1) i i i − i i Static Game Theoretic Models of Coordination of Private and Public Interests 81 whereu ¯ = (u1,...,un) is a vector of resources assigned by the players to the production of a public good (public income); c(¯u) is a function of the public income; p (r u ) is a function of the private income of the i-th player; i i − i si is a share of the i-th player in the public income, i M. Thus, in the models of coordination of private and public∈ interests (CPPI-models) the games described above are specified as follows: (A) a game in normal form of the equal players:

Γ =< M, U, J(u) >=< 1, 2,...,n ,U = [0, r ],g = p (r u )+ s c(¯u) > . (2) eq { } i i i i i − i i (B) a hierarchical game: Γ =< N, U, J(u) >=< 0, 1, 2,...,n ,U = [0, 1],g = p (r u )+ s c(¯u) > . hi { } i i i i − i i For building the hierarchical game on the base of a game in normal form the set of players M is added by a specific player 0 (leader, Principal) which represents the interests of the whole system. The set of strategies is added by the Principal’ s vector strategy k = (k1, k2,...,kn) which is a set of control impacts on other players depending on the type of the used control mechanism. Also, the vector of payoff functions is added by the Principal’s payoff which is a function of social welfare equal to the sum of the payoffs of all agents given the condition i M si = 1. ∈ g = p (r u )+ c(¯u). P (3) 0 i i − i i M X∈ To provide the system compatibility the following control mechanisms may be used: (a) an administrative mechanism (compulsion) when the Principal constraints the sets of strategies of the agents, namely, she fixes the amounts q (scalar or vector ones) such that an agent cannot assign greater or less resources to the public objectives: π = k = q =(¯q , q ) 0 q¯ , q r , q u q¯ . a { i i i i | ≤ i i ≤ i i ≤ i ≤ i} In this case we receive ui Ui(q) Ui , whereq ¯ = (q1, q2,...,qn) is a matrix of the dimension n 2,u ¯ = (u∈ ,u ,...,u⊂ ) , and the social welfare function is ∗ 1 2 n g = p (r u )+ c(¯u) C(q). (4) 0 i i − i − i M X∈ where C(q) is a function of administrative control costs; (b) an economic mechanism (impulsion) when the Principal impacts the agents’ payoff functions, namely, she sets the shares si of their participation in the public income: π = k = s 0 s 1, s =1 . (5) e { i i| ≤ i ≤ i } i M X∈ Each of the two mechanisms can be applied with a feedback or without it. If the feedback is present then we receive a Germeier game with a mechanism

π = k = k (¯u) . (6) G { i i } otherwise a Stackelberg game arises with a mechanism

π = k = const . (7) St { i } 82 Olga I. Gorbaneva, Guennady A. Ougolnitsky

The Germeier games may be accompanied by a corruption mechanism

π = k = k (¯u, ¯b) , ¯b = (b ,b ,...,b ). (8) b { i i } 1 2 n where bi [0; 1] is a share of the bribe given to a bribe-taker by the agent. It is supposed∈ here that a Principal - agents hierarchy is added by another element (or elements) , a supervisor. The Principal is not corrupted but the real control from her name is made by the supervisor which can weaken the Principal’s requirements in exchange to the bribe from an agent. The models of corruption are considered in details in (Gorbaneva et al. (2016)). In general case the considered game has the form

Γ =< N, k¯ K, V n , J,¯ Π > . (9) attr ∈ { i}i=1 where N is a set of players; attr denotes the game’s type (in normal form of hier- archical); k¯ is a vector of the Principal’s strategies: if an administrative mechanism is used then k¯ =q ¯ is a vector of resource constraints, while in an economic mecha- ¯ nism k =s ¯ is a vector of shares of distribution of the public income, i M si =1 ∈ ; K is a set of the Principal’s strategies which is K = i M [0; ri] in the case ∈ P of an administrative mechanism and K = i M [0; 1] in the case of an economic ∈ Q one; Vi are sets of the agents’ strategies. If a mechanism of corruption is used then V = U [0; 1] , otherwise V = U ; J¯ = (gQ,g ,...,g ) is a vector of the players’ i i × i i 0 1 n payoffs where g0 is a social welfare function maximized by the Principal and having the form g0 = i M pi(ri ui)+ c(¯u), and gi i M are the agents’ payoff functions ∈ − | ∈ in the form gi = pi(ri ui)+ sic(¯u); Π is a set of control mechanisms used by the Principal. Namely,P −

Π = [π π ]&[π π ]&(1 π ). a ∨ e G ∨ St ∨ b or the set contains administrative and economic mechanisms without a feedback or with it, possibly including corruption. For example, Γ =< 0, 1, 2 , ¯k, V n , J,¯ hi { } { i}i=1 πa&πG&πb > denotes a hierarchical game among Principal and two agents where the{ Principal} uses an administrative mechanism with a feedback and corruption, the Principal’s strategy includes resource constraints k¯i = qi(¯u, ¯b), the set of the Principal’s feasible strategies is K = i M [0; ri], the set of the agent’s feasible ∈ strategies is Vi = Ui [0; 1] (including a share of bribe), the vector of payoff functions × Q is equal to J¯ = (g0,g1,...,gn) and contains the Principal’s payoff function g0 = i M pi(ri ui)+ c(¯u) , and the agents’ payoff functions gi = pi(ri ui)+ sic(¯u). Let’s∈ introduce− in the game (9) an analogue of the price of anarchy (Papadimitriou,− P NE NE 2001). Denote by NE = u(1) , ..., u(k) a set of the Nash equilibria in the game { } NE NE NE (2), u(j) = (u(j)1,...,u(j)n) a game outcome, g0min = min g0(u(1) ),...,g0(u(k) ) max { } , g0 max = maxu U g0(u)= g0(u ) . Then the price of anarchy in the model (9) is ∈ gNE P A = 0min . (10) g0 max It is evident that P A 1. If P A is close to one then the efficiency of equilibria is high and the need of coordination≤ in the model (9) is low or absent at all (when P A = 1); the lower is P A, the greater is the coordination need. Two approaches: an empirical one and a theoretical one - can be used in the in- vestigation of an economic control mechanism with a feedback. In the empirical Static Game Theoretic Models of Coordination of Private and Public Interests 83 approach the methods of distribution of the public income widely used in practice are analyzed: [π &π ] = s =s ¯ (u),s is given . e G emp { i i i } The Principal only fixes a form of the function s, and retires. An example of the empirical approach is given by the method of proportional allo- cation when a share of the agent in the public income is proportional to his share in the production of the public good:

ui , m : um =0, P uj si(u)= j∈M ∃ (11) ( 0, otherwise. The theoretical approach is based on building the economic mechanism that is optimal for the Principal and considers the interests of agents using Germeier’s theorem (Gorelik and Kononenko,1982): [π &π ] = s = s (u),s is found . e G G2 { i i i } An administrative mechanism without a feedback may be implemented in several variants: 1) Principal controls the resource allocation only above, namely, she fixes the amounts qi such that an agent cannot assign less resources to the public objec- tives: π &π = k = q R,i =1,...,n 0 q r . a St { i i ∈ | ≤ i ≤ i} In this case , and a social welfare function has the form

g = p (r u )+ c(¯u) C(q , q ,...,q ). 0 i i − i − 1 2 n i M X∈ 2) Principal controls the resource allocation from both sides, and in turn two cases are possible: (a) the control amounts for the agents are different, namely, the Principal fixes for each agent the thresholds qi and qi , such as the agent cannot assign greater or less resources to the public objectives: π &π = k = (q , q ) R2,i =1,...,n 0 q q r . a St { i i i ∈ | ≤ i ≤ i ≤ i} In this case q u q , and a social welfare function has the form i ≤ i ≤ i g = p (r u )+ c(¯u) C(q , q , q , q ,..., q , q ). 0 i i − i − 1 1 2 2 n n i M X∈ (b) the control amounts for the agents are the same, namely, the Principal fixes the thresholds q and q , such as each agent cannot assign greater or less resources to the public objectives: π &π = k = (q, q) R2,i =1,...,n 0 q q 1 . a St { ∈ | ≤ ≤ ≤ } In this case qr u qr , and a social welfare function has the form i ≤ i ≤ i g = p (r u )+ c(¯u) C(q, q). 0 i i − i − i M X∈ 84 Olga I. Gorbaneva, Guennady A. Ougolnitsky

3. System compatibility in the base model

Let’s consider the game theoretic model (2) in normal form. It is supposed that: - function c monotonically increases by all ui, c(0,..., 0) = 0; - functions p monotonically increase by (r u ) and monotonically decrease by u , i i − i i pi(0) = 0 (when ui = ri); n - if ui > 0 then si > 0,i = 1,...,n. The variant i=1 si = 0 corresponds to the case when i ui = 0; then the public income is not produced and there is nothing to share. ∀ P

Definition 1. A model is system compatible if P A = 1.

Theorem 1. Suppose that the functions c and pi are increasing and concave, pi(0) = 0, c(0) = 0. Then the system compatibility holds if and only if the set of agents con- sists of two classes: individualists I (ui =0) and collectivists C (ui = ri).

Proof. Denote 1 n − x = s c′ u p′ (r u ) (0), i i i − i i − i i=1 ! ! X

1 n − y = c′ u p′ (r u ) (0). i i − i i − i i=1 ! ! X Then

0, xi < 0, uNE = x , 0 < x < r , (12) i  i i i  ri, xi > ri. 0, y < 0,  i umax = y , 0 ri.  It is seen that the values of strategies coincide on the bounds of the segment [0, ri], i.e. when the agent is an individualist or a collectivist. Let’s prove that the in- n n ternal values do not coincide. As far si < 1 then sic′ ( i=1 ui) < c′ ( i=1 ui), n n therefore, sic′ ( i=1 ui) pi′ (ri ui)

The conditions of Nash equilibrium in the model (2) can be characterized as 1 n c(r1,...,rk, 0,..., 0) pi(ri),i =1,...,k (the transition C I is not profitable); 1 ≥ → pj(rj ) n c(r1,...,rk, 0,...,rj ,..., 0),i = j +1,...,n (the transition I C is not profitable).≥ → Static Game Theoretic Models of Coordination of Private and Public Interests 85

We have

n I g0 = g0(0,..., 0) = pj (rj ), j=1 X C g0 = g0(r1,...,rn)= c(r1,...,rn), n NE NE g0 = g0(u )= c(r1,...,rk, 0,..., 0) + pj (rj ), j=Xk+1

4. Control mechanisms in static CPPI-models Thus, the condition of system compatibility P A = 1 is rarely satisfied by itself, and therefore special control mechanisms are required to provide it. Suppose that maximization of the social welfare (3) is an objective of a specific agent (Principal, leader, social planner, mechanism designer) which has a possibility of impact on the sets of feasible strategies (administrative mechanism) and/or payoff functions (economic mechanism) of other agents to implement the objective. Denote the first possibility by Ui = Ui(qi), and the second one by gi = gi(pi,ui). Both types of impact cannot use or use a feedback on control. In the first case a hierarchical game of the type G1 (Stackelberg game), in the second one a hierarchical game of the type G2 (Germeier game) arises (Gorelik and Kononenko,1982). Thus, four types of control mechanisms are possible. Definition 2. A control mechanism k in the model (9) is system compatible if the optimal answer of the players u(k) makes the model system compatible.

Economic control mechanisms πe in the model (9) are implemented by choosing by the Principal some values si:

π = k = s 0 s 1, s =1 . e { i i| ≤ i ≤ i } i M X∈ Administrative control mechanisms mean that the Principal can constraint feasible strategies of the agents:

π = k = (q , q ) 0 q , q r , q u q . a { i i i | ≤ i i ≤ i i ≤ i ≤ i} 4.1. Economic mechanisms without a feedback

Suppose that a control mechanism πe&πSt = ki = si 0 si 1, i M si =1 is implemented. Using the first order conditions{ shows that| ≤ the≤ system∈ compatibility} in the interior of the domain of feasible strategies is possible only inP the degenerated case. Thus, when πe&πSt, the system compatibility in the model (9) means as a rule that all agents are individualists or collectivists. If the condition of system compatibility is not satisfied then it is possible to set a problem of coordination of interests in a weaker form of building an economic control mechanism which maximizes the price of anarchy (10). 4.2. Economic mechanisms with a feedback Suppose now that a control mechanism π &π = k = s (u) 0 s (u) 1, e St { i i | ≤ i ≤ i M si =1 is implemented in the model. Using the first order conditions shows ∈ } P 86 Olga I. Gorbaneva, Guennady A. Ougolnitsky that the system compatibility in the interior of the domain of feasible strategies is possible only if

∂si(u) ∂c(u) = [1 si(u)] ,i M. ∂ui − ∂ui ∈

In the frame of empirical approach [πe&πG]emp = si =s ¯i(u),si is given widely spread practical methods of distribution of the public{ income are investigated.} For example, consider a mechanism of proportional allocation (11). Theorem 2. A mechanism of proportional allocation is system compatible if and only if the function c(x) is linear.

Proof. In this case the condition of system compatibility takes the form

∂c j M uj ∈ uj uj c uj =0,i M.   ∂ui  −   ∈ j=i P j M j M X6 X∈ X∈    ∂c(Pj∈M uj ) Let’s solve the equation ∂u j M uj c j M uj = 0. i ∈ − ∈ ∂c u ∂c u (Pj∈M j ) (Pj∈M j )  ∂ui Transform uj = c P uj , P = , ∂ui j M j M P uj ∈ ∈ c(Pj∈M uj ) j∈M   ln c j M uj = ln jPM uj +c ˆ(u i), whereP c ˆ(u i) is an integration constant ∈ ∈ − −   on uiP, c j M uj =ˆPc(u i) j M uj . ∈ − ∈ The function c depends only on the sum j M uj , andc ˆ(u i) does not depend on P P ∈ − the sum. Therefore,c ˆ(u i) = const = cPthat means c j M uj = c j M uj . − ∈ ∈ The theorem is proved. P  P

Remind that a function f(x1, x2, ..., xn) is symmetrical relative to the variables x1, x2, ..., xn, if a permutation of any pair of the variables does not change the form of the function, i.e. for any i, j, 1 i, j n holds f(x , x , ..., x , ..., x , ..., x ) = ≤ ≤ 1 2 i j n = f(x1, x2, ..., xj , ..., xi, ..., xn).

Theorem 3. An allocation mechanism si(u) is system compatible if and only if the function si(u) is symmetrical by ui.

Proof. The first order conditions are:

∂p(u ) ∂c(P uj ) i = j∈M , ∂ui ∂ui ∂c u  ∂p(ui) ∂si(ui) (Pj∈M j ) ∂u = ∂u c j M uj + si(u) ∂u .  i i ∈ i P   ∂c(Pj∈M uj ) ∂s (u ) ∂c(Pj M uj ) ∂u or0= i i c u +(s (u) 1) ∈ . Let’s transform: i = ∂ui j M j i ∂ui ∈ − c(Pj∈M uj ) ∂s (u ) i i ∂ ln c u  ∂ui (Pj∈M j ) ∂ ln(1 si(u)) cˆ(u−i) = , = − , c u = . 1 si(u) ∂ui ∂ui j M j 1 si(u) − − ∈ − The left hand side is symmetrical by u , therefore, the right hand side should also i P  be symmetrical by ui. It means that si(u) is symmetrical by u i . The theorem is proved. − Static Game Theoretic Models of Coordination of Private and Public Interests 87

Notice that the right hand side, therefore si(u) , depends on j M uj. Besides, ∈ cˆ(u−i) si(u)= 1 and consideration of the condition Pi M si = 1 gives − c(Pj∈M uj ) ∈ cˆ(u ) cˆ(u ) s (u)= n Pi∈M −i , c u = Pi∈M −i . P i M i c u j M j n 1 ∈ − (Pj∈M j ) ∈ −   PAs far the left hand side c j PM uj does not depend on n, the right hand ∈ side also does not depend onPn. Therefore the denominator (n 1) should be reduced. It is possible only if the sum of n summands in the numerator− in the right hand side may be regrouped in (n 1) equal summands that provides the reduction of (n 1). Besides, each of the regrouped− summands should depend only − on j M uj . Thus, the numerator in the right hand side should be presented as ∈ i PM cˆ(u i) = (n 1) c j M uj . Thereforec ˆ(u i) must be symmetrical by ∈ − − · ∈ − Pu i, and finally i M cˆ(u i)P should depend on j M uj , i.e. the mechanism si(u) may− be represented∈ as − ∈ P P (n 1)ˆc(u i) si(u)=1 − − − i M cˆ(u i) ∈ − Other economic mechanisms are also possible,P for example,

1 ,ui = ri, s (u)= j:uj =rj (14) i 0|{, }| otherwise.  This mechanism allocates the public income only among collectivists. Notice that in this case all agents have only two rational strategies: i : Ui = 0, ri , therefore the mechanism (14) reduces a general CPPI-model to the∀ CPPI-m{odel with} binary sets of strategies (Gorbaneva and Ougolnitsky, 2015). Let’s now formulate the problem of control mechanisms design in a general form. Suppose a social planner which maximizes the social welfare function (3) reports to all agents the control mechanism

1 max max ,u = u , j:uj =u i i si(u)= |{ j }| ,i =1, ..., n. (15) ( 0, otherwise,

Then the agents’ payoffs are equal to

max max c(ui ,u−i) max p (r u )+ max ,u = u , i i i j:uj =u i i gi(u)= − |{ j }| ( pi(ri ui), otherwise. − It is evident that in this case U = 0,umax , because if u > 0, u = umax then i { i } i i 6 i gi(u) = pi(ri ui) < pi(ri). Therefore, the mechanism (15) also reduces a general CPPI-model to− the model with binary sets of strategies. The Theorem 2 leads to Corollary 1. The allocation mechanism

1 max max ,u = u , j:uj =u i i si(u)= |{ j }| ,i =1, ..., n. ( 0, otherwise, is not system compatible. 88 Olga I. Gorbaneva, Guennady A. Ougolnitsky

The difficulty is that an i-th player in the moment of decision does not know u i max − and respectively j : uj = uj . Therefore it is difficult to estimate the efficiency of the mechanism{ (15) (to compare} the payoffs) in a general case. It is possible to argue that an optimal answer of the i-th player to the mechanism (15) is

max max c(ui ,u−i) ui , u i U i pi(ri) pi(ri ui)+ max , opt − − j:uj =uj u (si)= ∀ ∈ ≤ − |{ max }| (16) i c(ui ,u−i)  0, u U p (r ) p (r u )+ max , i i i i i i i j:uj =u  ∀ − ∈ − ≥ − |{ j }| i.e. one of the two feasible strategies dominates the other one and is the domi- nant strategy respectively. But the optimal answer is uncertain if for different u i the signs of the inequalities are different (i.e. both strategies are non-dominated)− (Gorbaneva and Ougolnitsky, 2015). A theoretical approach [πe&πG]G2 = si = s (u),s is found leads to the following result. { i i } Theorem 4. If functions pi(x) and c(x) are of power type with a positive exponent less or equal to one then an economic mechanism with a feedback [πe&πG]G2 = s = s (u),s is found is system compatible. { i i i } Due to some special properties of linear functions it is convenient to consider four cases separately: 1) functions pi(x) and c(x) are linear; 2) functions pi(x) are linear, and the function c(x) is of power type with a positive exponent less than one; 3) functions pi(x) are of power type with a positive exponent less than one, and the function c(x) is linear; 4) functions pi(x) and c(x) are of power type with a positive exponent less than one. A proof is presented for the case of linear functions pi(x)= pi x and c(x)= c x , and is based on the Germeier theorem (Gorelik and Kononenko,1982)· . Other cases· are analyzed similarly. P Proof. Notice that the punishment strategy is si = 0 , an i-th agent’s optimal answer is Ei = ui =0 , and his payoff is equal to Li = piri. The Principal’s payoff is equal to { }

K2 = max min pi(ri ui)+ c ui = max pi(ri) = pi(ri) si ui Ei − si ∈ "i M i M !# "i M # i M X∈ X∈ X∈ X∈ Let’s determine a set Di of such strategies that the i-th player’s payoff is greater than Li:

p (r u )+ s c u >p r i i − i i i i i i M ! X∈ piui It is possible only if si > . The value K1 is equal to c(Pi∈M ui)

K1 =

= max max pi(ri ui)+ c ui = max pi(ri ui)+ c ui si Di ui − ui − ∈ "i M i M !# "i M i M !# X∈ X∈ X∈ X∈ Static Game Theoretic Models of Coordination of Private and Public Interests 89

As it is shown above,

r ,c>p , umax = j j j 0,c

piui Let’s prove that the inequality si > can be satisfied that provides a c(Pi∈M ui) max max profitability of the strategy uj for the agent. For those agents which have uj = piui 0, it may be provided by the strategy si = ǫi, for the other agents si > . c(Pi∈C ui) 1, i : s > 0, It holds if n s (u)= i or p r < c u . As for each i=1 i 0, ∃i : s =0, i C i i i C i  ∀ i ∈ ∈ summand theP inequality c>pj holds then theP condition Pi C piri < c i C ui is also satisfied, and due to the equivalence of all transforms∈ the initial inequality∈ holds, too. The theorem is proved. P P  4.3. Administrative mechanisms of system compatibility Let’s suppose that Principal can constraint the sets of feasible strategies of agents. Consider the model (9) with a control mechanism π &π = k = (q , q ) R2,i = a St { i i i ∈ 1, ..., n 0 qi qi ri .. It is clear| ≤ that≤ if the≤ Principal’s} possibilities are not bound then the problem of max system compatibility has a trivial solution qi = qi = ui ,i M. Therefore, a real setup of the problem requires a consideration of the administr∈ ative costs of the Principal. Then her payoff function takes the form

g (q, q,u)= p (r u )+ c(u) C(q, q) max, 0 q q r ,i M(17) 0 j j − j − → ≤ i ≤ i ≤ i ∈ j M X∈ where C(q, q) is a continuously differentiable and convex by all arguments cost function of the Principal. The function increases by q and decreases by q . The function (3) can be considered as a specific case of (18) when q = 0, q = r = (r1, ..., rn). Definition 3. An administrative mechanism qmax(qmax) is weakly compatible if max max max u = q NE(q ) , and g0(q, q, q ) = maxqmax u r g0(q,r,u), (respectively, max ∈ max max ≤ ≤ u = q NE(q ) and g0(q, q, q ) = max0 u qmax g0(0, q,u) ). ∈ ≤ ≤ Notice that when a mechanism is weakly compatible then the value of social welfare function is certainly not greater that when the mechanism is system compatible because in the latter case the Principal has no administrative costs. Theorem 5. For a weakly compatible administrative mechanism in the model (9) with the Principal’s payoff function (17) q max 0, and to find q it is required to − i i − i qi i solve the system of equations pi′ (ri qi)+ c′(q)= Cq′i (q, q) < 0. In all three cases the same function in the left hand− side− decreases, and the values in the right hand max max max side are strictly ordered. Therefore, qi

NE max The proof follows from the Theorem 1, in which it is shown that ui < ui , NE max therefore, due to Theorem 5, ui < qi . Thus, in the condition qi ui qi the right inequality is satisfied automatically. ≤ ≤ So, let’s consider the mechanism π &π = k = q R,i = 1, ..., n 0 q a St { i i ∈ | ≤ i ≤ ri, qi ui ri . with Principal’s payoff function g0(q,u) = j M pj (rj uj)+ c(u) ≤C(q) ≤ max,} in which the Principal sets only the left bound of∈ the constraint− s. − → P Theorem 6. An administrative mechanism πa&πSt = ki = qi R 0 qi r , q u r . in the model (9) is system compatible. { ∈ | ≤ ≤ i i ≤ i ≤ i} Proof. The optimal strategy of an agent in the model (9) without consideration NE of the condition qi ui ri is ui , calculated by the formula (12), and the ≤ ≤ max strategy optimal for the Principal is qi = ui , calculated by the formula (13). In max NE Theorem 1 it is proved that ui > ui , therefore the agent’s optimal strategy max with consideration of the condition qi ui ri is ui = qi = ui that means system compatibility. The theorem is proved.≤ ≤

The following interpretation of the result is possible: if a one-side constraint on the resource allocation from below is costless then the Principal can compel the agents to make the desirable decision. Theorem 7. If Principal’s payoff function has the form (17) then a control mech- anism πa&πSt = ki = qi R 0 qi ri, qi ui ri . in the model (9) is weakly compatible if for any{ one of∈ the| two≤ conditions≤ ≤ is satisfied:≤ }

Arg max pi(ri ui)+ sic ui ui R − ≤ ∈ " i M !# X∈

Arg max pj (rj qj )+ c qi C(q) . ≤ qi R  − −  ∈ j M i M ! X∈ X∈   or

Arg max pi(ri ui)+ sic ui > ri, ui R − ∈ " i M !# X∈

Arg max pj (rj qj )+ c qi C(q) > ri. qi R  − −  ∈ j M i M ! X∈ X∈   Notice that a weak compatibility is also possible only on the bounds of the segment qi, ri (in the case of an administrative mechanism). Therefore to achieve the weak compatibility it is necessary to have a partition of the set of agents on individualists (ui = qi) and collectivists (ui = ri). Proof. Let’s find a Nash equilibrium. An optimal agent’s answer to Principal’s strat- egy is

ui∗, qi u , i  i i i∗  ri, ui∗ > ri,  Static Game Theoretic Models of Coordination of Private and Public Interests 91

where ui∗ = Arg maxui R pi(ri ui)+ sic i M ui (without consideration of the constraints q u ∈ r ). From− the point of∈ view of the Principal the agent’s i i i  optimal strategy is≤ ≤ P

ui∗∗, qi u , i  i i i∗∗  ri, ui∗∗ > ri,  where ui∗∗ = Arg maxui R j M pj (rj uj)+ c i M ui C(q) (similarly). ∈ ∈ − ∈ − Notice that there are uniqueh values ui∗ and ui∗∗ due to negativityi of the second P P  max derivatives of the functions gi and g0, and it is proved in Theorem 1 that ui NE ≥ ui . If it is profitable for an agent to be a collectivist (ui = ri), the Principal has no need to provide control, therefore, qi = 0. In this case the function g0 decreases by qi. If an agent is neither individualist nor collectivist (qi

qi∗, 0 < qi∗ < ri, q = 0, 0 > q , i  i∗  ri, qi∗ > ri, The theorem is proved.  In presence of a cost on a one-side control from below in resource allocation a set of agents is divided on three subsets: (1) those who are collectivists independently on the Principal’s interests (the set I1); (2) those who assign for the production of public good less resources than the Principal wants but she can’t prevent it (the set I2); (3) those who would like to assign for the production of public good less resources than the Principal wants and she can increase this value (the set I3). The Principal can impact only on the elements of the third subset but she cannot always provide the system compatibility. Also, a case may be considered when i qi = q. As amounts ri are different, in this case the measurement should be done∀ in shares (not amounts) of resources. The Principal’s optimization problem has the form

g (q,u)= p (r u )+ c(u) C(q) max, 0 q 1,i M 0 j j − j − → ≤ ≤ ∈ j M X∈ To find the optimal values q it is required to solve the equation

r p′ (r (1 q)) + c′ q r + u∗ r C′(q)=0. (18) − j j j −  j j  j − i I j I j I j I X∈ 3 X∈ 1 X∈ 2 X∈ 1   5. Conclusuion The paper is dedicated to the investigation of the static game theoretic models of coordination of private and public interests (CPPI-models) in resource allocation, 92 Olga I. Gorbaneva, Guennady A. Ougolnitsky to the revealing of conditions of system compatibility in these models, and to the analysis of control mechanisms that permit to attain or approach the system com- patibility. The system compatibility means that in any Nash equilibrium a function of social welfare attains its global maximum. Two control mechanisms are consid- ered: an administrative one (compulsion) and an economic one (impulsion), each of them with or without a feedback on control. The following conclusions can be made. The system compatibility in a base CPPI-model without control is possible if and only if all players are individualists (they assign all resources to their private activity) or collectivists (they assign all resources for the production of a public good). If a function of allocation of the public income among players is given then the system compatibility is possible if and only if the function is symmetrical rela- tive to for each player i. If this function is to be found then the system compatibility may be provided if and only if the functions of private and public income are of power type with a positive exponent less or equal than one. As for administrative control mechanisms, the system and weak compatibility are to be differentiated. The weak compatibility means that the system compatibility is attained when the values of constraints are chosen as strategies. If control costs are absent then the system compatibility is reachable, otherwise only the weak compatibility may be provided in the case when all agents are individualists or collectivists. It is also shown that in the model (17) it is sufficient to the Principal to constraint only the agents’ individualism, i.e. to use the bounds from below.

References Algorithmic Game Theory (2007). Ed. by N. Nisan, T. Roughgarden, E. Tardos, V. Vazi- rani. Cambridge University Press. Bergstrom, T., C. Blume and H. Varian (1986). On the private provision of public goods . Journal of Public Economics, 29, 25–49. Boadway, R., P. Pestiau and D. Wildasin (1989a). Non-cooperative behavior and efficient provision of public goods. Public Finance, 44, 1–7. Boadway, R., P. Pestiau and D. Wildasin (1989b). Tax-transfer policies and the voluntary provision of public goods. Journal of Public Economics, 39, 157-176. Burkov, V. N., V. I. Opoitsev (1974). A Meta-game Approach to the Control in Hierarchical Systems. Automation and Remote Control, 35(1), 92–103. Christodoulou, G., A. Sgouritza, B. Tang (2015). On the Efficiency of the Proportional Allocation Mechanism for Divisible Resources. M. Hoefer (Ed.):SAGT, LNCS 9347, 165–177. Germeier, Yu. B., I. A. Vatel (1975). Equilibrium situations in games with a hierarchical structure of the vector of criteria . Lecture Notes on Computer Science, 27, 460–465. Gorbaneva, O. I., G. A. Ougolnitsky (2013). Purpose and Non-Purpose Resource Use Mod- els in Two-Level Control Systems. Advances in Systems Science and Applications, 13(4), 379–391. Gorbaneva, O. I., G. A. Ougolnitsky (2015). System Compatibility, Price of Anarchy and Control Mechanisms in the Models of Concordance of Private and Public Interests. Advances in Systems Science and Applications, 15(1), 45–59. Gorbaneva, O. I., G. A. Ougolnitsky, A. B. Usov (2016). Modeling of Corruption in Hier- archical Organizations. N.Y.: Nova Science Publishers. Gorelik, V. A., A. F. Kononenko (1982). Models of Decision Making in Ecological-Economic Systems. M. (in Russian) Kahana, N., D. Klunover (2016). Private provision of a public good with a time-allocation choice. Social Choice and Welfare, 47, 379–386. Static Game Theoretic Models of Coordination of Private and Public Interests 93

Kukushkin, N. S. (1994). A Condition for Existence of Nash Equilibrium in Games with Private and Public Objectives. Games and Economic Behavior, 7, 177–192. Laffont, J.J., D. Martimort (2002). The Theory of Incentives. The Principal-Agent Model. Princeton University Press. Mechanism design and management: mathematical methods for smart organizations (2013). Ed. by D. Novikov. New York: Nova Science Publishers. Novikov, D. (2013). Theory of Control in Organizations. New York: Nova Science Publish- ers. Papadimitriou, C. H. (2001). Algorithms, games, and the Internet. Proc.33th Symposium Theory of Computing, 749–753. Warr, P. (1983). The private provision of a public good is independent of the distribution of income. Economics Letters, 13, 207–211. Contributions to Game Theory and Management, X, 94–99

On the Conditions on the Integral Payoff Function in the Games with Random Duration⋆

Ekaterina V. Gromova1, Anastasiya P. Malakhova2 and Anna V. Tur3 1 St. Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg, 199034 Russia E-mail: [email protected] 2 E-mail: [email protected] 3 E-mail: [email protected]

Abstract In this paper we consider the problem of the existence of the integral payoff in the differential games with random duration when the random time is defined on the infinite time interval. We present an example of a game with random duration, a game-theoretic model of the development of non-renewable resources. Keywords: differential games, random duration, environment, pollution control.

1. Introduction When solving various economical problems it is advantageous to use game-theoretic models with random time horizon (Petrosjan and Shevkoplyas, 2003). If the ran- dom time instant is defined on the infinite interval, the payoff functional turns out to be an improper integral. Thus, the problem of convergency of such integrals arises. In the classical differential games with the infinite time horizon it is com- mon to add a discounting factor to the model (Dockner et al., 2000). In the work (Aseev and Kryazhimskiy, 2007) the problem of convergency of the integral payoff in deterministic games on the infinite interval was solved. In this paper we generalize that result and present sufficient conditions for convergency of the expected payoff functional for games with random duration (Proposition 1) and random initial time (Proposition 2). The paper is structured as follows. Section 1 contains a general formulation of the differential games with random time horizon. In section 2 we briefly overview the problem of transformation of the expected payoff function. In section 3 we discuss the problem of convergency and describe the ways to solve it. The example of the game of extraction of non-renewable resources is presented in section 4. In section 5 we generalize the results to the class of games with random initial time.

2. Game formulation

Consider a differential game Γ (x0,t0,T ) with n players. The state equations have the form:

x˙ = g(x, u , ..., u ), x Rn, u U compRl, x(t )= x . (1) 1 n ∈ i ∈ ⊂ 0 0

The game starts from initial state x0 at the time t0. We assume that the duration of the game is the random variable T with known probability distribution function F (t), t [t , ) (Petrosjan and Murzov, 1966; Petrosjan and Shevkoplyas, 2003). ∈ 0 ∞ ⋆ Ekaterina Gromova acknowledges the grant 17-11-01079 of Russian Science Foundation. On the Conditions on the Integral Payoff Function 95

We assume that for all admissible controls of players there exists a piece-wise differentiable and extensible on [t , ) solution of (1). 0 ∞ Let hi(x(τ),u1, ..., un) be the instantaneous payoff function of player i at the time τ, τ [t0, ) or briefly hi(τ). The instantaneous payoff function of each player is assumed∈ to∞ be a continuous function of its arguments. The expected integral payoff of the player i can be written as

t ∞ Ki(x0,t0,u1, ..., un)= hi(τ)dτdF (t), i =1, ..., n. (2)

tZ0 tZ0

3. Transformation of integral functional The transformation of integral functional in the form of double integral (2) to the standard for dynamic programming form is important for further study of the game. For the case of nonnegative instantaneous payoff function hi(τ) the result was pre- sented in the work (Shevkoplyas, 2014).

Theorem 1. Let the instantaneous payoff function hi(τ), for each player i =1,...,n be nonnegative for all t [t0, ) and measurable function of t. Then, expected payoff of player i (2) can be written∈ ∞ as follows:

t ∞ ∞ K (x ,t ,u ,...,u )= h (τ)dτ dF (t)= (1 F (t))h (τ)dτ. (3) i 0 0 1 n i − i Z0 Z0 Z0 This result was also used in (Boukas et al., 1990; Marin-Solano and Shevkoplyas, 2011). In the general case the result was obtained in the paper (Kostyunin and Shevkoplyas, 2011). Theorem 2. The expected payoff (2) can be written in the form (3), if the following conditions hold 1. T

lim (1 F (T )) hi(t)dt =0. (4) T →∞ − tZ0 2. The following integrals exist in the sense of improper Riemann integrals:

t ∞ h (τ)dτ dF (t) < + , i =1, . . . , n. (5) i ∞ Z Z t0 t0

Further on we assume that conditions (4) and (5) hold and the transformation of double integral (2) to (3) takes place.

4. Convergency of expected payoff Since the expected payoff function (3) of player i is an improper integral, it is necessary to ensure its existence. Remark that the multiplier 1 F (t) can be written as follows: − R t λ(s)ds 1 F (t)= e− t0 , t [t ; ), (6) − ∀ ∈ 0 ∞ 96 Ekaterina V. Gromova, Anastasiya P. Malakhova, Anna V. Tur

f(s) where λ(s)= 1 F (s) is the hazard function. Thus, the expected− payoff takes form

∞ τ Rt λ(s)ds Ki(x0,t0,u1, ..., un)= e− 0 hi(τ)dτ, i =1, ..., n. (7) Z0

The multiplier (6) can be treated as a discounting multiplier. However, in a general problem statement (Marin-Solano and Shevkoplyas, 2011) another discount ρ(t ,t) component e− 0 can be added and the expected payoff takes the form

τ ∞ ρ(t0,τ) R λ(s)ds − −t Ki(x0,t0,u1, ..., un)= e 0 hi(τ)dτ, i =1, ..., n. (8) Z0 The presence of discount multiplier does not guarantee the convergence of the integral (8). The following proposition gives sufficient conditions for convergency of improper integral (8).

Proposition 1. The following inequalities should hold for all admissible pair (x, u)

t ρ(t0,t) Rt λ(s)ds e− − 0 max h(x(t),u) µ(t), t t0, u(t) U | |≤ ≥ ∈

∞ τ ρ(t0,τ) R λ(s)ds e− − t0 h(x(τ),u(τ)) dτ ω(t), t t , | | ≤ ≥ 0 Zt where µ(t), ω(t) – some positive functions of argument t such as lim µ(t) = +0 t and lim ω(t)=+0. →∞ t →∞ This proposition is the generalization of the work (Aseev and Kryazhimskiy, 2007) for the case of random duration of the game. In particular, the result by Aseev, Kryazhimskiy is recovered in the case of exponential distribution of the τ random variable T ( λ(s)ds = λ(τ t0)). t0 − 5. Example R As an example, consider a game-theoretic model of emissions management (see (Breton et al., 2005)). There are n players in the game, each has industrial produc- tion on its territory. It is assumed that the production volume is directly propor- tional to emissions ui. Thus, the strategy of the player is the choice of the volume of harmful emissions u [0,b ]. The solution is sought in the class of program i ∈ i strategies ui(t). The income of the player i at time t is determined by the formula:

r (u (t)) = u (t)(b 1/2u (t)). (9) i i i i − i The dynamic in total pollution x defined by the equation

n x˙ = ui(t), x(t0)= x0. (10) i=1 X On the Conditions on the Integral Payoff Function 97

Each player bears the costs associated with removing contaminants. Instant payoff (utility) of the player i is equal to r (u (t)) d x(t), d > 0. i i − i i Without loss of generality, we assume that the start of the game is t0 = 0. In contrast to the model (Breton et al., 2005) we assume that the game has a random terminal time T , where T — random variable with the distribution function F (t)= t2 1 e− , t 0, which corresponds to the Weibull distribution with scale parameter λ−= 1 and≥ the parameter δ = 2. The value δ = 2 corresponds to the increase f(t) function of the failure rate λ(t)= 1 F (t) , which can be interpreted as depreciation of equipment in the workplace. − The expected payoff of player i for the considered model has the form

t ∞ t2 K (0, x ,u ,...,u )= (r (u (τ)) d x(τ))dτ 2te− dt. (11) i 0 1 n i i − i Z0 Z0

It was shown in (Kostyunin and Shevkoplyas, 2011) that for this game the con- ditions of Theorem 2 are hold and (11) can be transformed to the form

∞ t2 K (0, x ,u ,...,u )= (r (u (t)) d x(t)) e− dt. (12) i 0 1 n i i − i Z0

Show that for (12) all conditions of Proposition 1 hold and therefore the integral is convergent. n We will use the following estimations: x(τ) x0 + i=1 biτ = x0 + Bτ, and also 2 ≤ r (u (τ)) bi , where B = n b . i i ≤ 2 i=1 i P We should find corresponding positive functions µ(t), ω(t) such that lim µ(t)= P t +0 and lim ω(t) = +0. →∞ t Consider→∞ an expression from first condition and evaluate it

t2 t2 e− max (ri(ui(t)) dix(t)) e− max ( (ri(ui(t)) + dix(t)) ) u(t) U | − |≤ u(t) U | | | | ≤ ∈ ∈

2 t2 bi e− + (x + Bt) . ≤ 2 0   Denote 2 t2 bi µ(t)= e− + (x + Bt) . 2 0   We have x0 0, hence we can conclude that µ(t) is a positive function. Since the linear functions≥ grow asymptotically slower then exponential functions,

lim µ(t)=+0. t →∞ The second condition of the proposition also holds. Indeed,

∞ ∞ τ 2 τ 2 e− (r (u (τ)) d x(τ)) dτ e− ( (r (u (τ)) + d x(τ)) )dτ | i i − i | ≤ | i i | | i | ≤ Zt Zt 98 Ekaterina V. Gromova, Anastasiya P. Malakhova, Anna V. Tur

∞ 2 τ 2 bi e− ( + d (x + Bτ))dτ = ≤ 2 i 0 Zt 2 1 t2 bi = d Be− + √π erfc(t)(d x + ) , (13) 2 i i 0 2   where erfc(t) is the Gauss error function. We can also rewrite (13) as follows

2 2 1 t2 bi 1 t2 bi (d Be− + √π erfc(t)(d x + )) = (d Be− +2√π Φ(√2t)(d x + )), 2 i i 0 2 2 i i 0 2 where Φ(√2t) is the standard normal cumulative distribution function. Therefore, we can denote

2 1 t2 bi ω(t)= (d Be− +2√π Φ(√2t)(d x + )). 2 i i 0 2 Obviously, ω(t) is positive and

2 1 t2 bi lim (diBe− +2√π Φ(√2t)(dix0 + )) = +0. t →∞ 2 2 Thus, for this game formulation we show existence of functions µ(t) and ω(t) satisfying the relevant conditions.

6. Convergency in the games with random initial time We can generalize the result for the class of game-theoretical models with random initial time (Gromova and Lopez-Barrientos, 2015). Consider a differential game Γˆ(x0,t0) with n players. The state equations have the same form as (1). Suppose that the game start at the moment t0 which is the random variable with known probability distribution function Fˆ(t), t [t0, ). Also assume that the game has infinite duration. Define the instantaneous∈ payoff∞ function in the same way as for the game Γ (x0,t0,T ). Thus, the expected integral payoff of the player i can be written as the following Lebesgue-Stieltjes integral:

∞ ∞ Ki(x0,t0,u1, ..., un)= hi(τ)dτdFˆ(t), i =1, ..., n. (14)

tZ0 Zt It is rather easy to show that (14) can be transformed to the standard for dynamic programming form (see (Gromova and Lopez-Barrientos, 2015)):

∞ Ki(x0,t0,u1, ..., un)= hi(τ)Fˆ(τ)dτ, i =1, ..., n. (15)

tZ0

Since (15) is an improper integral, it is necessary to ensure its existence. The multiplier Fˆ(t) can be interpreted as a discounting component. The following propo- sitions gives conditions for convergency of the integral. On the Conditions on the Integral Payoff Function 99

Proposition 2. The following inequalities should hold for all admissible pairs (x, u)

Fˆ(t) max h(x(t),u) µ(t), t t0, u(t) U | |≤ ≥ ∈

∞ Fˆ(τ) h(x(τ),u(τ)) dτ ω(t), t t , | | ≤ ≥ 0 Zt where µ(t), ω(t) – some positive functions of argument t such as lim µ(t) = +0 t and lim ω(t)=+0. →∞ t →∞ In the following, we plan to check conditions of Propositions 1 and 2 for different classes of distribution functions F (t) and Fˆ(t).

References Aseev, S. M. and Kryazhimskiy, A. V. (2007). The maximum Pontryagin principle and problems of optimal economic growth, M: Nauka, Tr. MIAN, 257, 3–271 , 272 p. (in Russian). Boukas, E. K. and Haurie, A. and Michel, P. (1990). An Optimal Control Problem with a Random Stopping Time. Journal of optimization theory and applications, 64(3), 471–480. Breton, M., G. Zaccour and M. Zahaf (2005). A differential game of joint implementation of environmental projects. Automatica, 41(10), 1737–1749. Dockner, E., S. Jorgensen, N. V. Long, G. Sorger (2000). Differential Games in Economics and Management Science. Cambridge: Cambridge University Press. Gromova, E. V. and Jose Daniel Lopez-Barrientos (2015). A differential game model for the extraction of non renewable resources with random initial times. Contributions to Game Theory and Management, 8, 58-63. Kostyunin, S. and Shevkoplyas, E. (2011). On simplification of integral payoff in differential games with random duration, Vestnik S. Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr., 4, 47-56. Marin-Solano, J., E. V. Shevkoplyas (2011). Non-constant discounting and differential games with random time horizon. Automatica, 47(12), 2626–2638. Petrosjan, L. A. and Murzov, N. V. (1966). Game-theoretic problems of mechanics. Litovsk. Mat. Sb. 6, 423–433 (in Russian). Petrosjan, L. A., Shevkoplyas, E. V. (2003). Cooperative Solutions for Games with Random Duration. Game Theory and Applications, Volume IX. Nova Science Publishers, pp. 125–139. Shevkoplyas, E. V. (2014). Optimal Solutions in Differential Games with Random Duration. Journal of Mathematical Sciences, 189(6), 715–722. Contributions to Game Theory and Management, X, 100–128

Modelling of Information Spreading in the Population of Taxpayers: Evolutionary Approach⋆

Suriya Sh. Kumacheva, Elena A. Gubar, Ekaterina M. Zhitkova, Zlata Kurnosykh, Tatiana Skovorodina St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected] [email protected] [email protected] [email protected] [email protected]

Abstract Information technologies such as social networks and Internet allow to spread ideas, rumors, advertisements and information effectively and widely. Here we use this fact to describe two different approach of evaluation the impact of information on the members of tax system. We consider an impact of spreading information about future audits of the tax authorities in a population of taxpayers. It is assumed that all agents pay taxes, if they know that the probability of a tax audit is high. However some agents can hide their true income and then such behavior provokes the tax audit. Each agent adopts her behavior to the received information of future audits, which depends on the behavior of other agents. Firstly, we model a process of propagation information as an epidemic pro- cess and combine it with game between tax authority and taxpayers. Sec- ondly, we consider evolutionary game on network which define structured population of taxpayers and evaluate the impact of the spreading of infor- mation on the changes of population states over the time. We formulate mathematical models, analyze the behavior of agents and cor- roborate all results with numerical simulations. Keywords: tax audit, tax evasion, total tax revenue, information spreading, evolutionary game on networks.

1. Introduction The system of tax control as a key element of fiscal system provides several tools to improve the collection of taxes. However if the budget is restricted then the tax authority should explore new approaches to stimulate the tax collection. One of the most effective method of struggle against the concealment of taxes is total tax audit, but this procedure is very expensive and can not be applied to whole population. Another way to enhance the collection of taxes is selective tax audit. It allows to check special subgroups of taxpayers from the population taking into account addi- tional information about their incomes and propensity to risk (Gubar et al, 2015). Previous research (Boure and Kumacheva, 2010; Chander and Wilde, 1998; Va- sin and Morozov, 2005) have shown that often goals of taxpayers and tax author- ity are opposite. This fact forms a conflict situation which can be formulated

⋆ This research was supported by the research grant ”Optimal Behavior in Conflict- Controlled Systems” (17-11-01079) of Russian Science Foundation. Modelling of Information Spreading in the Population of Taxpayers 101 as a game-theoretical problem. Thus in many studies authors combine statisti- cal and game-theoretical approach together to describe the behavior of all mem- bers of the fiscal system (Chander and Wilde, 1998; Reinganum and Wilde, 1985; Sanchez and Sobel, 1993). Such complex scheme allows us to estimate behavior of taxpayers with assumption about their rationality and including our beliefs about behavior of tax authority. Here we suppose that different shares of taxpayers demonstrate own propensity of risk. It is supposed that each taxpayer in the population demonstrates different risk propensity: to be risk-averse, to be risk-neutral or to be risk-loving. Risk-averse taxpayers tend to avoid risk and, hence, they pay taxes. Risk-loving taxpayers do not pay taxes even if a high risk (high probability) of a tax audit exists. The risk-neutral taxpayers are rational. It means that they evaluate the risk of tax auditing and if it is high enough, for example, this probability is bigger or equal to the threshold value, then risk-neutral taxpayers prefer to pay taxes. Since we have noticed that taxpayers and tax authority prefer to optimize own cost functions hence we can say that normally, the society should find a compromise between the opposite goals of population as single whole and separate individuals. It means that the goal of society is to increase the collected taxes to use them in social needs but at the same time individuals try to minimize payment to fiscal system. To analyze this social conflict we use known indexes such as Price of Anarchy and Price of Stability as well as we introduce new index of Social Welfare (Monderer and Shapley, 1996). However, as we noticed above, total tax control is very expensive procedure, hence in recent studies several new methods have been developed (Antocia et al, 2014; Bloomquist, 2006; Chander and Wilde, 1998). For example, one approach of- fers to spread information about future audits over the population. This scheme can be considered as a useful tool to stimulate collection of task and decrease costs of tax control. In current work we consider two different model which include the impact of information to the taxpayers decisions. Firstly, we assume that the process of propagation information resembles the process of spreading virus in epidemic pro- cess (Goffman and Newill, 1964). In this case we use classical Susceptible-Infected- Susceptible (SIS) model (Kandhway and Kuri, 2014; Kolesin et al, 2014) to describe the transition of information between Informed and Uninformed taxpayers. Then we combine the model of information spreading with game-theoretical model, which de- scribes behavior of tax authority and taxpayers and reaction of agents on received in- formation (Goffman and Newill, 1964; Gubar and Kumacheva and Zhitkova, 2015; Altman et al, 2014). Secondly, we model the same idea as an evolutionary game on structured population (Riehl and Cao, 2015; Altman et al, 2010). As we men- tioned early nowadays modern information technologies can be successfully used to spread various types of information (Nekovee et al, 2007). So we suppose that the population of taxpayers can be described by the network where nodes are in- dividual taxpayers and links define connections between them. We suppose that at the initial time moment tax authority throws an information about future tax audit into a small part of population of taxpayers. Economic agents communicate over the time period and information will spread. Received information can force an agent to change her strategy. This technics allows to improve the process of col- lection taxes with less costs. Propagation of information also initiates migration of 102 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina economic agents between two subgroups: those who pay taxes and those who evade payments. We consider a population of n homogeneous taxpayers, where every taxpayer can evade and declare his income less than its true level. In turn the tax authority can audit every taxpayer from the population. If the evasion is revealed, the taxpayer must pay his tax arrears and penalty. Tax authority is able to spread information about future audit. In this case can be informed or uninformed regardless of the risk propensity. We assume that each agent selects the best behavior that depends on the preferences of other participants and incorporate their beliefs or received information about probability of the future tax audit. It means that if taxpayer believes that a large part of the population pays taxes and this fact reduces the probability of tax audit, then a part of the taxpayers can deviate from the payment of taxes. Otherwise, if the most part of taxpayers attempt to evade the taxation then the optimal behavior is to pay taxes. Thereby we can say that the choice of each agent affects the state of population. In current paper we estimate the reaction of taxpayers on information received from tax authority and circulated in population. We study several approaches and propose numerical experiments to support the theoretical results. The paper is organized as follows. Section 2. presents the mathematical model of tax audit in classical formulation. Section 3. shows the dynamic model of tax control, which includes the knowledge about additional information. Its extension, the evolutionary model with network structure, is considered in Section 4. Numerical examples are presented in Section 5.

2. Game Theoretical Model of Interaction Between Tax Authority and Taxpayers with Information Dissemination In current section we study the model of tax control based on the problem presented in (Boure and Kumacheva, 2010). There is a homogeneous set of n taxpayers, each of them has an income Ii, where i = 1,n. In the end of tax period a taxpayer i declares his income as Di, Di Ii, i = 1,n. Let denote as ξ the tax rate, as π the penalty rate, which are measured≤ in shares of amount of money. If the evasion is revealed as the result of the tax audit, then the tax evader should pay unpaid tax and the penalty, which depends on the evasion level (ξ + π)(I D ), i = 1,n. The i − i tax authority makes auditing with probability pi and worth ci. 2.1. Strategies of Players Taxpayers have two strategies – to pay a tax in accordance to the true level of income (to declare Di = Ii) or not to pay (to declare Di < Ii). As it was previously obtained in the model (Boure and Kumacheva, 2010), the construction of further arguments depends on the ratio of the parameters ξ, π ci, i = 1,n. The tax auditing of i-th taxpayer can be profitable or not for tax authority depending on whether the condition (ξ + π)I c (1) i ≥ i satisfied or not. For the model described above the following results were obtained in (Boure and Kumacheva, 2010). Modelling of Information Spreading in the Population of Taxpayers 103

Proposition 1. If for the taxpayer i (i = 1,n) the inequality (1) is fulfilled, the optimal strategy of tax authority (due to maximize its income) is the probability of tax audit ξ p∗ = (2) ξ + π for every i = 1,n. The optimal strategy of the taxpayer i is

0, if pi

Formally, behavior of taxpayers can be defined as strategies: y1 – is to pay a tax in accordance to the true income level; • y2 – is not to pay (to pay D = 0 due to the Proposition 1). • i The behavior of each taxpayer depends on many factors, for example one of them is the risk propensity of the taxpayer. All considered taxpayers possess one of the three statuses: risk-averse, risk-neutral and risk-loving. Risk-averse taxpayers prefer to pay taxes in accordance with their true level of income. They always choose the strategy y1. Risk-loving are malicious evaders, despite of all external circumstances such as an audit, promised or conducted in reality. Their choice is the strategy y2. Risk-neutral agents decide to pay or to evade (corresponding to the strategies y1 or y2) depending on two factors. One of them is the preferences of other participants of the system. Every taxpayer compares her own strategy with the strategy of others in order to estimate wether it is profitable to change strategy or not. The received information about probability of possible tax audits also impacts on the agent’s behavior: as far as mentioned probability is high so the risk of penalty is high. In the real life there are some practical difficulties in the application of the model. The first is that the tax authority often has no information about the relation of parameters in (1) for every i, i = 1,n. Thus, the value of individual probability of audit for each taxplayer is unknown. Therefore in the current tax period it is assumed that the tax authority intend to audit taxpayers with average value of probability p and average cost c. Another difficulty is that the tax authority has a strongly limited budget. It is supposed not to be enough for the optimal auditing with probability p∗ (from the equation (1)). Due to the mentioned fact the tax authority needs to find additional ways to stimulate taxpayer fees. One of these ways is the injection of information about future auditing (which possibly can be false) into the population of taxpayers. In our study we assume that “p p∗”. Cost of the information spreading we denote asc ˜. ≥ Thus, the tax authority has the following methods of influence on a population of taxpayers (strategies): x1 – is to audit taxpayer, declared D = 0, with probability p; • i x2 – is to spread information about future audits with probability through • 0 initial share of taxpayers νinf = νinf (t0). 104 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

2.2. Dynamic Model of Spreading Information As we introduce in section 1. we study two different approaches of influence of information to the model of tax control. In this section we consider a non-cooperative game G = N,Xi,Ui , where N is the set of players, Xi is the strategy set of the player i Nh , U is thei payoff function of the player i N. ∈ i ∈ We consider the population of the taxpayers as a community network (complete graph with the set of vertices N, dimension n). Suppose that at the initial moment of time t = 0 the tax authority can inject the information about future auditing. We suppose that the cost of spreading information for tax authority significantly differs from the audit cost. Hence at the initial moment this information divides the total population of taxpayers into two groups – Informed and Uninformed:

n = ninf (t0)+ nnoinf (t0).

The share of informed we define as follows: n (t ) ν0 = ν (t )= inf 0 . inf inf 0 n Let the further transmission of information occurs according to the following scheme. Uninformed agents (Noinf) meet Informed agents (Inf) and receive the informa- tion about possible audit (infected by this information) and switch to the state of Informed. As soon as the information becomes irrelevant then Informed agent returns to the state Uninformed again (Altman et al, 2014). The process of trans- ferring information resembles the scheme of transition viruses in epidemic model (Nekovee et al, 2007; Altman et al, 2014), so we can use the classical Susceptible- Infected-Susceptible model in our case. Fig.1 demonstrates the transitions between subgroups of population.

Fig. 1. Scheme of transmission of information form Uninformed to Informed.

Such process can be formalized by the next system of differential equations (Altman et al, 2014):

dν (n,t) n k = σν (n,t)+ δ(1 ν (n,t)) a ν (n,t), (4) dt − k − k k i i i=1 X where δ is a probability that Uninformed received information, (named information spreading rate), σ is a probability that information becomes obsolescent, (named the rate of forgetting information), νk(n,t) is the probability that the agent k received the information at the moment t of n participants of the population (it is also can be interpreted as the share of informed agents in the population). The value of the coefficient ak i = 1, if there is a connection between taxpayers, and ak i = 0, if there is no connection between taxpayers. Modelling of Information Spreading in the Population of Taxpayers 105

Following the method proposed in the (Altman et al, 2014), we proceed to con- sider stationary state of the system in which the probability νk, (n) of the fact that the k-th agent is informed in the stationary state, does not depend∞ on time:

νk, (n) = lim νk(n, t). ∞ t →∞ In this case the stationary state of the system (4), obtained by the same scheme as in (Altman et al, 2014), becomes 1 νk, (n)=1 , (5) ∞ − n 1+ τ ak iνi, (n) i=1 ∞ P δ where τ = is the coefficient of information dissemination efficiency. This equation σ has two solutions:

trivial, which is νk, (n) = 0; • ∞ non-trivial, which corresponds to SIS-process. • If we exclude existing links from the population of n taxpayers (ak i = 1), we obtain new population consists of n ninf links. For each agent k in this population the next solution holds (in accordance− to metastable state of the system): 1 1 1 , if τ νk, (ninf )= − τ(ninf 1) ≥ ninf 1 ∞  − − 0, in opposite case. 2.3. The game ¡¡Tax authority — one taxpayer¿¿ In this section we formulate a game between tax authority and one taxpayer. We assume that information has been propagated over the population and during the time population reaches its steady state. It means that we have a stable distribution of Informed and Uninformed taxpayers. Also we set an assumption that if any taxpayer receives information about future tax audit then she pays. We denote the share of those agents as ninf . Here we consider a situation where taxpayer makes her decision in response to received information and incorporates the beliefs about the behavior of tax authority. Then we have a set of conflict situations in population between tax authority and individual agent k, which can be formulated as a bimatrix game. We suppose that the taxpayer from the population chooses the strategy y1 and pays with probability ω, which depends on the share of informed agents, i. e. ω = ω(νk, (ninf )). As it was discussed before, the population of taxpayers is heterogeneous∞ in its relation to risk. The variable ω reflects this heterogeneity, that is ω characterizes how different part of the population will react on the received information. It’s obvious that, due to the probability properties,

n = n ω. inf · The payoff matrix of presented game is: y1 y2 x1 ( pc + ξI ω; ξI ω) ( pc + p (ξ + π)I (1 ω); p (ξ + π)I (1 ω)) − i − i − i − − i − x2 ( ν0 c˜ + ξI ω; ξI ω) ( ν0 c˜; 0) − inf i − i − inf 106 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

0 Analysis of the payoff matrix. In the case when pc>νinf c˜ (the cost of auditing 0 of the share of the population is greater than the cost of information injection νinf at the initial time moment), the strategy x1 of the tax authority is dominated 2 0 by the strategy x . If the opposite case is possible (pc νinf c˜) and, besides, the 1 1≤ probability of audit is p p∗, the strategies profile (x ,y ) is the Nash Equilibrium. This equilibrium corresponds≥ to a high probability of auditing and high cost of information dissemination. If the latter is higher than the cost of auditing we obtain that there is no need for consideration of the spreading of information because it is unprofitable. Therefore this model is similar to the previous one which was studied in (Boure and Kumacheva, 2010) and is not of interest for a detailed study here. Further let’s assume that the audit cost and information cost are related as follows: 0 νinf c˜ < p c. (6) Due to inequation (6) the strategy profile situation (x1,y1) is not an equilibrium, and thus (x2,y2) is the unique situation which can be equilibrium. The taxpayer’s strategy y2 dominates the strategy y1, if the auditing probability is ω p p∗. (7) ≤ 1 ω − The higher the probability of the strategy ¡¡to pay¿¿, which depends on the share of the informed, the more natural and obvious is the implementation of the inequality (7). If the inequation

p (c (1 ω)(ξ + π)I ) >ν0 c,˜ (8) − − i inf is fullfilled, then (x2,y2) is Nash equilibrium. For the further analysis of the strate- gies the next cases of relation between parameters of the system are considered. Case 1. Let the condition (1) is satisfied, then c (ξ +π)I 0. If probability ω − i ≤ is small (ω is positive, close to zero, ω 0+) then the expression c (1 ω)(ξ +π)Ii is less or close to zero. Hence, we have→ − −

ν0 c (1 ω)(ξ + π)I < inf c.˜ − − i p

Then the condition (8) is not fulfilled, and there is no equilibrium in the pure strategies in order to (x2,y2) is not an equilibrium. Case 2. If the condition (1) is fulfilled, but the probability ω is high and tends to one from the left (ω 1 ), then the value of the expression c (1 ω)(ξ + π)Ii is close to c, and the fulfillment→ − of (8) is guaranteed by condition− (6).− In this case (x2,y2), indeed, is a Nash equilibrium. Case 3. Now let (1) is not fulfilled, then the cost of auditing is high as well as c (ξ+π)Ii > 0 and the expression c (1 ω)(ξ+π)Ii, is a strongly, nonnegative. But it− is impossible to say whether (8) is− satisfied− or not. In this case, due to proposition 1, the auditing is not profitable when the costs of it is large, in other words if p = 0. Thus, boundary cases are analyzed. It is obvious that (7), (8), and therefore the existing of equilibrium in pure strategies, essentially depends on the values of probability ω, defined by the dissemination of information in the population of agents. Modelling of Information Spreading in the Population of Taxpayers 107

The price of stability. In the studied game the best Nash equilibrium is the strategy profile (x2,y2). The value of the game is

V (x2,y2)= ν0 c˜+0= ν0 c.˜ − inf − inf Let’s use the term of the Price of Stability given in (Nisan et al, 2007). Definition 1. The Price of Stability (PoS) of a game is ratio between the value of the best Nash equilibrium to the value of the optimal solution:

max SW (s) s S P oS = ∈ , (9) max SW (s) s E ∈ where SW is the Social Welfare function, S is the set of payoff, E S is the subset of the set of equilibrium strategies. ⊆ If parameters satisfy (6), then the strategy profile, which gives the highest value of the system’s revenue, is Nash equilibrium (x2,y2). In this case the Price of Sta- bility is V (x2,y2) P oS = =1. (10) V (x2,y2) If the information spreading is not profitable and the inequation (6) is broken, the highes return of the system is reached if the tax authority chooses the strategy x1. Then the Price of Stability is

V (x2,y2) ν0 c˜ P oS = = inf . (11) pc pc − 2.4. The game ¡¡Tax Authority — n Taxpayers¿¿ Now let’s consider a game where the tax authority interacts to n taxpayers by the same scheme as we presented in subsection 2.3. As in the previous game the information about future tax audits has been propagated over the population of taxpayers and the system has reached its steady state. The information has been injected by the tax authority at the initial time moment. Analogously to model from (Altman et al, 2014) we define the payments of i-th taxpayer as

Ci = ξIiω(νj, (n)), (12) ∞ Payments of an audited taxpayer are:

Hi = p(ξ + π)Ii(1 ω(νj, (n))). (13) − ∞

We denote as Ui the payments of player i and Uiσi (ninf ) the payoff of i-th taxpayer if the rate of obsolescence of information is σi. As in previous section we define the share of those who paid as ninf . Then we construct the Social Welfare function for this system as:

n ninf n ninf −

SW (ν)= Uiσi (ninf )= ξIiω(νj, (n)) + p(ξ + π)Ii(1 ω(νj, (n))). ∞ − ∞ i=1 i=1 k=1 X X X (14) 108 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

We also assume that the income is uniformly distributed over the population of taxpayers. To simplify the model we use the average income I of the population (as it was introduced in (Gubar et al, 2015)) instead the exact value of income of the taxpayer i. Then the Social Welfare function (14) is

SW (n )= n C + (n n )H, (15) inf inf − inf where C and H are the average values of (12) and (13) correspondingly. Here we remind the definition of Price of Anarchy as in (Nisan et al, 2007). Definition 2. Price of Anarchy (PoA) is a ratio between the value function of a Nash equilibrium and the optimal objective value function

max SW (s) s S P oA = ∈ , (16) min SW (s) s E ∈ where SW is the Social Welfare function, S is the set of payoffs, E S is the subset of the equilibrium strategy set. ⊆ Based on this definition we obtain that the Price of Anarchy for the system with information spreading (4) – (14): 1 1 P oA . (17) ≤ ≤ 1 (1 + 1 ) 1 − τ n However, analyzing the coefficient of Price of Anarchy for considered game we faced with the problem how to estimate the relation between the price of individual and the total welfare. The results of the study we present in the next subsection. Coefficient of Inter-Social Welfare. In this subsection we consider the modifi- cation of the index of Price of Anarchy and introduce a new coefficient which takes into account the difference between the individual and the total welfare.

Definition 3. Define as the Coefficient of Inter-Social Welfare(CoISW) a ratio be- tween the summary of minimum payment of all taxpayers and maximum value of Social Welfare: n min Uk CoISW = i=1 . (18) Pmax SW To estimate this index, we follow the same scheme of the analysis as in the previous subsections.

Case 1. Let the condition (6) be justified. Due to the economic reason the tax authority chooses strategy x2 – to inform taxpayers at the initial time moment t = 0. In this case the total income of tax authority R is defined as:

0 R = ξIω(νj, (n)) νinf c˜ n. (19) ∞ − If we consider equation (19) without the second term, which corresponds to budgetary expenses, then this is the Social Welfare function. This function reaches Modelling of Information Spreading in the Population of Taxpayers 109 its maximum with the probability of tax payments in the case of the strategy ¡¡to pay¿¿:

ω(νi, (n))=1. ∞

Then, in this case, ninf = n, and maximum value of SW is

max SW (ninf )= SW (n)= ξIn. (20) ninf

Taxpayers prefer to minimize their tax payments and chose y2. Then

min U (x2, )= U (x2,y2)=0, i · i In the case described above Coefficient of Inter-Social Welfare is CoISW = 0.

Case 2. If the inequation (6) is not fulfilled, then it is more preferable to audit taxpayers than to spread information over the population. Thus the tax authority choose the strategy x1 and audit taxpayers with probability p. In this case the aggregated tax income is

R = ξIω(νi, (ninf )) + p(ξ + π)I(1 ω(νi, (ninf ))) pc n. (21) ∞ − ∞ − By analogy to the previous case, the last term in (21) is the budgeta ry expenses. Excluding this term from (21) we have the Social Welfare function:

SW (n ) = (pn(ξ + π)+ n (ξ p(ξ + π))) I. (22) inf inf − The function (22) reaches its maximum if every audited taxpayer is evader (ninf = 0) or if ω(νi, ) = 0. This fact is absolutely logical, if it is assumed that the probability to choose∞ the strategy ¡¡to pay¿¿ tends to zero in the absence of information. If the last assumption is incorrect and ω(νi, (ninf )) > 0, then we should analyze the summands of (22) separately. It is obvious∞ that the second term, which depends on ninf , decreases if p [0,p∗) and increases if p>p∗. However extremely small values of p essentially minimize∈ the first term of relation (22) (collected taxes and fees). We obtain that SW function reaches its maximum when the second term is equal to zero that is when p = p∗. In both of two considered situations the equation (20) satisfied. If the tax authority audits taxpayers with the probability p p∗, then taxpayers ≥ minimize their payments if they honestly pay taxes. That is, min Ui = ξIi for each i = 1,n. In this case the Coefficient of Inter-Social Welfare is

ξIn CoISW = =1. ξIn

If the probability of audit is small(p

ξ After simplifying the model and taking into account that p = , the pre- ∗ ξ + π vious equation becomes p CoISW = . p∗ It has an obvious interpretation. In this case the Coefficient of Inter-Social Welfare differs from its optimal value (which is equal to one) as many times as the actual probability of audit is less than its optimal value (p∗). When the value of auditing probability is p = p∗ the trade-off between personal and public welfare is reached.

In the section 2. we represent the game-theoretical model which describes inter- action between tax authority and taxpayers taking into account information about future tax audit which was injected by the tax authority at the initial time moment and then circulates in population. We combine SIS model to describe the process of propagation information over the population of taxpayers with the analysis of different scenarios which realize into the system ¡¡tax authority-taxpayers¿¿. Then in the next part of our paper we formulate an evolutionary model on the network which also includes agents’ beliefs about possible tax audit. We will study a complicated model which merges a process of information spreading with evolutionary process in structured population of taxpayers in response to received information. We formulate the evolutionary game on the network and evaluate the value of initial information invasion. We also corroborate our theoretical results by numerical simulations.

3. The evolutionary model of Information Dissemination Suppose that at the initial moment of time tax authority injects information about the audit. We also suppose that taxpayers can transfer this information to their neighbors and friends according to the network which define the structure of popu- lation. Hence the total population of the taxpayers is divided into subpopulations: Informed (ninf ) and Uninormed (nnoinf ), also we can say that agents can have a propensity to perceive or not the information:

n = ninf (t0)+ nnoinf (t0).

In other words taxpayers can be inclined to receive information or not. Then each taxpayer can choose strategy to pay or not to pay taxes due to her true income level based on received information. If one taxpayer from the subpopulation of those who inclined to perceive the information meets another one from this population, they will get the payoffs (Uinf ,Uinf ). In this case both of them know the same information and pay, hence their payoff is defined from the equation

U = (1 ξ)I. (23) inf − Similarly, if the taxpayer who does not perceive the information (and therefore wants to evade) meets the same taxpayer, they will get the payoffs (Uev,Uev), which are defined from the equation

U = (1 p)I + p I(1 (ξ + π)), ev − − Modelling of Information Spreading in the Population of Taxpayers 111 or, the equivalent equation,

U = (1 p (ξ + π))I. (24) ev − We denote the taxpayer’s propensity to perceive the information as α and con- sider the case when the uninformed taxpayer meets the informed taxpayer. As a result of such meeting, uninformed taxpayer obtains the information and should pay the payoff (3) with probability α if she believes in this information, or the payoff (4) if she does not believe. As in classical evolutionary game the instant communications between taxpayers defines by two-players bimatrix game. For the cases, when taxpayers of different types meet each other, the matrix of payoffs can be written in the form:

Inf Noinf

Inf (Uinf , Uinf ) (Uinf , αUinf + (1 − α)Uev ) Noinf (αUinf + (1 − α)Uev , Uinf ) (αUinf + (1 − α)Uev , αUinf + (1 − α)Uev) where Inf is the strategy of taxpayer if she is informed (she perceives the informa- tion) and Noinf is the strategy not to be informed. In this section, we consider a comparison of two modifications of the model of tax control based on the propensity to risk which demonstrate different taxpayers. The first case of the model does not include the process of information dissemination and in the absence of information risk-loving taxpayers do not pay. Risk-neutral taxpayers suppose that the probability of auditing is rather small (p

R = N ν ξI + p (1 ν )(ξ + π)I pc . (25) 1 a − a − The second case takes into account the dissemination of informatio n in the popu- lation of taxpayers. At the initial moment of time there is an information injection 0 which is a share of informed taxpayers νinf = νinf (t0). The cost of unit of informa- tion is still c. At the moment when the system reached its steady state νinf is the share of those who perceived information and paid taxes, νev is the share of those who still evades.e In this case the total tax revenue is R = N ν ξI + pν (ξ + π)I pc ν0 c . (26) 2 inf ev − − inf  4. The evolutionary model with network structure e In this section we suppose that agents transfer information not to a random oppo- nent but then communicate with their neighbors and friends. In this case we can describe the possible links between agents using network. Let G = (N,L) denote an indirect network, where N = 1,...,n is a set of economic agent and L N N is an edge set. Each edge in L represents{ } two-player symmetric game between⊂ connected× taxpayers. The taxpayers choose strategies from a binary set X = A, B and receive payoffs according to the matrix of payoffs in section 3. Each{ instant} time moment agents use a single strategy against all 112 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina opponents and thus the games occurs simultaneously. We denote the strategy state T by x(T ) = (x1(t),...,xn(t)) , xi(t) X. Here xi(t) X is a strategy of taxpayer i, i = 1,n, at time moment t. Aggregated∈ payoff of∈ agent i will be defined as in (Riehl and Cao, 2015):

ui = ωi axi(t),xi(t), (27)

j Mi X∈ where a is a component of payoff matrix, M := j L : i, j L is a xi(t),xi(t) i { ∈ { } ∈ } set of neighbors for taxpayer i, weighted coefficient φi = 1 for cumulative payoffs and φ = 1 for averaged payoffs. Vector of payoffs of the total population is i Mi | | T u(t) = (u1(t),...,un(t)) . The state of population will be changed according to the rule, which is a function of the strategies and payoffs of neighboring agents: x (t + i)= f( x (t),u (t): j N i ). (28) i { j j ∈ i ∪{ }} Here we suppose that taxpayer changes her behavior if at least one neighbor has better payoff. As the example of such dynamics we can use the proportional imita- tion rule (Sandholm, 2010; Weibull, 1995), in which each agent chooses a neighbor randomly and if this neighbor received a higher payoff by using a different strategy, then the agent will switch with a probability proportional to the payoff difference. The proportional imitation rule can be presented as: λ 1 p (x (t +1)= x (t)) := (u (t) u (t)) (29) i j M j − i | i| 0 for each agent i L where j Mi is a uniformly randomly chosen neighbor, λ> 0 ∈ ∈ 1 is an arbitrary rate constant, and the notation [z]0 indicates max(0, min(1,z)). Now let’s consider two cases of the rule described above. Rule. Case 1. Initial distribution of agents is nonuniform. When agent i • receives an opportunity to revise her strategy then she considers her neigh- bors as one homogeneous player with aggregated payoff function. This payoff function is equal to mean value of payoffs of players included in homogeneous player. It is assumed that the agents meets with any neighbor with uniform probability, then mixed strategy of such homogeneous player is a vector of distribution of pure strategies of included players. If payoff function of ho- mogeneous player is better then player i changes her strategy to the strategy of her more popular neighbor. Rule. Case 2. Initial distribution of agents is uniform. In this case agent i • keeps her own strategy.

5. Numerical simulations In this section we present numerical examples to support the approaches described in precious sections and demonstrate the influence of the structure of the network to the population of taxpayers in those series of experiments we will use the following structures of graphs: grid and random. Firstly, we demonstrate the combined model with contains spreading of infor- mation based on SIS model and the game between taxpayers and tax authority. Secondly, we show simulations reffered to spreading information as an evolutionary game on the graph. Modelling of Information Spreading in the Population of Taxpayers 113

5.1. Information spreading in structured population In this paragraph we present numerical simulation in population of taxpayers based on the model from section 2.2. Here we suppose that at the initial time moment t = 0 tax authority starts to spread information about possible tax audit over the population of taxpayers. To simplify calculations we define function y(t)

N v (t) i (30) y(t)= i=1 PN which is an average probability of transfer of information between taxpayers in population at time moment t. As we presented above we formulate the process of spreading information as an epidemic process on the network. Let GN be an undirected graph with N = 30 nodes, A is a connectivity symmetric matrix with binary coefficients a : { ij } 1, if node i has a link with j a = ij 0, otherwise.  Remark. If i = j then aij = 0. Here, value y0 is probability of receiving infor- mation at time moment t = 0. In Figs. 2 — 7 we present the series of experiments varying coefficient σ with step 0.1 and fixed δ = 0.1. In this case information spreads over the population of taxpayers, but value of function y(t) decreases because of growth of parameter σ.

Fig. 2. Experiment 1. δ = 0.1, σ = 0.1. Sta- Fig. 3. Experiment 1. δ = 0.1, σ = 0.2. Sta- tionary state: t = 5.717, y(t) = 092. tionary state t = 6.621, y(t) = 0.8506.

From experiments we can see that system (4) reaches its stationary states inside the time interval t [5.7, 8.4] but value of function y(t) monotonically decreases. If parameter σ is∈ fixed but value of δ is decreased then we have that information spreads and behavior of function y(t) changes, Figs.8 — 9 demonstrate this process. 114 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Fig. 4. Experiment 1. δ = 0.1, σ = 0.3. Sta- Fig. 5. Experiment 1. δ = 0.1, σ = 0.4. Sta- tionary state: t = 6.537, y(t) = 0.778. tionary state: t = 6.47, y(t) = 0.7063.

Fig. 6. Experiment 1. δ = 0.1, σ = 0.5. Sta- Fig. 7. Experiment 1. δ = 0.1, σ = 0.6. Sta- tionary state: t = 7.689, y(t) = 0.6361. tionary state: t = 8.352, y(t) = 0.5665.

Fig. 8. Experiment 2. δ = 0.08, σ = 0.9. Fig. 9. Experiment 2. δ = 0.07, σ = 0.9. Stationary state: t = 6.61, y(t) = 0.5331. Stationary state: t = 8.114, y(t) = 0.4676.

In the next series of experiments we fix the value of coefficients as δ = 0.1, σ = 0.8 and estimate the effect of different initial number of informed taxpayers Modelling of Information Spreading in the Population of Taxpayers 115 in population. In table 1 we collect information about number of links for each of taxpayer.

Table 1: The Matrix of links between agents Node i Number Node i Number Node i Number of links of links of links 1 16 11 8 21 15 2 15 12 19 22 11 3 15 13 16 23 12 4 17 14 16 24 15 5 22 15 13 25 10 6 14 16 11 26 7 7 14 17 15 27 13 8 15 18 15 28 10 9 15 19 14 29 15 10 13 20 10 30 17

In Figs. 10 — 13 we demonstrate behavior of the system which depends on different initial states. In this case we receive that stationary state of the system is reached if y(t)=0.4978. We can also notice that a number of informed taxpayers at the initial time moment comes to steady state faster.

Fig. 10. Experiment 3. Number of informed Fig. 11. Experiment 3. Number of informed nodes is 14 at t = 0. nodes is 17 at t = 0.

Now, we can compute the value of Social Welfare Function SW , base on the known average probability of perceiving information about future tax audit

SW (yf )= yf ξI (31) where yf is the value of the function y(t) when the system reaches its steady state. In Experiment 5 we change the initial number of informed taxpayers. In the next series of experiment we compare the behavior of function y(t) which depends on different value of σ, δ and calculate SW function. 116 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Fig. 12. Experiment 4. Number of informed Fig. 13. Experiment 4. Number of informed nodes is 15 at t = 0. nodes is 15 at t = 0.

Fig. 14. Experiment 5. Number of informed Fig. 15. Experiment 5. Number of informed nodes is 5 at t = 0. y1(t) : δ = 0.9, nodes is 5 at t = 0. y1(t) : δ = 0.9, σ = 0.1, stationary state t = 1.499, y(t) = σ = 0.5, stationary state t = 1.28, y(t) = 0.9916, SW (yf ) = 5372.885; y2(t) : δ = 0.1, 0.958, SW (yf ) = 5190.827; y2(t) : δ = 0.1, σ = 0.1, stationary state t = 10.09, y(t) = σ = 0.5, stationary state t = 11.95, y(t) = 0.9247, SW (yf ) = 5010.395. 0.6362, SW (yf ) = 3447.186.

Fig. 16. Experiment 5. Number of informed nodes is 5 at t = 0. y1(t) : δ = 0.9, σ = 0.9, stationary state t = 1.121, y(t) = 0.9247, SW (yf ) = 5010.395; y2(t) : δ = 0.1, σ = 0.9, stationary state t = 15.59, y(t) = 0.3625, SW (yf ) = 51964.17.

From Figs. 14-15 we can notice that if value of δ growths then value of social welfare function SW also increases. However if at the same time value of σ is large Modelling of Information Spreading in the Population of Taxpayers 117

Fig. 17. Experiment 5. Number of informed Fig. 18. Experiment 5. Number of informed nodes is 10 at t = 0. y1(t) : δ = 0.9, nodes is 10 at t = 0. y1(t) : δ = 0.9, σ = 0.9, stationary state t = 1.286, y(t) = σ = 0.5, stationary state t = 1.548, y(t) = 0.9247, SW (yf ) = 5010.395; y2(t) : δ = 0.1, 0.958, SW (yf ) = 5190.827; y2(t) : δ = 0.1, σ = 0.9, stationary state t = 10.23, y(t) = σ = 0.5, stationary state t = 10.54, y(t) = 0.3625, SW (yf ) = 1964.17. 0.6362, SW (yf ) = 3447.186.

(σ = 0.9) then value of SW function is less. So it means that taxpayers also take into anount importance of information.

Fig. 19. Experiment 5.Number of informed nodes is 10 at t = 0. y1(t) : δ = 0.9, σ = 0.1, stationary state t = 1.662, y(t) = 0.9916, SW (yf ) = 5372.885; y2(t) : δ = 0.1, σ = 0.1, stationary state t = 10.23, y(t) = 0.9247, SW (yf ) = 5010.395.

Fig. 20. Experiment 5. Number of informed nodes is 15 at t = 0. y1(t) : δ = 0.9, Fig. 21. Experiment 5. Number of informed σ = 0.9, stationary state t = 0.8408, y(t)= nodes is 15 at t = 0. y1(t) : δ = 0.9, 0.9247, SW (yf ) = 5010.395; y2(t) : δ = σ = 0.5, stationary state t = 0.8463, y(t)= 0.5, σ = 0.9, stationary state t = 1.795, 0.958, SW (yf ) = 5190.827; y2(t) : δ = 0.1, y(t) = 0.8656, SW (yf ) = 4690.167; y3(t) : σ = 0.5, stationary state t = 8.863, y(t) = δ = 0.1, σ = 0.9, stationary state t = 12.87, 0.6362, SW (yf ) = 3447.186. y(t) = 0.3626, SW (yf ) = 1964.17. 118 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Fig. 22. Experiment 5. Number of informed nodes is 15 at t = 0. y1(t) : δ = 0.9, σ = 0.1, stationary state t = 1.009, y(t) = 0.9916, SW (yf ) = 5372.885; y2(t) : δ = 0.1, σ = 0.1, stationary state t = 7.906, y(t) = 0.9247, SW (yf ) = 5010.395.

Fig. 23. Experiment 5. Number of informed nodes is 20 at t = 0. y1(t) : δ = 0.9, Fig. 24. Experiment 5. Number of informed σ = 0.9, stationary state t = 0.9392, y(t)= nodes is 20 at t = 0. y1(t) : δ = 0.9, 0.9247, SW (yf ) = 5010.395; y2(t) : δ = σ = 0.1, stationary state t = 0.8297, y(t) = 0.5, σ = 0.9, stationary state t = 2.238, 0.9916, SW (yf ) = 5372.885;; y2(t) : δ = 0.1, y(t) = 0.8656, SW (yf ) = 4690.167; y3(t) : σ = 0.1, stationary state t = 6.033, y(t) = δ = 0.1, σ = 0.9, stationary state t = 14.03, 0.9247, SW (yf ) = 5010.395. y(t) = 0.3626, SW (yf ) = 1964.17.

Nevertheless from Figs. 14–25 we have observed that a number of initially in- formed taxpayers do not influence significantly the value of SW function. We can interpret that fact as the importance of information prevails over the initial number of informed agents in population. Perhaps, those experiments demonstrate one of the possible scenarios of minimization of coast of audit. 5.2. Evolutionary game on the graph In this section we present series of experiments based on the evolutionary model of section 3. We use the network G to define the structure of population and set following data for the network, process dynamics and matrix coefficients: size of popula- tion is n = 30, share of risk-averse taxpayers in population is νa = 17% due to the psychological research (Niazashvili, 2007), λ = 1, tax and penalty rates are Mi | | Modelling of Information Spreading in the Population of Taxpayers 119

Fig. 25. Experiment 5. Number of informed nodes is 20 at t = 0. y1(t): δ = 0.9, σ = 0.5, stationary state t = 1.13, y(t) = 0.958, SW (yf ) = 5190.827; y2(t) : δ = 0.5, σ = 0.5, stationary state t = 1.691, y(t) = 0.9247, SW (yf ) = 5010.395.

ξ = 0.13 due to the income tax rate in Russia (RF Tax Service, 2017), π = 0.065 (for bigger values of π, we obtain even bigger values of optimal audit probability p∗), the average taxpayers’ income is I = 47908 due to the value of average in- come (The web-site of the Russian Federation State Statistics Service, 2017), opti- mal value of the probability of audit is p∗ =0.167, unit cost of auditing is c = 7455 (minimum wage in St. Petersburg (The web-site of the Russian Federation State Statistics Service, 2017)), unit cost of information injection is c = 10% = 745.5, actual value of the probability of audit is p =0.1. Let the number of nodes in the population be n = 30. Thene for the first model (which does not include the process of information dissemination) the value of total tax revenue (25) is R1 = 815617.78. For the second model (which takes into account the dissemination of information) we compute the payoff functions of taxpayers: Uinf = 41679.96, Uev = 46973.79. Experiment 5. As a network G we use the random graph to define the structure of population and assume that probability of perception of information is α =0.9. The initial distribution (νinf ,νev) is (19, 11) respectively.

Fig. 26. Experiment 5: α = 0.9. Initial state is (νinf ,νev) = (19, 11). Strategy A corre- sponds to yellow dots on the graph and strategy B corresponds to blue dots.

From Fig.26 we obtain that agents who use strategy A (to perceive information) switch on strategy B (not to perceive information). This process starts if strategy B gives better payoff against strategy A, this fact occurs only if the most neighbors 120 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina in i-th agent’s environment are non-informed (see Rule. Case 1). In this case the stationary state of population is (11, 19). The slight difference between initial and stationary state is caused the fact that if in initial state agents with strategy A have many connections with B-strategy agents, then they replace their strategy else they keep own behavior.

Fig. 27. Experiment 5: α = 0.9. Final state is (νinf ,νev ) = (11, 19).

By using the initial distribution of Informed and Uniformed taxplayers (νinf ,νev) we compute the revenue of tax authority (26) for the second model: R2 = 2140450.62. In this experiment we can see if the information which was injected at the initial time moment, significantly increases R, despite the share of informed taxpayers at the final moment has decreased (from 19 to 11). In this case the second model is more effective, that is the dissemination of information is profitable. Experiment 6. In the experiment 6 at random graph we use the next ini- tial proportion of agents (18, 12) and keep the same probability of perception of information α =0.9.

Fig. 28. Experiment 6: α = 0.9. Initial state is (νinf ,νev) = (18, 12).

In Experiment 6 according to Rule. Case 2 we show that total population aspires to stationary state (0, 30) from the initial state (18, 12), where the first and the second numbers correspond to agents with strategy A and B respectively. Modelling of Information Spreading in the Population of Taxpayers 121

This replacement occurs as well as at each stage of evolutionary process agents with strategy A have many connections with B-strategy agents who receive better payoff.

Fig. 29. Experiment 6: α = 0.9. Final state is (νinf ,νev) = (0, 30). Total tax revenue R2 = 415850.40.

For this experiment the dissemination of information is unprofitable: R2 de- creases insufficiently because there is not enough agents who received information at the initial time moment, hence at the final moment there is no people in the system who will use the information and pay. Thus the aggregated income of the system is R2 = 415850.40 which is two times less than R1. Experiment 7. This example demonstrates behavior of population taxpayers structured by random graph with low probability of perception information α =0.1.

Fig. 30. Experiment 7: α = 0.1. Initial state is (νinf ,νev) = (12, 18).

In Experiment 7 we receive that our population contains a mixture of A- and B-strategies agents, here stationary state is (7, 23). Agents use strategy A who change it to B if they surround by B-strategy neighbors and else their hold own strategy if they have many connections with A-strategy taxpayers. Total tax revenue is R2 = 1661745.54. This experiment shows that the dissemination of information helps to increase the revenue of the tax authority, despite the fact that the share of informed agents has declined in comparison to the initial value. In this example we have significant 122 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Fig. 31. Experiment 7: α = 0.1. Final state is (νinf ,νev) = (7, 23). Total tax revenue R2 = 1661745.54. number of agents, who received information at the initial time moment, therefore we obtain that the second model is more attractive, that is R2 > R1. Now let’s consider the series of experiments in which the number of nodes in the network (the size of population) is n = 25. Then for the first model the value of total tax revenue (25) is R1 = 679681.49. Experiment 8. – Experiment 9. show that if the probability of perception information is α = 0.1 and the structure of graph is grid then for both variants of initial distribution of Uninformed and Informed taxpayer systems aspires to ¡¡pure¿¿ stationary state (νinf ,νev)=(0, 25).

Fig. 33. Experiment 8. Grid. Probability of Fig. 32. Experiment 8. Grid. Probability of perception information α = 0.1. n = 25. Sta- perception information α = 0.1. n = 25. Ini- tionary state is (νinf ,νev) = (0, 25). Total tial state is (νinf ,νev) = (10, 15). tax revenue R2 = 454639.50.

In both experiments we obtain that spreading of information is unprofitable due to the structure of the network. Experiment 10. shows that for value of probability α = 0.1 and the grid structure of graph during the time system aspires to ¡¡mixed¿¿ stationary state. Total tax revenue is R2 = 2930844.84. In Experiment 11. the value of perception information is α = 0.9 and the structure of graph is grid, we have that for initial distribution of Uninformed and Informed taxpayers system aspires to ¡¡mixed¿¿ stationary state and location of informed taxpayers at initial time moment have a strong influence on the final state. In this case the aggregated revenue of the system is R2 = 978074.58. If Modelling of Information Spreading in the Population of Taxpayers 123

Fig. 35. Experiment 9. Grid. Probability of Fig. 34. Experiment 9. Grid.Probability of perception information α = 0.1. n = 25. Sta- perception information α = 0.1. n = 25. Ini- tionary state is (νinf ,νev ) = (0, 25). Total tial state (νinf ,νev) = (15, 10). tax revenue R2 = 342814.50.

Fig. 37. Experiment 10. Grid. Probability Fig. 36. Experiment 10. Grid. Probability of perception information α = 0.1. n = 25. of perception information α = 0.1. n = 25. Stationary state is (νinf ,νev) = (17, 8). To- Initial state is (νinf ,νev) = (20, 5). tal tax revenue R2 = 2930844.84.

the probability of perceiving information is high the number of agents who don’t perceive the information is larger then in the previous case. But for the network with this structure R2 is still bigger then R1.

Fig. 39. Experiment 11. Grid. Probability Fig. 38. Experiment 11. Grid. Probability of perception information α = 0.9. n = 25. of perception information α = 0.9. n = 25. Stationary state is (νinf ,νev) = (4, 21). To- Initial state is (νinf ,νev) = (15, 10). tal tax revenue R2 = 978074.58. 124 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

In Experiment 12. we also obtain that stationary state is ¡¡mixed¿¿ and very close to the initial distribution of Informed and Uninformed taxpayers.

Fig. 40. Experiment 12. Grid. Probability Fig. 41. Experiment 12. Grid. Probability of perception information α = 0.9. n = 25. of perception information α = 0.9. n = 25. Initial state is (νinf ,νev ) = (20, 5). Stationary state is (νinf ,νev)=(7, 18).

This example demonstrates that if the number of the agents who perceived information is large in a steady state, then the revenue of the system will increase significantly. We have the distribution (20, 5) at the initial time moment and (18, 7) in the steady state. That is if at the initial time moment the number of informed is equal or bigger then 15 we can see weak decrease of number of susceptible to information, but R2 is almost 5 times bigger then R1. In Experiment 13. and Experiment 14. for random graph and probability α =0.1 as in previous cases we obtain that stationary state depends on the initial distribution of taxpayers and only if number of initially informed taxpayers prevails in population we have a ¡¡mixed¿¿ stationary state. From experiments we obtain that of the number of susceptible to information taxplayers is zero then the total tax revenue of the system does not increase.

Fig. 42. Experiment 13. Random graph. Fig. 43. Experiment 13. Random graph. Probability of perception information α = Probability of perception information is α = 0.1. n = 25. Initial state is (νinf ,νev) = 0.1. n = 25. Stationary state (νinf ,νev) = (15, 10). (0, 25). Total tax revenue R2 = 342814.5.

In Experiment 15. and Experiment 16. for random graph and probability α = 0.9 we have that stationary state is ¡¡mixed¿¿ for different initial distribution of Informed and Uninformed taxpayers. For the Experiment 15. R2 = 931084.56, for the Experiment 16. R2 = 2566224.78. Modelling of Information Spreading in the Population of Taxpayers 125

Fig. 45. Experiment 14. Random graph. Fig. 44. Experiment 14. Random graph. Probability of perception information is α = Probability of perception information is α = 0.1. n = 25. Stationary state (νinf ,νev) = 0.1. n = 25. Initial state (νinf ,νev) = (20, 5). (18, 7). Total tax revenue R2 = 3089659.86.

Fig. 46. Experiment 15. Random graph. Fig. 47. Experiment 15. Random graph. Probability of perception information is α = Probability of perception information is α = 0.9. n = 25. Initial state (νinf ,νev ) = 0.9. n = 25. Stationary state (νinf ,νev) = (10, 15). (3, 22). Total tax revenue R2= 931084.56.

Fig. 48. Experiment 16. Random graph. Fig. 49. Experiment 16. Random graph. Probability of perception information is α = Probability of perception information is α = 0.9. n = 25. Initial state (νinf ,νev ) = 0.9. n = 25. Stationary state (νinf ,νev) = (15, 10). (14, 11). Total tax revenue R2= 2566224.78.

Experiment 17. shows the case where final stationary state coincides with the initial distribution of taxpayers. In this case R2 = 3407289.90. Also from the series of experiments 6-17 we have that the expert influence upon the system demonstrate the next parameters: the probability of perception infor- mation and the initial distribution of Informed and Uninformed taxpayers. Those values increase total tax revenue significantly. As well as the structure of initial 126 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Fig. 51. Experiment 17. Random graph. Fig. 50. Experiment 17.Random graph. Probability of perception information is α = Probability of perception information is α = 0.9. n = 25. Stationary state (νinf ,νev) = 0.9. n = 25. Initial state (νinf ,νev) = (20, 5). (20, 5). Total tax revenue R2= 3407289.90. distribution and initial location of Informed taxpayers impact on revenue of the system. Thus from the simulations presented above it is obvious that the total revenue of the system R2 depends on the structure of the network. In both cases if the probability α is high (α =0.9) or low (α =0, 1), we obtain that if there is enough number of the susceptible to the information in the network when the system comes to its steady state and the total tax revenue is larger then R1 which computed for the model, which does not include the process of information spreading.

6. Conclusion In this paper we have presented two approaches which combine inter-agent com- munications and the process of propagation of information. First, we have formu- lated the game-theoretical model of interaction between taxpayers and tax-authority which include the information spreading based on structured SIS model. The sec- ond approach uses evolutionary game on the network to illustrate the idea of using information in fiscal system In both cases we present mathematical formulations, analysis of the behavior of the system and present series of experiments. We investigate the impact of information received from the tax authority on the decisions of taxpayers. We obtained that the final distribution of taxpayers who pay taxes depends on the network structure, risk-status and received information. We can see that propagation information about possible tax audit gives a positive effect for the total revenue of fiscal system and increases total amount of taxpayers who prefer to pay taxes honestly.

References Antocia, A. and Paolo Russua, P. and Zarrib, L. (2014). Tax Evasion in a Behaviorally Heterogeneous Society: An Evolutionary Analysis. Economic Modelling, 42, 106–115. Antunes, L. and Balsa, J. and Urbano, P. Moniz, L. and Roseta-Palma, C. (2006). Tax Compliance in a Simulated Heterogeneous Multi-agent Society. Lecture Notes in Com- puter Science, 3891, 147–161. Bloomquist, K. M. (2006). A comparison of agent-based models of income tax evasion. Social Science Computer Review, 24(4), 411–425. Boure, V. and Kumacheva, S. (2010). A game theory model of tax auditing using statistical information about taxpayers. St. Petersburg, “Vestnik SPbGU”, series 10, 4, 16–24 (in Russian). Modelling of Information Spreading in the Population of Taxpayers 127

Chander, P. and Wilde, L. L. (1998). A General Characterization of Optimal Income Tax Enfocement. Review of Economic Studies, 65, 165–183. Galegov, A. and Garnaev, A. (2009). A tax game in a cournot duopoly. Mathematical game theory and applications, 1(1), 3–15 (in Russian). Goffman, W. and Newill, V. A. (1964). Generalization of Epidemic Theory: An Application to the Transmission of Ideas. Nature, 204(4955), 225–228. Gubar, E. A. (2010). Construction Different Types of Dynamics in an Evolutionary Model of Trades in the Stock Market. Contributions to Game Theory and Management. Vol. 3. SPb: Graduate School of Management, SPbU, 162–171. Gubar, E. A. and Kumacheva, S. Sh. and Zhitkova, E. M. and Porokhnyavaya, O. Yu. (2015). Impact of Propagation Information in the Model of Tax Audit. Recent advances in game theory and applications, “Static& Dynamic Game Theory: Foundations & Ap- plications”, Birkhauser, 91–110. Gubar, E. A. and Kumacheva, S. Sh. and Zhitkova, E. M. and Porokhnyavaya, O. Yu. (2015). Propagation of information over the network of taxpayers in the model of tax auditing. Stability and Control Processes in Memory of V.I. Zubov SCP 2015, IEEE Conference Publications. INSPEC Accession Number: 15637330. pp. 244-247. Harsanyi, J. C. and Selten, R. (1988). General Theory of Equilibrium Selection in Games. Cambridge, MA: The M.I.T. Press. Hayel, Ye. and Trajanovski, S. and Altman, E. and Wang, H. and Van Mieghem, P. (2014). Complete Game-theoretic Characterization of SIS Epidemic Protection Strategies. De- cision and Control (CDC), IEEE, 1179–1184. Kandhway, K. and Kuri, J. (2014). Optimal control of information epidemics modeled as Maki Thompson rumors. Preprint submitted to Communications in Nonlinear Science and Numerical Simulation. Kolesin, I. D. and Gubar, E. A. and Zhitkova, E. M. (2014). Strategies of control in medical and social systems. Unipress, SPbSU, St.Petersburg. Kolesnik, G. V. and Leonova, N. A. (2011). A model of tax competition under taxpayers’ local competition. Mathematical game theory and applications. 3(1), 60–80 (in Rus- sian). Kumacheva, S. Sh. and Gubar, E. A. (2015). Evolutionary Model Of Tax Auditing. Contri- butions to Game Theory and Management. Vol. 8. SPb: Graduate School of Manage- ment, SPbU, 164–175. Monderer, D. and Shapley, L. S. (1996). Potential Games. Games and Economic Behavior, 14, 124–143. Nekovee, A. M. and Moreno, Y. and Bianconi, G. and Marsili, M. (2007). Theory of rumor spreading in complex social networks. Physica, A, 374, 457–470. Niazashvili, A. (2007). Individual differences in risk propensity in different social situations of personal development. Moscow: Moscow University for the Humanities. Nisan, N. and Roughgarden, T. and Tardos, E. and Vazirani, V. (2007). Algorithmic Game Theory. Cambridge: Cambridge University press. Novikov, D. A. (2010). Games and networks. Mathematical game theory and applications, 2(1), 107–124 (in Russian). Reinganum, J. R. and Wilde, L. L. (1985). Income tax compliance in a principalagent frame- work. Journal of Public Economics, 26, 1–18. Riehl, J. R. and Cao, M. (2015). Control of Stochastic Evolutionary Games on Networks. 5th IFAC Workshop on Distributed Estimation and Control in Networked Systems, Philadelphia, PA, United States. Sanchez, I. and Sobel, J. (1993).Hierarchical design and enforcement of income tax policies. Journal of Public Economics, 50, 345–369. Sandholm, W. H. (2010). Population Games and Evolutionary Dynamics. Cambridge, MA: The M.I.T.Press. 128 S. Kumacheva, E. Gubar, E. Zhitkova, Z. Kurnosykh, T. Skovorodina

Tembine, H. and Altman, E. and Azouzi, R. and Hayel, Y. (2010). Evolutionary Games in Wireless Networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40(3), 634–646. Vasin, A. and Morozov, V. (2005). The Game Theory and Models of Mathematical Eco- nomics. Moscow: MAKSpress (in Russian). Weibull, J. (1995). . Cambridge, MA: The M.I.T.Press. The web-site of the Russian Federation State Statistics Service http://www.gks.ru/ The web-site of the Russian Federal Tax Service. https://www.nalog.ru/ Contributions to Game Theory and Management, X, 129–142

A Search Game with Incomplete Information on Detective Capability of Searcher⋆

Ryusuke Hohzaki1 National Defense Academy, Department of Computer Science, 1-10-20 Hashirimizu, Yokosuka, 239-8686, Japan E-mail: [email protected]

Abstract This paper deals with a so-called search allocation games (SAG), which a searcher distributes search resources, such as detection sensors and search time, into a search space to detect a target and the target moves to evade the detection. Although there have been many published papers on the SAG, they almost dealt with complete information games. In this paper, we consider private information of the searcher about the detection effectiveness of the search resource and discuss a two-person zero-sum incom- plete information SAG with the detection probability of the target as payoff. We derive its Bayesian equilibrium to evaluate the value of the incomplete information. Keywords: search theory, game theory, incomplete information.

1. Introduction This paper deals with a so-called search allocation games (SAG) (Hohzaki, 2013a), which a searcher distributes search resources, such as detection sensors and search time, into a search space to detect a target and the target moves to evade the detection. Although there have been many published papers on the SAG, they almost dealt with complete information games. In this paper, we consider private information of the searcher about the detection effectiveness of the search resource and discuss a two-person zero-sum incomplete information SAG with the detection probability of the target as payoff. We derive its Bayesian equilibrium to evaluate the value of the incomplete information. Morse and Kimball (1951) already discussed a search game in their well-known book as a control problem of submarine’s passage in straits in 1951. In the history of search theory, researchers first solved optimal search problems for stationary targets and then for moving targets. Koopman (1957) had been studying optimal distribution problems of search resources to detect targets effectively. His prob- lem became an origin of the research field of the so-called optimal resource allo- cation. In these one-sided search problems, they assumed that the searcher knew a probabilistic rule of the target’s existence or movement. De Guenin (1961) and Kadane (1968) are researches for stationary target problem, and Pollock (1970), Iida (1972), Brown (1980) and Washburn (1983) are for moving target problems. Stone (1975) got together the theoretical results of these early researches in his book.

⋆ This work was supported by JSPS KAKENHI Grant-in-Aid for Scientific Research (C) Grant Number 26350447. 130 Ryusuke Hohzaki

After that, they regarded the existence distribution and movement of the tar- get in search spaces, which had been assumed to be given by some probabilis- tic laws, as elements of the target’s decision making and began to discuss search games between a searcher and a target. They handled target strategies of exis- tence or movement in almost all models of the search game, but various types of searcher’s strategies (Hohzaki, 2016). The search game with the distribution strategy of search resources as a searcher’s strategy is referred to as search allo- cation game (SAG). Nakai (1988) and Iida et al. (1994) discussed SAGs with sta- tionary targets. Iida et al. (1996) and Hohzaki and Iida (1998) dealt with moving target SAGs. Washburn and Hohzaki (2001) developed a SAG model of embedding practical factors, e.g. energy and geographical constraints, into the target move- ment and Hohzaki (2006) generalized the model. Dambreville and Le Cadre (2002) and Hohzaki (2008) took account of some practical features of search resource in their SAGs. Hohzaki (2007a) analyzed a SAG with occurrence of false contacts in a search space. Multi-stage, cooperative and nonzero-sum SAGs were studied by Hohzaki (2007b), Hohzaki (2009) and Hohzaki (2013b), respectively. In the previous papers mentioned above, they modeled their SAGs as com- plete information games. Namely, the target and the searcher share all informa- tion about players as common knowledge. However, we can think of some cases that specific private information of players has serious effects on the outcome of games. At the beginning of the game, an initial position and an initial movement energy of the target must be known to the target. Hohzaki and Joo (2015) and Matsuo and Hohzaki (2017) first began to discuss incomplete information SAGs with target’s private information about initial position and initial energy, respec- tively. On the searcher’s side, just the searcher would properly know the effectiveness of search resources such as detection sensors because the searcher uses the resources to detect the target. The target tries to acquire the detection effectiveness of the sensors and gets its outlines although he cannot certainly know it. What extent he can know it to depends on its technological history, its patent, how open the technology is to the world, and so on. If it is a high technology, it would be deeply kept in secret and the target’s estimation about its effectiveness would be rather uncertain. In this paper, we model an incomplete information SAG with private informa- tion about search resource such as detection sensors. We derive its equilibrium to evaluate the value of the information. By some numerical examples, we do the eval- uation in the concrete and analyze the effects of whether the target successes or fails to estimate the resource’s effectiveness on the results of the game. The analysis gives us a lesson about R & D of sensing or search resource technology.

2. A Search Game with Searcher’s Information about Sensors The detection capability of searcher’s sensors affects the results of search operations very much. Only the searcher naturally knows the capability. We consider a one-shot search game between a searcher and a target with the searcher’s private information about the sensors’ capability. (A1) A search space consists of a discrete time space, denoted by T = 1, ,T , and a discrete geographic space, denoted by K = 1, ,K . { ··· } (A2) There are two players: a searcher and a target. Only{ ··· the sea} rcher knows the detection capability of searcher’s sensors. The capability is indexed by sensor’s A Search Game with Incomplete Information on Detective Capability of Searcher 131

types H. The target does not know the sensors’ type the searcher currently uses but can guess it according to a distribution of the types f(h),h H , where { ∈ } h H f(h) = 1. (A3) The∈ target starts from its initial possible cells S K and moves in the space P 0 K as time elapses. His motion has some constraints;⊆ he can move from current cell i at time t to cells N(i,t) at time t + 1; he takes energy µ(i, j) to move from cell i to j; he possesses initial energy e0 at time t = 1; he can do nothing but staying at his current cell if he exhausts his energy. Let us denote a set of feasible target paths of satisfying all constraints above by Ω. If the target chooses a path ω Ω, he will be at cell ω(t) K at time t T . ∈ ∈ ∈ (A4) To detect the target, the searcher distributes search resources using a specific type of sensors after an initial time τ, that is, the searchable time period is denoted by T = τ, τ +1, ,T . Φ(t) resources are available to the searcher at each time t. Let{ us denote··· a distribution} plan of the type h of search resources by ϕh = ϕhb(i,t), i K, t T , where ϕh(i,t) is the amount of resources to be scattered{ in cell i∈at time∈t. We} call the searcher using the h-type resources or sensors the h-type searcher. b (A5) The distribution of x h-type resources in cell i gives the searcher the following detection probability 1 exp( αhx) (1) − − i h only if the target is there. Parameter αi indicates the detection effectiveness per unit h-type resource in cell i. (A6) If the searcher detects the target, the searcher gets reward 1 and the target loses the same.

If we denote a residual energy of the target at the beginning of time t by e(t), we can define the movement constraints in Assumption (A3) on a feasible target path ω Ω starting from initial cell k, as follows: ∈ (i) Initial cell condition: ω(1) S ∈ 0 (ii) Movable cell conditions: ω(t + 1) N(ω(t),t), t =1, ,T 1 ∈ ··· − (iii) Initial energy condition: e(1) = e0 (iv) Preservation conditions of energy: e(t +1) = e(t) µ(ω(t),ω(t + 1)), t = 1, ,T 1 − ··· − (v) Energy condition for a motion: µ(ω(t),ω(t + 1)) e(t), t =1, ,T 1 ≤ ··· − Ω is defined as a set of feasible paths of satisfying all conditions above. As seen from Assumption (A6), the problem is a two-person zero-sum game with the detection probability as payoff. Before the derivation of the payoff function of the game, let us confirm the feasible region of the distribution plan ϕh of the h-type searcher. It is given by

Ψ ϕ ϕ (i,t) Φ(t), ϕ (i,t) 0, i K, t T , (2) h ≡  h h ≤ h ≥ ∈ ∈  i K  X∈  b

from Assumption (A4). Assume that the h-type searcher takes a pure strategy ϕh and the target does a pure strategy ω. At time t, ϕh(ω(t),t) resources are effective 132 Ryusuke Hohzaki to the target detection because the target is in cell ω(t). Therefore, the searcher has the detection probability of the target

R (ϕ ,ω)=1 exp αh ϕ (ω(t),t) h h − − ω(t) h  t T  X∈ b    as the payoff function from Equation (1). We think of a target mixed strategy π π(ω), ω Ω , where π(ω) is the probability of choosing a path ω. It feasible≡ region { is given∈ by}

Π π(ω) π(ω)=1, π(ω) 0,ω Ω . ≡ ({ } ≥ ∈ ) ω Ω X∈

By the h-type searcher’s strategy ϕh and the target mixed strategy π, the searcher has the expected payoff as follows:

Rh(ϕh, π)= π(ω)Rh(ϕh,ω) ω Ω X∈

= π(ω) 1 exp αh ϕ (ω(t),t)  − − ω(t) h  ω Ω   ∈  t T  X  X∈ b      =1 π(ω)exp αh ϕ (ω(t),t) . (3) − − ω(t) h  ω Ω ∈ t T X  X∈ b    The h-type searcher wants to maximize the payoff. On the other hand, the target desires to minimize the following expected payoff:

R(ϕ, π)= f(h)Rh(ϕh, π) h H X∈

= π(ω) f(h) 1 exp αh ϕ (ω(t),t)  − − ω(t) h  ω Ω h H   ∈ ∈  t T  X X  X∈ b      =1 π(ω) f(h)exp αh ϕ (ω(t),t) , (4) − − ω(t) h  ω Ω h H ∈ ∈ t T X X  X∈ b    considering all types of searchers’ strategies ϕ ϕh,h H based on the proba- bility distribution f(h),h H of searcher’s type.≡ { The expected∈ } payoff by a target pure strategy ω and{ all types∈ of} searchers’ strategies ϕ, R(ϕ, ω), is given by

R(ϕ, ω) f(h)R (ϕ ,ω)= f(h) 1 exp αh ϕ (ω(t),t) . ≡ h h  − − ω(t) h  h H h H   ∈ ∈  t T  X X  X∈ b    At a glance, the payoff looks different between the searcher and the target. Let us discuss this search game and derive an equilibrium. A Search Game with Incomplete Information on Detective Capability of Searcher 133

3. Equilibrium of the One-Shot Search Game

As seen from Equations (3) and (4), a set of optimal strategies ϕh of maximizing Rh(ϕh, π) for every h maximizes R(ϕ, π) in the aggregate. As a result, all types of searchers and the target play a two-person zero-sum game with the payoff R(ϕ, π). A max-min optimization of R(ϕ, π) is all we have to do to obtain optimal searchers’ strategies for all types. We can transform a minimization of R(ϕ, π) with respect to π into

min R(ϕ, π) = min π(ω) f(h)Rh(ϕh,ω) π π ω Ω h H X∈ X∈

h = min f(h) 1 exp αω(t)ϕh(ω(t),t) . (5) ω Ω  − −  ∈ h H   ∈  t T  X  X∈ b      We maximize the value minimized above with respect to ϕ to have the following formulation of the max-min optimization:

(PS ) max ν ϕ,ν

s.t. f(h) 1 exp αh ϕ (ω(t),t) ν, ω Ω, (6)  − − ω(t) h  ≥ ∈ h H   ∈  t T  X  X∈ b    ϕ (i,t) Φ(t), t T ,h H, ϕ (i,t) 0, i K,t T ,h H. h  ≤ ∈ ∈ h ≥ ∈ ∈ ∈ iXK ∈ b b Considering that the feasible region of constraints on variables ϕ and ν is a convex set, the problem is a convex programming problem. We can numerically solve it to get optimal strategies ϕ∗ of all types of searchers by a general commercial solver.

We note that the optimal h-type searcher’s strategy ϕh∗ maximizes Rh(ϕh, π) for a given target strategy π and its maximized value is a real detection probability if the type h is realized to the searcher. Let us write down the maximization problem.

h h (PS ) Dh∗ = max 1 π(ω)exp αω(t)ϕh(ω(t),t) ϕh − −  ω Ω ∈ t T X  X∈ b  s.t. ϕ (i,t) Φ(t),   (7) h ≤ i K X∈ ϕ (i,t) 0, i K, t T . (8) h ≥ ∈ ∈ From now, we are going to derive an optimalb target strategy. The optimal strategy π must be a best response to optimal searchers’ strategies ϕ∗ = ϕ∗ ,h H of the { h ∈ } problem (PS ) so that π minimizes R(ϕ∗, π) = ω Ω π(ω) h H f(h)Rh(ϕh∗ ,ω). ∈ ∈ Conversely, ϕ∗ must be a best response to the target strategy π and then be an h P P optimal solution of the problem (PS ). Setting Lagrange multipliers λh(t) and ηh(i,t) 134 Ryusuke Hohzaki

corresponding to conditions (7) and (8), we define a Lagrange function by

L (ϕ ; λ (t), η(i,t)) π(ω)exp αh ϕ (ω(t),t) h h h ≡ − ω(t) h  ω Ω ∈ t T X  X∈ b    + λ (t) ϕ (i,t) Φ(t) η (i,t)ϕ (i,t) h h − − h h i ! i,t t T X∈ b X X and have the so-called Karush-Kuhn-Tucker (KKT) conditions as necessary and h sufficient conditions of optimal solution of the problem (PS ), as follows:

∂Lh h h = αi π(ω)exp αω(t′)ϕh(ω(t′),t′) ∂ϕh(i,t) − −  ω Ωit ∈ t′ T X  X∈ b   +λ (t) η (i,t)=0, (9) h − h λ (t) ϕ (i,t) Φ(t) =0, t T , (10) h h − ∈ i K ! X∈ η (i,t)ϕ (i,t)=0, (i,t) K T , b (11) h h ∈ × λh(t) 0, t T , (12) ≥ ∈ b ηh(i,t) 0, (i,t) K T , (13) ≥ b ∈ × ϕh(i,t) Φ(t), t T , (14) ≤ ∈ b i K X∈ ϕ (i,t) 0, (i,t) K b T , (15) h ≥ ∈ ×

where Ωit ω Ω ω(t)= i . We unifyb conditions (9)–(13) to have the following ≡{ ∈ | } conditions for an optimal solution ϕh∗ .

(i) If ϕh∗ (i,t) > 0,

h h α π(ω)exp α ′ ϕ∗ (ω(t′),t′) = λ (t). (16) i − ω(t ) h  h ω Ωit ∈ t′ T X  X∈ b    (ii) If ϕh∗ (i,t) = 0,

h h α π(ω)exp α ′ ϕ∗ (ω(t′),t′) λ (t). (17) i − ω(t ) h  ≤ h ω Ωit ∈ t′ T X  X∈ b    Because the optimal solution of the problem (PS ), ϕ∗ = ϕh∗ , satisfies conditions (14) and (15), we can regard these conditions (16) and (17){ as} the conditions of an

optimal π∗ such that ϕh∗ becomes a best response to π. If the conditions hold for every type h, ϕ∗ = ϕ∗ is aggregately the best response to π. From the discussion { h} A Search Game with Incomplete Information on Detective Capability of Searcher 135 so far, we formulate the problem of giving an optimal target strategy π∗ into the following:

h (PT ) min π(ω) f(h) 1 exp αω(t)ϕh∗ (ω(t),t) π,λ  − −  ω Ω h H   ∈ ∈  t T  X X  X∈ b      h h s.t. α π(ω)exp α ′ ϕ∗ (ω(t′),t′) = λ (t) (18) i − ω(t ) h  h ω Ωit ∈ t′ T X  X∈ b    for (i,t,h) K T H of ϕ∗ (i,t) > 0, ∈ × × h

h h b α π(ω)exp α ′ ϕ∗ (ω(t′),t′) λ (t) (19) i − ω(t ) h  ≤ h ω Ωit ∈ t′ T X  X∈ b    for (i,t,h) K T H of ϕ∗ (i,t)=0, ∈ × × h π(ω)=1, (20) b ω Ω X∈ π(ω) 0, ω Ω. (21) ≥ ∈ 4. Consistency of Equilibrium with Dynamic Game The situation of the search game transits as time elapses. At each time on a time horizon, each player comes to acquire private or common information about the situation of the game. That is why we regard our one-shot game as a dynamic game changing on the time horizon. Here we confirm that the equilibrium of π∗ and ϕ∗, which we derived in Section 3., is consistent with the dynamic game. Assume that it is time t and the game has progressed to the time. Assume more that the target has not been detected and has residual energy e at cell i while moving along a pre-determined path ω. At the present time t, both players recognize the non-detection of the target but the selection of the path ω and the current state (i,e) are private information of the target. The realized type h of the searcher and the past distribution of search resources ϕh∗ (i,t′),i K,t′ = τ, ,t 1 are a private information of the searcher. Each{ player can make∈ a ration···al estimation− } on his opponent’s strategy, ϕ∗ or π∗ from the problems (PS ) or (PT ). For the game continuing after the current time t, rational decision making of the target and the searcher must have the following features;

τ (i) If the h-type searcher took a distribution plan ϕt 1 ϕh(i,t′),i K,t′ = τ, ,t 1 in the past, he should revise his strategy− after≡ { time t by∈ rationally estimating··· − the} target strategy of path selection by

t 1 h π∗(ω)exp t′−=τ αω(t′)ϕh(ω(t′),t′) π′(ω)= − (22)  t 1  P − h ω′ Ω π∗(ω′)exp t′=τ αω′(t′)ϕh(ω′(t′),t′) ∈ − P  P  in a Bayesian manner. The revised searcher’s strategy coincides with the pre- planned strategy ϕ∗ (i,t′),t′ = t, ,T . It would be proved later. { h ··· } 136 Ryusuke Hohzaki

(ii) The target should revise the estimation on the probability of searcher’s type h by

t 1 h f(h)exp t′−=τ αω(t′)ϕh∗ (ω(t′),t′) f ′(h)= − (23)  t 1 ′  P − h h′ H f(h′)exp t′=τ αω(t′)ϕh∗′ (ω(t′),t′) ∈ − P  P  in a Bayesian manner. Based on the estimation, the target should choose a path ω starting from his current state (i,t,e) with probability

e ω′ Ω(i,t,e;ω) π∗(ω′) π(ω)= ∈ e e . (24) ω′ Ω(i,t,e) π∗(ω′) P ∈ e P Ω(i,t,e) is a set of paths going through the state (i,t,e) and Ω(i,t,e; ω) is a set of paths starting from the state (i,t,e) and having the same route as path ω. The sets are defined by e e

e Ω(i,t,e) ω Ω ω(t)= i, e(ω,t)= e ≡{ ∈ | } Ω(i,t,e) ω Ω ω(t′)= ω(t′), t′ = t +1, ,T , ≡{ ∈ | ··· } where e(ω,t) ise the amount of residual energye at time t on the path ω and is t 1 defined by e(ω,t) e ′− µ(ω(t′),ω(t′ + 1)). ≡ 0 − t =1 P Let us prove the above claim is correct. (i) A rational searcher’s strategy: If the searcher recognizes no detection and ra- tionally estimates the target path selection by Equation (22) at the time t, the searcher’d better make his strategy by solving the following problem:

T h h (QS) vt∗ = min π′(ω)exp αω(t′)ϕh(ω(t′),t′) ϕt − T ω Ω t′=t ! X∈ X s.t. ϕ (i,t′) Φ(t′),t′ = t, ,T, ϕ (i,t′) 0,i K, t′ = t, ,T, h ≤ ··· h ≥ ∈ ··· i K X∈

t t where π′ is fixed and ϕT is ϕT ϕh(i,t′), i K,t′ = t, ,T . If the game continues up to the time t with no≡ detection { of the∈ target, the··· searcher} would take t the strategy ϕT∗ derived from the problem (QS) and evaluate the minimum non- h detection probability vt∗ during [t,T ]. At the initial time t = 1, the searcher has to consider the non-detection probability during an early period [τ,t 1] −

t 1 − h h P t 1 π∗(ω′)exp αω′(t′)ϕh(ω′(t′),t′) − ≡ − ω′ Ω t′=τ ! X∈ X

h τ and vt∗ during the late period [t,T ], and make a plan ϕt 1 ϕh(i,t′),i K,t′ = τ, ,t 1 during the period [τ,t 1] to minimize the− total≡{ non-detection∈ proba- ··· h− } h − bility P t 1vt∗ . As a result, we have the following problem for an optimal strategy − A Search Game with Incomplete Information on Detective Capability of Searcher 137

τ ϕt 1 from Equation (22): − t 1 − h h π∗(ω)exp t′=τ αω(t′)ϕh(ω(t′),t′) min P min − τ t 1 t h ϕt−1 − ϕ   T ω Ω P P t 1 X∈ − T h exp αω(t′)ϕh(ω(t′),t′) × − ′ ! tX=t T h = min π∗(ω)exp αω(t′)ϕh(ω(t′),t′) ϕh − ω Ω t′=τ ! X∈ X h The problem is equal to the problem (PS ) formulated for the h-type searcher’s optimal strategy. (ii) A rational target strategy: Consider a target on a path ω with a current state (i,t,e). Based on the estimation (23), the target has to make a path selection plan π of paths starting from the current state by solving the following problem:

t 1 h f(h)exp ′− α ′ ϕ (ω(t′),t′) e − t =τ ω(t ) h λ(∗i,t,e)(ω) = max π(ω) π  PS(i,t,e; ω)  e ω Ω(i,t,e) h H e∈ X X∈ T e e h exp αω(t′)ϕh(ω(t′),t′) , × − ′ e ! tX=t e t 1 h′ where S(i,t,e; ω) h′ H f(h′)exp t′−=τ αω(t′)ϕh′ (ω(t′),t′) . If the target ≡ ∈ − takes a path ω at timePt = 1, he reaches theP state (i,t,e) undetected with probability S(i,t,e; ω). Therefore, the target would choose a path during [1,t 1] rationally from the following optimization: −

max π(ω)S(i,t,e; ω)λ(∗i,t,e)(ω) π (i,e) ω Ω(i,t,e) X ∈ X T h = max π(ω) π(ω) f(h)exp αω(t′)ϕh(ω(t′),t′) π,π − e (i,e) ω Ω(i,t,e) ω Ω(i,t,e) h H t′=τ ! X ∈ X e∈ X X∈ X eT e h = max π(ω) f(h)exp α ′ ϕh(ω(t′),t′) . π − ω(t ) ω Ω h H t′=τ ! X∈ X∈ X The path ω is constructed by connecting ω during [1,t] and ω after the time t and the probability of taking the paths ω and ω is expressed by π(ω) π(ω) π(ω). ≡ · As a result, we have the same objective function as Functione (4) essentially. The target’d better choose his path ω with the probabilitye π∗(ω)= π(ω) π(ω) at et =1e and obey the rule given by Equation (24) for an optimal path selectio· n after time t. e e We have made sure that in a state (i,t,e) on the process of the dynamic game, an equilibrium of the game starting from the state is consistent with an equilibrium of the one-shot game, ϕ∗ and π∗, at t = 1. By Bayesian estimations of (22) and (23), the searcher keeps his optimal strategy ϕh∗ at any time and the target revises his path selection plan by Equation (24). 138 Ryusuke Hohzaki

5. Numerical Examples

Here we take some numerical examples to investigate the effects of information about the searcher’s type on search games. We consider a search space consisting of a discrete time space T = 1, , 6 and 19 hexagonal cells K = 1, , 19 , as shown in Figure 1. Cells 9 and{ 11··· are obstacles} to interrupt the passa{ge··· of a target;} the target cannot go through these cells.

The target is at Cell S0 = 1 at time t = 1 and intends to go to Cell 19 by t = 6. He can move to the cells located{ } next to his current position and two-cell-distance cells every time. From Cell 1, for example, he can move to cells 1, 2, 3, 4, 5, 6, 8, 9 and 10 if energy allows him to do so. He consumes nothing to stay at the current cell but energy 1 to move to the next cells and 4 to the two-cell-distance cells, as parameter µ(i, j) is set. He has initial energy e0 = 8 at time t = 1. The movement constraints allows the target to have 294 options of paths from Cell 1 to 19. Φ(t) = 1 search resource is available to the searcher every time t = 2, , 6 after τ = 2. The detection effectiveness of the resource is different depending··· on cell. In Cells 8, 13, 14 beneath obstacle cell 9 and in Cells 7, 12, 16 beneath the other obstacle cell 11, the effectiveness is lower. The searcher has two types of resources, the effectiveness of which is αi =0.8 in any cell except the cells specified above. In cell j 8, 13, 14, 7, 12, 16 , h = 1-type resource, say classic resource, has α =0.2 ∈{ } j but h = 2-type resource, say improved resource, has αj =0.6. The target knows that the resource’s type or the searcher’s type is h = 1 or h = 2, but does not know which type of the resource the searcher uses in practice.

Figure 1. A search space

5.1. Sensitivity analysis of detection probability to f(1) In Figure 2, a curve with circles indicates the expected detection probabilities or the optimized values of Problem (PS) or (PT ) for f(1) varying from 0 through 1. For the sake of comparison, we depict a curve with squares in the case of no private information, in which the target knows the searcher’s type. We weight two game val- ues for h = 1-type resource and h = 2-type resource with f(1) and f(2) = 1 f(1), respectively, to calculate an expected detection probability in the case of− no pri- vate information. Two curves coincide at the points of f(1) = 0 and f(1) = 1 because both points mean the target’s recognition of the resource type. The differ- ence between both curves is the value of the searcher’s private information about the resource’s type. As the searcher has some advantage of knowing his own type, the curve with private information always lies higher than that with no private A Search Game with Incomplete Information on Detective Capability of Searcher 139 information. The value of information becomes largest around point f(1) = 0.5, where the uncertainty of the searcher’s type is largest to the target. Table 1 gives us the detailed results of the numerical example above. From the left, the following numbers are listed in each column: f(1), the value of the game with private information, the value of the game with no private information, the value of information explained above, and Rh(ϕh∗ , π∗) for h = 1, 2, calculated by Equation (3). Number in the last column is R(ϕ∗, π∗) R (ϕ∗, π∗) in the case of − 1 1 f(1) < 0.5 and R(ϕ∗, π∗) R2(ϕ2∗, π∗) in the other case of f(1) > 0.5. The number indicates how much the target’s− wrong estimation about the resource’s type affects the detection probability. The number is transformed into

(1 f(1))(R2(ϕ2∗, π∗) R1(ϕ1∗, π∗)),if h =1 R(ϕ∗, π∗) Rh(ϕh∗ , π∗)= − − − (1 f(2))(R1(ϕ∗, π∗) R2(ϕ∗, π∗)),if h =2  − 1 − 2 and proportional to 1 f(h) and R1(ϕ1∗, π∗) R2(ϕ2∗, π∗). That is why the number tends to be larger as the− target estimates the− resource’s type more wrong. As seen from Table 1, the wrong estimation affects the detection probability larger in the case of f(1) > 0.5 than in the case of f(1) < 0.5. Let us analyze the difference

between the two cases by taking f(1) = 0.1 and f(1) = 0.9 as their representatives.

: ¥ ¨

: ¥ §

: ¥ ¦

: ¥ ¥

: ¥ ¡

: ¥ :

: ¡ ¤

: ¡

£

: ¡ ¢

: : ¡ : ¥ : ¦ : § : ¨ : © : ¢ : : ¤ ¡ £

Figure 2. Game values for f(1)

Table 1: Sensitivity analysis of the game to f(1)

f(1) R(ϕ∗, π∗) past model value of info. R (ϕ∗, π∗) R (ϕ∗, π∗) R R 1 1 2 1 − h 0 0.2390 0.2390 0.0000 0.1379 0.2390 0.1011 0.1 0.2380 0.2351 0.0030 0.2280 0.2391 0.0102 0.2 0.2368 0.2311 0.0057 0.2256 0.2396 0.0112 0.3 0.2352 0.2271 0.0081 0.2229 0.2405 0.0123 0.4 0.2331 0.2231 0.0100 0.2184 0.2430 0.0147 0.5 0.2303 0.2192 0.0111 0.2133 0.2472 N/A 0.6 0.2261 0.2152 0.0110 0.2072 0.2546 0.0285 0.7 0.2210 0.2112 0.0098 0.2042 0.2601 −0.0391 0.8 0.2149 0.2072 0.0077 0.2018 0.2673 −0.0524 0.9 0.2077 0.2032 0.0044 0.1997 0.2797 −0.0720 1 0.1993 0.1993 0.0000 0.1993 0.1810− 0.0183 140 Ryusuke Hohzaki

5.2. Effects of wrong estimation about the resource’s type Let us check an optimal movement strategy of the target first. The h = 1-type resources are much less effective for the searcher to detect the target moving on paths running on the right and left sides of the search space but are comparatively effective to detect the target moving in the central area between two obstacle cells. The h = 2-type resource is improved from the h = 1-type one in terms of its detection effectiveness in the side areas on the left side of Cell 9 and the right side of Cell 11. The target with an estimation of the h = 1-type resource would more likely take paths running in the side areas than the target with the other estimation of the h = 2-type resource. Practically, we can confirm the tendency by checking that the probabilities of target’s taking paths going in the side areas and paths passing through the central area are 0.395 and 0.605, respectively, for f(1) = 0.9, and 0.344 and 0.656 for f(1) = 0.1. We must note that 91 percentages of 294 all paths go through the central area, though. Additionally, the target should use a mixed strategy of taking many paths running in various areas to make the searcher difficult to have a good estimate about the target path. Therefore, the probabilities of taking side paths are not extremely biased even for f(1) = 0.9.

Each type of searcher makes a rational distribution plan of resources, ϕh∗ , consid- ering the features of the target strategy explained above. In the case of f(1) = 0.9, in which the target more likely believes the searcher’s possession of the h = 1-type resource, the h = 2-type searcher with the h = 2-type resource puts a lot of the improved resources in side areas to increase the detection probability. In the case of f(1) = 0.1, in which the target has more belief of the searcher’s usage of the h = 2-type resource, the target does not bias the path selection in favor of side paths but evenly takes the paths going through the central area and the side areas. Under the situation, the h = 1-type searcher can keep the detection probability being not so worse even if he has to use less effective h = 1-type resources in the side areas, as seen from the cases of f(1) = 0 0.5 in Table 1. − As discussed so far, the effect of the wrong estimation about the resource’s type is getting bigger as f(1) is increasing. But we regard the cases of f(1) = 0 and f(1) = 1 as exceptions because we would properly think that the target has a confidence in the resource’s type in the both cases. From the analyses above, the searcher has a lesson about the usage of new and old sensors. It would say that developing of new technological sensors and installing them are more effective in search operations when the searcher pretends to possess no advanced sensor to his opponents. Conversely, to the target with a belief of the searcher’s possession of advanced sensors, even the usage of old-fashion sensors does not make the detection probability so worse. The lesson would be valid just in the cases set in this section because of case-dependency usually. We just demonstrate the possibility of these analyses.

6. Conclusion In this paper, we constructed a general model of incomplete information SAG with searcher’s private information about search resource such as detection sensors and proposed a method to derive its equilibrium to evaluate the value of the information. By some numerical examples, we concretely evaluated the value of the information about the high-tech and the old-fashion sensors and analyzed the effects of target’s guess at the sensor’s detection effectiveness on the detection probability of the target A Search Game with Incomplete Information on Detective Capability of Searcher 141 in a search operation. From the analysis, we demonstrated the discussion about the value of the research and development of sensing technology. We handled the incomplete information about the detection effectiveness of the search resource in this paper but we could extend our procedure to other properties or characteristics of the resource uncertain to the target.

References Brown, S.S. (1980). Optimal search for a moving target in discrete time and space. Oper- ations Research, 28, 1275–1286. Dambreville, F. and J.P. Le Cadre (2002). Detection of a Markovian target with optimiza- tion of the search efforts under generalized linear constraints. Naval Research Logistics, 49, 117–142. De Guenin, J. (1961). Optimum distribution of effort: an extension of the Koopman basic theory. Operations Research, 9, 1–9. Hohzaki, R. (2006). Search allocation game. European Journal of Operational Research, 172, 101–119. Hohzaki, R. (2007a). Discrete search allocation game with false contacts. Naval Research Logistics, 54, 46–58. Hohzaki, R. (2007b). A multi-stage search allocation game with the payoff of detection probability. Journal of the Operations Research Society of Japan, 50, 178–200. Hohzaki, R. (2008). A search game taking account of attributes of searching resources. Naval Research Logistics, 55, 76–90. Hohzaki, R. (2009). A cooperative game in search theory. Naval Research Logistics, 56, 264–278. Hohzaki, R. (2013a). The search allocation game. Wiley Encyclopedia of Operations Re- search and Management Science, John Wiley & Sons, Online-version, 1–10. Hohzaki, R. (2013b). A nonzero-sum search game with two competitive searchers and a target. Annals of Dynamic Games, 12, 351–373. Hohzaki, R. (2016). Search games: Literature and survey. Journal of the Operations Re- search Society of Japan, 59, 1–34. Hohzaki, R. and K. Iida (1998). A search game with reward criterion. Journal of the Operations Research Society of Japan, 41, 629–642. Hohzaki, R. and K. Joo (2015). A search allocation game with private information of initial target position. Journal of the Operations Research Society of Japan, 58, 353–375. Ibaraki, T. and N. Katoh (1988). Resource Allocation Problems: Algorithmic Approaches. The MIT Press: London. Iida, K. (1972). An optimal distribution of searching effort for a moving target. Keiei Kagaku, 16, 204–215 (in Japanese). Iida, K., R. Hohzaki and S. Furui (1996). A search game for a mobile target with the conditionally deterministic motion defined by paths. Journal of the Operations Research Society of Japan, 39, 501–511. Iida, K., R. Hohzaki and K. Sato (1994). Hide-and-search game with the risk criterion. Journal of the Operations Research Society of Japan, 37, 287–296. Kadane, J.B. (1968). Discrete search and the Neyman-Pearson lemma. Journal of Mathe- matical Analysis and Applications, 22, 156–171. Koopman, B.O. (1957). The theory of search III: the optimum distribution of searching effort. Operations Research, 5, 613–626. Matsuo, T. and R. Hohzaki (2017). A search game with incomplete information about target’s energy. Scientiae Mathematicae Japonicae, 79, to appear. Morse, P.M. and G.E. Kimball (1951). Methods of Operations Research. MIT Press: Cam- bridge. Nakai, T. (1988). Search models with continuous effort under various criteria. Journal of the Operations Research Society of Japan, 31, 335–351. 142 Ryusuke Hohzaki

Pollock, S.M. (1970). A simple model of search for a moving target. Operations Research, 18, 883–903. Stone, L.D. (1975). Theory of Optimal Search. Academic Press: New York. Washburn, A.R. (1983). Search for a moving target: the FAB algorithm. Operations Re- search, 31, 739–751. Washburn, A.R. and R. Hohzaki (2001). The diesel submarine flaming datum problem. Military Operations Research, 4, 19–30. Contributions to Game Theory and Management, X, 143–161 Application of Game Theory in the Analysis of Economic and Political Interaction at the International Level

Pavel V. Konyukhovskiy1 and Victoria V. Holodkova2 1 St.Petersburg State University, Faculty of Economic, Department of Economic Cybernetics, Chaykovskogo ul. 62, St.Petersburg, 191123, Russia E-mail: [email protected] 2 St.Petersburg State University, Faculty of Economic, Department of Economic Cybernetics, Chaykovskogo ul. 62, St.Petersburg, 191123, Russia E-mail: Holodkova [email protected]

Abstract The main objective of the work is to consider possible approaches to the application of the theory of cooperative games for the simulation of the relationship between the major centers of economic and political influence in the modern world. The authors discuss three main interval repositioning of political forces in the world until 2008, from 2008 to 2014 and after 2014. Model presented in the building on political cooperation are intended to reflect the economic component of this interaction on the world market. The results are subject to more detailed content analysis. Special attention is paid to the issues of construction of characteristic functions for cooperative games, the underlying model interaction of the centers of political forces. In particular, the technique of construction of characteristic function, involving the use of baskets, constructible on currencies of countries coming together in a coalition. Keywords: cooperative games, cooperative models interaction of the inter- national centers of power, stochastic cooperative games, imputations, Shap- ley value

1. Introduction Recent years we have seen impressive shifts not only in economics sphere, but also in those spheres which have traditionally characterized by polysemantic definition politics. Just two or three years, significantly transformed the system of international political and economic relations. Simple comparison of the problems which was most actual for period 2009-2010 with most actual today problems looks symptomatic. Just years ago, the question: ”Will there be a second wave of the crisis?” was the most discussed. The vast majority of experts responded in the affirmative. Opinions differed on the timing of the crisis only. This greatly reduces the value of these opinions. It is extremely difficult to verify the quality of the forecasts of perspectives of evolution of Russia and the global economic system which were based on threats, challenges and wave of the crisis. It is hard to overestimate the complexity of the problems faced by the Russian economy in recent year. However, we have to admit that the main reason for today’s problems are eco- nomic sanctions, and not the second wave of crisis. 144 Pavel V. Konyukhovskiy, Victoria V. Holodkova

2. Introductory provisions

There is common set of issues about the relationship between economics and politics spheres. On the one hand, it‘s generally accepted the basis foundation of political phe- nomena are economic processes. At the same time, there is no doubt that the po- litical processes affect the economy macro-economic systems of individual countries and regions, and the global economic system as a whole. There are many works devoted to explaining the objective economic reasons, political conflicts, crises and wars. There is a lot of papers considering the economic consequences of political decisions. There is also a lot of research devoted to the economic consequences of the economic policy decisions. However, significantly fewer studies in which these issues would be considered in the logic chain economy-politics- economics. See, for example, (Dergachov, 2011; Dergachov, 2005). The researches aimed at finding methods and tools to describe and analyze the causal relationship between economic processes and political events are very actual and interesting. Let’s illustrate this issue by a concrete example. The current economic and political situation in Russia is characterized by complex set of problems, such as Crimea incorporation, cardinal shift in relations with Ukraine, course on economic cooperation with China and the BRICS countries, building new relationships with the countries of Western Europe and freezing cooperation with United States. There are two possible approaches to explain the sequence of these events. One of them based on the priority political factors, the other on the economics. In the first case, the key causes are taken decision of the political leadership of the country, has decided to pursue a more independent course. This has led to conflicts with the leading powers of the modern world. They are trying to restore the status quo using economic sanctions. The sanctions have caused harm not only to Russia but also to Western countries, at the same time. However, they are guided, basic values are not ready to go to their abolition as long as Russia does not back down. With the weight of this argument is difficult to disagree. As historical experience shows, concessions a separate state on fundamental ideological issues can lead to their destruction. The second case explanations the fundamental cause of the current events base on issue of balance of power in the world. There were very serious socio-economic and socio-political transformation in the first decade of the XXI-st century in Russia. No less important transformation took place with China, India and with other countries, which undoubtedly changed their relative weight in the international arena. This led to the emergence of new claims on their side. This version looks more realistic than ideological value-oriented version. The both approaches have a common defect. Namely, they have speculative character. More preferable from a scientific standpoint is the conclusion, based on a model researches. This paper focuses on the mathematical model allowing to represent and analyze the economic and political processes of the modern world in terms of the interaction of a coalition of world powers, international associations and countries. The key place occupied by cooperative effects in such cooperation and using methods of cooperative game theory. Application of Game Theory 145

3. The basic game-theoretic model of cooperative interaction centers of political influence (Base-3) We will look at several versions for constructing cooperative game models assess- ment of political influence, next. Of course, we are talking about an extremely simplified conceptual models. We restrict our discussion to the game with three participants (players) Russia (1st player), China (2nd player), West (3rd player). We agree to call this game briefly Base-3 with adding postfix a, b, c, etc. Recall that the classical cooperative game with transferable utility is given where a pair (I, v): I = 1, ..., n - a set of players; • { } v - the characteristic funtion (2I R), that each subset of players S I • (a coalition of players) assigns the−→ value of the usefulness of the coalition⊂ in the event of its formation (v( S )). { } The definition of ”cooperative game solution” based on the concepts of ”impu- tation” and ”pre-imputation”. Let define for cooperative game (I, v) as vector x = (x1, x2, ..., xn) satisfying the condition individual rationality • x v( i ), i 1..n ; (1) i ≥ { } ∀ ∈{ } group rationality • n x = v( I ). (2) i { } i=1 X In other words, imputation is a distribution of the full utility of the coalition of all players (the so-called grand or full coalition). Imputation gives each player the utility of not less than he could get individually (without entering into any coalition). Widely known methods of the index influence calculation proposed in the works (Penros, 1946; Shapley, 1954; Banzhaf, 1965; Johnston, 1978; Deegan, 1978; Holler, 1983) and others. There are many studies devoted to the application of the theory of cooperative games for analyzing the interaction of political forces within the elected bodies (par- liaments, legislatures, etc). In particular, in this connection it is appropriate to recall such works as (Aleskerov et al., 2008), which analyzes the distribution of power be- tween the fractions of the State Duma of the Russian Empire, (Sokolov, 2008), dedicated to the calculation of the indices of influence and consider them examples of the Council of Ministers of the European Union and modern Russia’s State Duma As the values of the characteristic function of the considered games Base-3 at baseline studies can be used conventional assessment of the impact on the situation in the world that has this or that party or coalition of parties. The level of influence is the variable taking values in the interval [0, 1] (0 - lack of influence, 1 - the maxi- mum impact). The usefulness of the countries and coalitions bases on the criterion of influence. We are developing the traditions already mentioned earlier political sci- ence applications cooperative games associated with the assessment of the impact of the parliamentary parties and coalitions on the basis of indices: Shapley-Shubik 146 Pavel V. Konyukhovskiy, Victoria V. Holodkova

(Shapley, Shubik, 1954), Banzhaf (Banzhaf, 1965), Penrose (Penrose, 1946), and others. In the simplest case, a quantitative assessment of the influence of player (the coalition) may, for example, based on the share in the pool of all considering in- ternational problems those problems which can‘t be solved without agree of this player (coalition). This approach can be justified by those circumstances that in the sphere of interstate cooperation veto-players, usually defined enough simply.

Table 1. The characteristic function of the game Base-3-a for the period 1992 - 2008.

i v(i) S v(S) 1 0 {1,2} 0 2 0 {1,3} 1 3 0.8 {2,3} 1 {1,2,3} 1

Source: conditional data Table 1 shows the characteristic function corresponding to the balance of power in the world for the period about 1992 - 2008 (game Base-3-a). The predominance of the third player (West) observed at this period. Players 1 and 2 are not able to create a counterbalance to him. Fig. 1 demonstrates a geometric interpretation of the set of non-negative pre- imputation for game Base-3-a. In this case pre-imputations are vectors lying on the plane

x + x + x = v( I ) (3) 1 2 3 { } in three-dimensional space. Traditionally for geometric illustrations are using flat picture, which represents the plane (3). One of the main concepts of the solution of cooperative games is Core, or set of non-dominated imputations

C(v)= x Rn ( S I) x v( S ), x = v( I ) . (4) { ∈ | ∀ ⊂ i ≥ { } i { } } i S i I X∈ X∈ From (4) it follows that any imputation belonging to Core, allows any coalition no less than it could obtain without cooperation with other any players. As can be seen in Fig. 1 Core games Base-3-a (period 1992 - 2008) is a singleton

C(v)=(0, 0, 1), (5) which marks a maximum of third player (West) influence. The method of constructing the characteristic function is controversial. At the same time the logic on which it is based looks quite plausible. In particular, highest level of influence reflects the dominance of the West at the stage that followed the collapse of the Soviet Union. Complete dominance of the West is due to the inability to create a winning coalition without him

v( 1, 3 )= v( 2, 3 )=1 { } { } Application of Game Theory 147

Fig. 1. Geometric illustration of the game Base-3-a, 1992-2008. in that v( 1, 2 ) = 0. This occurs{ } despite the fact that the individual payoff of the West (v( 3 )=0.8) less than 1. { } Another possible distribution of influence between players in the game Base-3-a is based on the vector (the values) Shapley

(a) 1 1 14 Sh = ( 30 , 30 , 15 ). Recall that the values of Shapley vector can be expressed as

s!(n s 1)! Sh = − − (v( S i ) v( S )), (6) i n! { ∪ } − { } S:i/S X∈ where n - the total number of players, s - the number of players in the coalition S. Shapley vector give for player i I a value representing a weighted sum of the increments of utility coalition, which∈ are caused by the addition to these coalitions player i. Shapley value of the game Base-3-a does not belong to Core. This means that there are potential challenges to coalitions 1, 2 , 1, 3 . The Shapley values could be interpreted as possible marks of concessions{ from} { the} players 3 in relation to the players 1 and 2 when it is interested in reaching a consensus (the creation of a full coalition) 148 Pavel V. Konyukhovskiy, Victoria V. Holodkova

One of the advantages of the game Base-3-a is that it allows one to clearly describe the changes in the system of interstate relations that are observed after the economic crisis of 2008. The highlighting of 2008 as the beginning of new stage is highly conditional. As an argument in favor of such a choice, we can take an involvement of Russia in the events in South Ossetia (August 2008), which is visibly different from the Russian foreign policy during the events in Kosovo in 1999, is limited only by diplomatic demarches. The changes taking place in relations between the world centers of power in the years 2008 - 2014 are shown in Table 2 (game Base-3-b).

Table 2. The characteristic function of the game Base-3-b (Period 2008 - 2014).

i v(i) S v(S) 1 0 {1,2} 0.2 2 0 {1,3} 1 3 0.8 {2,3} 1 {1,2,3} 1

Source: conditional data In the new conditions of the possibility of players 1 and 2 (Russia, China) set as

v( 1, 2 )=0.2 { } The changes mean that Russia and China can achieve predominant influence in those 20 procents of cases that are beyond the control of the player to the West, when he did not enter into any coalition

v( 1, 2 )=0.2=1 v( 3 ). { } − { } A geometric interpretation of the game Base-3-b is shown in Fig. 2. The game Base-3-b has empty Core. This means that now none of the players can not reasonably undivided claim to influence, unlike the previous case (see Fig. 1). This fact can be interpreted as a mathematical proof of the thesis about the end of unipolar world in this model. In a situation of Core emptiness can be considered an alternative solution con- cepts, for example Nucleolus. This concept is based on the solution providing a maximum of at least lexicographical excesses coalitions S = , I. Strict definition of Nucleolus can be found in Schmeidler (1969) or other professiona6 ∅ l books on the cooperative games theory, for example Pechersky Yanovskaya (2004). Remind also that the excess of coalition S, for imputation x is expressed as

e(S, x)= v( S ) x( S ), where x( S )= i S x(i). { } − { } { } ∈ The excess e(S, x) contrapose own capabilities of the coalitionP v( S ) and payoff which it receives in accordance with the imputation Thus, the smaller{ } the excess, the more favorable imputation for the coalition and vice versa. In our example, the nucleolus takes on values 0.2 0.2 0.2 N (b) = ( , , 0.8+ ). (7) 3 3 3 Application of Game Theory 149

Fig. 2. Geometric illustration of the game Base-3-b, 2008 - 2014.

It should be emphasized that, Nucleolus values it is not the uncontested, for reasons of Core emptiness. For example, the player 3 can separately negotiate with the player 1 on the conditions

(δ, 0, 1 δ) where 0.2 <δ< 0.2, − 3 excluding player 2. Similarly, it may come with the player 1, player 2 had agreed

(0, δ, 1 δ), where 0.2 <δ< 0.2. − 3 At the same time, players 1 and 2, there is also the opportunity to challenge the nucleolus, get off at either the division, which provides them with a total of not less than limiting, thus, third player claims. Shapley value Sh(b) in this game is the same as nucleolus. The following qualitative changes associated with a further increase in the ca- pacity of the coalition 1, 2 . We may waive the requirement of equality of the sum influence for possible configurations{ } of coalitions of unit. In this case, we can get another version of the game Base-3-c, see Table 3. Source: conditional data A geometric interpretation of the game Base-3-c is shown in Fig. 3. It is also, like the preceding examples, is non-convex, i.e. the condition 150 Pavel V. Konyukhovskiy, Victoria V. Holodkova

Table 3. The characteristic function of the game Base-3-c (after 2014).

i v(i) S v(S) 1 0 {1,2} 0.4 2 0 {1,3} 1 3 0.8 {2,3} 1 {1,2,3} 1

Fig. 3. Geometric illustration of the game Base-3-c, after 2014.

( S,T )v( S T )+ v( S T ) v( S )+ v( T ) (8) ∀ { ∪ } { ∩ } ≥ { } { } is not satisfied. Moreover, unlike them, it is not a superadditive. Indeed, in the case of accession Player 3 to the coalition 1, 2 we see that { } v( 1, 2 )+ v( 3 )=0.4+0.8=1.2 > 1= v( 1, 2, 3 ). { } { } { } As can be seen, in this situation, we got an additional factor incompatibility coalition rationality conditions

v( 1, 2 ) 0.4, v( 3 ) 0.8. { } ≥ { } ≥ Application of Game Theory 151

The nucleolus of the game is determined by the vector

0.4 0.4 0.4 0.4 0.4 0.4 N (c) = ( , , 0.6+ ) = ( , , 0.8 ). (9) 3 3 3 3 3 − 6 The principal difference between the solution (9) of the solution (7) is that it assumes the need for concessions from the players 3 (West) in order to form a full coalition. Indeed the proportion of the influence of the West in the nucleolus of the Base-3-c

N (c) =0.8 0.4 <=0.8= v( 3 ), 3 − 6 { } which implies serious doubts about the desirability and possibility of a full coalition of all players (achieve a global consensus). We have to admit that this is not the positive conclusion drawn on the level of the model. But this conclusion does not contradict the reality that we have observed over the last year. Few of the experts and analysts agree with the thesis reducing the stability and security in the world for 2013 - 2014 years. Another feature of this game is a mismatch nucleolus and Shapley

Sh(c) = (0.1, 0.1, 0.8).

These values Sh(c) are a direct consequence of non-superadditivity of the game. This means that the third player indifferent between joining or not joining the coali- tion 1, 2 . In in other words, this may mean the appearance of objective tendencies to dismissal{ } third players from other game participants. Widely known thesis about the importance of the transition from a unipolar to a multipolar world. Appearing at the beginning of the third millennium, it has taken a leading position in the contemporary political science discourse. Let’s try to analyze the possible relationship version of our model of players in a multipolar world. Table 4 (Base-3-d) describes a situation in which none of the players (Russia, China, the West) can not get influence in the world alone. At the same time, any coalition of the two parties receives the absolute influence (it can impose its rules to a third party). Of course, the absolute influence has full coalition.

Table 4. The characteristic feature of the game Base-3-d (”Multi-polar world”)

i v(i) S v(S) 1 0 {1,2} 1 2 0 {1,3} 1 3 0 {2,3} 1 {1,2,3} 1

Source: conditional data Geometric Vector game Base-3-d shown in Fig. 4. As you can see the game Base-3-d has an empty Core. Crossing lines coalition rationality gives us three basic points

1,2 1 1 1,3 1 1 2,3 1 1 x = ( 2 , 2 , 0), x = ( 2 , 0, 2 ), x = (0, 2 , 2 ). 152 Pavel V. Konyukhovskiy, Victoria V. Holodkova

Fig. 4. Geometric illustration of the game Base-3-d

Each of them corresponds to the ”agreement” between the two players, the effect of dividing equally, to the exclusion of a third party, do not receive anything. These distributions are characterized by instability evident since against them there are obvious threats. If any configuration, each of ”united party” has a reason to suspect a partner that one can without loss of utility for himself agrees with ”the odd man out.” Nucleolus Base-game 3-d by the symmetry of the capacity of members is a vector

1 1 1 N (d) = ( , , ). (10) 3 3 3 Note that in the Base-3-d coincides with the nucleolus of the Shapley value. However, (10) has all the disadvantages of ”unstable, contested” decision from which players deviate profitable x(1,2), x(1,3) or x(2,3). ”Symmetry” instability of the solutions x1,2, x1,3, x2,3 in the game Base-3-d due to the symmetry of the player to complete of the players capabilities. Credibility of such assumptions are raises serious doubts obviously. This disadvantage can be partly overcome due to the differentiation of individual utility player, see Table 5. The characteristic function given by Table 5, based on estimates, according to which the supposed relative increase in economic and political potential of China in relation to the United States and its allies. We can agree with the proposed values, at least at the level of the preliminary analysis. It is not hard to guess that Application of Game Theory 153

Table 5. The characteristic feature of the game Base-3-e (Multi-polar world with asym- metry).

i v(i) S v(S) 1 0 {1,2} 1 2 0.2 {1,3} 1 3 0.4 {2,3} 1 {1,2,3} 1

the fundamental importance is not so much the absolute values as their ordering relative to each other.

Fig. 5. Geometric illustration of the game Base-3-e

Fig. 5 (geometric illustration games Base-3-e) demonstrates the transformation of solutions x1,2, x1,3, x2,3, assuming the achievement of agreement between the two 154 Pavel V. Konyukhovskiy, Victoria V. Holodkova players with the exception of the third (it‘s assumed an equal distribution of power between the contracting players). For example, 2,3 v( I ) v( 2 ) v( 3 ) v( I ) v( 2 ) v( 3 ) x = (0, v( 2 )+ { } − {2 } − { } , v( 3 )+ { } − {2 } − { } )= 1{ 0}.2 0.4 1 0.2 0.4 { } = (0, 0.2+ − 2− , 0.4+ − 2− )=(0, 0.4, 0.6). As you can see, in Base-3-e players 1 and 2 in the case of the realization of the distribution x1,2 = (0.4, 0.6, 0) does not have incentives to abandon the alliance in favor of the unions with the third player in which it gets a smaller share. The higher an individual utility of the third player and therefore a higher level of his initial claims are working against him. It makes less favorable separate agreement with him for other players. Substantial differences between the games Base-3-d and Base-3-e can be demon- strated by comparing Nucleolus

1 1 1 N (e) = ( , , ) (11) 3 3 3 (remains the same as the Base-3-d, which is determined by the immutability capa- bilities paired coalitions) and the Shapley vector

7 1 13 Sh(e) = ( , , ). (12) 30 3 30 Comparing (11) and (12), we find that these decisions are indifferent to second player (China). At the same time, (12) makes it possible to take into account the inequality of thr opportunities of players 1 and 3 (Russia and the West). If the players overcome the temptation twinning and try to reach a consensus in the game Base-3-e (to form a great coalition), the possible contours of such an agreement are determined by the following parameters: player 1 - from 23 to 33 procents of influence; • player 2 - about 33 procents of influence; • player 3 - from 33 to 43 procents of the influence. • 4. Extended model of interaction centers of political influence (Base-4) One of the most doubtful issues of the Base-3 models is assumption that determines the number of players. Certainly, specialists in game theory will understand to this assumption is likely. However, it met serious objections from the experts in the field of political science and international economy. The models Base-3 is based on a compromise between properties of the re- searched object and the opportunity to apply the mathematical tools. The increase in the number of players makes the model more appropriate. The model is com- plicated at the same time. There are arisen difficulties in the construction of the characteristic function in addition. With a large number of players their possible coalitions in fact never occur, and therefore their potential utility can be evaluated only in a hypothetical manner. As a consequence, the results obtained from the analysis of models with a large number of players are equally ambiguous and debatable, as is the case n = 3. Also need to add that in the transition to cooperative games with number of players more than 4 we lose the possibility of constructing geometrical interpretations (for games Application of Game Theory 155 possible three-dimensional illustration, however, in terms of practical application, they are rather decorative character). The increase in the number of players does not lead to an increase in the quality of the results. Moreover in the case of large n we get the muddy mathematical construction. However, the transition from n = 3 to n = 4 can be carried out relatively easy and clear. Consider the model, conventionally referred to as Base-4. In this model the player 3 is splitting into players 3 and 4 (West-I, West-II). As the player West-I we will continue to treat the United States and its allies. West-II could be treated as country’s traditionally assigned to the Western world, which in the long term can be more clear form their own system of goals and interests that are not directly correlated neither with the interests of the United States, Russia or China. The authors did not insist on the inevitability of the transformation of the system players. We are quite assume other scenarios of the appearance of the fourth player. However, they have obvious advantages with regard to the our scenario. In addition, it is reasonable and realistic looks assumption 1 and 2 players in these models will need to be treated not as Russia and China, as well as Russia and its allies and China, as well as the country focused on its support. The characteristic feature of the game Base-4 is submitted in Table 6.

Table 6. The characteristic feature of the game Base-4

i v(i) S v(S) S v(S) S v(S) 1 0 {1,2} 0.8 {1,2,3} 1 {1,2,3,4} 1 2 0.2 {1,3} 0.9 {1,2 4} 1 3 0.4 {1,4} 0.3 {1,3,4} 1 4 0 {2,3} 0.3 {2,3,4} 1 {2,4} 0.5 {3,4} 0.7

Source: conditional data Shapley value for games Base-4 takes the form

29 29 43 19 Sh(f) = ( , , , ). (13) 120 120 120 120 Its distinguishing feature is the ”equalization of influence” of the players 1 and 2, in spite of the initial differences in their ability defined intrinsic function. See Table. 3. It can be seen as one of the effects caused by the appearance of ”fourth power” - player West-II. Nucleolus for this game takes the from

N (f) = (0.308, 0.197, 0.308, 0.187). (14)

Note, the nucleolus is imputation, which reached the lexicographical minimum maximum exesses (coalition S = , I). When N (f) 6 ∅

maxS= ,I minx e(S, x) =0.3077. 6 ∅ { { }} 156 Pavel V. Konyukhovskiy, Victoria V. Holodkova

This excess is reached for a coalition formed by the players 1, 2, 4. In other words, that this coalition is the most dissatisfied with the fact that it offers impu- tation N(f). At the same time, we can express some doubts as to the adequacy of this decision for the game nucleolus Base-4. First of all, it concerns the ratio of shares Player 1 and Player 2. It is understandable if guided solely by the criterion of minimizing the dissatisfaction most offended coalition. However, it is extremely difficult to accept, if we take into account the actual practice of international re- lations. Thus, with regard to the concept of the game nucleolus Base-4 we can be regarded as a theoretical guide.

5. Development of models of interaction centers of political influence Returning to the issues related to the problems of construction of characteristic functions in cooperative games, describing the relationship between the centers of power and political influence in the modern world. As already mentioned, a method- ology based on the principle of influence as an opportunity to lock can be seen (with all the reservations) as an acceptable only in the initial stages of the study focused on obtaining generalizing, qualitative conclusions. In this connection, it is natural looks the question: how are alternative ap- proaches? From our point as one of the possible alternatives is an approach, involv- ing an assessment of the impact of the forces and their potential coalitions on the basis of the portfolio, composed of the currencies of these countries. The first task, the solution of which depends the success or lack of success of this approach is to determine the principles and rules on which data should be formed of the currency portfolio. To date, there are a number of serious studies that yielded important and inter- esting results with respect to the laws of dynamics of multi-currency portfolios and, in particular, with respect to the so-called currency with minimal volatility. See in particular (Hovanov et al., 2004; Hovanov, 2005; Hovanov, 2005; Bubenko and Hovanov, 2012). One of the problems that we face when used as a base for the construction of the characteristic function of the minimum income in the currency volatility, built on the basis of a portfolio of exchange members of the coalition is the stochastic nature of the data. A constructive solution to this problem is associated with the transition from deterministic to stochastic cooperative games. In the modern theory of cooperative games it has developed several approaches to the definition of stochastic cooperative game. One of the first studies in this direc- tion was the work (Charnes, 1977; Charnes, 1973). Also worth mentioning is the se- ries of works on the subject (Yeung and Petrosyan, 2006; Yeung and Petosyan, 2004; Suijs and al., 1999). In this article a stochastic cooperative game (SCG) we mean a pair Γ = (I, v˜) where I = 1..n - the set of participants; • { } v˜( S )- random variables with known distribution density pv˜( S )(x) which • interpreted{ } as revenues (utility payments), the corresponding c{oalitions} S I. ⊂ This approach to the task of stochastic cooperative games were previously pre- sented in the paper Konyukhovskiy (2012). Application of Game Theory 157

Imputation stochastic cooperative game will be called the vector x(α) Rn satisfying for conditions ∈ (a) ( i I)P x (α) v˜( i ) α (15) ∀ ∈ { i ≥ { } }≥ stochastic analog of individual rationality; (b) P x (α) v˜( I ) α (16) { i ≤ { } }≥ i=1..n X stochastic analog of the group rationality. Note that the condition (15) sets that the share prescribed by delay x(α) for ith player has to be greater or equal than the random value of his personal gain with a probability of not less than α . In accordance with (15) ith component of the vector division x(α) compared with the α-quantile Fv˜( i )(x) (distribution function of the random variable ). For compactness subsequent{ expressio} ns we introduce the notation

( 1) − vα(i)= Fv˜( i )(α) (17) { } for some players and i,

( 1) − vα(S)= Fv˜( S )(α) (18) { } for some coalition S I. Then (15) can be written as ⊂ ( i I)x (α) v ( i ) (19) ∀ ∈ i ≥ α { } The possibility of transformation from condition (15) to (19) follows from the properties of non-decreasing distribution functions. Indeed, the condition xi(α) v˜( i ) is satisfied for a certain level of probability α , will be carried out for all≥ { } α′ < α. In the classical cooperative games under group rationality means the need for full distribution utility large (complete) coalition within the division. In a modification of the stochastic game (16) means that the big (full) coalition is able to win with a probability of not less than α , to ensure the realization of the imputation x(α). Note that the condition (16) is equivalent to

P x (α) v˜( I ) 1 α (20) { i ≥ { } }≤ − i=1..n X ( 1) − From (20), denoting vα( I )= Fv˜( I )(α) the αquantile of the distribution func- ( 1) { } { } tion F − (α) , we obtain xi(α) v1 α( I ). v˜( I ) i=1..n ≤ − { } We emphasize{ } quite a significant difference. If the normal condition of cooper- P ative games group rationality is defined as the strict equality and thus defines a hyperplane in m-dimensional space, the approach proposed here it is in the form of inequality and defines loosely in half-dimensional space. Thus, the nature of vec- tors x that satisfy the definition (15) - (16) differs from the nature of imputations in their classical interpretation. Sometimes for naming such objects use the term distributions (allocations). 158 Pavel V. Konyukhovskiy, Victoria V. Holodkova

As a result, the system conditions, which determines imputation stochastic game, takes the form (a) ( i I)x (α) v ( i ) (21) ∀ ∈ i ≥ α { } (b)

xi(α) v1 α( I ) (22) ≤ − { } i=1..m X For values vα( i ) in modern risk management usually apply the term value at risk (VaR). In this{ } connection it may be noted advantages of the approach (21) - (22). Namely, it is logically consistent with the concept of VaR. This potentially opens up opportunities for a meaningful interpretation of the results of subsequent studies of the properties of this class of games and concepts of finding solutions. When modeling the interaction of centers of influence in the modern world as a base for building the characteristic functions we can use the basket of the currencies of the countries forming the corresponding coalition. The potential income provided by the currency basket is a random value. Due to the fact that the actual implementation of the yield occurs under con- ditions of uncertainty in the simulation it is expedient to use random values. Ac- cordingly, for the simulation of these processes should be used models based on stochastic games. For example, the income received under the basket for a coalition of players S, we can assume a random variablev ˜( S ) with the density pv˜( s )(x). Of course, this applies to both the coalitions formed{ } by individual players,{ and} to complete the coalition. In practice, modeling income currency basket can be used normally distributed random variables or random variables with asymmetric triangular distribution. An important specific feature of stochastic cooperative games is a substantial modification in them the concept superadditivity. For ordinary (non-stochastic) cooperative games v( S T )= v( S )+ v( T ) { ∪ } { } { } (equality between the sum of utilities coalitions S and T and utility of united coalition S T ) means meaninglessness of association. At the same time in the stochastic game∪ similar amount v˜+( S T )=˜v( S )+˜v( T ) { ∪ } { } { } is also a random variable, for which v+( S T ) = v ( S )+ v ( T ) α { ∪ } 6 α { } α { } in general. The problems of relationships between quantile sums and the sum of the quantile most concern to the field of probability theory than the theory of games. See, for example, (Watson and Gordon, 1986; Liu and David, 1989). In terms of assessing the impact of cross-national coalition stated above property means that when used as a tool for modeling stochastic cooperative games, even a simple inter-coalition agreements summation income (utility) when combined cur- rency baskets may bring additional effect. Accordingly, there is an opportunity for an adequate modelling of the effects of the emergence of cross-country coalition. In particular, can be distinguished the following types of coalitions. Application of Game Theory 159

Basket of currencies united coalition S T is characterized by simple sum- • mation of revenuesv ˜+( S T )=˜v( S∪)+˜v( T ) it new qualities are de- termined by differences{ between∪ } VaR{ of} united{ coalition} and sum of VAR‘s of participants. When merging coalitions S and T their union in a coalition S T occurs • basket of currencies, the income of which is described by some new∪ random variablev ˜( S T ) with the density pv˜( S T )(x) is not directly related to the amount{ of∪v ˜( }S )+˜v( T ) { ∪ } { } { } The second option allows the association to reflect the substantial effects of combining for specific coalitions. In such models there is another area of analysis of rationality imputations. In- deed, in the case of meaningful association (the second type of coalition), the share of utility x( Q ) which imputation x prescripts for coalition Q can be compared { } 1 not only v ( Q )= F − (α) (VaR of utility Q ), but with the sum of VAR‘s of α { } v˜( Q ) player included in the coalition{ } Q

v˜+( S )= v˜( i )) (23) { } { } i S X∈ Moreover, there are possible comparisons vα( Q ) with vα( S )+vα( T ) where S,T - subsets Q (Q = S T,S T = ). Generally,{ } such{ comparisons} { } can be conducted over all possible∩ partitions∪ of each∅ set S I into subsets. As a result, we can get a essential conclusions about⊂ preferred forms of cross- national associations, and that the impact on which they can objectively claim.

6. Closing The methods of the modern theory of cooperative games can act as effective tools for modeling and analysis of processes of redistribution of political and economic influence among the world’s centers of power. The given examples in this paper largely suggest that the cooperative game mod- els provide some internally consistent logic to explain the trend, according to which the relationship has evolved, and there was a redistribution of influence between the world centers of power in recent decades. The condition for the successful development of co-operative models of interac- tion between power centers, the improvement of methods of construction of charac- teristic functions in the direction of improving the adequacy of accounting objective interests (utilities) countries and coalitions. It is very important to forth researches aimed at improving of mathematical tools used in the models. This applies particularly to development potential concepts of the solutions for stochastic cooperative games.

References Aleskerov, F. T., Kravchenko, A. S. (2008). The distribution of influence in the government of the Russian Empire Dumas. Polit 3(50). Bubenko, E. A., Hovanov, N. V. (2012). Using the aggregate economic benefits of constant values for the hedging of exchange risks.. Management of economic systems 12. (Elec- tronic scientific journal http://uecs.ru/instrumentalnii-metody-ekonomiki/). Heydarov, N. A. (2008). Geopolitical ”triangle” Russia-China-US in Eurasia. Magazine ”Explorer.” February 2. 160 Pavel V. Konyukhovskiy, Victoria V. Holodkova

Grinin, L. E. (2013). Globalization of the world shuffles a deck (which shifted the global economic and political balance). In: Age of Globalization (Grinin, L. E.), 2(12), 63–78. Dergachov, V. A. (2011). The geopolitical theory of large multi-dimensional spaces. Pub- lishing Project Professor Dergacheva. Dergachov, V. A. (2005). Global Studies. School edition. M.: UNITY-DANA. Ignatov, T. V., Podolsky, T. V. (2014).The possibilities of global governance of the world financial system: realities and prospects. In: Age of Globalization 2 (Ignatov, T. V.), 119–128. Konyukhovskiy, P. V. (2012). The use of stochastic cooperative games in the justification of investment projects. In: Vestnik St.Petersburg University. Univ. Ser. 5 ”Economy”., 2(124) (December), 134–143. Pechersky, S. L. Yanovskaya, E. B. (2004). Cooperative Games: solutions and axioms.. SPb .: Publishing House of Europe. U. of St. Petersburg. Sokolov, A.V. (2008). Quantitative methods of assessing the impact of participating in the collective decision-making.. Polit, 4(51). Hovanov, N. V. (2005). Measuring the exchange value of economic benefits in terms of aggregated stable currency. In: Finance and business., 2, 33–43. Hovanov, N. V. (2005). The phenomenological theory of stable meta-money. In: Finance and business., 4, 18–21. Banzhaf, J. F. (1965). Weighted voting does not work: a mathematical analysis. In: Rutgers Law Review, 19, 317–343. Charnes, A., Granot, D. (1977). Coalitional and Chance-Constrained Solutions to n-Person Games, II: Two-Stage Solutions. In: Operation Research, Vol. 25, Issue 6, pp. 1013– 1019. Charnes, A., Granot, D. (1973). Prior solutions: extensions of convex nucleolus solutions to chance-constrained games. In: Symposium at Ioonvex nucleolus solutions to chance- constrained games. wa University:, pp. 1013–1019. Coleman, J. S. (1971). Control of Collectivities and the Power of a Collectivity to Act. In: B. Lieberman (ed.) Social Choice, New York: Gordon and Breach:, pp. 269–300. Deegan, J., Packel, E. W. (1978). A New Index of Power for Simple n-Person Games. In: International Journal of Game Theory-2, 7. Holler, M. J., Packel, E. W. (1983). Power, Luck and the Right Index. In: Journal of Eco- nomics, 43. Hovanov, N. V., Kolari, J. W., Sokolov, M. V. (2004). Computing currency invariant in- dices with an application to minimum variance currency baskets. In: Computing cur- rency invariant indices with an application to minimum variance currency baskets, 28, 1481–1504. Johnston, R. J. (1978). On the Measurement of Power: Some Reactions to Laver. In: Environment and Planning., 10. Liu, J., David, H. A. (1989). Quantiles of Sums and Expected Values of Ordered Sums. In: Austral J. Statist., 31(3), 469–474. Penrose, L. S. (1946). The Elementary Statistics of Majority Voting. In: Journal of the Royal Statistical Society, 109, 53–57. Rae, D. W. (1946). Decision-Rules and Individual Values in Constitutional Choice. In: American Political Science Review, 63, 40–63. Shapley, L., Shubik, M. A. (1954). Method for Evaluating the Distribution of Power in a Committee System. In: American Political Science Review, 3(48). Schmeidler, D. (1969). The nucleolus of a characteristic function game. In: SIAM Journal of Applied Mathematics, 17(6), 1163–1170. Suijs, J., Born, P. (1999). Stochastic Cooperative Games: Superadditivity, Convexity, and Certainty Equivalents. In: Games and Economic Behavior, 27, 331–345. Suijs, J., Born, P. (1999). A nucleolus for stochastic operative games. In: A nucleolus for stochastic operative games, pp. 152–181. Application of Game Theory 161

Suijs, J., Borm, P., De Waegenaere, A., Tijs, S., (1999). Cooperative games with stochastic payoffs. In: European Journal of Operational Research, 113(1), 193–205. Watson, R., Gordon, L. (1986). On Quantiles of Sums. In: Austral J. Statist, 28(2), 192– 199. Yeung, D. W. K., Petrosyan, L. A. ‘(2006). Cooperative Games: solutions and axioms.. SPb .: Publishing House of Europe. U. of St. Petersburg. Yeung, D. W. K., Petrosyan, L. A. (2004). Subgame consistent cooperative solutions in stochastic differential games. In:J.Optimiz. Theory and Appl., 120(3), 651–666. Contributions to Game Theory and Management, X, 162–174

Game-Theoretic Approach for Modeling of Selfish and Group Routing

Alexander Yu. Krylatov and Victor V. Zakharov St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected], [email protected]

Abstract The development of methodological tools for modeling of traffic flow assignment is crucial issue since traffic conditions influence significantly on quality of life nowadays. Herewith no secret that the development of in- vehicle route guidance and information systems could impact significantly on route choice as soon as it is highly believed that they are able to reduce congestion in an urban traffic area. Networks’ users join groups of drivers who rely on the same route guidance system. Therefore, present paper is devoted to discussing approaches for modeling selfish and group routing. Network performance is deeply associated with competition between users of networks. So, the emphasis in our discussion is placed on game-theoretic approaches for appropriate modeling. Keywords: traffic assignment problem, selfish routing, user equilibrium of Wardrop, group routing, Nash equilibrium, system optimum of Wardrop.

1. Introduction J.G. Wardrop published in 1952 two principles for traffis flow assignment (Wardrop, 1952). The first principle of Wardrop associated with user equilibrium assignment states: The journey time on all routes actually used are equal, and less than those • which would be experienced by a single vehicle on any unused route. Clear, that it is still relevant and useful for evaluation of traffic flow assignment. It means, that traffic engineers and decision makers at the different levels of manage- ment rely on first principle to identify used routes in the network and the appropriate flows. The second principle of Wardrop associated with system optimum: The average journey time is a minimum. • The second principle could not be used for evaluation of traffic flow assignment since no one driver seeks to minimize average time. Any driver tries to minimize its own travel time and, hence, the first principle of Wardrop is more appropriate for purpose of such evaluation. Nevertheless, the relationships between user equilibrium and system optimum define important economical conclusions in transportation about toll pricing on the links of network (Gartner, 1980). Eventually, due to the second Wardrop’s principle it is possible to evaluate toll prices that could be charged from users of a road network. There exist a number of researches that dealt with these two principles of Wardrop. The extensive review of existing models and methods implemented for traffic assignment evaluation is made by M. Patriksson in 1994 (Patriksson, 1994). This outstanding book was republished in the beginning of 2015 without loss of Game-Theoretic Approach for Modeling of Selfish and Group Routing 163 its relevance (Patriksson, 2015). In spite of various results already obtained in this branch of applied mathematics, new investigations are still appearing to contribute namely practically expanding numerical techniques. For instance, the past decade much attention has been paid to a class of bush-based algorithms (Zheng and Peeta, 2014). According to (Xie et al., 2013), it is the ability of such algorithms to solve large-scale traffic assignment problems at a high level of precision that attracts many researchers. On the other hand, there have been a relatively weak development of theoretical principles concerning traffic assignment problem since the early 1950s. The important substantive insights on the first Wardrop’s principle were made by T. Roughgarden in his book (Roughgarden, 2005). He deeply investigated user equilibrium concept as selfish behaviour of network’s users. However, the certain relationships between selfish routing and group behavior in a network (system opti- mum) were not established. Nevertheless, it was clearly shown, how selfish routing could be inefficient from system perspective. In other words, when each driver tries to minimize its own travel time (selfish routing), then the average traffic conditions deteriorate. Much attention was payed to Braess’s paradox and its extension on sec- ond and third Braess graphs. The task to avoid this paradox seems unsolvable for large road networks. From practical perspective, any topology of real road network is really exposed to the manifestation of Braess’s paradox. Therefore, selfish routing in real road network lead to the loss of efficiency of network performance. The only way to increase network performance is developing of the group routing principle (Krylatov et al., 2016). A number of theoretical results that establish relationships between user equilib- rium of Wardrop, Nash equilibrium (competition between several groups of users) and system optimum of Wardrop were obtained in (Krylatov et al., 2016). Investi- gation was based on assumption that the impact of in-vehicle route guidance and information (IVRGI) systems on route choice in daily trips of people increases nowa- days. Such systems are result of the rapid development of information technologies in the past three decades leads by the way to the emergence of different special- ized telecommunication systems, which nowadays are introduced almost in every field of human activity. The influence of these systems on decision making seems to be significant. Moreover, the permanent innovative development of such systems is noticeably related to the creation of intelligent vehicles. Indeed, from a consumer perspective one of the main attributes of any intelligent vehicle is an automatic drive regime that is associated with an automatic in-vehicle route guidance system. Therefore, guidance systems are seemed to be an integral part of the concept of intelligent vehicle. Actually, an automatic guidance system is a great advantage of intelligent ve- hicles not only from a consumer perspective. The traffic flow of intelligent vehicles could be automatically assigned by the central guidance system in such a way to minimize overall travel time of all road network users. In other words, a system op- timum (Wardrop, 1952) could be reached on the network by imposing the optimal route choices on the users (Patriksson, 2015; Sheffi, 1985; Wardrop, 1952). Such an assignment of traffic flow is often called involuntary system optimum, unlike voluntary system optimum (Gartner, 1980). In a voluntary system optimum case, after paying special charging tolls users reached system optimum, although initially they were tending to user equilibrium assignment (Patriksson, 2015; Sheffi, 1985; Wardrop, 1952). Here, it should be mentioned that user equilibrium assignment is 164 Alexander Yu. Krylatov, Victor V. Zakharov supposed to take place when all drivers tries to minimize their own travel time with- out support of any guidance system. As a result, the overall travel time spending by all users assigned according to interest of each atomic user is more than overall travel time in system optimum case. Therefore, central guidance system is capable to reduce congestions by imposing system optimum assignment on the intelligent vehicles. Moreover, according to (Bonsall, 1992) there is considerable government interest in the development of in-vehicle guidance systems. This interest reflects a belief that such systems could produce benefits in four ways (Bonsall, 1992):

to improve people’s knowledge of the network and assist them to find efficient • routes; to reduce unnecessary mileage, traffic volumes, and hence congestion; • to link in-vehicle guidance system with traffic control and, perhaps, road • pricing systems; to obtain more globally efficient routing patterns (system optimum). • Therefore governments feel possible positive effects from implementation of in- vehicle guidance systems and try to formalize them. From the mathematical side all these advantages could be expressed by system optimum principle. Hence, all mentioned ways of producing benefits to the traffic systems are already discussed in the previous paragraph in a short form of the ”optimizational vocabulary”. Despite the interest of government, as a rule the major contribution in devel- opment of guidance systems are produced by different private business companies. By the virtue of competitive structure of economics these companies are forced to compete with each other offering their own users better service. First of all, ”better service” means the less travel time from origin to destination point. Thus, each company seeks to route the flows of its own users so to minimize their average travel time. At the same time others try to minimize average travel time of their users routing in the same road network. Due to described circumstances the com- petitive traffic assignment problem is appeared. Non-cooperative nature of relations between the companies leads to the set of such optimization programs that the un- known variables of any of these programs are independent parameters in all others. Therefore, competitive traffic assignment problem should be formulated in a game theoretic form with Nash equilibrium search (Nash, 1951). System optimum assignment obtained by in-vehicle guidance systems with one provider is assumed to differ from assignment imposed by Nash equilibrium strate- gies of the competitive companies offered their consumers route guidance. Thus, investigation of relationships between Wardrop’s system optimum associated with traffic assignment problem and Nash equilibrium associated with competitive traf- fic assignment problem seems quite important. Indeed, when flows of intelligent vehicles are large enough then competitive guidance systems could deviate traf- fic assignment from system optimum significantly. This fact should be taken into consideration by traffic engineers, transportation planners, network designers and etc. in transportation modeling. This paper is completely focused on mentioned relationships and moreover, some common aspects of Nash equilibrium and user equilibrium of Wardrop assignments will be also illuminated. Game-Theoretic Approach for Modeling of Selfish and Group Routing 165

2. State-of-The-Art We give here an overview of the results, obtained in sphere of traffic behavior mod- eling by virtue of game-theoretical approach. This is quite a standard overview of the results on the topic (one could see (Krylatov et al., 2016; Patriksson, 2015)). The first attempt to define traffic equilibria in forms of network games was made in the late 1950s by Charnes and Cooper (Charnes and Cooper, 1958). They described the user equilibrium flow as a non-cooperative Nash equilibrium in a game where the players are pairs of origin–destination (OD pairs), competing to minimize travel time of their respective commodity flows. Further discussion along this line is developed by Dafermos in (Dafermos, 1971; Dafermos and Sparrow, 1969). How- ever, these first investigations of the relationships between a Wardrop equilibrium and a network games have not driven to any formal expressions. Rosenthal studied a discrete version of the user equilibrium traffic assignment problem in 1973 (Rosenthal, 1973). It should be stressed that the players are defined as the individual travelers, with strategy spaces equal to their respective sets of routes available. Travelers seeking to minimize their individual travel time, i.e. their payoff functions. The game is shown to be equivalent to a non-cooperative, pure- strategy Nash game in the traffic network. Therefore, he was the first one who formulated special case of competitive traffic assignment problem as we defined it above. This is the special case since each UG consist of solely one user. Devarajan in 1981 extended discrete version to the continuous case, however, as do Charnes and Cooper, defines OD pairs as the players (Devarajan, 1981) and consider the payoff functions:

ya ϕw(y)= ta(s)ds, w W, 0 ∀ ∈ a Aw X∈ Z where W is a set of UGs, w W ; A is a set of links included in routes between ∈ w OD pair w; ya is a traffic flow on a congested link a; ta is a travel time through a congested link a. Hence his formulation is not correspond to competitive traffic assignment principle. Nevertheless, it was proved that the Nash game thus defined is equivalent to a Wardrop equilibrium. In the middle of 1980s more general game formulations of traffic equilibria were given by Fisk (Fisk, 1984) and Haurie and Marcotte (Haurie and Marcotte, 1985). The travel made in an OD pair is divided into a number of players and, hence, a player, as defined, may use several routes simultaneously; in equilibrium, all players divide their flow on all routes used in the OD pair. Therefore, such formulation has the certain common features with competitive traffic assignment problem. However, it is shown that only in the limiting case, when the number of players in each OD pair tends to infinity, while sharing the same strategy, the Nash game is equivalent to a Wardrop equilibrium. In the 1990 the development of computer networks motivated researchers to begin investigation of competitive routing in multiuser communication networks (Orda et al., 1993). According to Orda et al., a single administrative domain was no longer a valid assumption in networking. Then communication networks shared by selfish users with their own given flow demands were considered and mod- eled as noncooperative games by several research groups (Korilis and Lazar, 1995; Korilis et al., 1995; La and Anantharam, 1997) and (Altman et al., 2002). Due to 166 Alexander Yu. Krylatov, Victor V. Zakharov these researches different properties of such systems are established, and the condi- tions of existence and uniqueness of Nash equilibrium in multiuser communication networks are widely studied. During the 2000s Altman et al. have extended results, obtained for multiuser communication networks, and then implemented them to road networks (Altman et al., 2002; Altman et al., 2011; Altman and Kameda, 2005). Unlike Haurie and Marcotte, Altman et al. established the convergence of the Nash equilibrium in network games to the Wardrop equilibrium as the number of players grows under weaker convexity assumptions (Altman et al., 2011). Therefore, they raised again the question of relationships between Nash equilibrium in noncooperative n-person network games and Wardrop equilibrium in the traffic assignment problem.

3. Traffic assignment problem This section is devoted to the basic description of the modern traffic assignment problem. Consider a transportation network presented by oriented graph G = (N, A). We assume, that there is a set of OD pairs W and the sets of routes Rw between each OD pair w W . Moreover, introduce following notation: F w is the demand between w; f w is the∈ flow of UG j through r Rw; x is the flow through the arc a A, r ∈ a ∈ x = (...,xa,...); ta(x)= ta(xa) is the travel time of flow xa through congested arc a A; δw is an indicator: ∈ a,r 1, if route r Rw includes arc a A, δw = a,r 0, otherwise.∈ ∈  According to (Beckmann et al, 1956; Patriksson, 1994; Sheffi, 1985), the equal travel time on all actually used routes, that is less than travel time on any un- used route, could be reached by assignment strategy obtained from the following optimization program:

xa ue x = arg min ta(u)du, (1) x a A 0 X∈ Z subject to f w = F w w W, (2) r ∀ ∈ r Rw X∈ f w 0 r Rw, w W, (3) r ≥ ∀ ∈ ∈ with definitional constraints

x = f wδw a A. (4) a r a,r ∀ ∈ w W r Rw X∈ X∈ Let us offer the basic theorem and its proof for user equilibrium of Wardrop (Patriksson, 2015; Sheffi, 1985). The proof of the following theorem has an impor- tant meaningful sense.

Theorem 1. Solution x∗ of optimization problem (1)–(4) is user equilibrium of Wardrop. Game-Theoretic Approach for Modeling of Selfish and Group Routing 167

Proof. Lagrangian of (1)–(4) is

xa w w w w w L = ta(u)du + t F fr + ( fr ) ηr , 0 − − a A r Rrs ! r Rw X∈ Z ∈X X∈ where tw and ηw 0, r Rw, w W are multipliers of Lagrange. According to r ≥ ∈ ∈ Kuhn–Tucker conditions, partial derivatives of L in x∗ are equal to zero: ∂L ∂x = t (x ) a tw f w =0 r Rw, w W. (5) ∂f w a a · ∂f w − − r ∀ ∈ ∈ r a A r X∈ Due to (4): ∂xa w w w = δa,r a A, r R , w W., ∂fr ∀ ∈ ∈ ∈ that leads from (5) to ∂L = t (x ) δw tw f w =0 r Rw, w W. (6) ∂f w a a · a,r − − r ∀ ∈ ∈ r a A X∈ w w w w Complementary slackness require fr ηr = 0. If fr > 0, then ηr = 0, and, if f w = 0, then ηw 0. Than we obtain:· r r ≥ w w w = t , if fr > 0 w ta(xa) δa,r w w r R . (7) · t , if fr =0 ∀ ∈ a A  ≥ X∈ Therefore, solution of (1)–(4) fulfills (7). Consequently, it is user equilibrium of Wardrop by definition. Note, that since goal function (1) has no any physical or economical scence, expression (5) is crucial for selfish routing. It confirms that problem (1)–(4) is the behavioral model of selfish routing when each user tries to minimize its own travel time. According to (Beckmann et al, 1956; Patriksson, 1994; Sheffi, 1985), the mini- mum average travel time could be reached by assignment strategy obtained from the following optimization program:

so T (x ) = min ta(xa)xa, (8) x a A X∈ subject to f w = F w w W, (9) r ∀ ∈ r Rw X∈ f w 0 r Rw, w W, (10) r ≥ ∀ ∈ ∈ with definitional constraints

x = f wδw a A. (11) a r a,r ∀ ∈ w W r Rw X∈ X∈ Unlike user equilibrium model, problem (8)–(11) is quite clear as soon as goal func- tion (8) has an accurate economical scence (total time costs). 168 Alexander Yu. Krylatov, Victor V. Zakharov

4. Game of OD-pairs This section is devoted to the alternative game-theoretic formulation of user-equilib- rium assignment. All discussed here results are obtained by S. Devarajan (Devara- jan, 1981). Assume that OD-pairs are players. Then we define Aw A as subset of arcs that ⊂ w are used by traffic flows between pair w W : Aw = a : a r, for some r R . Each player w tries to minimize its payoff∈ function { ∈ ∈ }

xa w w Pw(x)= Pw(x , x− )= ta(u)du, (12) 0 a Aw X∈ Z subject to w w fr = F , (13) r Rw X∈ f w 0, r Rw, (14) r ≥ ∀ ∈ with definitional constraint

x = f wδw , r Rw,a A . (15) a r a,r ∀ ∈ ∈ w w W r Rw X∈ X∈ Let us formulate the problem (12)–(15) in a form of noncooperative network game w w with penalty functions: Ω W, Fw w W , Pw w W , where Fw = f fr 0 r w w w { } ∈ { } ∈ { | ≥ ∀ ∈ R , r Rw fr = F , w W , when ∈ ∀ ∈ }  P x = f wδw , r Rw,a A . a r a,r ∀ ∈ ∈ w w W r Rw X∈ X∈ Nash equilibrium in game Ω is reached by such x∗ that

w w P (x∗) P x , x− ∗ w W, w ≤ w ∀ ∈   w w 1 w+1 where x− = (...,x − , x ,...). Let us offer the following theorem. Proof of the theorem is highly important from meaningful perspective. Theorem 2. User-equilibrium flow pattern (with continuous flows) is a pure strat- egy Nash equilibrium in game Ω. Proof. Assume xa xa ta(u)du > ta(u)du, a p 0 a q 0 X∈ Z X∈ Z Then xa xa t (u)du t (u)du = δ > 0. a − a a (p q) Z0 a (q p) Z0 ∈X− ∈X− xa Due to continuity of 0 ta(u)du, there exist ∆f1 > 0, ∆f2 > 0, such that

R xa ∆f1 xa − δ t (u)du > t (u)du , (16) a a − 3 a (p q) Z0 a (p q) Z0 ∈X− ∈X− Game-Theoretic Approach for Modeling of Selfish and Group Routing 169

xa+∆f2 xa δ t (u)du > t (u)du + . (17) a a 3 a (q p) Z0 a (q p) Z0 ∈X− ∈X− Let us substitude ∆f = min ∆f1, ∆f2 for ∆f1 and ∆f2 in (16) and (17) and the inequalities still hold. Thus,{ from (16)} and (17) we obtain

xa+∆f xa ∆f − t (u)du t (u)du < a − a a (q p) Z0 a (p q) Z0 ∈X− ∈X− xa xa 2δ δ < t (u)du t (u)du + = − < 0. (18) a − a 3 3 a (q p) Z0 a (p q) Z0 ∈X− ∈X−

Let ∆Pw be the change in Pw resulting from a transfer of ∆f from p to q. Since xa 0 ta(u)du is increasing function

R xa+∆f ∆P < t (u)du ∆f w a · − a (q p) Z0 ∈X− xa ∆f − δ t (u)du ∆f < − ∆f < 0. (19) − a · 3 a (p q) Z0 ∈X−

Therefore, if no player w can lower his payoff Pw by an interpath flow transfer, then the network is user optimized. However, the equivalence to Nash equilibrium is not yet transparent.

Eventually, S. Devarajan made important conclusions that we cite further (De- varajan, 1981). Theorem 2 does not guarantee the equivalence of user-equilibrium and Nash equilibrium in game of OD-pairs Ω. Actually, user-optimization is a weak condition for Nash equilibrium. The problem is that the criterion for user- optimization, that a shift of ∆f from path p to path q not decrease the cost to ∆f, is a weak condition for Nash equilibrium. Pure strategy Nash equilibrium requires that the adoption of any new pure strategy by a player should not improve his payoff. This means we should be able to transfer any number of f (not necessarily equal) from as many paths to another set of paths (all connecting the same OD- pair w) and not register a drop in payoff Pw. To put it another way, given a pure strategy (fpi ,...,fpn ), the user optimization criterion refers to changing only two of the fpi ’s. For Nash equilibrium, we must test whether shifts which change up to all the fpi ’s improve the payoff. Thus, every Nash equilibrium is a user optimal network but not vice versa. However if we can show that: the user optimal solution is unique, and • Nash equilibrium always exists in this game. • then the two equilibrium concepts are, in fact, equivalent.

5. Game of Individual Users This section is devoted to another alternative game-theoretic formulation of user- equilibrium assignment. All discussed here results are obtained by R.W. Rosenthal (Rosenthal, 1973). 170 Alexander Yu. Krylatov, Victor V. Zakharov

Let us start from introductory citation of R.W. Rosenthal (Rosenthal, 1973). The individuals are assumed to be playing a game in which the pure strategies for each are the individuals’ feasible paths. The payoffs (to be minimized) are the sums of the costs of the arcs used. Nash equilibria are sought. In this case these corre- spond to equilibria for the system. For general n-person games, however, one is not guaranteed that any Nash equilibria must exist; unless the individual strategy sets are extended to include all possible randomizations over the sets of pure strategies. (See Nash, 1951) (The cost of playing a randomized strategy is taken to be the expected cost over the relevant pure strategies.) spond to fractional solutions to the continuous-variables model . For this class of games, however, it is not necessary to introduce randomizations, since pure-strategy Nash equilibria always exist. k Denote xa as the fraction of individual k’s flow which passes through arc a, k 1,..., F , a A. If all individuals have chosen their routes, the total cost to an∈{ individual| |} traversing∈ route r is

Pk(r)= ta(xa). (20) a r X∈ Then we can formulate the following noncooperative network game: Υ F,R, Pk k F , where F = 1,..., F and R is a set of all possible routes. { } ∈ { | |} An equilibrium for the system is a set of feasible paths, one for each individual,  such that no individual can decrease his total cost by switching unilaterally to some other feasible path. We shall assume in all that follows that at least one feasible route exists for each individual.

xa min ta(u) (21) a A u=0 X∈ X subject to F | | x = xk, a A, (22) a a ∈ Xk=1 when 1, if choosen route containing arc a, xk = (23) a 0, otherwise.  The following theorem was proved (Rosenthal, 1973).

Theorem 3. In game Υ , derived from network equilibrium models, pure-strategy Nash equilibria always exist. Furthermore, any solution to the problem (21)–(23) is a pure-strategy Nash equilibrium in game Υ .

We can say, that theorem 3 gave the first justification of a high correlation between selfish routing and user equilibrium of Wardrop.

6. Game of Users’ Groups This section is devoted to relationships between user equilibrium of Wardrop, Nash equilibrium (competition between several groups of users) and system optimum of Wardrop. All discussed here results are obtained by A.Y. Krylatov et al. (Krylatov et al., 2016). Game-Theoretic Approach for Modeling of Selfish and Group Routing 171

Let’s start with the fact that the both Wardrop’s principle are useful from practi- cal perspective. However, the drawback is that they are not applicable when instead of atomic drivers, the behavior of groups of drivers (with common group interest) should be taken into consideration. Thus, by virtue of rapid guidance systems devel- opment the need for a new assignment principle is expected to increase significantly in the coming years. If the term ”group of users (UG)” we understand as a set of all drivers following advices of the same guidance system, then a new principle could be formulated roughly as follows: Under competitive conditions the average journey time of each group of users • is a minimum. Note, the explicit mention of the competitive environments in this principle is im- portant since without competitive behavior of companies creating guidance systems the second principle of Wardrop is sufficient for the purpose of appropriate model- ing. Further we will associate this competitive traffic assignment principle with the competitive traffic assignment problem. Consider the same transportation network presented by oriented graph G = (N, A), set of OD pairs W and corresponding sets of routes Rw, w W . Ac- cording to competitive traffic assignment principle, each group tries to∈ assign its users among available routes from origin to destination in such a way, that their average travel time is minimum. Introduce following notation: M = 1,...,m is the set of users’ groups (UG); F jw > 0 is the demand of UG j between{ w}, j jw j j j F = w W F ; xa is the flow of UG j throgh the arc a A, x = (...,xa,...), j 1∈ j 1 j+1 m 1 m ∈jw x− = (x ,...,x − , x ,...,x ) and xa = (xa,...,xa ); fr is the flow of UG j P w jw jw jw jw through r R ; f = (f1 ,...,f Rw )⊤ is the assignment of the flow F through ∈ | | possible routes Rw; f j = (...,f j,w,...) is the strategy of UG j (the assignment jw j 1 j 1 j+1 m of the flows F between all OD-pairs), and f − = (f ,...,f − ,f ,...,f ); f = (f 1, ..., f m) is the set of all strategies of all UGs. Each UG tries to minimize the average travel time of its own users. There- fore, the following optimization programs could be formulated for all j = 1,m Zakharov and Krylatov, 2016:

j j ∗ j j j Tm x , x− = min Tm (x) = min ta(xa)xa, (24) xj xj a A   X∈ subject to f jw = F jw w W, (25) r ∀ ∈ r Rw X∈ f jw 0 r Rw, w W, (26) r ≥ ∀ ∈ ∈ with definitional constraints

xj = f jwδw a A, (27) a r a,r ∀ ∈ w W r Rw X∈ X∈ m x = xj a A. (28) a a ∀ ∈ j=1 X j Note, that for each j M the set x− is not fixed, but induced by the assignment decisions of other UGs. Therefore,∈ we obtain competitive traffic assignment problem, 172 Alexander Yu. Krylatov, Victor V. Zakharov that could be reformulated in a form of noncooperative network game with penalty j j j j jw jw functions z , j = 1,m: Γm M, Fm j M , Tm j M , where Fm = f f ∈ ∈ k jw jw { } { } { | ≥ 0 k Kw, k Kw fk = F , w W , when ∀ ∈ ∈ ∀ ∈ }  P m j jw jw j xa = fr δa,r and xa = xa. w W r Rw j=1 X∈ X∈ X Consideration of competitive traffic assignment problem in a game theoretic form leads us to the Nash equilibrium search. Nash equilibrium in the game Γm is ne 1 m realized by strategies xm = (x ∗,..., x ∗) such that

j ne j j j T (x ) T x , x− ∗ j M. (29) m m ≤ m ∀ ∈   Theorem 4. The following inequalities hold

T (xso) T (xne) T (xue). (30) ≤ m ≤ Corollary 1. If t (x ) = const then the following inequalities hold a a 6 so ne ne ne T (x )= T (x1 )

1 Competitive in-vehicle route guidance systems decrease the average travel time in urban traffic area in comparison with an atomic vehicle guidance. 2 The less amount of competitive in-vehicle guidance systems, the less average travel time in urban traffic area. 3 Centralized guidance system guarantees the least travel time in urban traffic area.

Therefore, since in modern worldwide cities drivers chose routes by their own even competitive guidance systems could decrease the average travel time. Consequently, from this perspective, the development of intelligent vehicles could significantly improve traffic conditions in urban areas. Game-Theoretic Approach for Modeling of Selfish and Group Routing 173

7. Conclusion This paper was devoted to discussing approaches for modeling selfish and group routing. Surprisingly, first behavioral model for selfish routing appeared in 1950’s, but no any methodological novations have appeared since then. Significant mean- ingful contribution was made by T. Roughgarden in 2000’s. New relationships be- tween user-equilibrium of Wardrop, Nash equilibrium (competiion between groups of users) and system optimum of Wardrop were established in 2010’s. Thus, there is a clear opportunity for development of selfish and group routing models on the basis of game-theoretic approach. Acknowlegments. The first author was jointly supported by a grant from the Russian Science Foundation (No. 17-11-01079 — Optimal Behavior in Conflict- Controlled Systems).

References Altman, E., T. Basar, T. Jimenez and N. Shimkin (2002). Competitive routing in networks with polynomial costs. IEEE Transactions on automatic control, 47(1), 92–96. Altman, E., R. Combes, Z. Altman and S. Sorin (2011). Routing games in the many players regime. Proceedings of the 5th International ICST Conference on Performance Evaluation Methodologies and Tools, 525–527. Altman, E. and H. Kameda (2005). Equilibria for multiclass routing problems in multi- agent networks. Advances in Dynamic Games, 7, 343–367. Beckmann, M. J., C. B. McGuire and C. B. Winsten (1956). Studies in the Economics of Transportation. Yale University Press: New Haven, CT. Bonsall, P. (1992). The influence of route guidance advice on route choice in urban net- works. Transportation. 19, 1–23. Charnes, A. and W. W. Cooper (1958). Extremal principles for simulating traffic flow in a network Proceedings of the National Academy of Science of the United States of America, 44, 201–204. Dafermos, S. C. (1971). An extended traffic assignment model with applications to two-way traffic. Transportation Science, 5, 366–389. Dafermos, S. C. and F. T. Sparrow (1969). The traffic assignment problem for a general network. Journal of Research of the National Bureau of Standards, 73B, 91–118. Devarajan, S. (1981). A note on network equilibrium and noncooperative games. Trans- portation Research, 15B, 421–426. Fisk, C. S. (1984). Game theory and transportation systems modelling. Transportation Re- search, 18B, 301–313. Gartner, N. H. (1980). Optimal traffic assignment with elastic demands: a review. Part I. Analysis framework. Transportation Science, 14(2), 174–191. Haurie, A. and P. Marcotte (1985). On the relationship between Nash-Cournot and Wardrop equilibria. Networks, 15, 295–308. Korilis, Y. A. and A. A. Lazar (1995). On the existence of equilibria in noncooperative optimal flow control. Journal of the Association for Computing Machinery, 42(3), 584–613. Korilis, Y. A., A. A. Lazar and A. Orda (1995). Architecting noncooperative networks. IEEE J. Selected Areas Commun, 13, 1241–1251. Krylatov, A. Y., V. V. Zakharov and I. G. Malygin (2016). Competitive Traffic Assignment in Road Networks. Transport and Telecommunication, 17(3), 212–221. La, R. J. and V. Anantharam (1997). Optimal routing control: game theoretic approach. Proc. of the 36th IEEE Conference on Decision and Control, 2910–2915. Nash, J. (1951). Non-cooperative games. Annals of Mathematics, 54, 286–295. 174 Alexander Yu. Krylatov, Victor V. Zakharov

Orda, A., R. Rom and N. Shimkin (1993). Competitive routing in multiuser communication networks. IEEE/ACM Transactions on Networking, 1(5), 510–521. Patriksson, M. (1994). The traffic assignment problem: models and methods. VSP Publish- ers: Utrecht, Netherlands. Patriksson, M. (2015). The traffic assignment problem: models and methods. Dover Publi- cations, Inc: N.Y., USA. Rosenthal, R. W. (1973). The network equilibrium problem in integers. Networks, 3, 53–59. Roughgarden, T. (2005). Selfish Routing and the Price of Anarchy. MIT Press. Sheffi, Y. (1985). Urban transportation networks: equilibrium analysis with mathematical programming methods. Prentice-Hall, Inc: N.J., USA. Wardrop, J. G. (1952). Some theoretical aspects of road traffic research. Proc. Institution of Civil Engineers, 2, 325–378. Xie, J., N. Yu and X. Yang (2013). Quadratic approximation and convergence of some bush-based algorithms for the traffic assignment problem. Transportation research Part B, 56, 15–30. Zakharov, V. and A. Krylatov (2016). Competitive routing of traffic flows by navigation providers. Automation and Remote Control, 77(1), 179–189. Zheng, H. and S. Peeta (2014). Cost scaling based successive approximation algorithm for the traffic assignment problem. Transportation research Part B, 68, 17–30. Contributions to Game Theory and Management, X, 175–184 Stationary Nash Equilibria for Two-Player Average Stochastic Games with Finite State and Action Spaces

Dmitrii Lozovanu1 and Stefan Pickl2 1 Institute of Mathematics and Computer Science of Moldova Academy of Sciences, Academiei 5, Chisinau, MD-2028, Moldova, E-mail: [email protected] 2 Institute for Theoretical Computer Science, Mathematics and Operations Research, Universit¨at der Bundeswehr M¨unchen, 85577 Neubiberg-M¨unchen, Germany, E-mail: [email protected]

Abstract The problem of the existence and determining stationary Nash equilibria in two-player average stochastic games with finite state and ac- tion spaces is considered. We show that an arbitrary two-player average stochastic game can be formulated in the terms of stationary strategies where each payoff is graph-continuous and quasimonotonic with respect to player’s strategies. Based on this result we ground an approach for determin- ing the optimal stationary strategies of the players in the considered games. Moreover, based on the proposed approach a new proof of the existence of stationary Nash equilibria in two-player average stochastic games is derived and the known methods for determining the optimal strategies for the games with quasimonotonic payoffs can be applied. Keywords: two-players stochastic games, average payoffs, stationary Nash equilibria, optimal stationary strategies

1. Introduction The aim of this paper is to propose a new approach for determining stationary Nash equilibria in two-player average stochastic games with finite state and action spaces. We ground such an approach by using a new model in stationary strategies for the considered class of average stochastic games. We show that the payoffs of the players in the proposed model are quasimonotonic (i.e quasiconvex and qua- siconcave) with respect to the corresponding strategies of the players and satisfy the graph-continuity property in the sense of Dasgupta and Maskin, 1986. Based on these results a new proof of the existence of stationary Nash equilibria in the considered two-player average stochastic games is obtained and a new approach for determining the optimal stationary strategies of the players is proposed. Note that two-player stochastic games with average and discounted payoffs have been studied by Mertens and Neyman, 1981, Vielle, 2000, Solan and Vieille, 2010, who proved the existence of stationary Nash equilibria and proposed computing procedures for determining the optimal stationary strategies of the players in two- player stochastic games. The approach we propose for two-player average stochastic games differ from the mentioned ones and it can be extended for n-player average stochastic games if the mentioned graph-continuous property for the payoffs holds. However the graph-continuous property for the average stochastic games with n 3 players may not take place. It is well known that for an n-player average stochastic≥ game (n 3) a stationary Nash equilibrium may not exist. This fact has been shown by≥ Flesch, Thuijman and Vrieze, 1997, who constructed an example of 3- player average stochastic game with fixed starting state for which a stationary Nash 176 Dmitrii Lozovanu, Stefan Pickl equilibrium does not exist. Tijs and Vrieze, 1986, have shown that for an arbitrary average stochastic game with a finite set of states always exists a non empty subset of starting states for which a stationary Nash equilibrium exists. In the general case the problem of determining the states in average stochastic games for which stationary Nash equilibria exist is an open problem.

2. A Two-Player Average Stochastic Game in Stationary Strategies We first present the framework of a two-person stochastic game and then specify the formulation of stochastic games with average payoffs when players use pure and mixed stationary strategies. 2.1. The Framework of a Two-Person Stochastic Game A stochastic game with two players consists of the following elements: - a state space X (which we assume to be finite); - a finite set A1(x) of actions of player 1 for an arbitrary state x X; ∈ - a finite set A2(x) of actions of player 2 for an arbitrary state x X; ∈ - a payoff f 1(x, a) with respect to player 1 for each state x X and for an arbitrary action vector a = (a1,a2) A1(x) A2(x);∈ ∈ × - a payoff f 2(x, a) with respect to player 2 for each state x X and for an arbitrary action vector a = (a1,a2) A1(x) A2(x);∈ ∈ × - a transition probability function p : X (A1(x) A2(x)) X [0, 1] × x X × × → a ∈ that gives the probability transitions px,yQfrom an arbitrary x X to an arbitrary y Y for every action vector a = (a1,a2) A1(x∈) A2(x), ∈a 1 2 ∈ × where px,y =1, x X, a A (x) A (x); y X ∀ ∈ ∈ × ∈ - a startingP state x X. 0 ∈ The game starts in the state x0 and the play proceeds in a sequence of stages. At stage t players observe state xt and simultaneously and independently choose i i actions at A (xt), i = 1, 2. Then nature selects a state y = xt+1 according to ∈ at 1 2 probability transitions pxt,y for the given action vector at = (at ,at ). Such a play of the game produces a sequence of states and actions x0,a0, x1,a1,...,xt,at,... 1 1 2 2 n that defines a stream of stage payoffs ft = f (xt,at), ft = f (xt,at),..., ft = n f (xt,at), t = 0, 1, 2,... . The infinite average stochastic game is the game with payoffs of the players

t 1 1 − ωi = lim inf E f i , i =1, 2, x0 t t τ →∞ τ=0 ! X i where ωxo expresses the average payoff per transition of player i in an infinite game. Each players has the aim to maximize his average payoff per transition. In the case i = 1 this game becomes the average Markov decision problem with a transition probability function p : X A(x) X [0, 1] and immediate rewards f(x, a)= ×x X × → f 1(x, a) in the states x X Q∈for given actions a A(x)= A1(x). In the paper we will study∈ the stochastic games∈ when players use pure and mixed stationary strategies of selection of the actions in the states. Stationary Nash Equilibria for Two-Player Average Stochastic Games 177

2.2. Pure and mixed stationary strategies of the players A strategy of player i 1, 2 in a stochastic game is a mapping si that for every ∈{ } i state xt X provides a probability distribution over the set of actions A (xt). If these probabilities∈ take only values 0 and 1, then si is called a pure strategy, otherwise si is called a mixed strategy. If these probabilities depend only on the i i state xt = x X (i. e. s does not depend on t), then s is called a stationary strategy, otherwise∈ si is called a non-stationary strategy. This means that a pure stationary strategy of player i 1, 2 can be regarded as a map ∈{ } si : x ai Ai(x) for x X → ∈ ∈ that determines for each state x an action ai Ai(x), i.e. si(x)= ai. Obviously, the corresponding sets of pure stationary strategies∈ S1,S2,...,Sn of the players in the game with finite state and action spaces are finite sets. In the following we will identify a pure stationary strategy si(x) of player i with i i the set of boolean variables s i 0, 1 , where for a given x X s i = 1 if x,a ∈ { } ∈ x,a and only if player i fixes the action ai Ai(x). So, we can represent the set of pure stationary strategies Si of player i as the∈ set of solutions of the following system:

i sx,ai =1, x X; ai Ai(x) ∀ ∈ ∈  P i i i s i 0, 1 , x X, a A (x).  x,a ∈{ } ∀ ∈ ∀ ∈ i i i If in this system we change the restriction sx,ai 0, 1 for x X, a A (x) i ∈ { } ∈ ∈ by the condition 0 sx,ai 1 then we obtain the set of stationary strategies in ≤ ≤ i the sense of Shapley, 1953, where sx,ai is treated as the probability of choices of the action ai by player i every time when the state x is reached by any route in the dynamic stochastic game. Thus, we can identify the set of mixed stationary strategies of the players with the set of solutions of the system

i sx,ai =1, x X; ai Ai(x) ∀ ∈ ∈ (1)  P i i i s i 0, x X, a A (x)  x,a ≥ ∀ ∈ ∀ ∈ and for a given profile s = (s1,s2) of mixed strategies s1,s2, of the players the s probability transition px,y from a state x to a state y can be calculated as follows

s 1 2 (a1,a2) px,y = sx,a1 sx,a2 px,y . (2) (a1,a2) A(x) X∈ In the sequel we will distinguish stochastic games in pure and mixed stationary strategies. 2.3. Average stochastic games in pure stationary strategies Let s = (s1,s2) be a profile of pure stationary strategies of the players and denote by a(s) = (a1(s),a2(s)) A1(x) A2(x) the action vector that corresponds to s and ∈ × s a(s) determines the probability distributions px,y = px,y in the states x X. Then 1 2 ∈ the average payoffs per transition ωx0 (s), ωx0 (s) for the players are determined as follows i s i ωx0 (s)= qx0,yf (y,a(s)), i =1, 2 y X X∈ 178 Dmitrii Lozovanu, Stefan Pickl

s where qx0,y represent the limiting probabilities in the states y X for the s s ∈ Markov process with probability transition matrix P = (px,y) when the transi- s tions start in x0. So, if for the Markov process with probability matrix P the s s 1 2 corresponding limiting probability matrix Q = (qx,y) is known then ωx,ωx can be determined for an arbitrary starting state x X of the game. The func- 1 2 1 2 ∈ tions ωx0 (s), ωx0 (s) on S = S S define a game in normal form that we i i × denote S i=1,2, ωx0 (s) i=1,2 . This game corresponds to an average stochastic game inh{ pure} stationary{ strategies} i that in extended form is determined by the tuple i i (X, A (x) i=1,2, f (x, a i=1,2, p, x0). If{ an arbitrary} { profile }s = (s1,s2) of pure stationary strategies in a stochastic game induces a probability matrix P s that corresponds to a Markov unichain then we say that the game possesses the unichain property and shortly we call it unichain stochastic game; otherwise we call it multichain stochastic game. 2.4. Average stochastic games in mixed stationary strategies Let s = (s1,s2) be a profile of mixed stationary strategies of the players. Then s s elements of the probability transition matrix P = (px,y) in the Markov process s s induced by s can be calculated according to (3). Therefore if Q = (qx,y) is the lim- s 1 2 iting probability matrix of P then the average payoffs per transition ωx0 (s), ωx0 (s) for the players are determined as follows

i s i ωx0 (s)= qx0,yf (y,s), i =1, 2, (3) y X X∈ where i 1 2 i 1 2 f (y,s)= sy,a1 sy,a2 f (y, (a ,a )) (4) (a1,a2) A(y) X∈ expresses the average payoff (immediate reward) in the state y X of player i when the corresponding stationary strategies s1,s2 have been applied∈ by players 1 and 2 in y. 1 2 Let S , S be the corresponding sets of mixed stationary strategies for the players i 1, 2, i.e. each S for i 1, 2 represents the set of solutions of system (2). The ∈ { } 1 2 functions ω1 (s), ω2 (s) on S = S S , defined according to (3),(4), determine x0 x0 × i i a game in normal form that we denote by S i=1,2, ωx0 (s) i=1,2 . This game corresponds to an average stochastic gameh{ in mixed} stationary{ } strategiesi that in extended form is determined by the tuple (X, Ai(x) , f i(x, a , p, x ). { }i=1,2 { }i=1,2 0 2.5. Average stochastic games with random starting state In the paper we will consider also average stochastic games in which the starting state is chosen randomly according to a given distribution θx on X. So, for a given stochastic game we will assume that the play starts{ in} the states x X ∈ with probabilities θx > 0 where θx = 1. If the players use mixed stationary x X strategies of selection the actions inP∈ the states then the payoff functions

i 1 2 i 1 2 ψθ(s ,s )= θxωx(s ,s ), i =1, 2 x X X∈ 1 2 i i on S = S S define a game in normal form S i=1,2, ψθ(s) i=1,2 that in extended form× is determined by (X, Ai(x) h{, f i}(x, a { , p,}θ ).i In the { }i=1,2 { }i=1,2 { x} Stationary Nash Equilibria for Two-Player Average Stochastic Games 179 case θ = 0, x X x , θ = 1 the considered game becomes a stochastic x ∀ ∈ \{ 0} x0 game with fixed starting state x0.

3. Some Auxiliary Results In this section we present some auxiliary results for the average Markov decision problem in the terms of stationary strategies and some auxiliary results related to the existence of pure-strategy Nash equilibria in n-person games. 3.1. Optimal Stationary Policies in the Average Markov Decision Problem It is well-known that an optimal stationary policy (strategy) for the average Markov decision problem can be found by using the following linear programming model (see Puterman, 2005): Maximize ψ(α, β)= f(x, a)αx,a (5) x X a A(x) X∈ ∈X subject to

a αy,a px,y αx,a =0, y X; a A(y) − x X a A(x) ∀ ∈ ∈ ∈ ∈  P P P α + β pa β = θ , y X; (6)  y,a y,a − x,y x,a y ∀ ∈  a A(y) a A(y) x X a A(x)  ∈P ∈P P∈ ∈P αx,a 0, βy,a 0, x X, a A(x),  ≥ ≥ ∀ ∈ ∈  where θ for y X represent arbitrary positive values that satisfy the condition y ∈ θy = 1, where θy for y Y are treated as the probabilities of choosing the y X ∈ ∈ startingP state y Y . In the case θy = 1 for y = x0 and θy = 0 for y X x0 we obtain the linear∈ programming model for an average Markov decision∈problem\{ } with fixed starting state x0. This linear programming model corresponds to a multichain case of an average Markov decision problem. If each stationary strategy in the decision problem induces an ergodic Markov chain then the restrictions (6) can be replaced by the restrictions

a αy,a px,y αx,a =0, y X; a A(y) − x X a A(x) ∀ ∈ ∈ ∈ ∈  P P P (7)  αy,a = 1;  y X a A(y)  ∈ ∈ P P αy,a 0, y X, a A(y).  ≥ ∀ ∈ ∈  In the linear programming model (5),(6) the restrictions

α + β pa β = θ , y X y,a y,a − x,y x,a y ∀ ∈ a A(y) a A(y) x X a A(x) ∈X ∈X X∈ ∈X

with the condition θy = 1 generalize the constraint y X ∈ P αy,a =1 x X a A(y) X∈ ∈X 180 Dmitrii Lozovanu, Stefan Pickl in linear programming model (5),(7) for the ergodic case. The relationship between feasible solutions of problem (5),(6) and stationary strategies in the average Markov decision problem is the following: Let (α, β) be a feasible solution of the linear programming problem (5), (6) and denote X = x α { ∈ X αx,a > 0 . Then (α, β) possesses the properties that βx,a > 0 for | a X } a A(x) ∈ ∈ x PX Xα and a stationary strategy sx,a that correspond to (α,P β) is determined as∈ \ α x,a if x X ; ∈ α αx,a  a A(x)  ∈X sx,a =  β (8)  x,a if x X X ,  ∈ \ α βx,a  a A(x)  ∈X  where sx,a expresses the probability of choosing the actions a A(x) in the states x X. Thus, s can be regarded as a mapping that for every state∈ x X provides a∈ probability distribution over the set of actions A(x); if these probabilities∈ take only values 0 and 1, then s corresponds to a pure stationary strategy, otherwise it corresponds to a mixed stationary strategy. Using the linear programming problem (5),(7) Lozovanu, 2016, showed that an average Markov decision problem in terms of stationary strategies can be formulates as follows: Maximize

ψ(s,q,w)= f(x, a)sx,aqx (9) x X a A(x) X∈ ∈X subject to a qy px,y sx,aqx =0, y X; − x X a A(x) ∀ ∈ ∈  P ∈P  a  qy + wy px,ysx,awx = θy, y X;  − x X a A(x) ∀ ∈  ∈ ∈ (10)  P P   sy,a =1, y X; a A(y) ∀ ∈  ∈  P   sx,a 0, x X, a A(x); wx 0, x X,  ≥ ∀ ∈ ∀ ∈ ≥ ∀ ∈  where θ are the same values as in problem (5), (6) and s , q , w for x X, y x,a x x ∈ a A(x) represent the variables that must be found, where qx for x X express the∈ limiting probabilities in the states for the corresponding strategy s∈.

The main property that we shall use for the average stochastic game is repre- sented by the theorem that has been proven by Lozovanu, 2016.

Theorem 1. Let an average Markov decision problem be given and consider the function

ψ(s)= f(x,a)sx,a qx, (11) x X a A(x) X∈ ∈X Stationary Nash Equilibria for Two-Player Average Stochastic Games 181 where q for x X satisfy the condition x ∈ a qy px,y sx,aqx =0, y X; − x X a A(x) ∀ ∈  ∈ ∈ (12) P P a  qy + wy px,ysx,awx = θy, y X.  − x X a A(x) ∀ ∈ P∈ ∈P  Then on the set S of solutions of the system

s =1, x X; x,a ∀ ∈ a A(x) (13)  ∈P s 0, x X, a A(x)  x,a ≥ ∀ ∈ ∈ the function ψ(s) depends only on sx,a for x X, a A(x) and ψ(s) is quasimonotone on S ( i.e. ψ(s) is quasiconvex and∈ quasiconcave∈ on S).

Remark 1. The function (1) on S depends only on sx,a for x X, a A(x) because system (12) uniquely determines q , x X for a given s ∈S. ∈ x ∀ ∈ ∈ 3.2. Existence of Pure Nash equilibria in n-Player Games with Quasimonotonic Payoffs i i i Let S i=1,n, f (s)i=1,n be an n-player game in normal form, where S , i = 1,n rep- resenth the correspondingi sets of strategies (pure strategies) of the players 1, 2,...,n n i and f i : S R1, i = 1,n represent the corresponding payoffs of these players. j=1 → Q n i Let s = (s1,s2,...,sn) be a profile of strategies of the players, s S = S , ∈ j=1 n i 1 2 i 1 i+1 n i i i Q i and define s− =(s ,s ,...,s − ,s ,...,s ), S− = S where s− S− . j=1(j=i) ∈ 6 i i Thus, for an arbitrary s S we can write s = (s ,s− ). Q Fan, 1966 extended the∈ well-known equilibrium result of Nash, 1951 to the games with quasiconcave payoffs. He proved the following theorem: i Theorem 2. Let S , i = 1,n be non-empty, convex and compact sets. If each payoff f i : S R1,i 1, 2,...,n , is continuous on S and quasiconcave with i → i ∈{ } i i respect to s on S , then the game S i=1,n, f (s)i=1,n possesses a pure-strategy Nash equilibrium. h i Dasgupta and Maskin, 1986 considered a class of games with discontinuous pay- offs and proved a pure Nash equilibria existence result for the case when the payoffs are upper semi-continuous and graph-continuous. n i The payoff f i : S R1 is upper semi-continuous if for any sequence j=1 → i i sk S such that Qsk s holds lim sup f (sk) f (s). { }⊆ { } → k ≤ →∞ n i The payoff f i : S R1 is graph-continuous if for all s S there exists a j=1 → ∈ i i Q i i i i i i i function F : S− S with F (s− ) such that f (F (s− ),s− ) is continuous i i → at s− = s− . Dasgupta and Maskin proved the following theorem. 182 Dmitrii Lozovanu, Stefan Pickl

i Theorem 3. Let S , i = 1,n be non-empty, convex and compact sets. If each pay- off f i : S R1,i 1, 2,...,n , is upper semi-continuous on S, graph-continuous → ∈{ } i i i i and quasiconcave with respect to s on S , then the game S i=1,n, f (s)i=1,n pos- sesses a pure-strategy Nash equilibrium. h i In the following we need to extend this theorem for the case when each payoff i i i i i i i f (s ,s− ), i =1, 2,...,n is quasimonotonic with respect to s on S, i.e. f (s ,s− ) i is quasiconvex and quasiconcave with respect to si on S . We can observe that in this case the reaction correspondences of the players i i i i i i i i i i φ (s− )= s S f (s ,s− )= max f (s ,s− ) , i =1, 2,...,n i { ∈ | si S } ∈ are compact and convex valued and therefore the upper semi-continuous condition can be released. So, in this case the theorem can be formulated as follows. i Theorem 4. Let S , i = 1,n be non-empty, convex and compact sets. If each payoff f i : S R1,i 1, 2,...,n , is graph-continuous and quasimonotonic with i → i ∈{ } i i respect to s on S , then the game S i=1,n, f (s)i=1,n possesses a pure-strategy Nash equilibrium. h i

4. The Main Results In this section we present the results concerned with the existence and determining stationary Nah equilibria for a two-player average stochastic game in stationary strategies. For this case we formulate this game in normal form. 4.1. The Game Model in Normal Form The game model in normal form for the considered two-player average stochastic game is the following: i Let S , i 1, 2 be the set of solutions of the system ∈{ } i sx,ai =1, x X; i ∀ ∈ ai A (x) (14)  ∈ i i i P s i 0, x X, a A (x).  x,a ≥ ∀ ∈ ∈ i that determines the set of stationary strategies of player i. Each S is a convex compact set and an arbitrary extreme point corresponds to a basic solution si i i of system (14), where sx,ai 0, 1 , x X, a A(x), i.e such a solution ∈ { } ∀ ∈ ∈ 1 2 corresponds to a pure stationary strategy of player i. On the set S = S S we define the payoff functions ×

i 1 2 1 2 i 1 2 ψθ(s ,s )= sx,a1 sx,a2 f (x, (a ,a ))qx, i =1, 2 (15) x X (a1,a2) A1(x) A2(x) X∈ ∈ X × where qx for x X are determined uniquely from the following system of linear equations ∈

1 2 1 2 (a ,a ) qy sx,a1 sx,a2 px,y qx =0, y X; − x X (a1,a2) A1(x) A2(x) ∀ ∈ ∈ ∈ ×  P P (16)  1 2  1 2 (a ,a )  qy + wy sx,a1 sx,a2 px,y wx = θy, y X, − x X (a1,a2) A1(x) A2(x) ∀ ∈ ∈ ∈ ×  P P  Stationary Nash Equilibria for Two-Player Average Stochastic Games 183

1 2 i 1 2 for an arbitrary fixed profile s = (s ,s ) S. The functions ψθ(s ,s ), i = 1, 2 represent the payoff functions for the average∈ stochastic game in normal form that i i we denote by S i=1,2, ψθ(s) i=1,2 . This game is determined by the tuple (X, Ai(x) h{, f}i(x, a { , p,} θ i) where θ for y X are given nonnegative { }i=1,2 { }i=1,2 { y} y ∈ values such that y X θy = 1. ∈ If θy = 0, y X x0 and θx0 = 1 then we obtain an average stochastic ∀ P∈ \{i } i game in normal form S i=1,2, ωx0 (s) i=1,2 when the starting state x0 is fixed, i 1 2 i h{1 2} { } i i.e. ψθ(s ,s ) = ωx0 (s ,s ), i = 1, 2. So, in this case the game is determined by (X, Ai(x) , f i(x, a , p, x ). { }i=1,n { }i=1,n 0 If θy > 0, y X and y X θy = 1 then we obtain an average stochastic game ∀ ∈ ∈ when the play starts in the states y X with probabilities θy. In this case for the payoffs of the players in theP game in∈ normal form we have

i 1 2 n i 1 2 n ψθ(s ,s , ..., s )= θyωy(s ,s , ..., s ), i =1, 2. (17) y X X∈ 4.2. Stationary Nash Equilibria Existence Results As we have noted the existence of stationary Nah equilibria in two-player average stochastic games has been shown by Vielle, 2000. Here we show that this result can be derived also from the following theorem.

i i Theorem 5. The game S i=1,2, ψθ(s) i=1,2 possesses a pure-strategy Nash 1 2 h{ } { } i equilibrium s∗ = (s ∗,s ∗) which is a stationary Nash equilibrium for the two-player i i average stochastic game determined by (X, A (x) i=1,2, f (x, a i=1,2, p, θy ). 1 2 { } { } { } Moreover, if s∗ = (s ∗,s ∗) is a pure-strategy Nash equilibrium for the game i i S i=1,2, ψθ(s) i=1,2 , where θy > 0, y X then s∗ ia a stationary Nash equi- h{ } { } i ∀ ∈ i i librium for the two-player average stochastic game (X, A (x) i=1,2, f (x, a i=1,2,p,y) with an arbitrary starting state y Y . { } { } ∈ Proof. The proof of the existence of a pure-strategy Nash equilibrium for the game i i S i=1,2, ψθ(s) i=1,2 follows from Theorems 4. Indeed, according to h{ } { } i1 1 2 1 1 Theorem 1 the payoff ψθ (s ,s ) is quasimontonic with respect to s on S for 2 2 2 1 2 2 a fixed s S and the payoff ψθ (s ,s ) is quasimontonic with respect to s on 2 ∈ 1 S for a fixed s1 S . The graph-continuous property of payoffs functions also follows from Theorem∈ 1 (see the proof of theorem in Lozovanu, 2016). Note that the graph-continuous property for payoffs holds only for two-players games. 1 2 Now let us prove the second part of the theorem. Let s∗ = (s ∗, s ∗) be a pure- i i strategy Nash equilibrium for the game S i=1,n, ψθ(s) i=1,n determined i i h{ } { } i by (X, A (x) i=1,n, f (x, a i=1,n, p, θy ), where θy > 0, y X, y X θy = 1. { } { } { } ∀ ∈ ∈ Then s = (s1∗,s2∗) is a Nash equilibrium for the average stochastic game ∗ P i i S , ψ ′ (s) with an arbitrary distribution θ′ on X, where θ′ > 0, h{ }i=1,n { θ }i=1,n i { y} y y X, y X θy′ = 1, i.e ∀ ∈ ∈ P i i i i i i i i ψ ′ (s ∗,s− ∗) ψ ′ (s ,s− ∗), s S , i =1, 2. θ ≥ θ ∀ ∈ i i If here we express ψθ′ via ωy using (17) then we obtain

i i i i i i i i θ′ (ω (s ∗,s− ∗) ω (s ,s− ∗)) 0, s S , i =1, 2. y y − y ≥ ∀ ∈ y X X∈ 184 Dmitrii Lozovanu, Stefan Pickl

This property holds for arbitrary θy′ > 0, y X such that y Y θy = 1 and therefore for an arbitrary y X we have ∀ ∈ ∈ ∈ P i i i i i i i i ω (s ∗,s− ∗) ω (s ,s− ∗) 0, s S , i =1, 2. y − y ≥ ∀ ∈ 1 2 So, s∗ = (s ∗,s ∗) is a Nash equilibrium for an arbitrary average stochastic game i S , ωi (s) with an arbitrary starting state y X. h{ }i=1,n { y }i=1,n i ∈ Remark 2. The graph-continuous property of the payoffs in the case n> 2 players may fail to holds and therefore Theorem 5 couldn’t be extended to general n-player games.

So, the problem of determining the optimal stationary strategies in a two-player i i average stochastic game determined by (X, A (x) i=1,2, f (x, a i=1,2, p, θy ) { } { i } i { } can be found if we find the optimal strategies of the game S i=1,2, ψθ(s) i=1,2 , 1 2 1h{ } 2 { } i where the strategy sets S , S and the payoff functions ψθ (s), ψθ (s) are determined according to (14)-(16).

5. Conclusion For a two-player average stochastic game a stationary Nash equilibrium exists and the optimal stationary strategies of the players can be found by determining the the optimal pure strategies for the game in normal form presented in this paper.

References Dasgupta, P. and Maskin, E. (1986). The existence of equilibrium in discontinuous eco- nomic games. Review of Economic Studies, 53, 1–26. Fan, K. (1966). Application of a theorem concerned sets with convex sections. Math. Ann. 163 189-203. Flesch, J., Thuijsman, F. and Vrieze, K. (1997). Cyclic Markov equilibria in stochastic games. International Journal of Game Theory, 26, 303–314 66. Lozovanu, D. (2016). Stationary Nash equilibria for average stochastic games. Buletinul A.S.R.M, ser. Math. 2(81) 71–92. Mertens, J. F. and Neyman, A. (1981) Stochastic games. International Journal of Game Theory, 10, 53–66 . Nash, J. Non-cooperative games. Ann. Math. 54, 286-293. Puterman, M. (2005). Markov Decision Processes: Discrete Stochastic Dynamic Program- ming. John Wiley, New Jersey. Shapley, L. (1953). Stochastic games. Proc. Natl. Acad. Sci., U.S.A., 39, 1095–1100. Solan, E. and Vieille, N. (2010). Computing uniform optimal strategies in two-player stochastic games. Economic Theory (special issue on equilibrium computation), 42, 237–253. Tijs, S.and Vrieze, O. (1986). On the existence of easy initial states for undiscounted stochastic games. Math. Oper. Res., 11, 506–513. Vieille, N. (2000). Equilibrium in 2-person stochastic games I ,II. Israel J. Math., 119(1), 55–126. Contributions to Game Theory and Management, X, 185–225

Integrative Approach to Supply Chain Collaboration in Distribution Networks: Impact on Firm Performance⋆

Natalia Nikolchenko1 and Anastasia Lebedeva2 St. Petersburg State University, 7/9 Universitetskaya nab., St. Petersburg, 199034, Russia E-mail: [email protected] , [email protected]

Abstract The study relies on previous research by Cao and Zhang (2011) and van Dijk (2016) and aims to provide theoretical insights and empirical findings on the impact of supply chain collaboration on the performance of firms and collaborative advantage as an intermediate variable in the context of the supply chain of a Russian distributor and its suppliers. The research is based on a case study of a large Russian distribution network, and it is con- sidered to be explanatory and deductive, concerning the latent constructs in the conceptual supply chain framework. The obtained results indicate that supply chain collaboration improves collaborative advantage most sig- nificantly through decision synchronization, incentive alignment and infor- mation sharing, which in turn has a direct positive influence on operational and firm performance; moreover, a mediating effect of collaborative advan- tage on the relationship between supply chain collaboration and operational performance was established. Keywords: supply chain collaboration, distribution networks, firm perfor- mance, collaborative advantage, dimensions of supply chain collaboration, structural equation model.

1. Introduction The increasing number of organizations accessing new markets to seek higher effi- ciencies in sourcing and production have heightened the importance of supply chain management today. While there are many views held by scholars on how to define supply chain collaboration, some common features are evident. We advocate that collaboration involves multiple firms or autonomous business entities engaging in a relationship that aims to share improved outcomes and benefits. To achieve these improvements in performance, the business entities need to establish an appropriate level of trust; share critical information; make joint decisions; and, when necessary, integrate supply chain processes. Supply chain collaboration is often defined as two or more companies working together to create a competitive advantage and higher profits than can be achieved by acting alone (Simatupang and Sridharan, 2002). Olorunniwo and Li (2010) take a relational position arguing that collaboration can also be defined as a relationship between independent firms characterized by open- ness and trust where risks, rewards and costs are shared between parties. Focusing more on the outcome of collaboration, Simatupang and Sridharan (2005) also use the term collaboration to describe “the close cooperation among autonomous business partners or units engaging in joint efforts to effectively meet

⋆ This work is supported by the Russian Foundation for basic Research, project No. 16- 01-00805A 186 Natalia Nikolchenko, Anastasia Lebedeva end customer needs with lower costs”. However, Singh and Power (2009) argue that cooperation is when firms exchange basic information and have some long-term re- lations with multiple suppliers or customers. Coordination occurs at a higher level where a continuous flow of critical and essential information takes place using infor- mation technology. Additionally, collaboration is higher than coordination, and, at this stage, a high level of commitment, trust and information sharing is required. The widespread developments in supply chain technologies, tools and applica- tions such as traceability systems, Quick Response, Efficient Consumer Response, Collaborative Planning, Forecasting and Replenishment and VMI have assumed firms will engage in a collaborative approach to the implementation and use of tech- nologies (Lehoux et al., 2010; Deakins et al., 2008; Sari, 2008; Emberson and Storey, 2006; Derrouiche et al., 2008; Blackhurst et al., 2006). By taking this into consid- eration, Cao et al. (2010) argue that supply chain collaboration can be defined in different ways and could be either process focused or relationship focused. Notwith- standing, they derive a model for supply chain collaboration attributed to seven components (information sharing, goal congruence, decision synchronization, incen- tive alignment, resources sharing, collaborative communication and joint knowledge creation), which they term as mechanisms to reduce costs and risks (Soosay and Hyland, 2015). The study by Simatupang and Sridharan (2005) also proposes a model for the collaborative supply chain comprising five characteristics: collabora- tive performance system; information sharing; decision synchronization; incentive alignment; and integrated supply chain processes. With the advent of new technology such as electronic commerce, the collabo- ration among multiple participants in the large-scale logistics distribution network has become much easier. Collaboration among multiple participants reduces logis- tics costs, increases profits for large-scale industrial companies, and can benefit the overall economy (Wang et al., 2017). The research is constrained as following: the first part presents the theoretical framework. The second part provides the description of the research methodology, including the object of the study, data collection and sample descriptive statistics. It is followed then by results of correlation and regression analysis of depth and scope of collaboration, confirmatory factor analysis and the structural equation model of supply chain collaboration. The final part includes empirical findings and managerial implications of the obtained results.

2. Theoretical framework and hypotheses development Supply chain collaboration is considered a major factor in maintaining a supply chain’s competitive position and deemed an important research topic. It has re- ceived increased attention in the field of supply chain management with the number of articles published over the years. Supply chains, being inter-organizational and inter-functional, are known to be more effective with the coordinated and collab- orative efforts among partners (Soosay and Hyland, 2015). This concept was first highlighted by Ellram and Cooper (1990) as a motivation for successful supply chain management. Supply chain collaboration puts firms in a position of achieving better perfor- mance. To reach there, all participating members should make all necessary ar- rangements of collaborative practices, play according to rules, and follow all ethical principles to make things work well. Collaborative advantages, obtained through Integrative Approach to Supply Chain Collaboration in Distribution Networks 187 collaborative practices enable them to achieve the highest standards of excellence in customer services and processes and implement necessary improvements to match or exceed these standards (Simatupang and Sridharan, 2005). To achieve a high level of standard a company has to work hard and make all necessary improvements to get there. Collaboration has been referred to as the driving force behind effective SCM and may be the ultimate core capability (Min et al., 2005). 2.1. Supply Chain Collaboration Dimensions Based on previous research by Cao and Zhang (2011) and Dijk (2016), this study used the following seven dimensions: information sharing, decision synchronization, incentive alignment, resource sharing, collaborative communication, joint knowledge creation and goal congruence. Our choice is based on the need to compare our results and the results of van Dijk (2016) obtained from the similar model. For better understanding, the role of each dimension for collaboration is discussed further. Information sharing refers to the extent to which a firm shares relevant, accurate, complete, and confidential information duly with its supply chain partners (Cao and Zhang 2013). Previous research has identified that decision making and overall sup- ply chain performance improve when information is shared between functions (Li et al., 2006; Simatupang and Sridharan, 2008). The information sharing is reported to improve supply chain agility and visibility. The ability to make better decisions and to take actions on the basis of greater visibility makes information sharing valuable to supply chain members. (Davenport et al., 2001). Core guidelines are that visibility should inform action, and that action becomes visible if supply chain members understand better the underlying principles that link integrated infor- mation and performance drivers. Information sharing generally facilitates decision synchronization through providing relevant, timely, accurate information required to take effective decisions about supply chain planning and execution. It enables participating supply chain members to make use of integrated information to help fulfill demand more quickly with shorter order cycle time (Fisher, 1997). According to Hall and Saygin (2012), the simple act of transferring data between functions will not improve supply chain performance unless the information is accompanied by more robust requirements for collaboration/cooperation. Decision synchronization is the extent to which supply chain members are able to coordinate key decisions in planning and execution for optimizing supply chain profitability (Simatupang et al., 2002). The fact that supply chain partners have different decision rights and expertise about supply chain operations determines the importance of decision synchronization (Simatupang and Sridharan, 2005). The information availability needs to be fully synchronized with the decision making. Wadhwa and Rao (2003) indicated that improved decision knowledge can have a significant impact on supply chain performance. Decision synchronization provides feedback to supply chain performance on how performance metrics guide supply chain members to make effective decisions. It aids and enhances information shar- ing to identify what kind of information should be collected and transferred to the decision makers. Decision synchronization provides justification for incentive align- ment to construct appropriate incentive schemes, because different supply chain members are responsible for different levels of decision making. Finally, decision synchronization helps supply chain members to carry out productive actions as- sociated with integrated supply chain processes such as transportation, customer service and replenishment (Simatupang and Sridharan, 2005). 188 Natalia Nikolchenko, Anastasia Lebedeva

Incentive alignment can be defined as the process of sharing costs, risks, and benefits amongst the supply chain members (Simatupang and Sridharan, 2002). A successful supply chain partnership requires that all gains and losses should be distributed fairly across the supply chain and the collaboration outcome should be beneficial to all supply chain members (Manthou et al., 2004). Thus, incentive alignment motivates supply chain members to act consistently with their mutual strategic objectives, including making decisions that are optimal for the whole sup- ply chain and providing truthful private information (Simatupang and Sridharan, 2008). Narayanan and Ananth Raman (2004) associate incentive alignment with the performance of the overall supply chain. If the supply chain members lack in- centive alignment, their actions will not optimize the performance of the network, resulting in excess inventory, stock-outs, incorrect forecasts, inadequate sales efforts, and poor customer service. If supply chain members align their actions to the mu- tual purpose of collaboration, that will also enhance their individual profitability. It links performance scoreboards from supply chain performance to incentive. The more transparent the linkages between performance and incentives, the more effec- tively the given incentives are able to motivate the desired and required behavior. In conjunction with decision synchronization, incentive alignment provides incentives to motivate supply chain members to make effective decisions that reinforce the desired level of performance. Resource sharing is the process of leveraging capabilities and assets and investing in them with supply chain members (Cao and Zhang 2013). Along with information sharing, resource sharing has been widely referred to as a key determinant of effec- tive coordination (Arshinder et al., 2008; Huiskonen and Pirttil, 2002; Stank et al., 1999). Resource sharing among supply chain partners varies from tangible elements such as sharing of warehouses, machineries and logistical services to intangible ele- ments such as information sharing and reputations (Ramanathan and Gunasekaran, 2014). Resource sharing is a critical part of many collaborative relationships (Ire- land and Crum, 2005). Supply chain partners can develop critical resources that extend firm boundaries and that may be incorporated in interfirm activities and processes. These resources allow the collaborating firms to gain higher returns and sustainable competitive advantage (Dyer and Singh 1998). Communication is a critical task for each function within a supply chain. The more intensely and frequently the communication takes place across the supply chain, the more comprehensible organizational goals and objectives become, which may increase the overall level of coordination across supply chain functions (Wagner and Buko, 2005). To optimize coordination within a supply chain, the objectives of the organization as a whole must be clear and accessible to all functions (Hu- gos, 2011). A lack of coordination may take place when necessary information is not available for decision-making and when functions operate without the guide of system-wide objectives (Sahin & Robinson, 2005). However, supply chain man- agement is facilitated by clearly defined reporting structures and easily accessible information networks; hence, individual supply chain functions should be focused on high-level organizational interests to enable the alignment of the supply chain as a whole. Computing and communication technologies have played and will con- tinue to play, an important role in improving design communication (Demirkan, 2005). New technologies have been applied in order to enhance distributed orga- nizational interactions and achieve good coordination and communication between Integrative Approach to Supply Chain Collaboration in Distribution Networks 189 distributed project teams (Perry and Sanderson, 1998; Wikforss and Lofgren, 2007). Collaborative communication can increase the degree of the interaction and techni- cal collaboration between different partners, making it easier to remove uncertainty and confusion in the early design stage, which cannot be replaced completely by partnering procurement. Collaborative communication has a positive impact on timeliness, understanding, and accuracy.

According to Malhotra et al. (2005), joint knowledge creation can be described as the degree to which supply chain partners develop a better understanding of and re- sponse to the market and competitive environment by working together. Essentially, joint knowledge creation is one of the most important objectives of collaboration (Hardy et al,. 2003; Gomes and Dahab, 2010; Cheung et al., 2011). Supply chain col- laboration encourages collective learning for improving supply chain performance, which in turn provides benefits to all partners (Simatupang and Sridharan, 2004). Joint knowledge creation, as well as its distribution and shared interpretation allow firms in the supply chain to create new values such as developing new products, building brand image, responding to customers’ needs, and establishing channel re- lationships (Johnson and Sohi, 2003; Luo et al., 2006; Kaufman et al., 2000). New product development in a high-tech environment requires the merging and integra- tion of different technologies to network strategic communities inside and outside the company in order to share and transfer and thus create knowledge. Knowledge creation acquires expertise from outside the company. In order to create new knowl- edge, supply chain partners are engaging in interlinked processes that enable rich information sharing, and building information technology infrastructures that allow them to process information obtained from their partners (Malhotra et al., 2005).

Goal congruence is the extent to which supply chain partners perceive their own objectives to be satisfied by the accomplishment of the supply chain objectives. It is recognized as one of the key elements in the collaborative relationship between supply chain partners (Jap, 2001; Naude and Buttle, 2001). Alignment of goals leads to shared inter-organizational interests and thus assists the collaboration. One of the benefits it provides is the reduction of incentives for opportunism (Lejeune and Yakova, 2005). Congruent goals direct buyers and suppliers in the supply chain towards cooperative behaviours, such as constructive communication, mutual sup- port and adaptation, and high commitment (Jap and Anderson, 2003). As a result, goal congruence facilitates synergy in the supply chain and efficient use of resources (Littler et al. 1995). Engaging in networks and supply chain alliances is a means for involved partners to achieve goals that they could not attain independently (Mohr and Spekman, 1994), the partners also bring their own organizational- and individual-level goals of improving their performance to the process (Schreiner et al., 2009). Goal congruence is a necessary requirement to clear understand and achieve supply chain members’ goals and objectives as independent actors of alliance and as a part of the supply network as a whole.

In our research, we expected to ascertain a direct connection between supply chain collaboration dimensions and operational and firm performance. The aim of study is to estimate the level of impact of collaboration dimensions on operational and firm performance, which are the key indicators measuring supply chain perfor- mance. 190 Natalia Nikolchenko, Anastasia Lebedeva

2.2. Operational and Firm Performance The existence of different perspectives blurs the decision regarding what it is (or not) significant to measure in a supply chain, thus a growing, yet important, num- ber of performance measures has been suggested in the literature. At the end of the 1990s, most of the measures suggested in the area of supply chain management were focusing on the performance of the logistics and distribution networks. Undoubt- edly, measures related to the inventory cost or lead time are important, but provide limited and inadequate view when the level of discussion refers to complex supply chain settings (Mehrjerdi, 2009). According to Van Hoek (1998), the scope of per- formance measurement in a supply chain needs to be holistic. A similar suggestion is also provided by other scholars, who agree that an integrated approach needs to be adopted when measuring performance in a supply chain (Bititci et al., 2000; Lambert and Pohlen, 2001). Beamon (1999) claimed that appropriate measures in supply chain management fall into three categories, namely resources, output and flexibility. Gunasekaran et al. (2001) argue that performance measures should be identified into different levels according to the decision-making process, thus the suggested measures are strategic, tactical and operational. De Toni and Tonchia (2001) suggested that financial and non-financial measures should be considered. In a synthetic and important study, Gunasekaran and Kobu (2007) reviewed the pertinent literature and a number of cases. They identified 46 different performance measures, addressing the performance of a supply chain. They remarked that almost 50 percent of the suggested performance measures are related to internal business processes (internal view) of a supply chain and the remaining 50 percent refer to the customer (external view) of the supply chain. Making the choice between the internal and the external view of a supply chain is also associated to finding the right balance between operational efficiency and customer responsiveness (Fisher, 1997). Other research efforts adopt a specific performance measurement framework (e.g. balanced scorecard) and suggest other sets of measures. In this study, the term performance is considered as firm performance that in- cludes such measures as sales growth, satisfaction with collaboration, market share growth, ROI, and consumer satisfaction, and operational performance. Operational performance refers to the ability of a company to reduce management costs, order- time, lead-time, improve the effectiveness of using raw materials and distribution capacity (Heizer et al., 2008). It has an important meaning to firms: it helps to improve effectiveness of production activities and to create high-quality products (Kaynak, 2003), leading to increased revenue and profit for companies (Truong et al., 2015). For the purpose of the study, operational performance addresses such pa- rameters as on-time delivery to consumer, order fulfillment lead-time, total logistics costs, inventory turn and stock-outs. 2.3. Collaborative Advantages Several studies in SCM have attempted to identify empirical evidence of the role of SCC on collaborative advantage (Cao and Zhang, 2011; Kanter, 1994) and perfor- mance (Nyaga et al., 2010; Ramanathan and Gunasekaran, 2014; Sheu, Yen, and Chae, 2006; Yu et al., 2013; Zacharia, Nix, and Lusch, 2011). It has been well known that competitive advantage determines firms’ profits and performance; however, since recently, the increasing competition has compelled companies to start changing their strategies in order to create joint competitive advantage with their partners (Lavie, 2006). Collaborative advantage is a relational Integrative Approach to Supply Chain Collaboration in Distribution Networks 191 view of inter-organisational competitive advantage (Dyer and Singh, 1998). In con- trast to competitive advantage, which focuses only on the firm’s own profit, collabo- rative advantage seeks to maximise a common profit for joint rent-seeking activities (Lavie, 2006). Collaborative advantage cannot be achieved by any firm alone, rather it can be acquired when different firms pursue collaborative action for synergistic outcomes (Vangen and Huxham, 2003). One of the primary business strategies to improve supply chain performance is well-integrated supply chain. Real-time information exchange with suppliers in the upstream and with customers in the downstream will create an opportunity where optimization can take place. Linkage, which helps reduce lead-times, will undoubt- edly reduce the adverse effect (i.e. bullwhip effects) and contribute to enhancing performance. Theoretically, it has been well-known that supply chain integration creates strategic advantages. In previous research, it has been asserted that collaborative advantage is a way of improving performance (Sheu, Yen, and Chae, 2006; Yu et al., 2013). Jap (2001) discovered that joint competitive advantage has a positive influence on economic outcomes. Collaboration is intended to generate customer value by producing mutual ad- vantages among suppliers, manufacturers, and distributors with respect to the sup- ply of low-cost, high-quality products and services. Many of the problems that manufacturing firms face, such as parts shortages, delivery issues, quality problems, and cost increases, are rooted in the lack of effective supply chain integration (Kim, 2009). Supply chain collaboration makes use of shared resources and knowledge (both internal and external to an organization) optimal to achieve operating syn- ergy and efficiencies, reduce costs, and enhance profits (Stock et al., 2010). It also allows firms to take advantage of different specialized capabilities through intensive coordination, which allows for the accumulation of economies of scale in production, purchasing, logistics, and problem solving. Supply chain collaboration systemati- cally synchronizes the resources and capabilities of every supply chain participant to enhance service performance, lower total costs, develop innovation etc. All of this allows to predict a direct connection between dimensions of collaboration and col- laborative advantages. Moreover, the links between CA and firm performance and operational performance are also expected to be significant. Hence, collaborative ad- vantage has a mediating role and enhances the effect of supply chain collaboration dimensions on firm performance and operational performance. The level of impact of dimensions on collaborative advantages and firm performance and operational performance will be estimated further. 2.4. Distribution Supply Network Structure Supply chain collaboration helps small and medium-sized companies to reduce costs, while increasing operational efficiency. Despite these benefits, supply chain collab- oration encounters many challenges including partner search and selection. A key driver of the overall profitability of a firm is distribution because it directly impacts both the supply chain costs and the customer experience. Distribution refers to the steps taken to move and store products from the supplier stage to a customer stage in the supply chain. Good distribution can be used to achieve a variety of supply chain objectives ranging from low cost to high responsiveness. As a result, compa- nies in the same industry often select different distribution networks with similar and comparable structure. 192 Natalia Nikolchenko, Anastasia Lebedeva

Most distribution networks have a network supply chain structure. The network structure is a complex supply chain with a combination of divergent and conver- gent structures. It is one of the possible supply chain structures like serial, dyadic, divergent, convergent, and network. The serial structure is the typical structure studied in the literature in which supplier, manufacturer, distributor and retailer are considered. This structure is in fact obtained by cascading several dyadic struc- tures. The dyadic structure consists of two business entities. A divergent structure is used to represent a more realistic supply chain in which one entity (e.g. supplier) distributes stock to several downstream entities. In a convergent structure, several entities (e.g. several suppliers) deliver components to a single manufacturer or to a distribution center (Montoya-Torres and Ortiz-Vargas, 2014). An example of the network supply chain structure is depicted on Figure 1.

Fig. 1. Example of network supply chain structure. Source: Montoya-Torres and Ortiz- Vargas (2014)

Many papers on distribution networks focus mainly on classifying the mathemat- ical models. For example, Vidal and Goetschalckx (1997) reviewed the mixed-integer programming models for strategic production-distribution network design and iden- tified the main features of those models (e.g. assumptions, objective functions, and affecting factors). Beamon (1998) provided a focused review of mathematical mod- eling approaches, and four types of models were identified based on the nature of the inputs and the objectives. In addition, the number of articles considered in these pre- vious reviews was limited. As an example, Bilgen and Ozkarahan (2004) reviewed op- timization models for production distribution network design based on 35 published articles only. Meixell and Gargeya (2005) identified the decisions, objectives, level of integration from production sites to end customers, and globalization variables by reviewing no more than 18 research articles (Mangiaracina et al., 2014). This re- view considered the literature related to the distribution network design: first, in the downstream supply chain (i.e. from manufacturing plants to customers, as shown in Figure 2) and second, affected by the flow from up to downstream (and not by re- verse flows). Reverse logistics, in fact, often requires specific facilities, such as collec- tion centers (where customers bring the products) and/or recovery/manufacturing facilities (where returned products are refurbished/remanufactured) (Melo et al., 2009). Figure 2 presents an example of a distribution network, which is the most relevant for this research. Integrative Approach to Supply Chain Collaboration in Distribution Networks 193

Fig. 2. Example of distribution network structure. Source: Mangiaracina, Song and Perego (2015)

In our case the part of the network structure that consists of relationships be- tween suppliers and a distributor, is considered. An important feature of this struc- ture is the existence of decision-making firm related to the distributor organization, which, in fact, has a role of a 3PL operator in terms of supply chain management. In this paper, we focus on the SCN design of a two-echelon supply chain, that involves more than 600 suppliers and 8 distribution centers located in different regions of Russia, and a distribution decision-making center (headquarter of dis- tribution firm – focal firm). The reason for the limitation of our research by the two-echelon supply chain is the focus on the upstream relationships between sup- pliers (manufacturers mostly) and distributor.

Fig. 3. Considered part of supply chain distribution network. Source: partially adapted from Montoya-Torres and Ortiz-Vargas (2014)

Figure 3 depicts a part of distribution network structure with decision-making center. The material flow is directed from suppliers to distribution centers as de- picted in Figure 1. The information flow is directed from DCs to DMC and from DMC to suppliers and then back. Thus, each link starting from a manufacturer, 194 Natalia Nikolchenko, Anastasia Lebedeva passing through a distribution center, and ending at a retailer can be regarded as a potential transportation route. The majority of decisions related to the development strategy, contract system, location of distribution centers, building and equipment of warehouses, information integrated processes and other belong to the manag- ing company, while operational management is related to the regional departments (distribution centers). 2.5. Hypotheses Development From the theoretical background, we have derived principal constructs of the supply chain collaboration that were used in our theoretical, measurement and structural models. They include supply chain collaboration dimensions (SCCD), collaborative advantage (CA), operational performance (OP), and firm performance (FP). To address the research issues, seven basic and important elements of collaboration and its underlying structure were identified with the help of the existing related literature (Cao and Zhang, 2011; van Dijk, 2016). Thus, the construct SCCD in- cluded 7 variables, namely: information sharing, decision synchronization, incentive alignment, resource sharing, collaborative communication, joint knowledge creation and goal congruence. The latent construct CA consisted of 4 items: offering flexibil- ity, process efficiency, innovation and business synergy. To recap, the measurements for the latent construct OP were developed in the theoretical review and included 5 items: on-time delivery to consumer, order fulfillment lead-time, total logistics costs, inventory turn and stock-outs. Finally, for the latent construct FP 5 mea- sures were adopted from theoretical background, namely: sales growth, satisfaction with collaboration, market share growth, ROI, and consumer satisfaction. Relation-based view provides us theoretical support to our model because we focused on: - how dimensions of supply chain collaboration impact collaborative advantage and firm and operational performance; – how collaborative advantage impact firm and operational performance. The developed conceptual supply chain collaboration framework suggests that supply chain members need to embrace supply chain collaboration dimensions and to conduct and perform the dimensions of supply chain properly. If a firm accom- plishes to do so, the properly executed supply chain collaboration dimensions will lead to efficient and effective collaborative advantages, which in turn will have pos- itive direct impact on operational performance and firm performance. According to Cao and Zhang (2011), by collaborating, supply chain partners can work as if they were a part of a single enterprise. They can access and leverage each other’s resources and enjoy their associated benefits. Such collaboration can increase collaborative advantage and enhance firm performance and operational performance. Thus, we can formulate the following hypotheses. Supply chain collaboration dimensions: H1a: Supply chain collaboration dimensions have a significant positive direct effect on operational performance; H1b: Supply chain collaboration dimensions have a significant positive direct effect on firm performance; H1c: Supply chain collaboration dimensions positively impact collaborative ad- vantage at a significant level. Collaborative advantage: Integrative Approach to Supply Chain Collaboration in Distribution Networks 195

H2a: Collaborative advantage has a direct significant impact on operational performance; H2b: Collaborative advantage has a direct positive significant influence on firm performance; H2c: Collaborative advantage positively mediates the positive relationship be- tween supply chain collaboration dimensions and operational performance; H2d: Collaborative advantage positively mediates the positive relationship be- tween supply chain collaboration dimensions and firm performance. Operational performance: H3: Operational performance has a direct positive significant impact on firm performance. Compared to van Dijk (2016), we excluded cross-border business barriers and collaboration barriers from our research. In fact, all considered companies operate on Russian markets and cross-border barriers do not exist. Moreover, contemporary political environment encourages Russian companies to cooperate, thus collabora- tive barriers between them have been decreasing for the past three years. Most manufacturers were forced to increase their sales volumes in domestic markets, establishing closer relationships with distributors operating in domestic markets. Figure 4 depicts the conceptual supply chain collaboration hypotheses framework used in this research.

Fig. 4. Conceptual supply chain collaboration hypotheses framework. Source: partially adapted from Cao and Zhang (2011) and van Dijk (2016) 196 Natalia Nikolchenko, Anastasia Lebedeva

3. Research design In order to test the conceptual supply chain collaboration framework, the two-step approach was used for assessing the Structural Equation Modelling (SEM) (An- derson and Gerbig, 1988). Analyzing research data and interpreting results can be complex and confusing. Traditional statistical approaches to data analysis specify default models, assume measurement occurs without error, and are somewhat inflex- ible. However, structural equation modeling requires specification of a model based on theory and research, it is a multivariate technique incorporating measured vari- ables and latent constructs, and explicitly specifies measurement error. A model (diagram) allows for specification of relationships between variables. Moreover, a two-step approach has a number of comparative strengths that allow meaningful inferences to be made. First, it allows tests of the significance for all pattern coeffi- cients. Then, the two-step approach allows an assessment of whether any structural model would give an acceptable fit. Third, one can make an asymptotically indepen- dent test of the substantive or theoretical model of interest. Hence, the suitability of the formulated conceptual model in this research paper was tested before the eventual structural path relationships in the conceptual supply chain framework were examined to test the hypotheses.

3.1. Object of the Study. The subject of this study is supply chain collaboration in a distribution network, where the focal firm is one of the largest participants in the market. Therefore, the object of the study is the relationship between the focal firm (distributor) and its suppliers, manufacturers. The focal firm is represented by a large-sized distri- bution company operating in Russia for more than 25 years. The market share of the distributor was estimated at 17-20% at the end of 2015. The company is presented in more than 150 cities in different regions of Russia from the North- West to the Far East. As a major player in the market, the company has its own intra-organizational supply chain network including 8 distribution centers with full category A warehouses. All warehouses are equipped by warehouse management systems (WMS). There are more than 60 sales departments with full category B warehouses. The number of employees is more nearly six thousand. The number of suppliers having a valid contract is 632 by the end of 2016. Among suppliers, there are more than 400 manufacturers. The main suppliers of the company are manufacturers representing electrical industry divided into six parts, namely: Cable production; Industrial electrical equipment; Lighting products; Installation electri- cal equipment; Safety systems and Fasteners and Plumbing, the latter being a new direction of development of the focal company. Among the suppliers, there are such global giants as Philips, ABB, Schneider Electric, Siemens and others. Most cable production manufacturers are represented by Russian companies, which is also as- sociated with the fact that the level of foreign trade activity has been decreasing for the last several years. 3.2. Data collection To validate our research model with the data, we adopted a survey questionnaire with measurement items derived from the previous research (van Dijk, 2016; Cao and Zhang, 2011). The setting of this study views SCC as internally and externally focused functional areas. So, the study categorizes both planning and sharing as Integrative Approach to Supply Chain Collaboration in Distribution Networks 197 market and operations-oriented activities. To execute this perspective, relevant lit- erature was reviewed and then relevant items for relevant constructs were obtained. The items were then discussed by experts (operations, marketing, collaborative com- munications and information sharing) and practitioners. Such procedures intended to ensure face validity and content validity. We followed van Dijk (2016) in the sur- vey development, using a five-point Likert scale where 1 and 5 were “strongly dis- agree” and “strongly agree”, respectively. The survey incorporated multiple items for each of the constructs. Most of these items were developed or adopted from available SCC or SCM literature.

The instrument included 19 questions that evaluated the impact of supply chain collaboration constructs and their indicators on performance of suppliers involved in the distribution network. The first four questions were demographic in nature and evaluated the organization profile. Questions 5 to 10 deal with the data on the collaborative relationships that suppliers have with their distributor. The third section of the questionnaire (questions 11 to 17) examined the SCC development and its impact on organizational performance, including three open questions asking respondents to share their views on the potential areas of improvement in collabora- tion. Questions 18-19 in the final section aimed at investigating the SCC barriers and impediments that foreign suppliers face, however, due to aforementioned reasons, these indicators were excluded from the research. The questionnaire was prepared in Russian and English versions. The Russian version was sent out to the respon- dents and the English version was used in the research for the purpose of language uniformity.

The survey aimed to measure the level of practice of various construct items and targeted a single industry to ensure deep understanding. The questionnaire was initially subjected to review by researchers and practitioners in the area of supply chain management. After the instrument was approved, the primary data were collected using the service Google Forms. The survey link was mailed via email to 632 small, medium and large sized suppliers of the distribution network described above. Respondents were asked to fill in the questionnaire, if they had SCC experience. This limitation is allowed when the subject under study is not a usual practice, and the purpose is to get as many responses as possible.

Contacts were obtained from the distributor, and emailing was organized through the decision-making distributor company. It provides direct connection with dis- tributor’s business partners. Survey descriptions/extra information, motivations for respondents and the request to forward the email to another person who has more experience in SCC were highlighted. With a response time of five weeks, a total of 65 online responses were received of which 4 had excessive missing values, yielding 61 (9.7 per cent) usable responses. As the subject under study is not a usual practice, the response rate is considered acceptable and is also consistent with similar other studies (Cao and Zhang, 2011; van Dijk, 2016). The summary of the respondents who participated in the survey is shown in Table 1. Among large companies that participated in the survey there are firms related to six different industries, most of them are manufacturers. 198 Natalia Nikolchenko, Anastasia Lebedeva

Table 1. Distribution of the respondents in different industries

Industry Number of 50- 101- 251- 501- More companies 100 250 500 1000 than 1000 Cable production 6 - 3 1 1 1 Industrial electrical 17 10 2 2 1 2 equipment Lightingproducts 1 - 1 - - - Installation electrical 17 7 1 6 1 2 equipment Fasteners and plumbing 5 - 2 2 1 - Safety systems 15 10 2 1 - 2

3.3. Sample descriptive statistics The descriptive statistics of the sample is provided to assess the overall profile of the respondent group and get better understanding of the supply chain considered in this research. For the purpose of the study, IBM SPSS Statistics 24 software was used to calculate descriptive statistics. Almost all firms in the sample operate in the Russian Federation (98.4%), only one firm operates in Italy. The reason for such situation is increasing prices for im- ported products, reduction of foreign trade activity and, thus, the offset of supplier selection priorities in the Russian market. The majority of respondents are concentrated in three industries: Industrial elec- trical equipment (27.87%), Installation electrical equipment (27.87%), and Safety systems (24.59%). The results of the distribution of respondents by industry com- position in both frequencies and percentages are presented in table 2.

Table 2. Descriptive statistics by industry composition

Industry description N (%) Cable production 6 9.84% Fasteners and plumbing 5 8.20% Industrial electrical equipment 17 27.87% Installation electrical equipment 17 27.87% Lighting products 1 1.64% Safety systems 15 24.59%

Of all respondents, 27 (44.3%) reported that their firm has between 50-100 full- time employees (FTEs), 11 (18%) respondents declared to have 101-250 FTEs. Slightly more, 12 (19.7%) respondents stated that they have 251-500 FTEs. A smaller number of respondents reported to have 501-1000 and more than 1000 FTEs, 4 (6.6%) and 7 (11.5%), correspondently. Thus, we can conclude that the majority of respondents represent small and medium enterprises. The majority of respondents, 36 (59.0%) have long-term relationships with their distributor, that is, for more than 5 years, 21 (34.4%) respondents have reported to have a relationship with their distributor for 1-5 years, and only 4 respondents indicated that the relationship with their distributor has been lasting for less than Integrative Approach to Supply Chain Collaboration in Distribution Networks 199 one year, this group of respondents related to Fasteners and Plumbing industry, which represent a new direction of development of the focal company. As for the type of relationship strategy in the supply chain, most of the respon- dents (86.9%) stated to maintain cooperative relationship with their distributor. The distribution of respondent firms according to the relationship strategy with their distributor is presented in table 3.

Table 3. Type of supply chain relationship strategy

Type of supply chain relationship strategy N (%) Cooperative 53 86.9% Competitive 6 9.8% Command 2 3.3%

The long-term relationship between partners facilitate a high level of cooperation and, therefore, lead to the cooperative type of supply chain strategy. Another reason why most respondents reported the cooperative type of supply chain strategy is that all of them are partners of the single distributor and, hence, perceive the relationship within the network as a priori cooperative, rather than competitive or command. To support this, the cross-table of type of supply chain relationship strategy and relationship length is provided below.

Table 4. Cross-table of type of supply chain relationship strategy and relationship length

Strategy/length <1 year 1-5 years More than 5 years Cooperative 3 (4.9%) 18 (29.5%) 32 (52.5%) Competitive 1 (1.6%) 2 (3.3%) 3 (4.9%) Command 0 (0.0%) 1 (1.6%) 1 (1.6%)

4. Analysis of Modeling Results

4.1. Correlation and Regression Analysis of Depth and Scope of Collaboration

Following van Dijk (2016), the depth and scope of collaboration were assessed by means of the construct collaboration areas. While the scope of collaboration is measured by the number of business processes and activities in collaboration, the depth of collaboration represents the level and degree of integration of processes in collaboration, and it increases with the volume and frequency of material and information exchanges and the employed coordination mechanisms (Skjoett-Larsen et al., 2003). For the purpose of the study, IBM SPSS Statistics 24 and IBM SPSS Amos 24 were used to conduct data analysis. In our research, we asked the respondents to evaluate the extent of collaboration in several areas, the results are presented in table 5. 200 Natalia Nikolchenko, Anastasia Lebedeva

Table 5. Collaboration areas

Collaborationarea Min Max Mean SD Production 1 5 1.85 (Little involvement) 1.263 Inventory management 1 5 2.95 (Some involvement) 1.371 Distribution 1 5 2.90 (Some involvement) 1.411 R&D 1 5 1.48 (No involvement) 0.942 Supply chain design 1 5 2.69 (Some involvement) 1.444 Product development 1 5 1.69 (Little involvement) 1.148 Promotion 1 5 4.02 (Great involvement) 1.008

The means of involvement in most collaboration areas were lower than the scales mid-point (3). Thus, we can conclude that the respondents perceived a low level and degree of collaboration in most collaboration areas. The only collaboration area which had a larger mean (4.02) than the mid-point (3) was promotion. Hence, the respondents perceive to have the highest level of collaboration with their distributor in the area of promotion. The lowest level of collaboration was assigned by the respondents to the area of R&D with the mean value of 1.48. It is followed then by product development and production areas with means of 1.69 and 1.85 respectively. A higher degree of collaboration is perceived to be in the areas of supply chain design, distribution and inventory management, which all have means close to the mid-point (3). The correlations between collaboration areas and operational and firm perfor- mance indicators were calculated to examine the relationship between these inde- pendent and dependent variables. The results of the Pearson correlation are pre- sented in table 6.

Table 6. Correlation between collaboration areas and firm performance and operational performance indicators

Dependent/ Produc- Invento- Distri- R&D Supply Product Promo- Independent tion ry Ma- bution Chain Devel- tion nagement Design opment Sales growth .138 .215 .279* .104 .128 .182 .210 Satisfaction with .181 .324** .334** .268* .095 .302* .228 collaboration Market share .038 .270* .213 .002 .016 .116 .144 growth ROI .025 .245 .200 .179 .126 .120 .140 Consumer satisfac- .200 .319** .370** .232 .178 .334** .272* tion On-time delivery to .377** .200 .354** .199 .246 .353** .295* consumer Order fulfillment .427** .292* .446** .326* .287* .436** .235 lead time Total logistics costs .009 .154 .197 .035 .165 -.010 .033 Inventory turn .303* .417** .410** .143 .261* .240 .065 Stock-outs .190 .247 .205 .133 .247 .050 .156 **. Correlation is significant at the 0.01 level (2-tailed). *. Correlation is significant at the 0.05 level (2-tailed). Integrative Approach to Supply Chain Collaboration in Distribution Networks 201

Statistically significant correlations were observed in all collaboration areas and were found to be positive. Collaboration in production resulted in moderate signif- icant correlation with on-time delivery to consumer (.377**), order fulfillment lead time (.427**) and inventory turn (.303*). Collaboration in inventory management led to moderate significant correlation with satisfaction with collaboration (.324**), consumer satisfaction (.319**), and inventory turn (.417**), and showed weak cor- relation with market share growth (.270*) and order fulfillment lead time (.292*). Next, collaboration in distribution had moderate significant correlation with satis- faction with collaboration (.334**), consumer satisfaction (.370**), on-time deliv- ery to consumer (.354**), order fulfillment lead time (.446**) and inventory turn (.410**), weak significant correlation with sales growth (.279*). Besides that, col- laboration in R&D led to weak significant correlation with satisfaction with collabo- ration (.268*) and order fulfillment lead time (.326*). Collaboration in supply chain design showed weak significant correlation with order fulfillment lead time (.287*) and inventory turn (.261*). Also, collaboration in product development resulted in moderate significant correlation with consumer satisfaction (.334**), on-time deliv- ery to consumer (.353**) and order fulfillment lead time (.436**) and weak signifi- cant correlation with satisfaction with collaboration (.302*). Finally, collaboration in promotion demonstrated weak significant correlation with consumer satisfaction (.272*) and on-time delivery to consumer (.295*). No significant correlations were found in the dependent firm performance variable ROI and operational performance variables total logistics costs and stock-outs. By computing composite variables through summing up collaboration areas and firm performance together with operational performance, we analyzed the correla- tion between these two composite variables. The composite variable collaboration areas had a moderate significant correlation with the composite variable of opera- tional and firm performance (.426**). In order to gain a more detailed insight into the effects of collaboration areas on operational and firm performance indicators, we performed multiple regressions. Following van Dijk (2016) and Bagchi et al. (2005), the cut-off value for adjusted R square was set on .10. To avoid the multicollinearity issue, we assessed the variation inflation factor (VIF) of the collaboration areas, operational performance and firm performance variables. VIF between 5 and 10 may be a reason for concern, whereas VIF above 10 indicates high correlation that leads to the multicollinearity problem. Most VIFs were in the range between 1.228 and 4.234, only the area of product development had the VIF value 5.457. Nevertheless, all VIF values were well below the maximum acceptable cut-off value of 10, which indicates the absence of multi- collinearity. The results of multiple regression of collaboration areas as independent and firm performance as dependent variables are presented in the table 7. The results of the multiple regression analysis show that the firm performance variable satisfaction with collaboration was significantly correlated with the col- laboration areas inventory management, supply chain design and promotion. It is interesting to note that in the case of the relationship between supply chain design and satisfaction with collaboration, the regression parameter was negative. As we have information that in most cases all supplies are organized by distributor on the terms of Ex Works and the transfer of ownership of the goods is carried out in the supplier’s warehouse, design of supply chain does not exist in fact. Thus, we can 202 Natalia Nikolchenko, Anastasia Lebedeva assume that most respondents do not have any joint practices with their distributor in supply chain design.

Table 7. Multiple regressions of collaboration areas and firm performance

Firm performance Collaboration area Regressions pa- Adjusted variables variables rameter estimate R square (Beta) Satisfaction with collabo- Inventory manage- .350 .176 ration ment* Satisfaction with collabo- Supply chain design* -.371 .176 ration Satisfaction with collabo- Promotion* .362 .176 ration Market share growth Inventory manage- .429 .121 ment* Market share growth Promotion* .340 .121 Consumer satisfaction Promotion* .348 .192 *. P < 0.05

We can suggest that collaboration in inventory management and promotion between suppliers and their distributor is particularly valuable and effective, thus it leads to suppliers’ satisfaction with collaboration itself. The multiple regression analysis also showed that market share growth was significantly correlated with the collaboration areas inventory management and promotion. Finally, collaboration in the area of promotion had a significant correlation with consumer satisfaction. The logic of this correlation is quite clear: distributor has a great experience in the area of promotion and the opportunity to use best practices in the market, which results in consumer satisfaction. However, by computing the composite variables through summing the collabora- tion area variables and firm performance variables, a linear regression analysis was conducted between these two composite variables. The composite variable collab- oration area had a non-significant positive parameter estimate with the composite variable firm performance (.307). Moreover, the adjusted R square was lower than the cut-off value of .10, namely: .079. Multiple regression analysis showed no significant regressions between collab- oration areas as independent variables and operational performance indicators as dependent variables were observed. Nevertheless, by computing the composite vari- ables through summing the collaboration area variables and operational perfor- mance variables, a simple linear regression analysis was conducted between these two composite variables. The same as with the Pearson correlation analysis, the sum of collaboration area variables had a significant parameter estimate with the sum of operational performance variables (.445**). Furthermore, the adjusted R square was higher than the cut-off value of .10, namely .185. Thus, it can be stated that there is indeed a positive relationship of the scope and depth of collaboration with operational performance. The results of multiple regression of operational performance as independent and firm performance indicators as dependent variables are presented in the table below. Integrative Approach to Supply Chain Collaboration in Distribution Networks 203

Table 8. Multiple regressions of operational performance and firm performance

Firm performance Operational perfor- Regressions Adjusted variables mance variables* parameter esti- R square mate (Beta) Sales growth Inventory turn* .408 .165 Satisfaction with col- Inventory turn* .337 .175 laboration Market share growth Inventory turn*** .603 .240 ROI Total logistics costs** .381 .278 Consumer satisfaction Inventory turn* .315 .206 ***. P < 0.001, **. P < 0.01, *. P < 0.05

Analysis of multiple regression of operational performance variables on firm per- formance indicators showed significant regression between operational performance indicator inventory turn and firm performance indicators sales growth, satisfaction with collaboration, market share growth and consumer satisfaction. Besides that, significant regression was observed between operational performance indicator total logistics costs and firm performance variable ROI. Moreover, by computing the com- posite variables through summing the operational performance variables and firm performance variables, a simple linear regression analysis was conducted between these two composite variables. The composite variable operational performance had a significant positive parameter estimate with the composite variable firm perfor- mance (.550***). In addition, the adjusted R square was higher than the cut-off value of .10, namely .290. Thus, it can be stated that there is indeed a positive effect of the operational performance on firm performance. To provide an integrative and comprehensive analysis of collaboration areas, a path diagram of the multiple regressions was constructed. The independent vari- ables of all collaboration areas were combined to one latent construct “collaboration areas”, whereas the latent constructs operational performance and firm performance were determined as dependent variables. The results of the multiple regression anal- ysis are visualized in the path diagram included in Appendix 1. Table 9 on the next page shows standardized regression coefficients and their significance. The table above and the path diagram show that there is a positive signifi- cant relationship between the latent construct collaboration areas and the latent construct operational performance (.521*). Moreover, there is a significant positive effect between the latent construct operational performance and the latent construct firm performance (.416*). However, there is no significant effect between the latent construct collaboration areas and firm performance. To sum up, the scope and depth of collaboration between the suppliers and their distributor in this study can be evaluated as moderate. The results of multiple re- gression analysis showed that collaboration in the areas of inventory management, supply chain design and promotion had the most positive significant effect on sev- eral firm performance indicators, namely: satisfaction with collaboration, consumer satisfaction and market share growth. However, in other collaboration areas, that is, production, distribution, R&D, and product development no significant results from collaboration were observed. 204 Natalia Nikolchenko, Anastasia Lebedeva

Table 9. Standardized regression coefficients and their significance

Relationship Regression P- parameter value estimate (Beta) Collaboration areas → Operational performance .521* .011 Operational performance → Firm performance .416* .011 Collaboration areas → Firm performance .081 .611 Collaboration areas → Product development .825** .002 Collaboration areas → Supply chain design .652*** *** Collaboration areas → R&D .710** .003 Collaboration areas → Distribution .670** .004 Collaboration areas → Inventory management .627* .010 Collaboration areas → Production .883** .002 Collaboration areas → Promotion .413 Firm performance → Sales .715 Firm performance → Satisfaction with collabo- .933*** *** ration Firm performance → Market share .747*** *** Firm performance → ROI .628*** *** Firm performance → Consumer satisfaction .779*** *** Operational performance → On-time delivery .856 Operational performance → Order fulfillment lead .966*** *** time Operational performance → Total logistics costs .386** .002 Operational performance → Inventory turn .546*** *** Operational performance → Stock-outs .121 .366 ***. P < 0.001, **. P < 0.01, *. P < 0.05

In addition, the relationships between the composite variables of collaboration areas, operational performance and firm performance were analyzed. As a result of the regression analysis, a positive significant effect (.445**) of collaboration areas on operational performance was observed. Moreover, operational performance had a significant positive relationship with firm performance (.550***). To explain such results, we should understand that the term collaboration implies involving active engagement in the solution of operational issues. Coordination of strategic issues only without operational cooperation is not enough for satisfied results. In this case, operational performance influences firm performance. The abovementioned significant positive effects and relationships were also sup- ported by the path diagram of collaboration areas that is attached in Appendix 1. The structural model measured the relationship between the unobserved latent constructs collaboration areas and operational performance (.521*), collaboration areas and firm performance (.081), and operational performance and firm perfor- mance (.416*). Thus, it can be inferred that if the latent construct collaboration increases by one standard deviation, the latent construct operational performance increases by a standard deviation of .521 at the 5 percent level of significance. Thus, a higher level of collaboration has a significant positive impact on operational perfor- mance. Moreover, if the latent construct operational performance increases by one Integrative Approach to Supply Chain Collaboration in Distribution Networks 205 standard deviation, the latent construct firm performance increases by a standard deviation of .416 at the 5 percent level of significance. 4.2. Descriptive statistics of the Latent Constructs of the Structural Equation Model Before presenting the results of Confirmatory Factor Analysis (CFA) and Structural Equation Modeling (SEM), we provide descriptive and inferential statistics of the latent construct Supply Chain Collaboration Dimensions (SCCD) in table 10 and the latent construct Collaborative Advantage (CA) in table 11.

Table 10. Descriptive statistics of dimensions of supply chain collaboration

Dimension Min Max Mean SD Information sharing 1 5 4.10 0.926 Decision synchronization 1 5 3.57 1.258 Incentive alignment 1 5 3.16 1.344 Resource sharing 1 5 2.92 1.441 Collaborative Communication 2 5 4.36 0.817 Joint knowledge creation 1 5 2.90 1.350 Goal congruence 1 5 3.56 1.245

As it is shown in table 10, among the most used dimensions of collaboration, collaborative communication (4.36) and information sharing (4.10) had the highest means, also decision synchronization (3.57) and goal congruence (3.56) were used to some extent, whereas resource sharing (2.92) and joint knowledge (2.90) were perceived as the least used collaboration dimensions in the supply chain.

Table 11. Descriptive statistics of collaborative advantages

Collaborative advantage Min Max Mean SD Process efficiency 1 5 3.48 0.868 Offering flexibility 1 5 3.85 0.910 Business synergy 1 5 3.48 0.906 Innovation 1 5 2.97 1.064

The descriptive statistics in table 11 shows that flexibility (3.85) was evaluated by respondents as the most important advantage derived from collaboration in the supply chain. Such collaborative advantages as business synergy (3.48) and process efficiency (3.48) were evaluated equally by respondents, while innovation (2.97) was ranked as the least important advantage. For better understanding of the supply chain collaboration effect, respondents were asked to rate performance improvements due to collaboration in ten specific areas using a five-point Likert scale. The results are presented in table 12. As for the operational performance and firm performance, the means are gen- erally around the point 4 (Agree). Hence, we can conclude that most respondents perceived a positive change in operational and firm performance resulting from col- laboration. Two operational performance indicators, total logistics costs and stock- outs have lower means, which are closer to the mid-point 3 (Neutral). Thus, the respondents perceive almost no effect of collaboration on their total logistics costs 206 Natalia Nikolchenko, Anastasia Lebedeva and stock-outs. Four indicators have the highest means among all performance indi- cators, namely: sales growth (4.34), satisfaction with collaboration (4.3), consumer satisfaction (4.11), market share growth (4.07) and on-time delivery to consumer (4.02). These indicators were perceived by respondents to have achieved the highest improvement through collaboration.

Table 12. Descriptive statistics of firm performance and operational performance

Firm and operational per- Min Max Mean SD formance Firm performance Sales growth 2 5 4.34 (Agree) 0.680 Satisfaction with collaboration 3 5 4.30 (Agree) 0.558 Market share growth 2 5 4.07 (Agree) 0.910 ROI 2 5 3.49 (Neutral) 0.766 Consumer satisfaction 3 5 4.11 (Agree) 0.661 Operational performance On-time delivery to consumer 2 5 4.02 (Agree) 0.671 Order fulfillment lead time 2 5 3.92 (Agree) 0.781 Total logistics costs 2 5 3.39 (Neutral) 0.802 Inventory turn 1 5 3.77 (Agree) 0.824 Stock-outs 1 5 3.07 (Neutral) 0.946

4.3. Confirmatory Factor Analysis This research follows a two-step SEM approach. The first step in this approach requires to develop and assess the measurement model, whereas the second step requires to specify and assess the structural model (Hair, 2010). Confirmatory fac- tor analysis (CFA) is a multivariate statistical procedure, which corresponds to the measurement model. It is a theory-driven statistical method, employed to test pre- defined hypotheses. All latent constructs and indicators were determined in advance and presented in the conceptual framework, therefore, confirmatory factor analysis (CFA) was used to evaluate the measurement model fit and validity. After the mea- surement model was proved to adequately represent theory with the data obtained for the study, structural equation modeling was used to analyze the hypothesized relationships between constructs. All statistical analyses were completed in IBM SPSS 24 and IBM SPSS Amos 24. The level of significance for all tests was set at 0.05 level. Following Van Dijk (2016), we decided to conduct a preliminary test of con- struct reliability analyzing each of the constructs apart from the other ones. From the point of view of statistics, reliability is explained as the proportion of inconsis- tent observations due to individual differences in respondents. This means that even a reliable survey will have varying responses due to the fact that respondents have different opinions on questions, not because of the fact that the questionnaire ques- tions were unclear or ambiguous. Consequently, a test for reliability was conducted for all four latent constructs. The preliminary reliability analysis was run using Cronbach’s alpha coefficient. It indicates that all latent constructs taken separately, disregarding possible corre- lations between them and potential cross-loadings are able to capture the concept described. As a rule, Cronbach’s alpha cut-off value is 0.7, however small negative Integrative Approach to Supply Chain Collaboration in Distribution Networks 207 deviations are acceptable (Cooper and Schindler, 2006; Malhotra and Birks, 2006). The results of Cronbach’s alphas test are presented in table 13.

Table 13. Cronbach’s alpha (a preliminary test of construct reliability)

Latent construct Number of Cronbach’s indicators alpha Supply Chain Collaboration 7 0.881 Dimensions Collaborative Advantage 4 0.755 Firm Performance 5 0.850 Operational Performance 5 0.732 All items 34 0.897

The results in table 13 indicate that most latent constructs have Cronbach’s alpha coefficients higher than the cut-off value 0.7. Moreover, the composite Cron- bach’s alpha of the whole dataset is well above the threshold of 0.7. Thus, based on the preliminary test of Cronbach’s alpha, all the latent constructs and its indicators were included in the CFA. Following the preliminary test of reliability by means of Cronbach’s alpha, CFA was conducted to ensure composite, convergent and discriminant validity along with construct reliability (Gerbing and Anderson, 1988) as well as the overall model fit. Each indicator loading was treated as an a priori indicator for the latent construct it measures, and all the latent constructs were allowed to be correlated as there was no ground for an assumption that latent constructs are not correlated. The output for the measurement model after the initial CFA is included in Appendix 2. Measurement model fit assessment shows how well the observed data fits the theoretical framework developed at earlier stages. The overall fit of the measurement model was assessed by means of several indices to have a better understanding of the goodness-of-fit. The rule of thumb suggests relying on, at least, one absolute fit index and one incremental fit index besides traditional χ2 results (Hair et al., 2010). The table below compares the expected measurement model fit indices for the good fit with the obtained ones.

Table 14. Initial CFA. Model fit assessment

Expected Obtained χ2 normed <2.0 – good fit 1.680 (good) 2.0-5.0 – acceptable fit CFI > 0.95 great .804 (sometimes acceptable) > 0.90 moderate > 0.80 sometimes ac- ceptable RMSEA < .05 good .106 (bad) 0.05 - 0.10 moderate > 0.10 bad Source: (Hair et al., 2010; Van Dijk, 2016) 208 Natalia Nikolchenko, Anastasia Lebedeva

To find the areas of the measurement model improvement, construct validity is assessed along with modification indices. We start the analysis of construct validity with the analysis of convergent validity (factor loadings should be greater than 0.5, preferably higher than 0.7). The latent constructs CA and OP had some indicators with low loadings (<0.5) to their respective construct, which could cause problems for the model fit of the structural model, taken into account the relatively low sample size of the data set. Table 15 below contains data on factor loadings produced after the initial CFA.

Table 15. Initial CFA. Factor loadings

Construct Indicator Regressions parameter esti- mate (Beta) SCCD → Goal congruence .637 SCCD → Knowledge creation .698 SCCD → Collaborative communica- .646 tion SCCD → Resource sharing .689 SCCD → Incentive alignment .867 SCCD → Decision synchronization .866 SCCD → Information sharing .695 CA → Innovation .657 CA → Offering flexibility .794 CA → Process efficiency .699 CA → Business synergy .487 FP → Sales growth .736 FP → Satisfaction with collabora- .899 tion FP → Market share .761 FP → ROI .540 FP → Consumer satisfaction .790 OP → On-time delivery .866 OP → Order fulfillment lead time .948 OP → Inventory turn .563 OP → Total logistics costs .402 OP → Stock-outs .153

As the results in the table 16 show, several indicators had low factor loadings. In particular, the indicator of OP stock-outs had an extremely low loading (.153), which could be problematic in further analysis and, hence this indicator was re- garded as a potential candidate for removal. Although some other indicators had loadings lower than the cut-off value of 0.5, in particular, business synergy (.487) and total logistics costs (.402). Rather than automatically eliminating such indicators, researchers should carefully examine the effects of item removal on the composite reliability, as well as on the construct’s content validity. The results of the CFA functioned as an input to conduct composite reliability, as well as convergent and discriminant validity tests. In particular, such tests as composite reliability (CR), average variance extracted (AVE), maximum shared Integrative Approach to Supply Chain Collaboration in Distribution Networks 209 variance (MSV), and average shared variance (ASV) tests were conducted. The threshold values for the mentioned tests are provided in table 16.

Table 16. Reliability and validity threshold values

Reliability and validity tests Cut-off value Composite reliability >0.70 Convergent validity CR > AVE AVE > 0.50 Discriminant validity MSV < AVE ASV < AVE Source: (Hair et al., 2010; Van Dijk, 2016)

In order to calculate the reliability and validity tests, the correlation table and standard regression weight table of the initial CFA, including all the latent con- structs, were used as an input. The results were calculated by means of an Excel macro (Gaskin, 2014). Table 17 summarizes the outcomes of reliability and validity tests. Table 17. Reliability and validity test results after initial CFA

CR AVE MSV FP SCCD CA OP FP 0.866 0.569 0.287 0.754 SCCD 0.889 0.539 0.116 0.115 0.734 CA 0.759 0.447 0.308 0.536 0.340 0.669 OP 0.751 0.430 0.308 0.475 0.180 0.555 0.656

As the result of testing reliability and validity, the latent constructs OP and CA showed convergent validity issues (AVE<0.5), which means that the indicators of the latent construct do not correlate well among each other. The problem could lie in the low factor loadings of the indicators previously mentioned: stock-outs (.153), business synergy (.487) and total logistics costs (.402). After removing the indicator with the lowest loading related to the construct OP, that is stock-outs, a new reliability and validity analysis was run to determine if it met the threshold values. The results of new reliability and validity tests are presented in table 18.

Table 18. Reliability and validity test results after revised CFA

CR AVE MSV FP SCCD CA OP FP 0.866 0.569 0.287 0.754 SCCD 0.889 0.539 0.116 0.115 0.734 CA 0.758 0.446 0.304 0.536 0.340 0.668 OP 0.803 0.530 0.304 0.468 0.178 0.551 0.728

According to the reliability and validity test results, the convergent validity of the construct OP improved and achieved the threshold value of 0.5, however that was not still true for the construct CA. However, following van Dijk (2016), in the spirit of the study and due to the low effect on model fit of the only one low reliability indicator, all items of the latent construct CA were included in the 210 Natalia Nikolchenko, Anastasia Lebedeva model, despite low loadings of some of them. The revised confirmatory analysis can be found in Appendix 3. The table below provides model fit indices after the revised confirmatory factor analysis. The model fit indicators were found to be acceptable for further analysis.

Table 19. Model fit indicators after revised CFA

Expected* Obtained χ2 normed <2.0 – good fit 1.355 (good) 2.0-5.0 – acceptable fit CFI > 0.95 great .907 (moderate) > 0.90 moderate > 0.80 sometimes accept- able RMSEA < .05 good .077 (moderate) 0.05 - 0.10 moderate > 0.10 bad

4.4. Test of Common Method Bias The revised CFA was further used to test the common method bias by means of a common latent factor (CLF), which captures the common variance among all observed variables in the measurement model. Afterwards, the standardized regres- sion weights from the model with the CLF were compared with the standardized regression weights of the measurement model without the CLF. The measurement model with CLF is illustrated in Appendix 4. The CLF should be retained and moved to the structural model if there are differences greater than 0.2 between the standardized regression weights of the two models. The results of the comparison of the standardized regression weights are presented in the table 20. As the table above demonstrates, the difference between the standardized re- gression weights of the model with CLF and the measurement model without CLF was not greater than the cut-off value 0.2; hence, the measurement model without CLF was hereinafter moved to the structural model. 4.5. Structural Equation Model of Supply Chain Collaboration After conducting CFA and approving of the measurement model, the structural model can be put forward for the analysis by means of SEM. SEM represents a combination of linear equations that are used to test causal relationships between latent constructs (Hair et al., 2010). As a final result, SEM is used to identify to which extent the theoretically developed model fits observed data in the sample. The main difference between CFA and SEM is that in SEM the focus is shifted to relationships between latent constructs rather than the relationships between indicators and latent constructs. We used the measurement model without CLF to build the structural equation model that is illustrated in Appendix 5. The table below provides model fit indices for the structural model. The results of the structural equation model showed that the latent construct SCCD had a significant positive effect on the latent construct CA (.408*). The latent construct CA had a significant positive influence on the latent construct OP (.520**) and the latent construct FP (.389*). Besides that, it is interesting to note that the control variable firm size had a significant negative effect on the latent Integrative Approach to Supply Chain Collaboration in Distribution Networks 211 construct CA (-.419*). No significant direct effects were observed for the relationship between SCCD and OP (.083) and between SCCD and FP (-.038). In addition, the relationship between OP and FP was also insignificant (.253). Table 22 presents the results of the standardized regression weights of the structural model.

Table 20. Comparison of standardized regression weights of the model with CLF and the model without CLF

Relationship Estimate Estimate Difference (without CLF) (with CLF) SCCD → Goal congruence 0.996 1 0.004 SCCD → Knowledge creation 0.528 0.499 -0.029 SCCD → Collaborative communi- 0.34 0.291 -0.049 cation SCCD → Resource sharing 0.444 0.398 -0.046 SCCD → Incentive alignment 0.683 0.642 -0.041 SCCD → Decision synchronization 0.619 0.559 -0.06 SCCD → Information sharing 0.353 0.299 -0.054 CA → Innovation 0.496 0.564 0.068 CA → Offering flexibility 0.84 0.864 0.024 CA → Process efficiency 0.721 0.724 0.003 CA → Business synergy 0.265 0.363 0.098 FP → Sales growth 0.703 0.703 0 FP → Satisfaction with collabo- 0.936 0.939 0.003 ration FP → Market share 0.743 0.739 -0.004 FP → ROI 0.624 0.649 0.025 FP → Consumer satisfaction 0.775 0.772 -0.003 OP → On-time delivery 0.851 0.861 0.01 OP → Order fulfillment lead 0.939 0.958 0.019 time OP → Inventory turn 0.558 0.551 -0.007 OP → Total logistics costs 0.417 0.398 -0.019

Table 21. Structural model fit assessment

Expected* Obtained χ2 normed <2.0 – good fit 1.355 (good) 2.0-5.0 – acceptable fit CFI > 0.95 great .907 (moderate) > 0.90 moderate > 0.80 sometimes acceptable RMSEA < .05 good .077 (moderate) 0.05 - 0.10 moderate > 0.10 bad 212 Natalia Nikolchenko, Anastasia Lebedeva

Table 22. Standardized regression weights of the structural model

Relationship Regressions P parameter estimate (Beta) SCCD → CA .408* .037 Firm Size → CA -.419* .010 CA → OP .520** .009 SCCD → OP .083 .534 CA → FP .389* .028 SCCD → FP -.038 .762 OP → FP .253 .123 SCCD → Goal congruence .615 SCCD → Knowledge creation .594*** *** SCCD → Collaborative communication .637*** *** SCCD → Resource sharing .670*** *** SCCD → Incentive alignment .817*** *** SCCD → Decision synchronization .923*** *** SCCD → Information sharing .692*** *** CA → Innovation .560 CA → Offering flexibility .911*** *** CA → Process efficiency .674*** *** CA → Business synergy .355** .003 FP → Sales growth .703*** *** FP → Satisfaction with collaboration .944*** *** FP → Market share .737*** *** FP → ROI .642*** *** FP → Consumer satisfaction .771 OP → On-time delivery .859*** *** OP → Order fulfillment lead time .961*** *** OP → Inventory turn .548 OP → Total logistics costs .397** .007

4.6. Mediation effect of Collaborative Advantage

According to the previously developed conceptual hypotheses framework, the la- tent construct CA is expected to positively mediate the relationship between the latent constructs SCCD and OP and between SCCD and FP. Hence, the mediation analysis was conducted in SPSS Amos 24. There are several methods to test the mediation relationships, such as Sobel’s test (1982) and the Baron and Kenny ap- proach (1986), which are regarded as more traditional ones. Both of the mentioned methods have low power compared to more modern approaches and are typically no longer recommended (e.g., MacKinnon et al., 2002; Biesanz, Falk, & Savalei, 2010). One of the most preferred methods currently is bootstrapping, which is a resampling method that is used to build a confidence interval for the indirect effect (Preacher & Hayes, 2004). One of the main advantages of the bootstrapping method is that it does not violate assumptions of normality and is therefore can be used for small sample sizes (Preacher & Hayes, 2004), which is the case in this research. Our mediation analysis was performed with 2000 bootstrap replications. To infer the observed significance level of the effects, nonparametric bootstrap bias-corrected Integrative Approach to Supply Chain Collaboration in Distribution Networks 213 confidence intervals were used. The results of the mediation analysis are presented in table 23.

Table 23. Indirect effect of SCCD through CA on OP and FP

Path Estimate P-value Lower Upper SCCD → CA → OP .212 .002 .031 .452 SCCD → CA → FP .159 .067 -.006 .581

The indirect effect of SCCD through the mediation variable CA on OP was positive and significant (.212**). The last two columns in table 23 show the upper and lower limits for the 95% confidence intervals. These values correspond to the 2.5th and 97.5th percentiles from lowest to highest rank-ordered estimates of the indirect effect derived from the 2.000 samples. Since zero does not fall between the confidence interval ranging from 0.31 to .452, we can conclude that there is a significant mediation effect. Thus, it can be stated that collaborative advantage positively mediates the relationship between supply chain collaboration dimensions and operational performance of the firm.

The indirect effect of SCCD through mediation variable CA on FP was positive, but not significant (.159), moreover the confidence interval range in this case does include zero, which means that CA does not mediate the relationship between SCCD and FP. In this case, we can propose that collaborative advantage form a sustainable advantage or superiority in operating activities.

To sum up the analysis of mediation effect, through the influence of supply chain collaboration dimensions on operational performance of firms involved in the distribution network, the performance of the entire supply chain improves, and as the result the performance of the individual firm.

5. Empirical Findings and Managerial Implications

This study set out to empirically test the relationship between supply chain collab- oration dimensions, collaborative advantage and operational and firm performance by means of structural equation modeling. The measurement model developed in this research was based on the conceptual SCC hypotheses framework, adapted from previous research (Cao et al, 2011; van Dijk, 2016). The final measurement model was transformed into the final structural equation model. This final struc- tural equation model was used for mediation analysis of the mediation construct CA to test the formulated hypotheses in the conceptual SCC framework. As a re- sult, the structural model, presented in Figure 5, showed that the latent construct SCCD had a significant positive effect on the latent construct CA (.408*).The la- 214 Natalia Nikolchenko, Anastasia Lebedeva tent construct CA had a significant positive influence on the latent construct OP (.520**) and the latent construct FP (.389*). The similar results were obtained by Cao and Zhang (2011) in their research: supply chain collaboration had a significant positive direct impact on collaborative advantage (.640**). At the same time, Cao and Zhang (2011) considered only firm performance as the latent construct, with- out including operational performance as a separate latent construct. Their result was also significant (.500**), which indicated that collaborative advantage had a significant positive direct effect on firm performance.

Fig. 5. SEM full model results of conceptual SCC hypotheses framework. Source: Author’s own

The results obtained by van Dijk (20160 are similar to those achieved in the research by Cao and Zhang (2011) and in this study. There were significant positive effects of dimensions of collaboration on operational performance (.472**), dimen- sions of collaboration on collaborative advantage (.651***), collaborative advantage on firm performance (.429***) and operational performance on firm performance (.579***). Integrative Approach to Supply Chain Collaboration in Distribution Networks 215

As for other results obtained in this research, no significant direct effects were observed for the relationship between SCCD and OP (.083) and between SCCD and FP (-.038). In addition, the relationship between OP and FP was also insignificant (.253). It is noteworthy that van Dijk (2016) achieved comparable results for the relationship between SCCD and FP, which was negative (-.180*), as well as in this research (-.038). This negative effect can be explained by increased costs due to the waste of resources required for collaboration without first achieving collaborative advantages.

Besides that, it is interesting to note that in this research the control variable firm size had a significant negative effect on the latent construct CA (-.419*). It means that there is an inverse relationship between firm size and collaborative advantage. The reason for this relationship is that smaller firms get more advantages relative to their firm size than larger firms. In the context of the examined distribution network, In our case we have examined a variety of firms, ranging from small companies (50-100 FTEs) to the larger ones (more than 1000 FTEs). For small firms, the cooperation with a large distributor provides opportunities to increase the market share by leveraging the distributor’s resources and advantages. In contrast, the larger firms are more competitive and have their own advantages that are no worse than the distributor’s ones, hence, they do not aim to cooperate and access the distributor’s resources.

It can be concluded that the different dimensions of supply chain collaboration had a significant positive effect on realizing and achieving collaborative advantages. Moreover, as a result of the mediation analysis, the positive and significant indirect effect (.212**) of SCCD through the mediation variable CA on OP was established. Therefore, improvement in operational performance can be achieved by first ob- taining collaborative advantages, in particular, offering flexibility, process efficiency, innovation and business synergy, which, in turn, are achieved by practicing SCC di- mensions. The relationship implies that, in order for a supply chain as a whole to perform well, firms should try to create a win–win situation that all participants collaborate to achieve business synergy and compete with other chains. According to Cao and Zhang (2011), generally, competitive intentions make individual firms promote their own interests at the expenses of others, which is very insidious for collaboration and can worsen or destroy the relationships. Long-term relationships such as supply chain collaboration have to be motivated by the mutuality of intent, goal congruence, and benefit sharing (Wong, 1999; Tuten and Urban, 2001). Thus, managers need to align goals and benefits with supply chain partners for creating collaborative advantage. Such collaborative advantage indeed directly increases the performance for each partner in the chain. In addition, no significant mediation effect between SCCD and FP through CA was established. However, the direct re- lationship between SCCD and CA (.408*), as well as between CA and FP (389*) were significant. 216 Natalia Nikolchenko, Anastasia Lebedeva

Our study found that effective supply chain collaboration leads to better oper- ational performance through collaborative advantage. However, the results empir- ically confirm that supply chain collaborative advantage directly improves opera- tional performance and firm performance. Whereas much of the previous research was focused on direct relationship between collaboration and performance (Duffy and Fearne, 2004, Stank et al., 2001 and Tan et al., 1998), our study, following Cao and Zhang (2011) and van Dijk (2016), considers an intermediate variable collab- orative advantage. Thus, we imply that the improvement of the firm performance should be realized through the achievement of collaborative advantages first. As the empirical results of this study show, the main instrument of obtaining collaborative advantages is the dimensions of supply chain collaboration. Under the conditions of the growing uncertainty of business environment and increasing competition, decision synchronization (.923***), incentive alignment (.817***) and information sharing (.692***) come at the forefront. Practicing these collaborative dimensions allow firms to improve process visibility and reduce the uncertainty level in decision- making.

There are different definitions and measures of collaborative advantages, which can help managers to improve shared supply chain processes and achieve benefits for all members. However, this study, consistently with the research by Cao and Zhang (2011) and van Dijk (2016), confirms that the use of such collaborative advantages as offering flexibility, process efficiency, innovation and business synergy is the most efficient.

The empirical findings showed that collaboration in the areas of inventory man- agement, supply chain design and promotion had the most positive significant effect on several firm performance indicators, namely: satisfaction with collaboration, con- sumer satisfaction and market share growth. Since the term collaboration cannot be considered apart from operational activity, most collaboration areas are related to operational functions, not only to strategic management. Consequently, the ef- fect of collaboration areas on operational performance is much higher than on firm performance. Nevertheless, the operational performance has a significant effect on firm performance.

In conclusion, after summarizing all the empirical and statistical analyses and formulating the conclusions and implications, the main contribution of this research is that in line with the research by Cao and Zhang (2011) and van Dijk (2016), our study found that the performance of firms practicing collaboration in the supply chain can be improved by obtaining collaboration advantages first. Moreover, unlike other studies, our research explains why small firms tend to collaborate more than the larger ones in the context of the distribution network encompassing mainly Russian firms operating in one industry. Taking into account that each industry has its specific features, future research should be aimed at studying networks of firms operating in one industry to deeper understand the links and principles of collaboration. Integrative Approach to Supply Chain Collaboration in Distribution Networks 217

6. Appendices

Appendix 1. Path Diagram of Collaboration Areas 218 Natalia Nikolchenko, Anastasia Lebedeva

Appendix 2. Results of Initial Confirmatory Factor Analysis

Appendix 3. Results of Revised Confirmatory Factor Analysis Integrative Approach to Supply Chain Collaboration in Distribution Networks 219

Appendix 4. Results of Confirmatory factor analysis with a Common Latent Factor 220 Natalia Nikolchenko, Anastasia Lebedeva

Appendix 5. Results of the Structural Equation Model of Supply Chain Collaboration

References Anderson, J. C. and Gerbing, D. W. (1988). Structural equation modeling in practice: a review and recommended two-step approach. Psychological Bulletin, 103(3), 411–423. Arshinder, K. A. and Deshmukh, S. G. (2008). Supply chain coordination: perspectives, em- pirical studies and research directions. International Journal of Production Economics, 115, 316–35. Integrative Approach to Supply Chain Collaboration in Distribution Networks 221

Bagchi, P., Ha, B., Skjoett-Larsen, T. and Soerensen, L. (2005). Supply chain integration: a European survey. The International Journal of Logistics Management, 16 (2), 275–294. Baron, R. M. and Kenny, D. A. (1986). The moderator-mediator variable distinction in so- cial psychological research: Conceptual, strategic and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. Beamon, B. M. (1998). Supply chain design and analysis: models and methods. Interna- tional Journal of Production Economics, 55(3), 281–294. Beamon, B. M. (1999). Measuring supply chain performance. International Journal of Op- erations & Production Management, 19(3/4), 275–92. Biesanz, J. C., Falk, C. F. and Savalei, V. (2010). Assessing mediational models: Testing and interval estimation for indirect effects. Multivariate Behavioral Research, 45, 661– 701. Bilgen, B. and Ozkarahan, I. (2004). Strategic tactical and operational production- distribution models: a review. International Journal of Technology Management, 28 (2), 151–171. Bititci, U., Suwignjo, P. and Carrie, A. (2000). Quantitative Models for Performance Mea- surement System. International Journal of Production Economics, 63(1-3), 231–241. Blackhurst, J., Craighead, C. W. and Handfield, R. B. (2006). Towards supply chain collab- oration: an operations audit of VMI initiatives in the electronics industry. International Journal of Integrated Supply Management, 2(1/2), 91–105. Bulgakova, M. A. and Petrosyan, L. A. (2015). Cooperative network games with pairwise interactions. In: Matematicheskaya Teoriya Igr i Ee Prilozheniya, Vol. 7, Iss. 4, pp. 7-18. Faculty of applied mathematics and control processes St. Petersburg State University: St. Petersburg (in Russian). Cao, M. and Zhang, Q. (2011). Supply chain collaboration: Impact on collaborative advan- tage and firm performance. Journal of Operations Management, 29(3), 163–180. Cao, M., Vonderembse, M., Zhang, Q. and Ragu-Nathan, T. (2010). Supply chain collab- oration: conceptualisation and instrument development. International Journal of Pro- duction Research, 48(22), 6613–6635. Cassivi, L. (2006). Collaboration planning in a supply chain. Supply Chain Management: An International Journal, 11(3), 249–258. Cheung, M., Myers, M. and Mentzer, J. (2011). The value of relational learning in global buyer-supplier exchanges: a dyadic perspective and test of the pie-sharing premise. Strategic Management Journal, 32, 1061–1082. Cooper, D. R. and Schindler, P. S. (2006). Marketing research. New York: McGraw- Hill/Irwin. Davenport, T. H., Harris, J. G., De Long, D. W. and Jacobson, A. L. (2001). Data to knowl- edge to results: building an analytic capability. California Management Review, 43(2), 117–39. De Toni, A. and Tonchia, S. (2001). Performance measurement systems: models, character- istics, and measures. International Journal of Operations & Production Management, 21 (2), 46–70. Deakins, E., Dorling, K. and Scott, J. (2008). Determinants of successful vendor managed inventory practice in oligopoly industries. International Journal of Integrated Supply Management, 3(3/4), 355–377. Demirkan, H. (2005). Generating design activities through sketches in multiagent systems. Automation in Construction, 14(6), 699–706. Derrouiche, R., Neubert, G. and Bouras, A. (2008). Supply chain management: a frame- work to characterize the collaborative strategies. International Journal of Computer Integrated Manufacturing, 21(4), 426–439. Duffy, R. and Fearne, A. (2004). The impact of supply chain partnerships on supplier performance. International Journal of Logistics Management, 15(1), 57–71. 222 Natalia Nikolchenko, Anastasia Lebedeva

Dyer, J. H. and Singh, H. (1998). The relational view: cooperative strategy and sources of interorganizational competitive advantage. Academy of Management Review, 23(4), 660–79. Ellram, L. M. and Cooper, M. C. (1990). Supply chain management, partnership and the shipper – Third party relationship. The International Journal of Logistics Management, 1(2), 1–10. Emberson, C. and Storey, J. (2006). Buyer-supplier collaborative relationships: beyond the normative accounts. Journal of Purchasing and Supply Management, 12(5), 236–245. Fisher, M. L. (1997). What is the right supply chain for your product? Harvard Business Review, 75(2), 105–16. Gomes, P. and Dahab, S. (2010). Bundling resources across supply chain dyads: The role of modularity and coordination capabilities. International Journal of Operations & Pro- duction Management, 30(1), 57–74. Gunasekaran, A. and Kobu, B. (2007). Performance measures and metrics in logistics and supply chain management: a review of recent literature (1995-2004) for research and applications. International Journal of Production Research, 45(12) 2819–40. Gunasekaran, A., Patel, C. and Tirtiroglu, E. (2001). Performance measures and met- rics in a supply chain environment. International Journal of Operations & Production Management, 21(1-2), 71–87. Hair, J. F., Black, W. C., Babin, B. J. and Anderson, R. E. (2010). Multivariate data anal- ysis. New York: Prentice Hall. Hall, D. C. and Saygin, C. (2012). Impact of information sharing on supply chain perfor- mance. International Journal of Advanced Manufacturing Technology, 58(1-4), 397– 409. Hardy, C., Phillips, N. and Lawrence, T. B. (2003). Resources, knowledge and influence: The organizational effects of interorganizational collaboration. Journal of Management Studies, 40(2), 321–347. Heizer, J. H., Render, B. and Weiss, H. J. (2008). Principles of Operations Management, Pearson Prentice Hall, PA. Hugos, M. (2011). Supply chain coordination. In M. Hugos (Ed.), Essentials of supply chain management (3rd ed., pp. 183–211). Hoboken, NJ: John Wiley & Sons, Inc. Huiskonen, J. and Pirttila, T. (2002). Lateral coordination in a logistics outsourcing rela- tionship. International Journal of Production Economics, 78(2), 177–185. Ireland, R. K. and Crum, C. (2005). Supply Chain Collaboration: How to Implement CPFR and Other Best Collaborative Practices. J. Ross Publishing Inc., Florida. Jap, S. D. (2001). Perspectives on joint competitive advantages in buyer-supplier relation- ships. International Journal of Research in Marketing, 18(1/2), 19–35. Jap, S. D. and Anderson, E. (2003). Safeguarding interorganizational performance and con- tinuity under ex post opportunism. Management Science, 49(12), 1684–1701. Johnson, J. J. and Sohi, R. S. (2003). The development of interfirm partnering competence: Platforms for learning, learning activities and consequences of learning. Journal of Business Research, 56(9), 757–766. Kanter, R. M. (1994). Collaborative advantage: the art of alliances. Harvard Business Re- view, 72(4), 96–108. Kaufman, A., Wood, C. H. and Theyel, G. (2000). Collaboration and technology linkages: A strategic supplier typology. Strategic Management Journal, 21(6), 649–663. Kaynak, H. (2003). The relationship between total quality management practices and their effects on firm performance. Journal of Operations Management, 21(4), 405–435. Kim, S. W. (2009). An investigation on the direct and indirect effect of supply chain inte- gration on firm performance. International Journal of Production Economics, 119(2), 328–346. Kumar, G., Banerjee, R. N., Meena, P. L. and Ganguly K. (2016). Collaborative culture and relationship strength roles in collaborative relationships: a supply chain perspective. Journal of Business & Industrial Marketing, 31(5), 587–599. Integrative Approach to Supply Chain Collaboration in Distribution Networks 223

Lambert, D. and Pohlen, T. (2001). Supply chain metrics. International Journal of Logistics Management, 12 (1), 1–19. Lavie, D. (2006). The Competitive Advantage of Interconnected Firms: An Extension of the Resource-based View. The Academy of Management Review 31(3), 638–658. Lee, C. W., Kwon, I-W.G., Severance, D. (2007). Relationship between supply chain per- formance and degree of linkage among supplier, internal integration, and customer. Supply Chain Management: An International Journal, 12 (6), 444–452. Lehoux, N., D’Amours, S. and Langevin, A. (2010). A win-win collaboration approach for a two-echelon supply chain: a case study in the pulp and paper industry. European Journal of Industrial Engineering, 4(4), 493–514. Lejeune, N. and Yakova, N. (2005). On characterizing the 4 C’s in supply china manage- ment. Journal of Operations Management, 23(1), 81–100. Li, G., Lin, Y., Wang, S. and Yan, H. (2006). Enhancing agility by timely sharing of supply information. Supply Chain Management: An International Journal, 11(5), 425–435. Littler, D., Leverick, F. and Bruce, M. (1995). Factors affecting the process of collaborative product development: A study of UK manufacturers of information and communica- tions technology products. Journal of Product Innovation Management, 12(1), 16–32. Luo, X., Slotegraaf, R. J. and Pan, X. (2006). Cross-functional coopetition: The simulta- neous role of cooperation and competition within firms. Journal of Marketing, 70(2), 67–80. MacKinnon, D. P., Lockwood, C. M., Hoffman, J. M., West, S. G. and Sheets, V. (2002). A comparison of methods to test mediation and other intervening variable effects. Psy- chological Methods, 7, 83–104. Malhotra, N. and Birks, D. (2006). Marketing Research: An Applied Perspective. Mangiaracina, R., Perego, A. and Song, G. (2012). A quantitative model to support strategic distribution network design. In Proceedings of the 2012 Logistics Research Network Annual Conference, Cranfield University, 1–8. Manthou, V., Vlachopoulou, M. and Folinas, D. (2004). Virtual e-Chain (VeC) model for supply chain collaboration. International Journal of Production Economics, 87(3), 241–250. Mehrjerdi, Y. Z. (2009). The collaborative supply chain. Assembly Automation, 29(2), 127–136. Meixell, M. J. and Gargeya, V. B. (2005). Global supply chain design: a literature review and critique. Transportation Research Part E, 41 (6), 531–550. Melo, M. T., Nickel, S. and Saldanha-da-Gama, F. (2009). Facility location and supply chain management – a review. European Journal of Operational Research, 196(2), 401–412. Min, S., Roath, A. S., Daugherty, P. J., Genchev, S. E., Chen, H., Arndt, A. D. and Richey, R. G. (2005). Supply chain collaboration: what’s happening? The International Journal of Logistics Management, 16 (2), 237–256. Mohr, J. and Spekman, R. E. (1994). Characteristics of partnership success: partnership attributes, communication behavior, and conflict resolution techniques. Strategic Man- agement Journal, 15(2), 135–152. Montoya-Torres, J. R. and Ortiz-Vargas D. A. (2014). Collaboration and information shar- ing in dyadic supply chains: A literature review over the period 2000–2012. Estudios gerenciales, 30, 343–354. Narayanan, V. G., and Ananth Raman (2004). Aligning Incentives in Supply Chains. Har- vard Business Review, 82(11), 94–102. Naude, P. and Buttle, F. (2001). Assessing relationship quality. Industrial Marketing Man- agement, 29(4), 351–361. Nyaga, G., Whipple, J. and Lynch, D. (2010). Examining supply chain relationships: Do buyer and supplier perspectives on collaborative relationships differ? Journal of Oper- ations Management, 28(2), 101–114. 224 Natalia Nikolchenko, Anastasia Lebedeva

Olorunniwo, F. O. and Li, X. (2010). Information sharing and collaboration practices in reverse logistics. Supply Chain Management: An International Journal, 15(6), 454 – 462. Papakiriakopoulos, D. and Pramatari K. (2010). Collaborative performance measurement in supply chain. Industrial Management & Data Systems, 110(9), 1297–1318. Parilina, E. M. (2009). Cooperative data transmission game in wireless network. In: Matem- aticheskaya Teoriya Igr i Ee Prilozheniya, Vol. 1, Iss. 4, pp. 93-110. Faculty of applied mathematics and control processes St. Petersburg State University: St. Petersburg (in Russian). Parilina, E. M. (2014). Strategic stability of one-point optimality principles in cooperative stochastic games. In: Matematicheskaya Teoriya Igr i Ee Prilozheniya, Vol. 6, Iss. 1, pp. 56–72. Faculty of applied mathematics and control processes St. Petersburg State University: St. Petersburg (in Russian). Perry, M. and Sanderson, D. (1998). Coordinating joint design work: the role of commu- nication and artefacts. Design Studies, 19(3), 273–88. Preacher, K. J. and Hayes, A. F. (2004). SPSS and SAS procedures for estimating indi- rect effects in simple mediation models. Behavior Research Methods, Instruments, & Computers, 36, 717–731. Ramanathan, U. and Gunasekaran, A. (2014). Supply chain collaboration: Impact of success in long-term partnerships. International Journal of Production Economics, 147, 252– 259. Sahin, F. and Robinson, E. P. (2005). Information sharing and coordination in make-to- order supply chains. Journal of Operations Management, 23(6), 579–598. Sari, K. (2010). Exploring the impacts of radio frequency identification (RFID) technology on supply chain performance. European Journal of Operational Research, 207(1), 174- 183. Schreiner, M., Kale, P. and Corsten, D. (2009). What really is alliance management ca- pability and how does it impact alliance outcomes and success? Strategic Management Journal, 30, 1395–1419. Sheu, C., Yen, H., & Chae, D. (2006). Determinants of supplier–retailer collaboration: Evi- dence from an international study. International Journal of Operations and Production Management, 26(1), 24–49. Simatupang, T. M. and Sridharan, R. (2002). The Collaborative Supply Chain. The Inter- national Journal of Logistics Management, 13(1), 15–30. Simatupang, T. M., and Sridharan, R. (2008). Design for supply chain collaboration. Busi- ness Process Management Journal, 14(3), 401–418. Simatupang, T. M. and Sridharan, R. (2004). A benchmarking scheme for supply chain collaboration. Benchmarking: An International Journal, 11(1), 9–30. Simatupang, T. M. and Sridharan, R. (2005). An integrative framework for supply chain collaboration, The International Journal of Logistics Management, 16(2), 257–274. Simatupang, T. M., Wright, A. C. and Sridharan, R. (2002). The knowledge of coordination for supply chain integration. Business Process Management Journal, 8(3), 289–308. Singh, P. J. and Power, D. (2009). The nature and effectiveness of collaboration between IRMS, their customers and suppliers: a supply chain perspective. Supply Chain Man- agement: An International Journal, 14(3), 189–200. Skjoett-Larsen, T., Thernoe, C. and Andersen, C. (2003). Supply chain collaboration. Inter- national Journal of Physical Distribution and Logistics Management, 33(6), 531–549. Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equa- tion models. Sociological Methodology, 13, 290–312. Song, H., Yu, K., Ganguly, A. and Turson, R. (2015). Supply chain network, information sharing and SME credit quality. Industrial Management & Data Systems, 116(4), 740–758. Integrative Approach to Supply Chain Collaboration in Distribution Networks 225

Soosay, C. A. and Hyland, P. (2015). A decade of supply chain collaboration and directions for future research. Supply Chain Management: An International Journal, 20(6), 613– 630. Stank, T. P., Crum, M. R. and Arango, M. (1999). Benefits of interim coordination in food industry supply chain. Journal of Business Logistics, 20(2), 21–41. Stank, T. P., Keller, S. B. and Daugherty, P. J. (2001). Supply chain collaboration and logistical service performance. Journal of Business Logistics, 22(1), 29–48. Stock, J. R., Boyer, S. L. and Harmon, T. (2010). Research opportunities in supply chain management. Journal of the Academy of Marketing Science, 38(1), 32–41. Sundram, V. P. K, Chandran, V. and Bhatti M. A. (2015). Supply chain practices and performance: the indirect effects of supply chain integration. Benchmarking: An Inter- national Journal, 23(6), 1445–1471. Tan, K. C., Kannan, V. R. and Handfield, R. B. (1998). Supply chain management: supplier performance and firm performance. International Journal of Purchasing and Materials Management, 34 (3), 2–9. Truong, H. Q., Sameiro, M., Fernandes, A. C., Sampaio P., Duong, B. A. T., Duong, H. H. and Vilhenac, E. (2015). Supply chain management practices and firms’ operational performance. International Journal of Quality & Reliability Management, 34(2), 176– 193. Tuten, T. L. and Urban, D. J. (2001). An expanded model of business-to-business partner- ship foundation and success. Industrial Marketing Management, 30(2), 149–164. Van Dijk, M. L. (2016). Cross-border collaboration in European-Russian supply chains: In- tegrative approach of provision on design, performance and impediments. Contributions to Game Theory and Management, IX, 118–169. Van Hoek, R. (1998). Measuring the unmeasurable – measuring and improving performance in the supply chain. Supply Chain Management, 3(4), 187–92. Vangen, S. and C. Huxham (2003). Enacting Leadership for Collaborative Advantage: Dilemmas of Ideology and Pragmatism in the Activities of Partnership Managers. British Journal of Management, 14(1), 61–76. Vidal, C. J. and Goetschalckx, M. (1997). Strategic production-distribution models: a criti- cal review with emphasis on global supply chain model. European Journal of Operational Research , 98(1), 1–18. Wadhwa, S., and Rao, K. S. (2003). Flexibility and Agility for enterprise synchronization: Knowledge and Innovation Management towards Flexagility. SIC Journal, 12(2), 111– 128. Wagner, S. M., and Buko, C. (2005). An empirical investigation of knowledge sharing in networks. Journal of Supply Chain Management, 41(4), 17–31. Wang, T., Ma, X., Li, Z., Liu, Y, Xu M. and Wang, Y. (2017). Profit distribution in collaborative multiple centers vehicle routing problem. Journal of Cleaner Production, 144, 203–219. Wikforss, O. and Lofgren, A. (2007) Rethinking communication in construction. Electronic Journal of Information Technology in Construction, 12, 337–45. Wong, A. (1999). Partnering through cooperative goals in supply chain relationships. Total Quality Management, 10(4/5), 786–792. Xie, D., Wu, D., Luo, J. and Hu, X. (2010). A case study of multi-team communications in construction design under supply chain partnering. Supply Chain Management: An International Journal, 15(5), 363–370. Yu, T-Y., Jacobs, M. A., Salisbury, W. D. and Enns, H. (2013). The effects of supply chain integration on customer satisfaction and financial performance: an organiza- tional learning perspective. International Journal of Production Economics, 146(1), 346–358. Zacharia, Z. G., Nix, N. W. and Lusch, R.F. (2011). Capabilities that enhance outcomes of an episodic supply chain collaboration. Journal of Operations Management, 29 (6),591– 603. Contributions to Game Theory and Management, X, 226–232

Blotto Games with Costly Winnings

Irit Nowik1 and Tahl Nowik2 1 Department of Industrial Engineering and Management, Lev Academic Center, P.O.B 16031, Jerusalem 9116001, Israel E-mail: nowik@@jct.ac.il 2 Department of Mathematics, Bar-Ilan University, Ramat-Gan 5290002, Israel E-mail: tahl@@math.biu.ac.il URL: www.math.biu.ac.il/ tahl

Abstract We introduce a new variation of the stochastic asymmetric Colo- nel , where the n battles occur as sequential stages of the game, and the winner of each stage needs to spend resources for maintaining his win. The limited resources of the players are thus needed both for increasing the probability of winning and for the maintenance costs. We show that if the initial resources of the players are not too small, then the game has a unique Nash equilibrium, and the given equilibrium strategies guarantee the given expected payoff for each player.

1. Introduction We present a new n-stage game, which is a variation of the Colonel Blotto game. Each player starts the game with some given resource, and at the beginning of each stage he must decide how much resource to invest in that stage. A player wins the given stage with probability corresponding to the relative investments of the players, and if both players invest 0 then no player wins that stage. The winner of the stage receives a payoff which may differ from stage to stage. Since it is possible that certain stages will not be won by any player, this is not a fixed sum game. The players’ resources from which the investments are taken can be thought of as money, whereas the payoffs should be thought of as a quantity of different nature, such as political gain. The two quantities cannot be interchanged, that is, the payoff cannot be converted into resources for further investment. The new feature of our game is the following. The winner of each stage is required to spend additional resources on the maintenance of his winning. This is a real life situation, where the winnings are some assets, and resources are required for their maintenance, as in wars, territorial contests among organisms, or in the political arena. The winner of a given stage must put aside all resources that will be required for future maintenance costs of the won asset. Thus, a fixed amount will be deducted from the resources of the winner immediately after winning, which should be thought of as the sum of all future maintenance costs for the given acquired asset. At each stage the player thus needs to decide how much to invest in the given stage, where winning that stage on one hand leads to the payoff of the given stage, but on the other hand the maintenance cost for the given winning negatively af- fects the probabilities for future winnings. In the present work we show that if the initial resources of the players are not too small then the game has a unique Nash equilibrium, and each player guarantees the payoff of this Nash equilibrium (Theorem 2.) Blotto Games with Costly Winnings 227

As mentioned, our game presents a variation of the well known Colonel Blotto game (Borel, 1921). In Blotto games two players simultaneously distribute forces across several battlefields. At each battlefield, the player that allocates the largest force wins. The Blotto game has been developed and generalized in many directions (see e.g., Borel, 1921; Friedman, 1958; Lake, 1979; Roberson,2006; Hart, 2008; Duf- fy and Matros, 2015). Two main developments are the “asymmetric” and the “stochastic” models. The asymmetric version allows the payoffs of the battlefields to differ from each other, and in the stochastic model the deterministic rule deciding on the winner is replaced by a probabilistic one, by which the chances of winning a battlefield depends on the size of investment. The present work adds a new feature which changes the nature of the game, in making the winnings costly. The players thus do not know before hand how much of their resources will be available for investing in winning rather than on maintenance, and so the game cannot be formulated with simultaneous investments, as in the usual Blotto games, but rather must be formulated with sequential stages. At each stage the players need to decide how much to invest in the given stage, based on their remaining available resources and on the future fees and payoffs. This work was inspired by previous work of the first author with S. Zamir and I. Segev ( Nowik, 2009; Nowik et al., 2012) on a developmental competition that occurs in the nervous system, which we now describe. A muscle is composed of many muscle-fibers. At birth each muscle-fiber is innervated by several motor- neurons (MNs) that “compete” to singly innervate it. It has been found that MNs with higher activation-threshold win in more competitions than MNs with lower activation thresholds. In Nowik, 2009 this competitive process is modeled as a multi-stage game between two groups of players: those with lower and those with higher thresholds. At each stage a competition at the most active muscle-fiber is resolved. The strategy of a group is defined as the average activity level of its members and the payoff is defined as the sum of their wins. If a MN wins (i.e., singly innervates) a muscle-fiber, then from that stage on, it must continually devote resources for maintaining this muscle-fiber. Hence the MNs use their resources both for winning competitions and for maintaining previously acquired muscle-fibers. It is proved in Nowik, 2009 that in such circumstances it is advantageous to win in later competitions rather than in earlier ones, since winning at a late stage will encounter less maintenance and thus will negatively affect only the few competitions that were not yet resolved. If µ is the cost of maintaining a win at each subsequent stage, then in the terminology of the present work, the fee payed by the MNs for winning the kth stage of an n stage game is (n k)µ. − 2. The game The initial data for our game is the following.

1 The number n of stages of the game. 2 Fixed payoffs w > 0, 1 k n, to be received by the winner of the kth stage. k ≤ ≤ 3 The initial resources A, B 0 of players I,II respectively. ≥ 4 Fixed fees ck 0, 1 k n 1, to be deducted from the resources of the winner after the≥ kth stage.≤ ≤ −

The rules of the game are as follows. At the kth stage of the game, the two players, which we name PI,PII, each has some remaining resource Ak,Bk, where 228 Irit Nowik, Tahl Nowik

A1 = A, B1 = B. PI,PII each needs to decide his investment xk,yk for that stage, respectively, with 0 xk Ak ck, 0 yk Bk ck, and where if Ak M then there is a unique Nash equilibrium− for the game, and each player guarantees the value of this Nash equilibrium. n For k =1,...,n let Wk = i=k wi and W = W1. We now show that if A>M, wk then if PI always chooses to invest xk Ak (as holds for our strategy σn,A,B P ≤ Wk presented in Definition 1 below), then whatever the random outcomes of the game are, his resources will not run out before the end of the game. We in fact give a specific lower bound on Ak for every k, which will be used repeatedly in the sequel.

Proposition 1. Let

k 1 c − c M = W max k + i . · 1 k n w W ≤ ≤ k i=1 i+1 ! X If A>M, and if PI plays x wkAk for all k, then A > Wk ck for all 1 k n. k Wk k wk In particular A > 0 for all 1 ≤ k n. And similarly for PII. ≤ ≤ k ≤ ≤ A M c k 1 c Proof. For every 1 k n we have > k + − i , so ≤ ≤ W W ≥ wk i=1 Wi+1

k 1 P A − c c i > k . W − W w i=1 i+1 k X A A k 1 c Thus it is enough to show that k − i for all 1 k n. We show Wk ≥ W − i=1 Wi+1 ≤ ≤ this by induction on k. For k = 1 the sum is empty and we get equality. Assuming P k 1 Ak A − ci W ≥ W − W k i=1 i+1 X Blotto Games with Costly Winnings 229 we get

Ak+1 1 wkAk 1 Wk+1Ak Ak ck = ck Wk+1 ≥ Wk+1 − Wk − Wk+1 Wk −   k 1  k A c A − c c A c = k k i k = i . W − W ≥ W − W − W W − W k k+1 i=1 i+1 k+1 i=1 i+1 X X 3. Nash equilibrium

We define the following two strategies σn,A,B and τn,A,B for PI,PII respectively. We prove that for A,B >M as given in Proposition 1, this pair of strategies is a unique Nash equilibrium, and these strategies guarantee the given payoffs. Definition 1. At the kth stage of the game, let

wkAk Akck wkBk Bkck ak = and bk = . Wk − Ak + Bk Wk − Ak + Bk where as mentioned, we formally define cn = 0. The strategy σn,A,B for PI is the following: At the kth stage PI invests ak if it is allowed by the rules of the game. Otherwise he invest 0. The strategy τn,A,B for PII is similarly defined with bk.

Recall that ak = 0 is allowed by the rules of the game if 0 ak Ak ck, whereas a = 0 is always allowed,6 even when A c < 0. We interpret≤ the≤ quantities− a ,b k k − k k k as follows. PI first divides his remaining resource Ak to the remaining stages in proportion to the payoff for each remaining stage, which gives wk A . From this Wk k he subtracts Ak c which is the expected fee he will pay for this stage, since Ak+Bk k ak = Ak . Note that W = w and formally c = 0, so a = A , b = B , ak+bk Ak +Bk n n n n n n n i.e. at the last stage the two players invest all their remaining resources. Depending on A and B and on the random outcomes of the game, it may be that PI indeed reaches a stage where ak is not allowed. In this regard we make the following definition.

Definition 2. The triple (n,A,B) is PI-effective if when PI and PII use σn,A,B and τn,A,B, then it is impossible that they reach a stage where ak is not allowed for PI. Similarly PII-effectiveness is defined for PII with bk. Proposition 2. Let M be as in Proposition 1. If A>M and B is arbitrary, then (n,A,B) is PI-effective. Furthermore, ak > 0 for all k. And similarly for PII when B>M.

Proof. We need to show that necessarily 0 < ak Ak ck for all 1 k n. We wk Ak Ak ck wkAk ≤ − wk ck ≤ c≤k have ak = , so by Proposition 1, > and Wk − Ak+Bk ≤ Wk Wk Ak ≥ Ak+Bk A > 0, so wkAk > Akck giving a > 0. k Wk Ak+Bk k For the inequality ak Ak ck we first consider k n 1. We have from ≤ − k ≤ − ck+1 the proof of Proposition 1 that Ak ck A ci > 0, so Wk − Wk+1 ≥ W − i=1 Wi+1 wk+1 ≥ Ak > ck , and so Wk Wk+1 P

Ak Wk+1Ak wk (1 )ck ck < = (1 )Ak. − Ak + Bk ≤ Wk − Wk This gives c Ak ck < A wkAk , so a = wk Ak Akck < A c . For k = n k Ak+Bk k Wk k Wk Ak+Bk k k we note that c− = 0 by definition,− and W = w , so−a = A = A − c . n n n n n n − n 230 Irit Nowik, Tahl Nowik

In general, an inductive characterization of PI-effectiveness will also involve in- duction regarding PII. But if we assume that B>M, and so by Proposition 2 all bk are known to be allowed and positive, then the notion of PI-effectiveness be- comes simpler, and may be characterized inductively as follows. When saying that a triple (n 1, A′,B′) is PI-effective, we refer to the n 1 stage game with fees − − c2,...,cn 1 and payoffs w2,...,wn. Starting with n =1, (1,A,B) is always PI- − effective. For n 2, if a1 is not allowed then (n,A,B) is not PI-effective. If a1 =0 then it is allowed,≥ and PI surely loses the first stage, and so (n,A,B) is PI-effective iff (n 1,A,B b c ) is PI-effective. Finally if a > 0 and it is allowed then − − 1 − 1 1 (n,A,B) is PI-effective iff both (n 1, A a1 c1,B b1) and (n 1, A a1,B b1 c1) are PI-effective. − − − − − − − − The crucial step in proving Theorem 2 below, on the unique Nash equilibrium and the guaranteed payoffs, is the following Theorem 1. We point out that in The- orem 2 we will assume that A>M, in which case (n,A,B) is PI-effective, by Proposition 2. But here in Theorem 1 we must consider arbitrary A 0 in order for an induction argument to carry through. ≥

Theorem 1. Given c1,...,cn 1 and w1,...,wn let M be as in Proposition 1, and − assume that B>M and PII plays the strategy τn,A,B. For A 0, if (n,A,B) is ≥ AW PI-effective, and PI plays according to σn,A,B, then his expected payoff is A+B . On the other hand, if (n,A,B) is not PI-effective, or if PI uses a different strategy, AW then his expected payoff is strictly less than A+B . Proof. By induction on n. We note that throughout the present proof we do not use the condition B>M directly, but rather only through the statements of Proposi- tions 2 and 1 saying that (n,A,B) is PII-effective, b > 0 and c < wkBk for all k k Wk 1 k n, which indeed continue to hold along the induction process. ≤ ≤ If A = 0 then ak = 0 for all k, which is the only possible investment, and its AW payoff is 0 = A+B , so the statement holds. We thus assume from now on that A> 0. For n = 1 we have b1 = B. The allowed investment for PI is 0 s A with s s ≤ ≤ A expected payoff s+B w1 = s+B W which indeed attains a strict maximum A+B W at s = A = a1. For n 2, let s be the investment of PI in the first stage. Assume first that s = 0. In this≥ case PII surely wins the first stage and so following this stage we have A = A and B = B b c . The moves for PII dictated by τ for the remaining 2 2 − 1− 1 n,A,B n 1 stages of the game are τn 1,A,B b1 c1 , and so by the induction hypothesis − − − − AW2 the expected total payoff of PI is at most A+B b c . Since Proposition 1 holds for − 1− 1 PII, we have c < w1B w1(A+B) , that is, w1 c1 > 0, and since A> 0 we get 1 W ≤ W W − A+B a = A( w1 c1 ) > 0. This means that s = 0 = a , so we must verify the strict 1 W − A+B 6 1 AW2 AW w1B inequality A+B b c < A+B . This is readily verified, using A > 0, c1 < W , − 1− 1 W = W w , and b + c = w1B Bc1 + c = w1B + Ac1 . 2 − 1 1 1 W − A+B 1 W A+B We now assume s> 0. This is allowed only if A>c1 and 0 0, then his expected payoff in the remaining n 1 stages s+b1 − (A s c1)W2 of the game is at most A+B− −s b c . Similarly, if he loses the first stage, which − − 1− 1 happens with probability b1 > 0, then his expected payoff in the remaining n 1 s+b1 − (A s)W2 stages is at most A+B −s b c . Thus, the expected payoff of PI for the whole n − − 1− 1 Blotto Games with Costly Winnings 231 stage game is at most F (s), where

s (A s c )W b (A s)W F (s)= w + − − 1 2 + 1 − 2 s + b 1 A + B s b c s + b · A + B s b c 1  − − 1 − 1  1 − − 1 − 1

w1B Bc1 with b1 = W A+B . By the induction− hypothesis we know furthermore, that in case PI wins the first (A s c )W stage, he will attain the maximal expected payoff − − 1 2 in the remaining A+B s b1 c1 stages of the game only if (n 1, A s c ,B b ) is− PI-effective,− − and he uses − − − 1 − 1 σn 1,A s c ,B b . Similarly, if he loses the first stage, he will attain the maximal − − − 1 − 1 (A s)W2 expected payoff A+B −s b c only if (n 1, A s,B b1 c1) is PI-effective and − − 1− 1 − − − − he uses σn 1,A s,B b1 c1 . If not, then since both alternatives occur with positive probability,− his− expected− − total payoff for the whole n stage game will be strictly less than F (s). To analyze F (s), we make a change of variable s = a1 + x, that is, we define F (x)= F (a + x)= F ( w1A Ac1 + x). After some manipulations we get: 1 W − A+B AW BW 3x2 b F (x)= . A + B − (A + B) W (A + B) W x W x Wc + w (A + B) 2 − − 1 1 b    Under this substitution, s = a1 corresponds to x = 0, and the allowed domain 0

w1B Using c1 < W , one may verify that in the above expression for F the two linear factors appearing in the denominator of the second term are both strictly positive AW in this domain. It follows that F in the given domain is at mostb A+B , and this maximal value is attained only for x = 0 (if it is in the domain), which corresponds to s = a for the original F . Finally,b as mentioned, unless (n 1, A ,B ) is PI- 1 − 2 2 effective and PI plays σn 1,A2,B2 , his expected payoff will be strictly less than F (s), −AW and so strictly less than A+B . We may now prove our main result.

Theorem 2. Given c1,...,cn 1 and w1,...,wn, let M be as in Proposition 1, and − assume A,B > M. Then the pair of strategies σn,A,B, τn,A,B is a unique Nash AW BW equilibrium for the game, with expected total payoffs A+B , A+B . Furthermore, AW BW σn,A,B and σn,A,B guarantee the expected payoffs A+B and A+B .

Proof. Denote σ0 = σn,A,B and τ0 = τn,A,B, and for any pair of strategies σ, τ let S1(σ, τ), S2(σ, τ) be the expected payoffs of PI, PII respectively. We first prove the second statement of the theorem. Recall that if both players invest 0 in a given stage then there is no winner to that stage. However, if B>M and PII plays τ0, then by Proposition 2 we have bk > 0 for all k, and so indeed there is a winner to each stage of the game, and thus the total combined payoff of PI and PII is necessarily W . It thus follows from Theorem 1 that for any strategy σ of PI we have BW AW S2(σ, τ0)= W S1(σ, τ0) A+B . Similarly, if A>M then S1(σ0, τ) A+B for all τ, establishing− the second≥ statement of the theorem. ≥ 232 Irit Nowik, Tahl Nowik

As to the first statement, Theorem 1 applied to both PI and PII implies that the pair σ0, τ0 is a Nash equilibrium with the given expected payoffs. To show it is unique we argue as follows. Let σ, τ be any other Nash equilibrium and assume that σ = σ0. By Theorem 1 we have S1(σ, τ0) W S (σ , τ )= 2 0 − 1 0 − 1 0 0 S2(σ0, τ0). Since the pair σ, τ is a Nash equilibrium we also have S2(σ, τ) S2(σ, τ0), and together we get S (σ, τ) > S (σ , τ ). Since S (σ, τ) + S (σ, τ)≥ W and 2 2 0 0 1 2 ≤ S1(σ0, τ0)+S2(σ0, τ0)= W , we must have S1(σ, τ)

References Borel, E. (1921). La theorie du jeu et les equations integrals a noyan symmetrique. C.R. Acad. Sci., 173, 1304–1308. English translation by L. Savage. ”The theory of play and integral equations with skew symmetric kernals”. Econometrica 21, (1953): 97–100. Duffy, J. and Matros, A. (2015). Stochastic asymmetric Blotto games: Some new results. Economic Letters, 134, 4–8. Friedman, L. (1958). Game-theory models in the allocation of advertising expenditures. Operations research, 6(5), 699–709. Hart, S. (2008). Discrete Colonel Blotto and General Lotto games. International Journal of Game Theory, 36, 441–460. Lake, M. (1979). A new campaign resource allocation model. In: Barms, S.J., Schotter, A., Schwodiauer, G. (Eds.) Applied Game theory. Physica-Verlag, Wurzburg, West Germany, Economic Theory, 118–132. Nowik, I. (2009). The game motoneurons play. Games and Economic Behavior, 66, 426– 461. Nowik, I., S. Zamir and I. Segev (2012). Losing the battle but winning the war: game the- oretic analysis of the competition between motoneurons innervating a skeletal muscle. Frontiers in Computational neuroscience, 6, Article 16. Roberson, B. (2006). The Colonel Blotto game. Economic Theory, 29, 1–24. Contributions to Game Theory and Management, X, 233–244

Social Welfare under Oligopoly: Does the Strengthening of Competition in Production Increase Consumers’ Well-Being?⋆

Mathieu Parenti1, Alexander V. Sidorov2,3 and Jacques-Fran¸cois Thisse4 1 European Centre for Advanced Research in Economics and Statistics (ECARES) , Av. F.D., Roosevelt, 39, 1050 Bruxelles, Belgium, E-mail: [email protected] 2 Sobolev Institute of Mathematics, 4 Acad. Koptyug avenue, 630090, Novosibirsk, Russia, E-mail: [email protected] 3 Novosibirsk State Univercity, 2 Pirogova Street, 630090 Novosibirsk, Russia 4 CORE-Universit´eCatholique de Louvain, 34 Voie du Roman Pays, 1348 Louvain-la-Neuve, Belgium, E-mail: [email protected]

Abstract The paper studies the detailed comparison of the Social welfare (indirect utility) under three types of imperfect competition in a general equilibrium model: quantity oligopoly (Cournot), price oligopoly (Bertrand) and monopolistic competition (Chamberlin). The folk wisdom implies that an increasing toughness of competition in sequence Cournot-Bertrand-Cham- berlin results in increasing of consumers’ welfare (indirect utility). We show that this is not true in general. This is accomplished in a simple general equi- librium model where consumers are endowed with separable preferences. We find the sufficient condition in terms of the representative consumer prefer- ence providing the “intuitive” behavior of the indirect utility and show that this condition satisfy the classes of utility functions, which are commonly used in examples (e.g., CES, CARA and HARA). Moreover, we provide a series of numerical examples (and analytically verifiable conditions as well), which illustrate that violation of this condition may results in “counter- intuitive” behavior of indirect utility, when the weakest level of competition (Cournot) provides the highest amount of the consumer’s welfare. Keywords: , Bertrand competition, free entry, Lerner index, indirect utility.

1. Introduction In oligopolistic markets, price (Bertrand) and quantity (Cournot) competition de- liver market solutions that typically differ, making it hard to formulate robust pre- dictions. The purpose of this paper is to contribute to this debate by providing a comparison of these types of competition from the consumer’s point of view. This is accomplished in an economy involving one sector and a population of consumers endowed with separable preferences and a finite number of labor units. Although we recognize that additive preferences are restrictive, they are widely used in the

⋆ This work was supported by the Russian Foundation for Fundamental Researches under grant No.15-06-05666. 234 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse literature and suffice to shed new light on old questions. Note also that the budget constraint implies that firms do not behave like monopolists. Our main findings are as follows. Using the concepts of relative love for variety, which measures the intensity of the preference for variety, and social mark-up, which measures the proportion of the utility gain from adding a variety, we show that ranking of consumers’ well-being under these two types of imperfect competition is ambiguous and depends on behavior, to be more precise, on their derivatives, in the neighborhood of zero.

2. The model The present paper deals with the model, which was introduced and studied in the paper (Parenti et al., 2017). There were proved the existence and uniqueness of oligopolistic equilibria, its comparative statics and limit behavior. The welfare was out of the scope of that paper. To save reader’s time, we borrowed the model description, definitions and the key results without proofs, which can be found in cited paper. 2.1. Firms and consumers There is one sector supplying a horizontally differentiated good and one production factor - labor. Consumption sector is continuum [0,L] of identical consumers. Each consumer supplies one unit of labor and owns 1/L of firms’ profits. The labor market is perfectly competitive and labor is chosen as the num´eraire. The differentiated good is made available under the form of a finite number n 2 varieties. Each variety is produced by a single firm and each firm produces≥ a single variety. To operate every firm needs a fixed requirement f > 0 and a marginal requirement c > 0 of labor. Without loss of generality we can normalize c = 1. Since wage can be also normalized to 1, the cost of producing qi units of variety i =1, ..., n is equal to f +1 q . · i Consumers share the same additive preferences given by

n U(x)= u(xi), (1) i=1 X where u is thrice continuously differentiable, strictly increasing, strictly concave over IR+, and such that u(0) = 0. The strict concavity of u implies that consumers have a love for variety: when a consumer is allowed to consume X units of the differentiated good, she strictly prefers the consumption profile xi = X/n to any other profile x = (x1, ..., xn) such that i xi = X. Following (Zhelobodko et al., 2012), we define the relative love for variety (RLV) as follows: P xu′′(x) ru(x) , ≡− u′(x) which is strictly positive for all x > 0. Very much like the Arrow-Pratt’s relative risk-aversion, the RLV is a local measure of consumers’ variety-seeking behavior. A higher value of the RLV means a stronger love for variety. On the contrary, ru(x)=0 means that the consumer perceives the varieties as perfect substitutes. Under the CES, we have u(x)= xρ where ρ is a constant such that 0 <ρ< 1, thus implying a constant RLV is constant and given by 1 ρ. Other examples include: (i) the − Social Welfare under Oligopoly 235

CARA utility u(x)=1 exp( αx) where α > 0 is the absolute love for variety (Behrens and Murata, 2007),− while− the RLV is increasing and given by αx; and (ii) the quadratic utility u(x)= αx βx2/2, with α,β > 0; the RLV is increasing and given by βx/(α βx). − The budget constraint− is given by n pixi = y. (2) i=1 X A consumer’s income y is equal to her wage plus her share of total profits: n 1 y =1+ Π 1, (3) L i ≥ i=1 X where the profits earned by firm i is given by Π = (p 1)q f, (4) i i − i − pi being the price set by firm i. The first-order condition for utility maximization yields

u′(xi)= λpi, where λ is the Lagrange multiplier defined by n xj u′(xj ) λ(x,y)= j=1 0. (5) P y ≥ A consumer’s inverse demand for variety i is such that

u′(xi) pi(xi, x i,y)= , (6) − λ where x i = (x1, ..., xi 1, xi+1, ...xn). − − 2.2. Market equilibrium The market equilibrium is defined by the following conditions. (E.1) Each consumer maximizes her utility (1) subject to (2). (E.2) Each firm i maximizes its profit (4) with respect to qi (under Cournot competition) or pi (under Bertrand competition). (E.3) Product market clears:

Lxi = qi for i =1, ..., n. (E.4) Labor market clears: n nf + qi = L. i=1 X The last condition implies that L 1 f q¯ f x¯ ≡ n − ⇐⇒ ≡ n − L are the only candidate symmetric equilibrium output and consumption, which both decrease with n. Note than nf is the minimum labor requirement for n firms to operate. Therefore, n cannot exceed L/f, which impliesx ¯ 0. ≥ 236 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse

Cournot Using (5) and (6), we obtain firm i’s inverse demand:

C y u′(xi) pi(x)= n , (7) j=1 xj u′(xj ) where yC is a consumer’s income underP Cournot competition. Firm i’s profit func- tion is then given by

C C y u′(xi) Πi (x) = [pi(xi, x i) 1]Lxi f = n 1 Lxi f. − − − " j=1 xj u′(xj ) − # − P For any given n 2, a Cournot equilibrium is a vector x∗= (x∗,...,x∗ ) such ≥ 1 n that each strategy xi∗ is firm i’s best reply to the strategies x∗ i chosen by the other C − firms. This equilibrium is symmetric if xi∗ = x for all i =1, ..., n.

Bertrand Assume now that firms compete in prices. Let p = (p1, ..., pn) be a price vector. In this case, consumers’ demand functions xi(p) are obtained by solving the system of equations (7) with i =1, ..., n, where yC is replaced with

1 n yB =1+ ΠB(p) L i i=1 X that is, a consumer’s income under Bertrand competition. Her the firm i’s profits are given by ΠB(p) = (p 1)Lx (p) f. i i − i − A Nash equilibrium p∗ = (p1∗, ..., pn∗ ) of this game is called a Bertrand equilib- B rium. This equilibrium is symmetric if pi∗ = p for all i. Income Effect and the Income-taking Firms One major difficulty in general equilibrium with oligopolistic firms is the income effect. Ever since (Gabszewicz and Vial, 1972), it is well known that firms operating in an imperfectly competitive environment are able to manipulate individual incomes through the profits they redistribute to consumers. By changing consumers’ incomes, firms affect their de- mand functions, whence their profits. Accounting for such feedback effects typically leads to the nonexistence of an equilibrium because the resulting profit functions are not quasi-concave (Roberts and Sonnenschein, 1977). This negative result prob- ably explains why many economic models involving imperfectly competitive product markets rely on the CES model of monopolistic competition, where the existence of an equilibrium can be established under very mild conditions. In this paper, we assume that firms recognize that income is endogenous because they operate in a general equilibrium environment. However, firms treat income parametrically, which means that they behave like “income-takers”. This approach is in the spirit of (Hart, 1985) for whom firms may take into account only some effects of their policy on the whole economy.1 Even though our model does not capture all possible strategic aspects, it is a full-fledged general equilibrium model in which oligopolistic firms account for strategic interactions within their group, as well as for endogenous

1 When product markets are imperfectly competitive, it is common to assume that firms do not manipulate wages, even though firms also have market power on the labor market. The paper (d’Aspremont et al., 1996) is a noticeable exception. Social Welfare under Oligopoly 237 incomes through the distribution of profits. Speaking technically, firms are said to be income-takers when they are aware that the income is endogenous, but treat y parametrically: ∂y ∂y =0 =0 for all i. ∂x ∂p i  i  Proposition 1. Assume that a symmetric equilibrium exists under Cournot and Bertrand competition when the number of firms is equal to n¡L/f. If firms are income-takers, the equilibrium markups are given by

1 f nru C 1 n 1 1 f B n − L m (n)= + − ru , m (n)= , (8) n n n − L n 1+r 1  f   − u n − L   while mC (n) >mB(n).

For the Proof see Proposition 1 in (Parenti et al., 2017). 2.3. Free Entry Condition Let p is a price in symmetric equilibrium, no matter Cournot or Bertrand, then denote as p 1 m − (0, 1) ≡ p ∈ mark-up, i.e., relative difference between price and marginal cost. Taking into ac- count that marginal cost coincides with the equilibrium prices under perfect com- petition, we obtain an another interpretation of mark-up as Lerner index of market power. Note that zero value of Lerner index characterizes perfect competition, while imperfect competition, e.g., Cournot or Bertrand oligopoly, is characterized by pos- itive values of m< 1. Note that in equilibrium, profits must be non-negative for firms to operate. The budget constraint can be rewritten as follows:

nf 1 n p 1 y =1 + j − p q , − L L p j j j=1 j X which, after symmetrization, yields nf 1 nf 1 nf/L y =1 + m np q =1 + m y y = − . (9) − L L · · − L · ⇐⇒ 1 m − Moreover, the strictly positive profit in industry is an incentive to enter for new firms. Thus, assuming that the enter is free we obtain one more condition of equi- librium C B (E.5) For all firms i =1,...,n profit Πi (x) = 0 (resp. Πi (p)=0) Applying (E.5) to (3) and (9) we obtain that zero-profit condition holds if and only if 1 nf/L − =1 1 m − or, equivalently L n = m. (10) f 238 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse

Therefore, the equilibrium number of firms increases with the market size and the degree of firms’ market power, which is measured by the Lerner index. The new problem we face now is that “zero-profit” number of firms is typically non-integer. It is not technical problem for symmetric equilibria, because in this case the upper limit of sum n turns into multiplier. As for substantial interpreta- tion of “fractional” firms, see, for example, short discussion in Subsection 4.3 of (Parenti et al., 2017) Note also that (10) implies f(1 m) x¯ = − > 0, (11) Lm provided that m satisfies 0 nB and qC < qB. First, we determine sufficient conditions on preferences and market size for a free-entry equilibrium to exist and to be unique. Second, we show that the above inequalities hold for any utility u. Since x can take on any positive value, for an equilibrium to exist under any collection of the parameter values, it must be that r (x) < 1 for all x 0. (14) u ≥ It is well known that a firm’s profit function is strictly quasi-concave if the second-order condition for profit-maximization is satisfied at any solution to the first-order condition. The second-order condition always holds if

xu′′′(x) ru′ (x)= < 2. (15) − u′′(x) This condition highlights the need to impose restrictions on the third derivative of the utility u to prove the existence and uniqueness of a Nash equilibrium. Proposition 2. Assume that (14) and (15) hold. If f > 0, then there exists a value L0 > 0 such that, for every L L0, there exists a unique symmetric free- entry Cournot equilibrium and a unique≥ symmetric free-entry Bertrand equilibrium. The equilibrium markups, outputs and numbers of firms satisfy mC >mB qC < qB nC >nB and C B lim m (L) = lim m (L)= ru(0). L L →∞ →∞ For the Proof see Proposition 2 in (Parenti et al., 2017). Social Welfare under Oligopoly 239

3. Consumers’ Welfare Proposition 2 highlights the existence of a trade-off between per variety consumption and product diversity. To be precise, when free entry prevails, Cournot competition leads to a larger number of varieties nC >nB, and at the same time, consumption level per variety is lower, than for Bertrand competition xC < xB . Therefore, the comparison between V C = nC u(xC ) and V B = nB u(xB ) is a priori ambiguous. In what follows we assume· additionally that the elemental· utility satisfies

lim u′(x)=0, x →∞ which is not too restrictive and typically holds for basic examples of utility func- tions. To solve the Welfare problem we consider an imaginary Social Planner, who manipulates with masses of firms n trying to maximize consumers’ utility

V (n)= n u(x) · subject to the labor market clearing condition

(f + L x)n = L. · Let ϕ = f/L, then the Social Planner’s problem is equivalent to maximization of the following function 1 V (n)= n u ϕ · n −   1 on the interval n (0, ϕ− ). Note that ∈ 1 V (0) = V (ϕ− )=0 1 1 1 V ′(n)= u ϕ u′ ϕ n − − n · n −     1 1 V ′′(n)= u′′ ϕ < 0 n3 · n −   which implies that graph of V (n) is bell-shaped and there exists unique social 1 optimum n∗ (0, ϕ− ), and V ′(n) 0 (resp. V ′(n) 0) for all n n∗ (resp. ∈ ≤ ≥ ≥ n n∗.) This implies the following statements: ≤ 1 Let Bertrand equilibrium number of firms lies to the right of Social Optimum B C B n n∗, then V < V holds 2 Let≥ Cournot equilibrium number of firms lies to the left of Social Optimum C C B n n∗, then V > V holds ≤ B C B C 3 In the intermediate case n < n∗ < n the relation between V and V is ambiguous.

In what follows, the first case will be referred as pro-Bertrand case, the second one - as pro-Cournot case. Now the problem formulated in the title of this paper may be represented in the following form. Let ϕ = f/L be given. Due to Proposition 2, there exist unique Cournot and Bertrand equlibria for all sufficiently small ϕ. This means that these equilibria are parametrized by ϕ, i.e., functions mC (ϕ), mB(ϕ), xC (ϕ), xB (ϕ), 240 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse nC(ϕ), nB(ϕ) are well-defined for all sufficiently small ϕ (0, ϕˆ). Moreover for ∈ all ϕ> 0 there exists the unique socially optimal number of firms n∗(ϕ), while the corresponding consumption of representative consumer 1 x∗(ϕ)= ϕ. n∗(ϕ) −

B Therefore, to obtain the pro-Bertrand case we have to prove that n∗(ϕ) n (ϕ) ≤C for all sufficiently small ϕ, while pro-Cournot case holds when n∗(ϕ) n (ϕ). In addition, it is (almost) obvious that xC (ϕ) 0, xB (ϕ) 0 vanish when≥ ϕ 0 (for the rigorous proof see (Parenti et al., 2017)),thus→ “for→ sufficiently small ϕ→” is actually equivalent to “for sufficiently small x.” Let’s determine the following function

u(x) xu′(x) xu′′(x) u(x) xp(x) xp′(x) ∆u(x) [1 εu(x)] ru(x)= − + = − + ≡ − − u(x) u′(x) u(x) p(x)

Vives in (Vives, 2001) points out that 1 εu(x) is the degree of preference for a single variety as it measures the proportion− of the utility gain from adding a variety, holding quantity per firm fixed. The subtrahend term, ru(x), may be char- acterized as “relative love for variety” (RLV), see, (Zhelobodko et al., 2012). In (Dhingra and Morrow, 2014) these values are referred as social mark-up and pri- vate mark-up, respectively. See the cited paper for the more detailed discussions on these characteristics of the consumer’s demand. Lemma 1. Let ru(0) < 1 holds, then

∆u(0) lim ∆u(x)=0. x 0 ≡ →

Proof. Note that the function xu′(x) is strictly increasing and positive for all x > 0. Indeed, u′(x) > 0 and (xu′(x))′ = u′(x)+ xu′′(x) = u′(x)(1 ru(x)) > 0, therefore there exists limit −

λ = lim x u′(x) 0. x 0 → · ≥

Assume that λ> 0. This is possible only if u′(0) = + . Using the L’Hospital rule we obtain ∞

2 x (u′(x)) xu′(x) λ λ = lim x u′(x) = lim 1 = lim = lim xu′′(x) = > λ. x 0 · x 0 (u′(x))− x 0 − u′′(x) x 0 ru(0) → → → → − u′(x) This contradiction implies that λ = 0. Q.E.D. The CES case is characterized by identity ∆u(x) = 0 for all x> 0, in the other cases the sign and magnitude of ∆u(x) may vary, as well as the directions of change for terms 1 εu(x) and ru(x) may be arbitrary, see (Dhingra and Morrow, 2014). Let − δu lim ∆u′ (x) x 0 ≡ → finite or infinite. Then the following theorem provides the sufficient conditions for pro-Bertrand and pro-Cournot cases, the obvious gap between (a) and (b) corre- sponds to the ambiguous case 3. above. Social Welfare under Oligopoly 241

Theorem 1. B (a) Let δu < ru(0), then for all sufficiently small ϕ = f/L an inequality V > V C holds. C B (b) Let δu > 1, then for all sufficiently small ϕ = f/L an inequality V > V holds. Proof. See Technical Appendix.

ρ It is obvious that in CES case u(x)= x we obtain immediately δ =0 < ru(0) = αx 1 ρ, thus CES is pro-Bertrand function. Considering the CARA u(x)=1 e− , α− > 0, HARA u(x) = (x + α)ρ αρ, α > 0, and Quadratic u(x) = αx − x2/2, − − α > 0, functions, we obtain ru(0) = 0, while the direct calculations show that δCARA = α/2 < 0, δHARA = (1 ρ)/2α < 0 and δQuad = 1/2α < 0. This implies that− these popular classes− of utility− functions also provide− the pro-Bertrand case. To illustrate the opposite, pro-Cournot case, consider the following function ρ1 ρ2 u(x)= αx + x . Without loss of generality we may assume that ρ1 <ρ2, then

ρ2 ρ1 ρ2 ρ1 α(1 ρ1)+(1 ρ2)x − αρ1(1 ρ1)+ ρ2(1 ρ2)x − 1 εu(x)= − ρ − ρ , ru(x)= − ρ − ρ . − α + x 2− 1 αρ1 + ρ2x 2− 1 Using the L’Hospital rule and the obvious calculations we obtain

2 ρ1 (1 ρ2) 1 εu(x) ru(x) α(ρ2 ρ1) x− − − lim ∆u′ = lim − − = lim − =+ > 1. x 0 x 0 x x 0 (α + xρ2 ρ1 )(αρ + ρ xρ2 ρ1 ) → → → − 1 2 − ∞ Remark 1. It is obvious, that the difference of social and private mark-ups may be equivalently represented as elasticity of elasticity of utility

xεu′ (x) ∆u(x)= . εu(x) Moreover, using L’Hospital rule we obtain that

∆u(x) εu′ (x) 1 δu = lim ∆u′ (x) = lim = lim = lim εu′ (x), x 0 x 0 x x 0 ε (x) ε (0) x 0 → → → u u → where εu(0) = 1 ru(0) > 0 exists due to our assumptions. Therefore, the sufficient conditions for pro-Bertrand− and pro-Cournot cases may be transforms as follows

B C B C lim εu′ (x) < ru(0) (1 ru(0)) = (1 εu(0)) εu(0) n∗ V x 0 → − − ⇒ ⇒ B C C B lim εu′ (x) > 1 ru(0) = εu(0) n V x 0 → − ⇒ ⇒ Consider the class of additive utility functions satisfying

ε′ (x)= (1 ε (x))′ < 0, u − − u i.e., social mark-up is strictly increasing function in some neighborhood of 0. This means that consumers have a higher preference for variety when they consume more per variety, see Subsection 2.1.1 in (Dhingra and Morrow, 2014). Due to (Spence, 1976) and (Vives, 2001; Chapter 6), such type of consumer’s behavior is considered as 242 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse

“normal”, or, “intuitive”, though there are various types of utility functions, which generate “counter-intuitive” behavior. Note that the classes of utility CARA, HARA, Quadratic satisfy this condition of strictly increasing of the social mark-up, while in case of CES utility, social mark-up is constant.

Corollary 1. Let social mark-up strictly increases at zero, then V B > V C .

This easily follows from Remark 1 and an obvious fact that ru(0)(1 ru(0)) 0. This is not necessary condition, however, which may be shown by fun−ction u(x≥)= √x √x +1 e− , because we obtain there εu′ (0) = 1/6 > 0, while (1 εu(0)) εu(0) = 1/4 > 1−/6 which implies V B > V C . −

4. Conclusion Additive preferences are widely used in theoretical and empirical applications of monopolistic competition. This is why we have chosen to compare the market out- comes under two different competitive regimes when consumers are endowed with such preferences. It is our belief, however, that most of our results hold true in the case of well-behaved symmetric preferences. Unlike most models of industrial organization which assume the existence of an outside good, we have used a limited labor constraint. This has allowed us to highlight the role of the marginal utility of income in firms’ behavior. Another distinctive feature of our approach is that firms recognize that consumers’ incomes are endogenous through the distribution of profits. The assumption of income-taking firms seems to be a reasonable alternative to the polar cases in which incomes are taken as exogenous, as in partial equilib- rium analyses, or incomes are strategically manipulated by firms, which leads to intractable general equilibrium models. In brief, even though our setup is restric- tive, it is sufficient to show that whether strengthening of imperfect competition will increase the social welfare depends on the nature of preferences.

Appendix Proof of Theorem 1 Let 1 1 x = ϕ n = , n − ⇐⇒ x + ϕ besides an equilibrium mark-up

f m = n = ϕn L which implies ϕ m = . x + ϕ Note that Bertrand equilibrium mark-up is determined by equation

m = ϕ + (1 ϕ)r (x). − u Substituting ϕ m = x + ϕ Social Welfare under Oligopoly 243 we obtain 1 x 1 x 2 xr (x) ϕ = − − u 2 − s 2 − 1 ru(x)   − which implies that

2 B 1+ x 1+ x x n n∗ u(x) u′(x), ≥ ⇐⇒ ≤  2 − s 2 − 1 ru(x)    −   at x = xB . The direct calculation shows that the last inequality is equivalent to

1+ x 4x ∆ (x) (1 r (x)) 1 1+ 1 . (16) u ≤ − u − 2 − (1 r (x))(1 + x)2 " s − u !# Taking into account the obvious inequality √1 z 1 z/2, we obtain that (16) will hold provided that − ≤ − x x ∆u(x) (1 ru(x)) 1 (1 + x) 1 2 = x ru(x) . ≤ − − − (1 ru(x))(1 + x) − 1+ x   −   (17) holds. Let x F (x)= x r (x) , u u − 1+ x   then Fu(0)=0= ∆u(0), therefore (17) will hold in some neighborhood of 0 provided that ∆u′ (0) < Fu′ (0) = ru(0). Similarly, Cournot mark-up satisfies ϕ ϕ m = + 1 r (x). m − m u   Taking to account that ϕ m = x + ϕ we obtain the following equation

(1 r (x))(x + ϕ)2 (1 r (x))(x + ϕ)+ x =0, − u − − u which implies 1 4x x + ϕ = 1 1 2 − − 1 r (x) s − u ! and

C 1 4x n n∗ u(x) 1 1 u′(x) ≤ ⇐⇒ ≥ 2 − − 1 r (x) s − u ! at x = xC . The direct calculation shows that the last inequality is equivalent to

1 4x ∆ (x) (1 r (x)) 1 1 . (18) u ≥ 2 − u − − 1 r (x) " s − u # 244 Mathieu Parenti, Alexander V. Sidorov, Jacques-Fran¸cois Thisse

Taking into account the inequality √1 z 1 αz/2 holds for any α > 1 and x 0, 4(α 1)/α2 , we obtain that (18)− will≥ hold− provided that ∈ −   1 2αx ∆ (x) (1 r (x)) 1 1 = αx (19) u ≥ 2 − u − − 1 r (x)   − u 

1+δu for all sufficiently small x. Now assume that δu > 1 and let α = 2 , then α > 1 and ∆u′ (0) = δu > α, which implies that (19) holds in some neighborhood of 0.

References Behrens, K. and Y. Murata, (2007). General equilibrium models of monopolistic competi- tion: A new approach. Journal of Economic Theory, 136, 776–787. d’Aspremont, C., R. Dos Santos Ferreira and L.-A. Grard-Varet (1996). On the Dixit- Stiglitz model of monopolistic competition. American Economic Review, 86, 623–629. Dhingra, S. and J. Morrow (2014). Monopolistic Competition and Optimum Product Di- versity Under Firm Heterogeneity, mimeo. Gabszewicz, J. and J.-P. Vial (1972). Oligopoly la Cournot in general equilibrium analysis. Journal of Economic Theory, 4, 381–400. Hart, O. (1985). Imperfect competition in general equilibrium: An overview of recent work. In K.J. Arrow and S. Honkapohja, eds., Frontiers in Economics. Oxford: Basil Black- well. Parenti, M., A. V. Sidorov, J.-F. Thisse, and E. V. Zhelobodko (2017). Cournot, Bertrand or Chamberlin: Toward a reconciliation, International Journal of Economic Theory, 13(1), 29–45. Roberts, J. and H. Sonnenschein (1977). On the foundations of the theory of monopolistic competition. Econometrica 45, 101-113. Spence, M., (1976). Product Selection, Fixed Costs, and Monopolistic Competition, The Review of Economic Studies, 43 (2), 217–235. Vives, X. (2001). Oligopoly pricing: Old ideas an new tools. Cambrige, MA; London, Eng- land: The MIT Press. Zhelobodko, E., S. Kokovin, M. Parenti and J.-F. Thisse (2012). Monopolistic competition in general equilibrium: Beyond the constant elasticity of substitution. Econometrica, 80, 2765–2784 Contributions to Game Theory and Management, X, 245–286

Cooperation in Bioresource Management Problems ⋆

Anna N. Rettieva Institute of Applied Mathematical Research Karelian Research Centre of RAS Pushkinskaya str., 11, Petrozavodsk, 185910, Russia E-mail: [email protected] WWW home page: http://mathem.krc.karelia.ru/member.php?plang=r&id=25

Abstract This paper is devoted to overview of the previously available and the author’s own results of cooperative behavior analysis in dynamic games related to bioresource management problems. The methodological schemes to maintain the cooperation are considered and modified. The incentive con- dition for rational behavior and characteristic function construction method are presented. The question of coalition stability is revised and extended. The cooperative behavior determination schemes for games with asymmet- ric players are obtained. Some analytical and numerical modelling results for particular dynamic bioresource management problems are presented. Keywords: dynamic games, bioresource management problem, Nash equi- librium, cooperative equilibrium, incentive equilibrium, dynamic stability, imputation distribution procedure, incentive conditions for rational behav- ior, coalition stability, asymmetric players, different planning horizons.

1. Introduction This paper is dedicated to overview of the results of rational behavior analysis in dynamic bioresource management problems. The primary aim of rational re- source exploitation consists in sustainable development of a population. Therefore, studying the difference between cooperative and egoistic (individual) behavior in optimal bioresource management problems represents an important issue (e.g., see (Kaitala and Lindroos, 2007; Lindroos et al., 2007)). Optimal control problems for biological objects are very popular among re- searches. Many papers have been dedicated to these problems. Classical biore- source dynamic models were investigated in (Gimelfarb et al., 1974; Clark, 1985; Goh, 1980). The papers (Baturin et al., 1984; Puh, 1983; Selutin et al., 1999) are dedicated to models with migration processes. Optimal control models of interact- ing biological species are considered in (Bazikin, 1985; Chaudhuri, 1986; Silvert and Smith, 1977). Discrete-time bioresource optimal control problems were considered in the papers (Abakumov, 1993; Il’ichev et al., 2000; Shapiro, 1979). Models with age- distributed populations are investigated in (Abakumov, 1994; Baturin et al., 1984; Gurman, 1978; Svirezhev and Elizarov, 1972). The game-theoretic approach for bioresource management problems was pio- neered by Smith M.J. (Smith, 1968). Haurie A. (Haurie and Tolwinski, 1984), Petrosyan L.A. (Petrosyan and Zakharov, 1981; Petrosyan and Zakharov, 1997), Tolwinski B. (Tolwinski et al., 1986), Levhari D. (Levhari and Mirman, 1980), Mir- man L.J. (Fisher and Mirman, 1992), Vislie J. (Vislie, 1987) and many others ap- plied the game-theoretic approach to resource management problems. The optimal

⋆ This work was supported by the Russian Science Foundation, project no. 17-11-01079. 246 Anna N. Rettieva noncooperative and cooperative players’ behavior in harvesting problems were ob- tained in (Ehtamo and Hamalainen, 1993; Hamalainen et al., 1984; Lindroos et al., 2007; De Zeeuw, 2008; Kulmala et al., 2009; Lindroos et al., 2007). Levhari D. and Mirman L.J. (Levhari and Mirman, 1980) presented the ”fish war” model which is convenient for analyzing bioresource exploitation processes in the discrete-time set- ting. This framework proceeds from the power function of population evolvement and the logarithmical functions of “instantaneous” payoffs. Then the total payoff of a player forms a finite or infinite sum of discounted instantaneous payments. Here, Nash equilibrium strategies and cooperative strategies are defined analytically. As is well-known, cooperation leads to a sparing mode of bioresource exploita- tion. The special importance of cooperative behavior for ”common resource” ex- ploitation was stressed by Nobelist E. Ostrom (Ostrom, 1990). This review will focus on the results by Ehtamo H., Fisher R.D., Hamalainen R.P., Haurie A., Kaitala V., Leitmann G., Lindroos M., Mirman L.J., Tolwinski B. (Fisher and Mirman, 1992; Ehtamo and Hamalainen, 1993; Kaitala and Lindroos, 2007; Tolwinski et al., 1986); Haurie and Tolwinski, 1984; Fisher and Mirman, 1996; Hamalainen et al., 1984) in this regard. There are several methodological schemes to maintain a cooperation. Here we focus on two of them: incentive equilibrium and time-consistent imputation distri- bution procedure. The concept of cooperative incentive equilibrium was introduced by Ehtamo H. and Hamalainen R.P. (Ehtamo and Hamalainen, 1993), as a natural extension of D.K. Osborn’s work (Osborn, 1976) about cartel stability. In this concept play- ers punish each other for a deviation from cooperative behavior by changing their optimal cooperative strategies. The question of dynamic stability in differential games has been investigated in the past three decades. Haurie A. (Haurie, 1976) raised the problem of insta- bility of the Nash bargaining solution. The concept of time-consistency (dynamic stability) was introduced by Petrosyan L.A. (Petrosyan, 1977). Time-consistency involves the property that, as the cooperation develops, participants are guided by the same optimality principle at each time moment and hence do not have incen- tives to deviate from cooperation. Petrosyan L.A. (Petrosyan and Danilov, 1979) has developed the notion of time-consistent imputation distribution procedure. Petrosyan L.A. (Petrosjan, 1993; Petrosjan and Zenkevich, 1996) offered a method of regularization to construct time-consistent solutions. Petrosyan L.A. and Zaccour G. (Petrosjan and Zaccour, 2003) presented time-consistent Shapley value allocation in a differential game of pollution cost reduction. Yeung D.W.K. (Yeung, 2006) introduced the ”irrational-behavior-proofness” condition that guar- anties the stability of cooperative agreement against unpredictable collapse of the coalition. The analysis of stable international environmental agreements (IEA) in game theory was pioneered by Barrett S. (Barrett, 1994), Carraro C. and Siniscalco D. (Carraro and Siniscalco, 1992), and was surveyed in (Ioannidis et al., 2000) and (Finus, 2008). IEAs typically use the concept of internal and external stability (D’Aspremont et al., 1983). In classical works (Barrett, 1994; Barrett, 1994) it is assumed that only one coalition can be formed. The ”new coalition theory” (Bloch, 1995; Yi, 1997; Carraro, 2000; Finus, 2008) does not restrict coalition formation to a single coalition but allows for the exis- Cooperation in Bioresource Management Problems 247 tence of multiple coalitions. Studies in this direction were published in (Ray and Vohra, 1997; Yi and Shin, 1995; Bloch, 1996; Osmani and Tol, 2010; Eyckmans and Finus, 2003). The main questions investigated were the rules of coalition formation. They can be Open Membership Game (Yi and Shin, 1995), Exclusive Membership Game (Eyckmans and Finus, 2003; Finus and Rundshage, 2003), Coalition Una- nimity Game (Bloch, 1996) and Equilibrium Binding Agreements (Ray and Vohra, 1997). Most of the papers on coalition stability concern the agreement on emission reduction, and only few of them apply these concepts to fisheries (De Zeeuw, 2008; Kulmala et al., 2009; Pintassilgo and Lindroos, 2008; Lindroos, 2008). Traditionally, cooperative behavior analysis in bioresource management prob- lems rests on the assumption of identical discount factors for all players. If these factors differ (players are asymmetric), standard techniques do not assist in evaluat- ing players’ payoffs under cooperation. As a matter of fact, the cooperative behavior design problem is underinvestigated in this case, even though asymmetry appears widespread in real ecological problems. For instance, countries concluding a cooper- ative agreement can have different rates of inflation, environmental conditions, and so on. The papers (Munro, 1979) and (Vislie, 1987) demonstrated that bioresource management conflicts often occur due to the existing difference in discount factors (time preferences). Consequently, a substantial role in cooperative behavior analysis of bioresource management problems belongs to seeking an optimal compromise in the case of heterogeneous goals pursued by players (different discount factors and fishing costs). The publication (Breton and Keoula, 2014) suggested constructing cooperative payoff as the weighted sum of individual payoffs (in the continuous-time setting, see (Plourde and Yeung, 1989)). This approach draws criticism: a player with a higher discount factor leaves the bioresource exploitation process quite soon, but has to obtain its share of the total payoff of a coalition. The cited work demonstrated that all utility from a cooperative agreement goes to participant 1 if the weight coefficients are defined by the Nash bargaining solution. Note that this infringes upon the interests of player 2, which is inadmissible in a cooperative agreement. An alternative approach was introduced in (Sorger, 2006) via a bargaining scheme. Cooperative and noncooperative behavior analysis in bioresource management problems with random planning horizons is an important problem, both theo- retically and practically. The authors (Marin-Solano and Shevkoplyas, 2011) and (Shevkoplyas, 2011) constructed cooperative strategies and time-consistent solu- tions in the case of a random planning horizon obeying a given distribution. The Nash bargaining solution was adopted in (Mazalov and Rettieva, 2014) to calculate a common discount factor; subsequently, the problem was reduced to determination of a time-consistent distribution of the total cooperative payoff. Munro G.R. (Munro, 2000) obtained cooperative strategies through maximization of the weighted sum of individual payoffs; moreover, it was noted that such solution satisfies the Nash product maximization problem. A well-known result of this paper is that cooperative payoff is equally shared in the case of side payments. Another meaningful applied problem is to find cooperative payoffs in the case of different planning horizons. When one player exploits a bioresource for a shorter period than the other, the former joins the exploitation process (in our case, fishing) for a fixed time and is willing to enter cooperation (owing to obvious profitability). 248 Anna N. Rettieva

But this player has a smaller planning horizon than its partner; and so, the player under consideration is interested in gaining more from cooperation than the player that continues harvesting individually. The model with random planning horizons in the bioresource exploitation pro- cess is the most adequate to reality: external random factors can cause cooperative agreement breach and the participants know nothing about them in advance. For instance, fishing firms can go bankrupt, their fleet can be damaged, etc. In the case of countries, negative factors include an economic crisis, abrupt variations in the rate of inflation, international or national economic and political situations, and so on. All these processes possibly break a cooperative agreement, and cooperative behavior of participants in this case has not yet been examined. According to the aforesaid, cooperative behavior design is very important. Here we present our results in this regard. Almost all the results are derived analyti- cally, which allows their direct application to concrete biological populations with appropriate parameters. Further exposition has the following structure. The types of the main investi- gated problems are shown in Section 2. Section 3 describes the obtained cooper- ation maintenance and cooperative behavior determination schemas. In Section 4 the problem of cooperative behavior determination for games with asymmetric play- ers is considered. Different types of bioresource management problems are treated in Section 5, with cooperative behavior design, cooperation maintenance schemes and the results of numerical experiments. And finally, Section 6 provides the basic results and their discussion.

2. Main problems 2.1. Continuous-time models The dynamics of the renewable resource is described by the equation

x′(t)= f(x(t),u1(t),...,un(t)) , x(0) = x0 , (1) where x(t) 0 denotes the resource size at time t, u (t) 0 represents the strategy ≥ i ≥ (exploitation intensity) of player i at time t, i = 1,...,n, f(x(t),u1(t),...,un(t)) indicates the natural growth function. Denote u(t) = (u1(t),...,un(t)). We consider players’ payoffs over the finite [0,T ] or infinite time horizon in the forms:

T ρt Ji = e− gi(x(t),u(t))dt + Gi(x(T )) (2) Z0 and ∞ ρt Ji = e− gi(x(t),u(t))dt , (3) Z0 where gi(x(t),u(t)) denotes the ”instantaneous” utility of player i at time t, ρ means the discount factor, 0 <ρ< 1. N N N Let u (t) = (u1 (t),...,un (t)) be the Nash equilibrium in problem (1), (2) (or (1), (3)). Under cooperation players wish to maximize the sum of their profits:

n T n n c ρt J = Ji = e− gi(x(t),u(t))dt + Gi(x(T )) max (4) → u(t) i=1 0 i=1 i=1 X Z X X Cooperation in Bioresource Management Problems 249 or n c ∞ ρt J = e− gi(x(t),u(t))dt max . (5) → u(t) 0 i=1 Z X c c c Let the set of strategies u (t) = (u1(t),...,un(t)) be the solution of the prob- lem (1), (4) (or (1), (5)) and xc(t) be the cooperative trajectory derived from the equation (1) applying the strategies uc(t). 2.2. Discrete-time models The renewable resource evolves according to the equation

xt+1 = f(xt,ut) , x0 = x , (6) where ut = (u1t,...,unt), xt denotes the resource size at time t, uit represents the strategy (exploitation intensity) of player i at time t, i =1,...,n. The players’ payoffs take the forms:

n t Ji = δ gi(xt,ut) (7) t=0 X and ∞ t Ji = δ gi(xt,ut) , (8) t=0 X where gi(xt,ut) denotes the ”instantaneous” utility of player i at time t, δ means the discount factor, 0 <δ< 1. N N N Let ut = (u1t,...,u2t) be the Nash equilibrium of the game (6), (7) (or (6), (8)). Under cooperation the discounted sum of players’ total utilities over the finite [0,m] or infinite time horizon is maximized:

m n c t J = δ gi(xt,ut) (9) t=0 i=1 X X or n c ∞ t J = δ gi(xt,ut) . (10) t=0 i=1 X X c c c Let the set of strategies ut = (u1t,...,unt) be the solution of the problem (6), c (9) (or (6), (10)) and xt be the cooperative trajectory derived from the equation c (6) applying the strategies ut .

3. Cooperation maintenance 3.1. Incentive equilibrium One of the methodological schemes to maintain the cooperation is the cooperative incentive equilibrium. This concept was introduced in the paper (Ehtamo and Hamalainen, 1993) as a natural extension of Osborn’s work (Osborn, 1976) about cartel stability. The incentive equilibrium is applied for main- taining the cooperation and punishing the player who deviates. This concept is presented for the problem with two players. 250 Anna N. Rettieva

Following (Ehtamo and Hamalainen, 1993) we assume that the strategy of player i is a causal mapping γi : Uj Ui (uj Uj ), i, j = 1, 2, i = j, where Ui denotes the set of admissible strategies→ of player∈i, i =1, 2. In order6 to give the definitions for both the continuous and the discrete cases we will omit the time parameter in the following definitions.

Definition 1. (Ehtamo and Hamalainen, 1993). A strategy pair (γ1,γ2) is called the cooperative incentive equilibrium if

c c c c u1 = γ1(u2) , u2 = γ2(u1) , c c J1(u1,u2) J1(u1,γ2(u1)) u1 U1 , J (uc,uc) ≥ J (γ (u ),u ) ∀u ∈ U . 2 1 2 ≥ 2 1 2 2 ∀ 2 ∈ 2 Thus, when players use incentive equilibrium strategies it is not advantageous for them to deviate from the initial cooperative agreement. The player’s profit under deviation is less than under cooperation. In the traditional statement players control their behavior, punishing for deviation by changing the cooperative strategies (see Fig. 1). In (Ehtamo and Hamalainen, 1993) players use punishment strategies which are proportional to the difference between the cooperative and deviating strategies.

(u2i (t),g 2 (u1i (t)) Player 1 Player2 J1 (u 1 ,g 2 (u 1 )) (u1i (t),g 1 (u2i (t)) J2 (g 1 (u2 ) , u 2 )

u11 (t) ... u1n (t) u21 (t) ... u2n (t)

resource 1 ... resourcen ecologicalsystem

Fig. 1. Traditional cooperative incentive equilibrium

In the papers (Mazalov and Rettieva, 2007; Mazalov and Rettieva, 2008; Mazalov and Rettieva, 2010) we presented a new scheme where the center controls the co- operation agreement by changing the harvesting territory. Let us divide the water area into two parts, s(t) and 1 s(t) (st and 1 st in discrete-time models), where two players exploit the fish stock.− The dynamics− of the fishery and the players’ payoffs have the same forms (1)–(10), but the strategies also depend on the territory sharing

ui(t)= ui(t,s(t)) , i =1, 2 or

uit = uitst , i =1, 2 . Denote by sc the territory sharing under cooperation. Assume that players de- viating from the cooperative equilibrium point are punished by the center propor- tionally to the value of deviation. So if the first player deviates the center increases Cooperation in Bioresource Management Problems 251 sc, and if the second player deviates – decreases sc proportionally to the difference between cooperative and deviating strategies. The proposed concept is given in the following definition (the time parameter is omitted)

Definition 2. A strategy pair (γ1,γ2) is called the cooperative incentive equilib- rium if c c c c c c c c u1(s )= γ1(u2(s )) , u2(s )= γ2(u1(s )) , c c c c J1(u1(s ),u2(s )) J1(u1(s),γ2(u1(s))) u1 U1 , 0 s 1 , J (uc(sc),uc(sc)) ≥ J (γ (u (s)),u (s)) ∀u ∈ U , 0 ≤ s ≤ 1 . 2 1 2 ≥ 2 1 2 2 ∀ 2 ∈ 2 ≤ ≤ The application of this scheme for cooperation maintenance is presented in Fig. 2. In Section 5.1 we present the results obtained for different game-theoretic models.

Center s(t) s(t) I(t,s,u1 ,...,u n )

Player 1 ... Playern J1 (t,s) Jn (t,s) nn u11 (t,s) ...... u (t,s) u1n (t,s) un1 (t,s) resource 1 resourcen ecologicalsystem

Fig. 2. New cooperative incentive equilibrium

3.2. Dynamic stability and conditions for rational behavior Let us consider the infinite time horizon problem (1), (5) or (6), (10). For finite horizon problems the following definitions are similar. S S Denote the profit of coalition S N as J (u) = gi(x(t),u(t)) (or J (u) = ∈ i S ∈ gi(xt,ut)). P i S P∈ For the cooperative variant of the game it is required to determine the character- istic function. There are several approaches to constructing the characteristic func- tion (Gromova and Petrosyan, 2015). The classical one is to determine the profit of coalition S assuming that the outside players form the coalition N S and play against the coalition S (zero-sum game, see (Neumann and Morgenstern,\ 1953)). Characteristic function construction In the papers (Mazalov and Rettieva, 2010; Mazalov and Rettieva, 2014) we con- structed the characteristic function in two unusual forms. In the first model players outside coalition K switch to their Nash strategies, which were determined for the initial noncooperative game. This approach was presented by Petrosyan L.A. and Zaccour G. (Petrosjan and Zaccour, 2003). It is the case where players have no in- formation about the fact that the coalition was formed. In the second model we 252 Anna N. Rettieva present a new approach where players outside coalition K determine new Nash strategies in the game with N K players. This case corresponds to the situation when players know that coalition\ K is formed. Model without information (Petrosjan and Zaccour, 2003). In this case players forming coalition K don’t inform others. Therefore, players outside coalition K use their Nash strategies determined for the noncooperative case. The following definitions are given for the game (6), (8). Denote by N N N ut = (u1t,...,unt) the Nash equilibrium. To determine the cooperative payoff of coalition K it is required to solve the next problem:

K ∞ t J = δ gi(˜ut) max , −→ ui, i K t=0 i K ∈ X hX∈ i where u , i K, u˜ = i i uN , i∈ N K.  i ∈ \ Model with informed players. Let’s consider the case where players outside the coalition K determine new Nash strategies in the game with N K players. This case corresponds to the situation where players know that coalitio\n K is formed. To determine the cooperative payoff of coalition K it is required to solve the next problem: ∞ K t J = δ gi(˜ut) max , −→ ui, i K t=0 i K ∈ X hX∈ i where u , i K, u˜ = i i u˜N , i∈ N K,  i ∈ \ N and the individual players’ strategiesu ˜i , i N K are defined from the maximiza- tion problems: ∈ \ ∞ t Ji = δ gi(ut) max ,i N K. −→ ui, i N K ∈ \ t=0 X ∈ \ In (Mazalov and Rettieva, 2010) these approaches were applied for the fish war model with many players and we present some results in Section 5.3.

Using classical or new approaches we determine the characteristic function V (S, 0) as the profit of coalition S, S N. When the characteristic function is determined, the imputation set can be defined⊂ as

ξ = ξ(0) = (ξ1(0),...,ξn(0)) : n { ξi(0) = V (N, 0), ξi(0) V (i, 0), i =1,...,n . i=1 ≥ } P Similarly we determine the characteristic function V (S,t) and the imputation c c set ξ(t) = (ξ1(t),...,ξn(t)) for every subgame started from the state xt (or x (t)) at time t. Further assume that one of the cooperative optimality principles is chosen; it can be proportional solution, C–core, n–core, the Shapley value or another. The concept of time-consistency (dynamic stability) was introduced by Petrosyan L.A. (Petrosyan, 1977). Time-consistency involves the property that, as Cooperation in Bioresource Management Problems 253 the cooperation develops participants are guided by the same optimality principle at each time moment and hence don’t have incentives to deviate from cooperation. In the paper (Petrosyan and Danilov, 1979) the notion of time-consistent imputation distribution procedure was developed.

Definition 3. The vector β(t) = (β1(t),...,βn(t)) is an imputation distribution procedure (Petrosyan and Danilov, 1979; Petrosyan and Danilov, 1985) if

∞ ρt ξi(0) = e− βi(t) dt, i =1, . . . , n , Z0 or, for a discrete-time problem

∞ t ξi(0) = δ βi(t) , i =1, . . . , n . t=0 X The main idea of this scheme is to distribute the cooperation gain along the game path. Then βi(t) can be interpreted as the payment to player i at time moment t.

Definition 4. The vector β(t) = (β1(t),...,βn(t)) is a time-consistent imputation distribution procedure (Petrosyan, 1977; Petrosyan and Danilov, 1979) if for all t 0 ≥ t ρτ ρt ξi(0) = e− βi(τ) dτ + e− ξi(t) , i =1, . . . , n , Z0 or, for a discrete-time problem

t 1 − τ t ξi(0) = δ βi(τ)+ δ ξi(t) , i =1, . . . , n , τ=0 X where ξi(t) is the imputation for player i at time t. Here, players following the cooperative trajectory are guided by the same opti- mality principle at each current time and hence do not have any reasonable moti- vation to deviate from the cooperation agreement. The application of these concepts to bioresource management problems are given in Sections 5.2, 5.3. Nonetheless, some irrational player can break out of the cooperation. To indem- nify players against the loss of profits in this case Yeung D.W.K. (Yeung, 2006) introduced the following condition.

Definition 5. The imputation ξ = (ξ1,...,ξn) satisfies the irrational-behavior- proofness condition (Yeung, 2006) if

t ρτ ρt e− β (τ) dτ + e− V (i,t) V (i, 0) , i =1, . . . , n , i ≥ Z0 or, for a discrete-time problem

t δτ β (τ)+ δt+1V (i,t + 1) V (i, 0) , i =1,...,n (11) i ≥ τ=0 X for all t 0, where β(t) = (β1(t),...,βn(t)) is the time-consistent imputation distribution≥ procedure. 254 Anna N. Rettieva

If this condition is satisfied, then player i is irrational-behavior-proof because irrational actions that break the cooperative agreement will not bring her payoff below initial noncooperative payoff. In the papers (Rettieva, 2009; Mazalov and Rettieva, 2010; Mazalov and Rettieva, 2012) we introduced a new condition for discrete-time prob- lems which is stronger than Yeung’s condition and is easier to verify.

Definition 6. The imputation ξ = (ξ1,...,ξn) satisfies the each step rational be- havior condition if

β (t)+ δV (i,t + 1) V (i,t) , i =1,...,n (12) i ≥ for all t 0, where β(t) = (β1(t),...,βn(t)) is the time-consistent imputation distribution≥ procedure. The proposed condition offers an incentive to player i to maintain cooperation because at every step she gains more from cooperation than from noncooperative behavior. In the series of papers (Rettieva, 2010; Mazalovand Rettieva, 2010; Mazalov and Rettieva, 2011; Rettieva, 2011) we verify these conditions for differ- ent models (see Sections 5.2, 5.3). 3.3. Coalition stability For the coalition structure not only external and internal stability (D’Aspremont et al., 1983) should be examined but also the possible moves of play- ers from one coalition to the other. Carraro (Carraro, 1997) presented the notion of intercoalition stability for such analysis. In the papers (Rettieva, 2011; Rettieva, 2012) we extend the intercoalition sta- bility concept to the situation where not only one player but a set of coalition members can join the other coalition (coalitional stability). This concept is close to the coalition structure (Finus and Rundshage, 2003), α- and β- core concepts (Bloch, 1996). We consider the bioresource management problem with two types of players: N = 1,...,n and M = 1,...,m . The coalition structure where players of each type{ form a coalition} is investigated.{ } Hence, there can be two coalitions (K N and L M) and single players of each type (N K and M L) in the game.⊂ The sizes of⊂ the coalitions are the subject of investigation.\ \ The most popular stability concept that is applied in game-theoretical literature on IEAs is external and internal stability (D’Aspremont et al., 1983). Definition 7. Coalition K is internally stable if 1 V k(K,L)= V k(K,L) V N (K i ,L) , i K. (13) i k ≥ i \{ } ∀ ∈ Definition 8. Coalition K is externally stable if 1 V N (K,L) V k(K i ,L)= V k+1(K i ,L) , i N K. (14) i ≥ i ∪{ } k +1 ∪{ } ∀ ∈ \ Internal stability means that no coalition member wishes to leave the coalition and become a singleton. External stability means that no singleton wishes to join the coalition. Cooperation in Bioresource Management Problems 255

The paper (Rettieva, 2012) extends the intercoalition stability concept to the situation where not only one player but a set of coalition members can join the other coalition. The intercoalition stability is now a special case of coalitional stability. Coalition K is coalitionally internally stable if

V k(K,L) V l+p(K P,L P ) , i P K, P = p . (15) i ≥ i \ ∪ ∀ ∈ ⊂ | | Coalition K is coalitionally externally stable if 1 V l(K,L)= V l(K,L) V k+q (K Q,L Q) , j Q L, Q = q . (16) j l ≥ j ∪ \ ∀ ∈ ⊂ | | Here internal stability means that no set of members of coalition K wishes to leave it and join coalition L. External stability means that no set of members of coalition L wishes to leave it and join coalition K. For coalition L conditions take the forms

V l(K,L) V k+q(K Q,L Q) , j Q L, j ≥ j ∪ \ ∀ ∈ ⊂ V k(K,L) V l+p(K P,L P ) , i P K, i ≥ i \ ∪ ∀ ∈ ⊂ which coincides with (15), (16). The presented concept is given in the next definition.

Definition 9. Coalition structure (K,L) is stable if conditions (15), (16) are ful- filled.

For P = i and Q = j this definition coincides with the intercoalition stability (Carraro, 1997;{ } Osmani and{ } Tol, 2010). The presented stability concept enlarges the intercoalition stability for the mod- els with two or more coalitions and possible moves of a set of coalition members. Moreover, as it will be shown below, the coalitions with a large number of members are stable under this concept. In Section 5.4 we give the results for the great fish war model (Fisher and Mirman, 1996) with coalition structure.

4. Cooperation for asymmetric players Our papers (Rettieva, 2012; Rettieva, 2014; Mazalov and Rettieva, 2015) suggest designing and stimulating cooperative behavior applying the Nash bargaining solution. The presented approach removes the need for sum- ming up the payoffs of asymmetric players (Breton and Keoula, 2014). The bargain- ing scheme yields an absolutely different solution (e.g., see a classical example in (Owen, 1968)). Cooperative behavior design based on maximization of the weighted sum of players’ payoffs may lead to the existence of parameter domains where the cooperative payoffs of players are smaller than their noncooperative counterparts (Breton and Keoula, 2014). This is impossible in the suggested scheme with cooper- ative behavior defined by the bargaining solution: under some parameters, players’ payoffs are greater or equal to Nash equilibrium payoffs (Section 5.5 provides nu- merical experiments illustrating this fact). Another meaningful applied problem is to find cooperative payoffs in the case of different planning horizons. The model with random planning horizons in the 256 Anna N. Rettieva bioresource exploitation process is the most adequate to reality: external random factors can cause cooperative agreement breach and the participants know nothing about them in advance. In what follows, we explored a discrete-time game-theoretic bioresource manage- ment problem. Players apply different discount factors which can be interpreted as their heterogeneous time preferences. A generalization of this model is when play- ers’ planning horizons differ due to cooperative agreement breach or other reasons. Although conclusion of an agreement implies fixed exploitation periods, external factors can force a player to leave the game. Therefore, it seems natural to consider planning horizons as random variables. 4.1. Models with different discount factors We consider discrete-time bioresource management problems (6), (7) and (6), (8) N N N with two players. Denote by u = (u1 ,u2 ) the Nash equilibrium of the problem (6), N N N (8), Vi(x, δi), i = 1, 2 denotes noncooperative payoffs, respectively. ut = (u1t,u2t) n and Vi (x, δi), i = 1, 2 give the Nash equilibrium strategies and payoffs in n-step game (6), (7). The papers (Rettieva, 2012; Rettieva, 2013) demonstrate how to determine the total discount factor in the case where the cooperative payoff is distributed propor- tionally for infinite-time problems. The schemes for determining the total discount factor in order to construct cooperative payoff are offered. Assume that players use the joint discount factor δ, which should be determined. So, the players solve the following problem

∞ t J = δ g1(u1t,u2t)+ g2(u1t,u2t) max , → u1t,u2t 0 t=0 ≥ X h i where 0 <δ< 1 denotes the unknown total discount factor. V (x, δ) denotes the cooperative payoff in this case. We suppose that the co- operative payoff is distributed in the portions γV (x, δ) and (1 γ)V (x, δ) among players. − In the paper (Rettieva, 2012) for the fish war model it was shown that the joint discount factor for the case where cooperative payoff is distributed proportionally among players exists. As a result we get the set of admissible parameters δ and γ. To construct the solution we propose to adopt the Nash bargaining scheme. It is necessary to solve the problem

(γV (x, δ) V1(x, δ1))((1 γ)V (x, δ) V2(x, δ2)) max . − − − → 0<δ,γ<1

In the papers (Rettieva, 2014; Mazalov and Rettieva, 2015) for the game (6), (7) we withdraw from total discounting factor design and determine cooperative strategies applying the Nash arbitration procedure. Two bargaining schemes are introduced, viz. the one for the whole duration of the game and the recursive arbi- tration procedure which applies the arbitration scheme at each shot of the game. In the first case cooperative strategies and payoffs are defined by resolving the Nash product maximization problem for the whole duration of the game

nc n nc n (V1 (x, δ1) V1 (x, δ1))(V2 (x, δ2) V2 (x, δ2)) max , − − −→ u1,u2 0 ≥ Cooperation in Bioresource Management Problems 257 and in the second case the Nash arbitration scheme gets activated at each shot of the game. It has been established that, within the framework of the proposed scheme, the cooperative payoffs of the players are greater or equal to (under some parameters) their payoffs gained by egoistic behavior (Section 5.5 provides numerical experiments illustrating this fact). In Section 5.5 we present the cooperative behavior determination adopting the Nash bargaining solution for the fish war model. 4.2. Models with different planning horizons 4.2.1. Fixed planning horizons Cooperative behavior has not yet been analyzed in the statement of differ- ent planning horizons. In this context, we mention the papers (Shevkoplyas, 2011; Marin-Solano and Shevkoplyas, 2011) where the planning horizon is a random vari- able with a given distribution. When the harvesting time of a player is smaller than that of another, the former harvests the fish stock for a fixed time and is willing to enter cooperation (owing to obvious profitability). But this player has a smaller planning horizon than its partner; and so, the player under consideration is inter- ested in gaining more from cooperation than the player which continues harvesting individually. The cited authors designed a dynamically stable allocation procedure for this model, but with identical discount factors and harvesting times. The papers (Rettieva, 2015; Mazalov and Rettieva, 2015) introduced the Nash bargaining solution to construct cooperative strategies in the case of different plan- ning horizons. Consider the harvesting process with the dynamics (6) and different planning horizons. Players 1 and 2 harvest the fish stock during n1 and n2 steps, respec- tively. For the sake of definiteness, suppose that n1

n1 n1 n2 t c t c t a J1 = δ1 ln(u1t), J2 = δ2 ln(u2t)+ δ2 ln(u2t) , (17) t=0 t=0 t=n +1 X X X1 c a where ui (i =1, 2) denote the cooperative strategies and u2 indicates the strategy of player 2 during individual catch. To construct the cooperative strategies and payoffs of the players, apply the Nash bargaining solution for the whole duration of the game. Thus, it is required to solve the following optimization problem:

V c(x, δ )[0,n ] V N (x, δ )[0,n ] (V c(x, δ )[0,n ]+V ac(xcn1 ,δ )[n ,n ] 1 1 1 − 1 1 1 × 2 2 1 2 2 1 2 V N (x, δ )[0,n ] V aN (xNn1 ,δ )[n ,n ]) max , (18) − 2 2 1 − 2 2 1 2 → N ac cn1 where Vi (x, δi)[0,n1] represent the Nash equilibrium payoffs, V2 (x ,δ2)[n1,n2] gives the payoff of player 2 owing to its individual harvesting after n1 steps of aN Nn1 cooperative behavior, and V2 (x ,δ2)[n1,n2] is the payoff of player 2 owing to its individual harvesting after n1 steps of noncooperative behavior. In Section 5.6 we present the cooperative behavior determination adopting the Nash bargaining solution for the fish war model with different harvesting times. 258 Anna N. Rettieva

4.2.2. Random planning horizons The model with random planning horizons in the bioresource exploitation pro- cess is the most adequate to reality: external random factors can cause cooperative agreement breach and the participants know nothing about them in advance. There- fore, it seems natural to consider planning horizons as random variables. In the paper (Mazalov and Rettieva, 2015) we explore the model where players possess heterogeneous discount factors and, moreover, heterogeneous planning hori- zons. By assumption, players stop cooperation at a random step: external stochastic processes can cause cooperative agreement breach. Suppose that players 1 and 2 harvest the fish stock during n1 and n2 steps, re- spectively. Here, n represents a discrete random variable taking values 1,...,n 1 { } with the corresponding probabilities θ1,...,θn . Similarly, n2 is a discrete ran- dom variable with the value set and the{ probabilities} ω ,...,ω . We believe that { 1 n} the planning horizons are independent. Therefore, during the time period [0,n1] or [0,n2] players enter cooperation, and the problem consists in evaluating their strategies. The players’ payoffs are determined via the expectation operator:

n1 t J1 = E δ1g1(u1t,u2t)I n1 n2 + { ≤ } t=1 nX n2 n1 t t a + δ1g1(u1t,u2t)+ δ1g1(u1t) I n1>n2 , { } t=1 t=n +1 X X2  o n2 t J2 = E δ2g2(u1t,u2t)I n2 n1 + { ≤ } t=1 nX n1 n2 t t a + δ2g2(u1t,u2t)+ δ2g2(u2t) I n2>n1 , { } t=1 t=n +1 X X1  o a where uit specifies the strategy of player i when its partner leaves the game, i =1, 2. To define cooperative behavior, we employ the Nash bargaining solution; the role of status quo points belongs to the noncooperative payoffs of the players. In Section 5.7 we present the cooperative behavior determination applying the Nash bargaining solution for the fish war model with random harvesting times. An obvious advantage of the Nash bargaining solution consists in the feasibil- ity of treating players individually. According to the conventional approach, the joint cooperative payoff function represents the sum of players’ individual payoffs, which has little to do with real systems. For instance, if the players are neighboring countries, this becomes even impossible (especially in the case of different plan- ning horizons). Other drawbacks of the traditional cooperative design are described in the Introduction and Section 4. In a certain sense, the Nash bargaining solu- tion resembles a Nash equilibrium (see (Mo and Walrand, 2000)). The players act individually as before, but within the boundaries of a cooperative agreement.

5. Some results Here some results of our investigations in the fields of cooperation maintenance and asymmetric players’ problems are presented. Cooperation in Bioresource Management Problems 259

5.1. Incentive equilibrium Continuous-time model A dynamic game model of bioresource management problem is considered in (Mazalov and Rettieva, 2007; Mazalov and Rettieva, 2008). The center (referee) who shares a reservoir, and the players (countries or fishing firms) that harvest the fish stock on their territory are the participants of this game. The equilibria are con- structed in the case where the players punish each other for a deviation from the cooperative equilibrium (Ehtamo and Hamalainen, 1993) and in the case where the center punishes them for the deviations. Let us divide the water area into two parts, s and 1 s, where two players exploit the fish stock during T time periods. The center (referee)− shares the reservoir. The dynamics of the fishery is described by the equation

x′(t)= F (x(t)) q E (t)(1 s)x(t) q E (t)sx(t) , 0 t T, x(0) = x , (19) − 1 1 − − 2 2 ≤ ≤ 0 where x(t) 0 is the population size at time t 0, F denotes natural growth function of the≥ population, E (t), E (t) 0 give players’≥ fishing efforts measured as 1 2 ≥ the number of vessels involved in fishing at time t and q1, q2 > 0 denote catchability coefficients related to the unit fishing effort of the player. We assume that E1, E2 belong to decision sets D1,D2. Let D1 = D2 C([0, )). Assume that fish population evolves according to Verhulst (Gurman,⊆ 1978) model∞ x F (x)= rx 1 , − K   where r > 0 represents the intrinsic growth rate, and K > 0 denotes maximal natural object capacity. The players’ net revenues over the fixed time period [0,T ] are defined by

T ρ t J = g (x(T )) + e− 1 [q E (t)(1 s)x(t)(p k q E (t)(1 s)x(t))]dt , 1 1 1 1 − 1 − 1 1 1 − Z0 T ρ t J = g (x(T )) + e− 2 [q E (t)sx(t)(p k q E (t)sx(t))]dt , (20) 2 2 2 2 2 − 2 2 2 Z0 where pi is the price, ki gives catching cost, ρi denotes the discount factor, i =1, 2. Functions gi(x) describe the salvage value of the stock at time T . Following usual assumptions on utility function we suppose that gi′(x) 0, gi′′(x) 0 , i =1, 2. The player’s profit is presented as an income over≥ the time period≤ [0,T ] that depends on the difference between the price and the catching costs with discounting. Here, catching costs have quadratic forms. Assume that players punish each other for a deviation from the cooperative equi- librium by increasing the control on the value which is proportional to the difference between cooperative and deviating strategies (Ehtamo and Hamalainen, 1993).

Proposition 1. The cooperative incentive equilibrium in the problem (19), (20) has the form

γ (E (t)) = Ec(t)+η (t)(E (t) Ec(t)) , γ (E (t)) = Ec(t)+η (t)(E (t) Ec(t)) , 1 2 1 1 2 − 2 2 1 2 2 1 − 1 260 Anna N. Rettieva where q µ λ (t)s 1 η (t)= 2 1 1 , η (t)= , 1 q µ λ (t)(1 s) 2 η (t) 1 2 2 − 1 c c cooperative strategies E1(t), E2(t) and conjugate variables λi(t), i =1, 2 are defined in (Mazalov and Rettieva, 2010).

Denote by sc the territory sharing under cooperation. We assume that players deviating from the cooperative equilibrium point are punished by the center rather than by themselves, as was in (Ehtamo and Hamalainen, 1993).

Theorem 1. The cooperative incentive equilibrium in the problem (19), (20) takes the form

1 1 b µ− q λ(t) b µ− q λ(t) γ (E (t)) = 1 − 1 1 , γ (E (t)) = 2 − 2 2 , 1 2 a (1 s (t))x(t) 2 1 a s (t)x(t) 1 − 2∗ 2 1∗ where c c c s c c 1 s c s2∗(t)= s c (E2(t) E2(t)) , s1∗(t)= s + −c (E1(t) E1(t)) , − E2(t) − E1(t) −

c c and E1(t), E2(t), x(t), λ(t) are defined in (Mazalov and Rettieva, 2010). We give an example where after the second player’s deviation at time instant t = 20 there is no return to cooperative behavior. Traditional scheme. Fig. 3–5 present the parameters of the model in the cases of cooperation and deviation (dotted line). Fig. 3 shows the population dynamics. Fig. 4 presents the players’ controls (in this model parameters η1 and η2 and the controls E1 and E2, respectively, are almost equal). Fig. 5 shows the players’ catch (v (t)= q E (t)(1 s(t))x(t), v (t)= q E (t)s(t)x(t)), respectively. 1 1 1 − 2 2 2

E t x() t i () vi () t 1.6 180000 2000 1.4 160000 1008 1.2 140000 1006 1 120000 1004 100000 0.8 0 10 20 30 40 50 t 0 10 20 30 40 50 t 0 10 20 30 40 50 t

Fig. 3. Population size Fig. 4. Players’ controls Fig. 5. Players’ catch

Our scheme of incentive equilibrium. Fig. 6–11 present the difference between the parameters in the cases of cooperation and deviation (dotted line). Fig. 6 shows the population dynamics. Fig. 7 and 8 present the players’ controls. Notice, the sec- ond player increases his fishing efforts and the first player decreases it. Fig. 9 shows water area sharing (s). One can see that s decreases from 0.5to 0.1. Fig. 10 and 11 present the players’ catch (v1(t)= q1E1(t)(1 s(t))x(t), v2(t)= q2E2(t)s(t)x(t)), re- spectively. Notice, the first player’s catch increases− slightly, while the second player’s catch decreases quickly (from 1420 to 900 individuals per time instant). According to the results of numerical modelling the center’s participation in op- timal resource exploitation regulation has several interesting features. If the center’s Cooperation in Bioresource Management Problems 261

E t x() t 1 () E2 () t 1.4 1.6 180000 1.2 1.4 160000 1 1.2 140000 0.8 1 120000 0.6 100000 0.8 0 10 20 30 40 50 t 0 10 20 30 40 50 t 0 10 20 30 40 50 t

Fig. 6. Population size Fig. 7. Player 1’s control Fig. 8. Player 2’s control

v t s() t v1 () t 2 () 0.5 1420 1400 0.4 1300 1400 1200 0.3 1380 1100 0.25 1000 0.2 1360 900 0 10 20 30 40 50 t 0 10 20 30 40 50 t 0 10 20 30 40 50 t

Fig. 9. Territory sharing Fig. 10. Player 1’s catch Fig. 11. Player 2’s catch strategy is to punish defaulter player until the end of the planning period, the hon- est player has visible advantages even in comparison with cooperative equilibrium, and his opponent incurs remarkable losses. The center’s strategy here is the terri- tory sharing. The player who breaks the agreement achieved at the beginning of the game is punished by gradually decreasing the harvesting territory. This scheme can be easily realized in practice. The economic feasibility of cooperation maintenance by the center is an ad- vantage for players who keep agreement achieved at the beginning of the game. Therefore, there is no need for monitoring the opponent’s actions that incur addi- tional costs, and players completely rely on the center. In the case where players control each other’s behavior, when the second player deviates the first player is compelled to increase his fishing efforts too, i.e. to incur additional cost on large number of ships’ operation. In the case where the center punishes deviating players, the honest player reduces his fishing efforts conversely, but his catch increases that is connected with catch territory change. Thus, he gets larger profit with smaller expenses for ships’ operation. Discrete-time model A discrete-time dynamic game model of bioresource management problem is considered in (Mazalov and Rettieva, 2008; Mazalov and Rettieva, 2009; Mazalov and Rettieva, 2011). Let us divide the water area into two parts: s and 1 s, where two players exploit the fish stock. The center (referee) shares the reservoir.− The players (countries or fishing firms) that exploit the fish stock during infinite time on their territory are the participants of this game. The fish population evolves according to the equation (the modified fish war model (Levhari and Mirman, 1980)): α xt+1 = (εxt) , x0 = x , (21) where xt 0 is the population size at time t 0, 0 <ε< 1 gives natural death rate, 0 <α<≥ 1 denotes natural birth rate. ≥ 262 Anna N. Rettieva

Suppose that the players’ utility functions are logarithmic. We consider the problem of maximizing the infinite sum of discounted utilities for two players:

∞ ∞ J = δt ln((1 s)x u1) , J = δt ln(sx u2) , (22) 1 1 − t t 2 2 t t t=0 t=0 X X i where 0 ut 1 gives player i’s fishing efforts at time t, 0 < βi < 1 denotes the discount≤ factor≤ for player i, i =1, 2. c c To determine the cooperative equilibrium strategies u1, u2 an approach of trans- fering from finite to infinite resource management problem is applied (see (Mazalov and Rettieva, 2011)). Denote by sc the territory sharing under cooperation. Assume that the center punishes players for a deviation from the cooperative equilibrium. If the first player deviates the center increases sc, but if the second player deviates – decreases sc. Theorem 2. The cooperative incentive equilibrium in the problem (21), (22) takes the form ε(1 αδ) ε(1 αδ) γ (u )= − , γ (u )= − , 1 2 2(1 s ) 2 1 2s − 2∗ 1∗ where c c c s c c 1 s c s2∗ = s c (u2 u2) , s1∗ = s + −c (u1 u1) . − u2 − u1 − In (Mazalov and Rettieva, 2011) it was shown that in the case of a short-time second player’s deviation on step k k ck u2 = u2 + ∆k and his returning to cooperation after, the next properties are satisfied: 1 The steady-state population size under deviation is equal to cooperative one when the number of steps tends to infinity: otk α x (εαδ) 1−α . n → 2 The conditions of the incentive equilibrium are satisfied J otk J c , J otk J c , 1 ≥ 1 2 ≤ 2 c where Ji denotes the player i’s profit when both players apply cooperative otk strategies, Ji gives the player i’s profit when the second player deviates and the center punishes her (i =1, 2). 3 The player who deviates losses less when the number of steps grows Dn+1

5.2. Dynamic stability and conditions for rational behavior Here, we don’t focus on time consistent IDP construction. Our aim is to under- line that the each step rational behavior condition (12) is much easier to verify than Yeung’s condition (11). To show it we check both conditions for a bioresource management problem with two players (Mazalov and Rettieva, 2010). Assume that the fish population evolves according to the equation (the fish war model (Levhari and Mirman, 1980)):

x = (x u1 u2)α , x = x , (23) t+1 t − t − t 0 where xt 0 is the population size at time t 0, α denotes natural birth rate, 0 <α< 1,≥u1, u2 0 give players’ catch at time≥t. t t ≥ The players’ net revenues over the infinite time horizon take the forms

∞ t i Ji = δ ln(ut) , i =1, 2 , (24) t=0 X where δ denotes the discount factor, 0 <δ< 1. The cooperative payoff has the form

µ1J1 + µ2J2 , (25) where µ ,µ denote the weighting coefficients, 0 µ ,µ 1, µ + µ = 1. 1 2 ≤ 1 2 ≤ 1 2 First, we determine the Nash equilibrium. The solution of the Bellman equation

α Vi(x) = max ln ui + δVi(x u1 u2) , i =1, 2, (26) ui 0{ − − } ≥ is sought in the next form

Vi(x)= Ai ln x + Bi, i =1, 2, and we suppose that the optimal strategies are linear ui = γix, i =1, 2. Hence, from 1 a equation (26) we get the optimal catch uN = uN = − x and the payoffs 1 2 2 a − 1 1 V (x)= V (x)= ln x + B, 1 2 1 a 1 δ − − where a = αδ, and a 1 B = ln(1 a)+ ln a ln(2 a). − 1 a − 1 a − − − To determine the cooperative payoff (25) we apply the Bellman principle again. Similar reasoning leads to the total payoff under cooperation 1 1 V (x)= ln x + B , 1, 2 1 a 1 δ 1, 2 − − where a B = µ ln µ + µ ln µ + ln(1 a)+ ln a. 1, 2 1 1 2 2 − 1 a − 264 Anna N. Rettieva

The dynamics under cooperation is

t j t P α α j=1 xt = x0 a . (27) The criterion of equal partition is considered as a solution of the cooperative game (23)–(25). This solution coincides with the Shapley value in two-person game and can be extended to the principle of cooperative gains’ proportional division. The imputation in the problem (23)–(25) takes the form 1 1 1 ξ (t)=ξ (t)= V = ln x + B , 1 2 2 1, 2 2(1 a) t 2(1 δ) 1, 2 − − where xt is obtained in (27). Theorem 3. The incentive conditions for rational behavior are fulfilled in the prob- lem (23)–(25). Proof. First, verify the each step rational behavior condition (12). Rewrite it in the form 1 1 ln x + µ ln µ + (1 µ ) ln(1 µ ) −2 t 2 1 1 − 1 − 1 − h 2 ln(1 a)+ ln(2 a) 0 . − − 1 a − ≥ − It is easy to show that the expression in squarei brackets is greater than 2 1 a ln(2 a) 1 > 0. This inequality follows from − − − 1 ((1 + )b)2 > e , b 1 where b = 1 a . Now, verify− Yeung’s condition (11). For presented model it takes the form

t t at 1 1 t j αδ(1 δ ) 2(1−a) ln x0 + 2(1 a) ln a δ α 1 −δ + − − { j=1 − − } 1 δt P + 2(1− δ) [µ1 ln µ1 + (1 µ1) ln(1 µ1) − − 2 − − ln(1 a)+ 1 a ln(2 a)] 0 . − − − − ≥ The first expression and the expression in square brackets as was already proved are positive. Now, we need to show that

t αδ(1 δt) f(t)= δt αj − < 0 , t 1 . − 1 δ ∀ ≥ j=1 X − Notice that f(1) = 0. Therefore, it is sufficient to prove that f(t) is decreasing.

t t ln δ(1 αδ) α ln(αδ)(1 δ) f ′(t)= δ α − − − < 0 . (1 α)(1 δ) − − Denote f (t) = ln δ(1 αδ) αt ln(αδ)(1 δ). This function is decreasing 1 − − − f1′ (t) < 0. To check that f1(1) < 0 consider

f2(α, δ)=f1(1)=ln δ(1 αδ) α ln(αδ)(1 δ)= = ln(αδ)(1 α) + (αδ −1) ln−α . − − − Cooperation in Bioresource Management Problems 265

∂f (α, δ) 1 α Function f (α, δ) is increasing with respect to δ and α since 2 = − + 2 ∂δ δ ∂f (α, δ) α ln(α) > 1 α+α ln(α) > 0 and 2 = δ 1 ln(αδ)+δ ln α> ln(α)(δ 1) > 0. − ∂α − − − Finally, f (1,δ)= f (α, 1) = 0, therefore f (α, δ) 0. 2 2 2 ≤ As one can see in this simple case, the each step rational behavior condition (12) is easier to verify than the irrational behavior proofness condition (11). 5.3. Characteristic function construction In the paper (Mazalov and Rettieva, 2010) we investigate the model with many players and infinite planning horizon in contrast to the traditional fish war model with two players (Levhari and Mirman, 1980). The characteristic function for co- operative game is constructed in two unusual forms. Let n players (countries or fishing firms) exploit the fish stock during infinite time horizon. The dynamics of the fishery is described by the equation

n x = (εx u )α , x = x , (28) t+1 t − it 0 i=1 X where x 0 is the population size at time t 0, ε (0, 1) denotes natural death t ≥ ≥ ∈ rate, α (0, 1) represents natural birth rate, uit 0 gives the catch of player i, i =1,...,n∈ . ≥ Suppose that the player i’s utility function is logarithmic. Then the players’ net revenues over infinite time horizon are defined by

∞ t Ji = δ ln(uit) , i =1, . . . , n , (29) t=0 X where 0 <δ< 1 denotes the common discount factor. To construct characteristic function in the first model we suppose that the play- ers outside coalition K switch to their Nash strategies, which were determined for the initial noncooperative game (Petrosjan and Zaccour, 2003). It is the case where players have no information about the fact that coalition was formed. In the sec- ond model players outside coalition K determine new Nash strategies in the game with N K players. This case corresponds to the situation where players know that coalition\ K is formed. Model without information. First, we determine the Nash equilibrium and get the optimal catch 1 a uN = − εx (30) i n a(n 1) − − and the payoffs 1 1 V (x)= ln x + B , i =1, . . . , n , (31) i 1 a 1 δ i − − where 1 ε a B = ln + ln(1 a)+ ln a, a = αδ . i 1 a n a(n 1) − 1 a −  − −  − 266 Anna N. Rettieva

Now, we determine the payoff of any coalition K with k players. Suppose that players outside coalition K apply their Nash strategies determined in (11). Hence, we get the optimal catch (1 a)(k a(k 1)) uK = − − − εx, i K (32) i k(n a(n 1)) ∈ − − and the payoff of coalition K k 1 V (x)= ln x + B , (33) K 1 a 1 δ K − − where k ε(k a(k 1)) ka B = ln − − + k(ln(1 a) ln k)+ ln a . K 1 a n a(n 1) − − 1 a −  − −  − Last, we determine the payoff and optimal strategies in the case of full cooper- ation (grand coalition). From (15) and (16) we get

(1 a) uI = − εx, i =1, . . . , n , i n n 1 V (x)= ln x + B , (34) I 1 a 1 δ I − − where 1 B = nB + n( ln(n a(n 1)) ln n) . I i 1 a − − − − Finally, we have determined the characteristic function for the game starting at time t from the state x 0,L =0 , V ( i ,x,t)= V (x),L = i , V (L,x,t)=  i (35) V ({K,x,t} )= V (x),L = {K,}  K  V (I,x,t)= VI (x),L = I,  where Vi(x), VK (x), VI (x) are of the forms (12), (16) and (20). In (Mazalov and Rettieva, 2010) it was proved that the characteristic function (22) is superadditive function. Next, the imputation set should be determined. In (Mazalov and Rettieva, 2010) it was proved that the vector β(t) = (β1(t),...,βn(t)), where

β (t)= ξ (t) δξ (t + 1) , i =1,...,n (36) i i − i is time-consistent imputation distribution procedure. Here, the Shapley value is adopted as the cooperative optimality principle. It takes the form 1 1 ξ (t)= ln x + (B + B ) , i =1, . . . , n , (37) i 1 a t 1 δ i ξ − − where 1 B = ln(1 + (n 1)(1 a)) ln n 0 . ξ 1 a − − − ≥ − Cooperation in Bioresource Management Problems 267

Theorem 4. The Shapley value (23) is time-consistent and both conditions for rational behavior ((11) and (12)) are satisfied. Proof. From (4) we get 1 β (t)= (ln x δ ln x )+ B + B , i =1, . . . , n . i 1 a t − t+1 i ξ − Yeung’s condition (11) takes the form 1 1 δt 1 1 δt (ln x δt ln x )+ − (B + B ) (ln x δt ln x )+ − B 1 a 0 − t 1 δ i ξ ≥ 1 a 0 − t 1 δ i − − − − and is fulfilled as Bξ 0. The each step rational≥ behavior condition (12) takes the form 1 1 (ln x δ ln x )+ B + B (ln x δ ln x )+ B 1 a t − t+1 i ξ ≥ 1 a t − t+1 i − − and is also valid as B 0. ξ ≥ Model with informed players. Consider the case where players outside coali- tion K determine new Nash strategies in the game with N K players. This case corresponds to the situation where players know that coalition\ K is formed. Hence, the difference from the previous case is only in determining VK . For players from coalition K we solve the Bellman equation ˜ ˜ N α VK (x) = max ln ui + δVK (εx ui u˜i ) , ui K{ − − } ∈ i K i K i N K X∈ X∈ ∈X\ N whereu ˜i , i N K, corresponds to the solution of the Bellman equation for players outside the coalition∈ \ K α V˜i(x)= max lnu ˜i + δV˜i(εx ui u˜i) , i N K. u˜i N K{ − − } ∈ \ ∈ \ i K i N K X∈ ∈X\ Now, we get the optimal catch of coalition K members 1 a u˜K = − εx,i K i k(1 + (n k)(1 a)) ∈ − − and the payoff of coalition K k 1 V˜ (x)= ln x + B˜ , (38) K 1 a 1 δ K − − where 1 ε a B˜ = k( ln + ln(1 a)+ ln a ln k) . K 1 a 1 + (n k)(1 a) − 1 a − −  − −  − Hence, the characteristic function for the game starting at time t from the state x will be of the form 0,L =0 , V ( i ,x,t)= V (x),L = i , V (L,x,t)=  { } i { } (39) V (K,x,t)= V˜ (x),L = K,  K  V (I,x,t)= VI (x),L = I, ˜  where Vi(x), VK (x), VI (x) are of the forms (12), (27) and (20). 268 Anna N. Rettieva

Theorem 5. The characteristic function (39) has a superadditive property if k,l n+1 ≥ 3 . This result shows that it is profitable to merge two coalitions when both of them have sufficiently large number of participants. Similarly to the first model we determine the Shapley value and time-consistent imputation distribution procedure. From (27) we get

1 1 ξ (t)= ln x + (B + B ) , i =1, . . . , n , (40) i 1 a t 1 δ i ξ − − where (n k)!(k 1)! 1 1 + (n 1)(1 a) B = − − k( ln( − − ) ln k) ξ n! 1 a 1 + (n k)(1 a) − − K N X∈ h − − − 1 1 + (n 1)(1 a) (k 1)( ln( − − ) ln(k 1)) = − − 1 a 1 + (n k + 1)(1 a) − − −n − − i 1 1 1 + (n 1)(1 a) = k( ln( − − ) ln k) n 1 a 1 + (n k)(1 a) − − kX=1 h − − − 1 1 + (n 1)(1 a) (k 1)( ln( − − ) ln(k 1)) = − − 1 a 1 + (n k + 1)(1 a) − − − − − 1 i = ln(1 + (n 1)(1 a)) ln n . 1 a − − − − The proof that the Shapley value is time-consistent and both conditions for rational behavior are satisfied is similar to Theorem 4. Some properties of the characteristic function construction’s variants were proved in (Mazalov and Rettieva, 2010):

1 The second model is better for free-riding. 2 The profit of coalition K in the first model is greater than in the second model. 3 The first model in the case of coalition K formation is better for population size.

Fig. 12 presents time-consistent imputation distribution procedure (βi(t)) for player i (dark line), player i’s Nash profit V i (bright line) and her Shapley value { } ξi(0) (dotted line), i =1,...,n. Notice that the distribution procedure is greater than the profit in noncoopera- tive case at every time instant. Hence, figure shows how to distribute the cooperative gain (the Shapley value) along the game path. Now, we show the difference between the two approaches of coalition K forma- tion. Fig. 13 presents the population dynamics in the case of non-informed players (dark line) and in the case of informed players (bright line). As one can notice the population size in the first case is larger. This result shows that for ecological systems the situation where coalition is formed and other players don’t have information about it is preferable. Cooperation in Bioresource Management Problems 269

bi(t) x(t) 0.8 Vi(t) -3 Sh i()0 0.7 -3.5 0.6

-4 0.5

-4.5 0.4

0.3 -5 0.2 -5.5 0.1 0 2 4 6 8 10 12 14 16 18 t 0 2 4 6 8 10 12 14 16 18 20 t

Fig. 12. IDP, Nash profit and Shi(0) Fig. 13. Population size

Fig. 14 presents the difference between the coalition K’s profits in two considered cases. Clearly, it is profitable for coalition to be formed insensibly. Fig. 15 illustrates the difference between the profits of player i outside the coali- tion K in two considered cases. Notice, the second model is better for free-riding.

VK(t) Vi(t) -16

-18 -3 -20

-22 -3.5

-24

-26 -4

-28

-30 -4.5

-32 -5 0 2 4 6 8 10 12 14 16 18 20 t 0 2 4 6 8 10 12 14 16 18 20 t

Fig. 14. Profit of coalition K Fig. 15. Profit of player i outside K

5.4. Coalition stability In (Rettieva, 2011; Rettieva, 2012) we consider a discrete-time game model re- lated to a bioresource management problem (fish catching). The reservoir is di- vided into regions where players (countries or fishing firms) of two types harvest the fish stock. We assume that there are migratory exchanges between the regions (Fisher and Mirman, 1992; Fisher and Mirman, 1996). So the stock in one region (where players of type 1 exploit the fish) depends not only on the previous stock and catch in the region, but also on the stock and catch in the other region (where players of type 2 exploit the fish). Here, in contrast to the grand coalition formation, we consider the coalition structure where players of each type can form a coalition. Therefore, there can be two coalitions and single players of each type in the game. The sizes of stable coalitions are the subjects of investigation. Two ways to construct the players’ optimal strategies are considered: all play- ers decide simultaneously (Nash-Cournot strategies) or members of coalitions are assumed to be the leaders and players decide sequentially (Stackelberg strategies). Furthermore, the characteristic function is constructed in an unusual form: players outside the coalition K determine new Nash strategies in the game with N K play- \ 270 Anna N. Rettieva ers. This case corresponds to the situation when players know that coalition K is formed (see Section 5.3). We divide a fishery into regions, which are exploited by two types of players: i N = 1,...,n and j M = 1,...,m . The players (countries or fishing firms)∈ that{ harvest} the fish stock∈ are{ the participants} of the game. The fish populations evolve according to the system of equations (the great fish war model (Fisher and Mirman, 1996)):

α1 β1 xt+1 = xt yt , x0 = x , y = yα2 xβ2 , y = y ,  t+1 t t 0 where xt 0 is the population size in the first region at time t 0, yt 0 denotes the population≥ size in the second region at time t 0, 0 < α≥< 1 gives≥ natural ≥ i birth rate, 0 <βi < 1 denotes coefficients of migration between the regions, i =1, 2. Here, αi represents the direct effect of the stock on the stock in this territory in the next period. βi represents the effect of migration between two parts of the reservoir. Let N = 1,...,n players exploit the stock x and M = 1,...,m players { } t { } harvest the stock yt. Suppose that the utility function of players are logarithmic. Then the players’ net revenues over the infinite time horizon are defined by

∞ ∞ J = δt ln(u ) , i N, J = δt ln(v ) , j M, (41) i it ∈ j jt ∈ t=0 t=0 X X where uit 0, vjt 0 give players’ catch at time t 0 (i N, j M), 0 <δ< 1 denotes the≥ common≥ discount factor. ≥ ∈ ∈ Each player is interested in maximizing the sum of her discounted utility. And the dynamics become

n α1 m β1 xt+1 = xt uit yt vjt , x0 = x , − i=1 − j=1   m α2  n β2 (42)  P P  yt+1 = yt vjt xt uit , y0 = y . − j=1 − i=1      P P  Nash-Cournot strategies. Players outside coalition K(L) determine new Nash strategies in the game with N K (M L) players. Players wish to maximize the\ following\ functionals

k ∞ t k l ∞ t l J = δ ln(uit) , J = δ ln(vjt) , t=0 i K t=0 j L X hX∈ i X hX∈ i ∞ ∞ J N = δt ln(uN ) , i N K, J N = δt ln(vN ) , j M L. i it ∈ \ j jt ∈ \ t=0 t=0 X X Stackelberg strategies. Assume that members of coalitions are the leaders and players decide sequentially. Hence, at first, singletons determine the Nash optimal strategies under the assumption that cooperative strategies are known. Then, the coalition members obtain their optimal catch. Cooperation in Bioresource Management Problems 271

k l I) Coalition members’ strategies ui , i K and vj , j L are fixed. Singletons wish to maximize their net revenues ∈ ∈

∞ ∞ J N = δt ln(u ) , i N K, J N = δt ln(v ) , j M L i it ∈ \ j jt ∈ \ t=0 t=0 X X under the dynamics

α1 β1 k l xt+1 = xt uit uit yt vjt vjt , x0 = x , − i K − i N K − j L − j M L ∈ ∈ \ ∈ ∈ \   α2  β2  P l P P k P  yt+1 = yt vjt vjt xt uit uit , y0 = y . − j L − j M L − i K − i N K  P∈ ∈P\   P∈ ∈P\   Denote byu ˜N , i N K andv ˜N , j M L, the obtained strategies. i ∈ \ j ∈ \ II) Coalition members maximize the joint payoff

∞ ∞ k t l t J = δ ln(uit) , J = δ ln(vjt) t=0 i K t=0 j L X hX∈ i X hX∈ i under the dynamics

α1 β1 N N xt+1 = xt uit u˜it yt vjt v˜jt , x0 = x , − i K − i N K − j L − j M L ∈ ∈ \ ∈ ∈ \   α2  β2  P P N P P N  yt+1 = yt vjt v˜jt xt uit u˜it , y0 = y . − j L − j M L − i K − i N K  P∈ ∈P\   P∈ ∈P\   Denote byu ˜k, i K andv ˜l , j L, the obtained coalition members’ strategies. i ∈ j ∈ In (Rettieva, 2012) it was proved that the payoff of a singleton is greater un- der Nash-Cournot strategies than under Stackelberg strategies and for a coalition member the opposite result is valid. The fact that the payoff of a coalition member is greater in the case were two coalitions K and L form than in the case were players join into one mixed coalition K + L also was proved. Then we checked the internal and external stability of our coalitions (D’Aspremont et al., 1983). Unfortunately, in our model, just like in the classical papers (Barrett, 1994; Carraro and Siniscalco, 1992), only small-size coalitions are internally stable (for Nash-Cournot strategies). For Stackelberg strategies, on the other hand, coalitions are internally, but not externally stable. We adopt a new coalition stability approach (15), (16) for presented model (41), (42) and get the coalition stability conditions in the forms

k l p C (p + l h s)!(h + s 1)!(p 1)!l! s+h s 1+h − − − − [C C − ] 0 , (43) k − (p + l)!(p s)!(s 1)!(l h)!h! − ≥ s=1 hX=0 X − − −

l k q C (k + q h s)!(h + s 1)!(q 1)!k! s+h s+h 1 − − − − [C C − ] 0 , (44) l − (k + q)!(q h)!(h 1)!(k s)!s! − ≥ s=0 X hX=1 − − − where the parameters are given in (Rettieva, 2012). 272 Anna N. Rettieva

We consider this model for the set of parameters which are typical for the fish species in Karelian lakes and obtain the next results: For Nash-Cournot strategies only the coalitions of size 1 (k = 1 or l = 1) are internally stable. External stability is valid for all k if n> 2 (for all l if m> 2). For Stackelberg strategies internal stability is valid for all k if n < 35 (for all l if m< 51). Unfortunately, there are no externally stable coalitions. Tables 1 and 2 present the coalition structures which are stable in the sense of coalitional stability (43), (44). We use the notation + for the coalitions that are stable for all p [1, k] and q [1,l]. Double numbers (first for p and second for q) represent the coalitions∈ that are∈ stable for p and q larger or equal these parameters.

k/l 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 + + + + 1,3 1,4 1,5 1,6 1,7 1,8 1,8 1,9 1,9 1,8 1,7 2 + + + + + + 1,3 1,4 1,5 1,6 1,7 1,7 1,7 1,7 1,5 3 + + + + + + + + + 1,4 1,5 1,5 1,6 1,6 1,4 4 + + + + + + + + + + + 1,3 1,4 1,4 1,3 5 3,1 + + + + + + + + + + + + + + 6 4,1 + + + + + + + + + + + + + + 7 4,1 + + + + + + + + + + + + + + 8 5,1 + + + + + + + + + + + + + + 9 5,1 + + + + + + + + + + + + + + 10 4,1 + + + + + + + + + + + + + + Table 1. Nash-Cournot strategies

k/l 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 + + 1,2 1,2 1,3 1,3 1,4 1,4 1,5 1,5 1,6 1,6 1,7 1,7 1,7 2 + + 2,2 2,2 2,2 2,3 2,3 2,4 2,4 2,5 2,5 2,6 2,5 2,7 2,7 3 2,1 + 2,2 2,2 2,2 2,3 2,3 2,4 2,4 2,4 2,5 2,5 2,5 2,6 2,7 4 2,1 2,2 2,2 2,2 2,2 2,3 2,3 2,3 2,4 2,4 2,5 2,5 2,5 2,6 2,6 5 3,1 2,2 2,2 2,2 2,2 2,2 2,3 2,3 2,4 2,4 2,4 2,5 2,5 2,6 2,5 6 3,1 3,3 3,2 2,2 2,2 2,2 2,3 2,3 2,3 2,4 2,4 2,5 2,5 2,5 2,5 7 4,1 3,2 3,2 3,2 3,2 2,2 2,3 2,3 2,3 2,4 2,4 2,4 2,5 2,5 2,5 8 5,1 4,2 4,2 3,2 3,2 3,2 3,3 3,3 2,3 2,4 2,4 2,4 2,5 2,5 2,4 9 5,1 5,2 4,2 4,2 4,2 4,2 3,3 3,3 3,3 3,3 2,4 2,4 2,4 2,5 2,4 10 6,1 5,2 5,2 5,2 4,2 4,2 4,2 4,3 3,3 3,3 3,3 3,4 2,4 2,4 2,3 Table 2. Stackelberg strategies

For intercoalition stability (p = 1 and q = 1) even the coalition structure con- sisting of all players (k = 10,l = 15) is stable for Nash-Cournot strategies. For Stackelberg strategies the maximal stable coalition structure is (k =3,l = 2). For the situations where a set of coalition’s members can move to other coalition, one can notice that the stability concept is valid for different coalitions’ sizes (as coalition size is larger the set of it’s members who have an incentive to move is larger too). For example, the coalition structure k = 3, l = 12 is protected against the possible moves of more than 5 coalition L’s members and it is unstable for q< 5 (Nash-Cournot strategies). For Stackelberg strategies this coalition structure is also unstable (p 2) since it is profitable for any coalition K’s member to move to coalition L. ≥ From the results of numerical modelling it can be noticed the load on the stock is minimal when players join into one coalition (cooperative case). However, the Cooperation in Bioresource Management Problems 273 formation of grand coalition is not natural for asymmetric players. Furthermore, we proved that it is less profitable for players to join into one mixed coalition that to form two coalitions. Hence, to minimize the load on the stock the coalition structure should consist of large number of players and be stable. We give some advices for ecological managers to improve populations’ growth in the case of asymmetric explores. For Nash-Cournot strategies the internal stability can’t be guaranteed, but it is protected from individual moves form one coalition to another (intercoalition stabil- ity). For Stackelberg strategies the coalitions are internally stable, but intercoalition stability condition is not valid. Therefore, the manager first should determine the coalition formation process and then: if it is Nash-Cournot, then one should use some mechanisms to internally stabi- lize the coalitions: it can be fines for breaking off the cooperative agreement, punish- ment schemas like incentive equilibrium (Mazalov and Rettieva, 2010) or transfers schemes. If it is successfully done then it is unnecessary to worry about the possible players’ moves for one coalition to another because the coalitions are intercoalition- ally stable almost for all the parameters. if it is Stackelberg, then one should prohibit individual moves from one coalition to another (it can be done by the government laws or punishment schemes, again). Then the coalition structure will be stable for most of the parameters in the sense of internal and coalition stability. The manager should not worry about the external stability because the more players decide to enter coalitions the larger population size will be. 5.5. Different discount factors Traditionally, cooperative behavior analysis in bioresource management problems rests on the assumption of identical discount factors for all players. In the papers (Rettieva, 2014; Mazalov and Rettieva, 2014; Mazalov and Rettieva, 2015) we seek an optimal compromise in the case of heterogeneous goals pursued by players (dif- ferent discount factors). Consider a discrete-time game-theoretic bioresource management model with an identical planning horizon of both players and their different discount factors. Suppose that two players (countries or fishing firms) harvest a fish stock on a finite planning horizon [0,n]. The fish population evolves according to the equation

x = (εx u u )α , x = x , (45) t+1 t − 1t − 2t 0 where x 0 is the population size at time t 0, ε (0, 1) denotes the natural t ≥ ≥ ∈ survival rate, α (0, 1) indicates the growth rate, and uit 0 gives the catch of player i, i =1, 2.∈ ≥ By assumption, the players possess the logarithmical payoff functions and differ- ent discount factors. In other words, the payoff functions of the players are defined by n t Ji = δi ln(uit) , (46) t=0 X where δ (0, 1) denotes the discount factor of player i, i =1, 2. i ∈ 274 Anna N. Rettieva

Theorem 6. The Nash equilibrium strategies in the problem (45), (46) have the form t 1 t 1 − j − j εa2 a1 εa1 a2 j=0 j=0 uN = x, uN = x , 1t t Pt 2t t Pt j j j j a1 a2 1 a1 a2 1 j=0 j=0 − j=0 j=0 − P P P P where ai = αδi , i =1, 2 , t =1,...,n. The individual payoffs of the players make up n n N j n j n V (x, δ )= (a ) ln x + (δ ) − A (δ ) ln k, i =1, 2 , (47) i i i i ij − i j=0 j=1 X X j k ε a j j j p P ak ak k=1 l k P l A =ln k=0 ( a )k=1 , l,p =1, 2, l = p , j =1, . . . , n. (48) lj j Pj l 6 k k k=1 h a1 a2 1 X i k=0 k=0 − Multi-stepP gameP and recursive Nash bargaining solution. Define cooperative be- havior in this model by a recursive bargaining procedure (Rettieva, 2014). At each step, cooperative strategies are found via a bargaining solution, where noncooper- ative payoffs play the role of status quo points. Theorem 7. The cooperative payoffs in the problem (45), (46) possess the form

n Hc (γc ,...,γc ,γc ,...,γc ; x)= aj ln(x) δn ln(k)+ 1n 11 1n 21 2n 1 − 1 j=0 X n 1 n j − − n j c i c c + δ1 − ln(γ1n j )+ a1 ln(ε γ1n j γ2n j ) , − − − − − j=0 i=1 X h X i n Hc (γc ,...,γc ,γc ,...,γc ; x)= aj ln(x) δn ln(1 k)+ 2n 11 1n 21 2n 2 − 2 − j=0 X n 1 n j − − n j c i c c + δ2 − ln(γ2n j )+ a2 ln(ε γ1n j γ2n j ) . − − − − − j=0 i=1 X h X i The cooperative strategies can be evaluated recursively using the equations

n 1 n j − − c n j c i c c j γ2n δ2 − ln(γ2n j )+ a2 ln(ε γ1n j γ2n j) δ2A2n j = − − − − − − − j=0 i=1 X h X i  n 1 n j − − c n j c i c c j =γ1n δ1 − ln(γ1n j )+ a1 ln(ε γ1n j γ2n j) δ1A1n j − − − − − − − j=0 i=1 X h X i  subject to the constraint n c i ε γ1n a1 c − i=0 γ2n = n , Pi a2 i=0 P Cooperation in Bioresource Management Problems 275 where Aij are defined by (48). In (Rettieva, 2014; Mazalov and Rettieva, 2014) we have performed numerical simulation for a 20-step game. In Fig. 16-18 the black line corresponds to cooperative behavior and grey line to the Nash equilibrium.

0.8

0.7

0.6 xN

0.5

0.4

2 4 6 8 10 12 14 16 18 20 Time t

Fig. 16. Population size Fig. 17. Player 1’s catch Fig. 18. Player 2’s catch

Fig. 16 demonstrates the dynamics of the population size, whereas Figs. 17 and 18 show the catch of each player. Note that cooperation appears beneficial to both players and, moreover, improves the ecological situation owing to sparing bioresource exploitation.

Fig. 19. The cooperative payoffs Fig. 20. Player 2’s payoffs

Compare players’ payoffs under different discount factors. Fig. 19 illustrates the nc nc payoffs V1 (x, δ1) and V2 (x, δ2) for δ1 =0.1,..., 0.9 and δ2 =0.1,..., 0.9. Clearly, a player with a higher discount factor gains more utility from cooperation. And the players obtain identical payoffs in the case of coinciding discount factors. The cooperative behavior design approach suggested in this paper leads to a player’s cooperative payoff which is above or equal to (under some parameters) its Nash equilibrium counterpart. The payoffs of player 2 under cooperative and egoistic behavior are presented in Fig. 20. Hence, the introduced approach stimulates cooperation, which is not always the case within other design methods of cooperative strategies and payoffs (Breton and Keoula, 2014). 276 Anna N. Rettieva

5.6. Different fixed planning horizons

The harvesting process with the dynamics (45) and different planning horizons was considered in (Mazalov and Rettieva, 2014; Rettieva, 2015). Players 1 and 2 harvest the fish stock during n1 and n2 steps, respectively. For the sake of definiteness, suppose that n1 < n2. Therefore, in this model players enter cooperation on the time period [0,n1] and we have to find their cooperative strategies. After step n1 till step n2 player 2 continues the harvesting process individually. Hence, the players’ payoffs are defined by

n1 n1 n2 t c t c t a J1 = δ1 ln(u1t), J2 = δ2 ln(u2t)+ δ2 ln(u2t) , (49) t=0 t=0 t=n +1 X X X1

c a where ui (i =1, 2) denote the cooperative strategies and u2 indicates the strategy of player 2 during individual catch.

To construct the cooperative strategies and payoffs of the players, we apply the Nash bargaining solution for the whole duration of the game. Thus, it is required to solve the following optimization problem:

(V c(x, δ )[0,n ] V N (x, δ )[0,n ]) 1 1 1 − 1 1 1 · (V c(x, δ )[0,n ]+V ac(xcn1 ,δ )[n ,n ] · 2 2 1 2 2 1 2 − V N (x, δ )[0,n ] V aN (xNn1 ,δ )[n ,n ]) = − 2 2 1 − 2 2 1 2 n1 n1 n2 = ( δt ln(uc ) V N (x, δ )[0,n ])( δt ln(uc )+ δt ln(ua ) 1 1t − 1 1 1 2 2t 2 2t − t=0 t=0 t=n +1 X X X1 N aN Nn1 V2 (x, δ2)[0,n1] V2 (x ,δ2)[n1,n2]) max , − − → uc ,uc 0 1t 2t≥

N where Vi (x, δi)[0,n1] represent the Nash equilibrium payoffs defined by (47) (with ac cn1 n = n1), V2 (x ,δ2)[n1,n2] gives the payoff of player 2 owing to its individual aN Nn1 harvesting after n1 steps of cooperative behavior, and V2 (x ,δ2)[n1,n2] is the payoff of player 2 owing to its individual harvesting after n1 steps of noncooperative behavior. Cooperation in Bioresource Management Problems 277

Theorem 8. The cooperative payoffs in the problem (45), (49) make up

c c c c c H1n1 (γ11,...,γ1n1 ,γ21,...,γ2n1 ; x)=

n1 n1 n1 j j n1 j c n1 j i c c n = a ln x+ δ − ln(γ )+ δ − a ln(ε γ γ )+δ 1 ln k = 1 1 1j 1 1 − 1j − 2j 1 j=0 j=1 j=1 i=1 X X X X n1+1 n1 n j 1 a1 n1 j c n1 j a1(1 a1) j j n = − ln x+ δ − ln(γ )+ δ − − ln(ε γ γ )+δ 1 ln k , 1 a 1 1j 1 1 a − 1 − 2 1 − 1 j=1 j=1 − 1 X X c c c c c H2n1 (γ11,...,γ1n1 ,γ21,...,γ2n1 ; x)=

n2 n1 n1 n+j j n1 j c n1 j i c c = a ln x + δ − ln(γ + δ − a ln(ε γ γ )+ 2 1 2j 2 2 − 1j − 2j j=0 j=1 j=1 i=1 X X X X n n n2 j j n j + δ − B + δ 1 a ln(1 k)= 2 2 2 − j=1 j=0 X X n2+1 n1 n1 n+j 1 a2 n1 j c n1 j a2(1 a2 ) j j = − ln x + δ − ln(γ )+ δ − − ln(ε γ γ )+ 1 a 2 2j 2 1 a − 1 − 2 2 j=1 j=1 2 − X X − n n+1 n2 j j n 1 a2 + δ − B + δ 1 − ln(1 k) , 2 2 1 a − j=1 2 X − where

j j j ε Bj = al ln + al ln ap , j =1,...,n, n = n n . 2 j 2 2 2 − 1 l=0 p l=1 p=1 X  a2  X X  p=0 P The cooperative strategies of the players are related via

n+t t c j c j εγ11 a2 ε γ1t a1 j=t 1 − j=0 γc = − , γc = . 1t n+t n+t Pt n+t 2t n+t P t 1 j c j j t 1 t j j εa1− a2 +γ11( a2 a1 (a1− +a1) a2) a2 j=0 j=t 1 j=0 − j=0 j=0 − P P P P P c The strategy of player 1 at the last step (the quantity γ11) follows from one of the first-order optimality conditions.

We present the simulation results for the planning horizons n1 = 10 and n2 = 20. In Fig. 21-23 the black line corresponds to cooperative behavior and grey line to the Nash equilibrium. The dynamics of the population size on the whole planning horizon [0,n2] can be observed in Fig. 21. Clearly, cooperation improves the ecological situation. Figs. 22 and 23 show the catch of player 1 on the time period [0,n1] and the catch of player 2 on the time periods [0,n1] and [n1,n2], respectively. Interestingly, player 2 has a smaller catch in cooperation than in the Nash equilibrium, but this is compensated by its individual harvesting at subsequent steps. And now, compare the players’ payoffs for different planning horizons in the case c when player 1 leaves the game earlier. Fig. 24 illustrates the payoffs V1 (n1, x) and 278 Anna N. Rettieva

0.8

0.7

0.6 xc

0.5

0.4

2 4 6 8 10 12 14 16 18 20 Time t

Fig. 21. Population size Fig. 22. Player 1’s payoff Fig. 23. Player 2’s payoff

Fig. 24. The cooperative payoffs Fig. 25. Player 2’s payoffs

V c(n , x) for n = 2,..., 10 and n = 1,...,n 1. Obviously, the closer is n to 2 2 2 1 2 − 1 n2, the smaller is the difference between the payoffs. Finally, we underline that the suggested cooperative behavior design guarantees that the cooperative payoff of a player is greater or equal to (under some parameters) its payoff in the Nash equilibrium. Fig. 25 shows the payoffs of player 2 under cooperative and noncooperative behavior for different planning horizons. This also manifests that the suggested approach stimulates cooperative behavior.

5.7. Random planning horizons In (Mazalov and Rettieva, 2015) we explore the model (45), (49), where players possess heterogeneous discount factors and, moreover, heterogeneous planning hori- zons. By assumption, players stop cooperation at random steps: external stochastic processes can cause cooperative agreement breach.

Suppose that players 1 and 2 harvest the fish stock during n1 and n2 steps, respectively. Here n represents a discrete random variable taking values 1,...,n 1 { } with the corresponding probabilities θ1,...,θn . Similarly, n2 is a discrete random variable with the value set and the probabilities{ } ω ,...,ω . We believe that the { 1 n} planning horizons are independent. Therefore, during the time period [0,n1] or Cooperation in Bioresource Management Problems 279

[0,n2] the players enter cooperation, and the problem consists in evaluating their strategies. The players’ payoffs are determined via the expectation operator:

n1 n2 n1 t t t a H1 =E δ1 ln(u1t)I n1 n2 + δ1 ln(u1t)+ δ1 ln(u1t) I n1>n2 = { ≤ } { } t=1 t=1 t=n +1 nX X X2  o n n n n 1 n n 1 1− 2 1 t t t a = θn1 ωn2 δ1 ln(u1t)+ ωn2 δ1 ln(u1t)+ δ1 ln(u1t) (50), n =1 n =n t=1 n =1 t=1 t=n +1 X1 h X2 1 X X2 X X2 i

n2 n1 n2 t t t a H2 =E δ2 ln(u2t)I n2 n1 + δ2 ln(u2t)+ δ2 ln(u2t) I n2>n1 = { ≤ } { } t=1 t=1 t=n +1 nX X X1  o n n n n 1 n n 2 2− 1 2 t t t a = ωn2 θn1 δ2 ln(u2t)+ θn1 δ2 ln(u2t)+ δ2 ln(u2t) (51), n =1 n =n t=1 n =1 t=1 t=n +1 X2 h X1 2 X X1 X X1 i a where uit specifies the strategy of player i when its partner leaves the game, i =1, 2. To define cooperative behavior, we employ the Nash bargaining solution; the role of status quo points belongs to the noncooperative payoffs of players. Therefore, we begin with construction of Nash equilibrium strategies. N As step τ occurs in the game, the Bellman functions Vi (τ, x), i =1, 2 of players acquire the form

n n θ n ω 1 V N (τ, x)= max n1 n2 δt ln(uN )+ 1 N N n n 1 1t u1τ ,...,u1n nn1=τ θl hn2=n1 ωl t=τ X l=τ X l=τ X

n1 1 P n2 P − ωn2 t N a + n δ1 ln(u1t)+ V1 (τ,n1) , n2=τ ωl t=τ io X l=τ X n n n Pω θ 2 V N (τ, x)= max n2 n1 δt ln(uN )+ 2 N N n n 2 2t u2τ ,...,u1n nn2=τ ωl hn1=n2 θl t=τ X l=τ X l=τ X

n2 1 P n1 P − θn1 t N a + n δ2 ln(u2t)+ V2 (τ,n2) , n1=τ ωl t=τ io X l=τ X P where

ni ni τ ni τ − − a t a j ni τ j j Vi (τ,ni)= δi ln(uit)= ai ln x+ δi − − Di ,i=1, 2, t=τ j=0 j=1 X X X j j j ε Dj = al ln + al ln( ap) ,i =1, 2. i i j i i l=0 p l=1 p=1 X  ai  X X p=0 P are the players’ payoffs provided that player i, i = 1, 2 harvests the fish stock individually. 280 Anna N. Rettieva

N N We get a relationship between Vi (τ, x) and Vi (τ +1, x) of the form

n n1 N τ N τ+1 N t a V1 (τ, x)= δ1 ln(u1τ )+ Pτ V1 (τ +1, x)+ C1τ θn1 δ1 ln(u1t) , n =τ+1 t=τ 1X X n n2 N τ N τ+1 N t a V2 (τ, x)= δ2 ln(u2τ )+ Pτ V2 (τ +1, x)+ C2τ ωn2 δ2 ln(u2t) , n =τ+1 t=τ 2X X where

n n ωl θl τ+1 l=τ+1 l=τ+1 ωτ 1 θτ 1 Pτ = Pn Pn , C1τ = n n , C2τ = n n . ωl θl ωl θl θl ωl l=τ l=τ l=τ l=τ l=τ l=τ P P P P P P Following the standard approach in fish war models, we search for the payoff N τ τ N N functions Vi (τ, x)= Ai ln x+Bi and linear players’ strategies uiτ = γiτ x, i =1, 2.

Theorem 9. The Nash equilibrium strategies in the problem (45), (50), (51) with random planning horizons take the form

εδτ Aτ εδτ Aτ γN = 1 2 , γN = 2 1 , 1τ τ τ τ τ τ τ τ+1 2τ τ τ τ τ τ τ τ+1 δ1 A2 + δ2 A1 + αA1 A2 Pτ δ1 A2 + δ2 A1 + αA1 A2 Pτ noncooperative payoffs make up

N τ τ Vi (τ, x)= Ai ln x + Bi , i =1, 2 , (52) where

n n1 τ n n2 τ τ − j τ − j δ1 + C1τ θn1 a1 δ2 + C2τ ωn2 a2 n =τ+1 j=0 n =τ+1 j=0 Aτ = 1 , Aτ = 2 , 1 P τ+1 P 2 P τ+1 P 1 αPτ 1 αPτ − − n n1 τ τ N τ τ+1 N N − n1 τ j j δ1 ln(γ1τ )+αA1 Pτ ln(ε γ1τ γ2τ )+C1τ θn1 δ1 − − D1 − − n =τ+1 j=1 Bτ = 1 , 1 τ+1 P P 1 Pτ − n n2 τ τ N τ τ+1 N N − n2 τ j j δ2 ln(γ2τ )+αA2 Pτ ln(ε γ1τ γ2τ )+C2τ ωn2 δ2 − − D2 − − n =τ+1 j=1 Bτ = 2 . 2 τ+1 P P 1 Pτ −

To construct the cooperative strategies and payoffs of the players, we adopt the Nash bargaining solution for the whole duration of the game. Consequently, it is Cooperation in Bioresource Management Problems 281 required to solve the problem (V c(1, x) V N (1, x))(V c(1, x) V N (1, x)) = 1 − 1 2 − 2 n n n1 t c = ( θn1 ωn2 δ1 ln(u1t)+ n =1 n =n t=1 X1 h X2 1 X n 1 n n 1− 2 1 + ω ( δt ln(uc )+ δt ln(ua )) V N (1, x)) n2 1 1t 1 1t − 1 · n =1 t=1 t=n +1 X2 X X2 i n n n2 ( ω θ δt ln(uc )+ · n2 n1 2 2t n =1 n =n t=1 X2 h X1 2 X n 1 n n 2− 1 2 t c t a N + θn1 ( δ2 ln(u2t)+ δ2 ln(u2t)) V2 (1, x)) max , − → uc ,uc 0 n =1 t=1 t=n +1 1t 2t≥ X1 X X1 i N N N where Vi (1, x) = Ai ln x + Bi , i = 1, 2 indicate the Nash equilibrium payoffs defined by (52). Theorem 10. The cooperative payoffs in the problem (45), (50), (51) with random planning horizons have the form c n k c Vi (n k, x)= δi − ln(uin k)+ − − n k+1 i c c +αPn −k Gn k+1 ln(εx u1n k u2n k)+ − − − − − − k 1 − n l n l c n l+1 c c + Pn −k[δi − ln(γin l)+ αPn −l ln(ε γ1n l γ2n l)] + − − − − − − − l=2 X n 1 n c c +Pn −k Pn 1[αAi ln(ε γ1n 1 γ2n 1)+ Bi]+ − − − − − − k n 1 n 1 c n l l +Pn −k δi − ln(γin 1)+ Pn −kCin lVi (ni) , (53) − − − − Xl=1 where n n1 n n2 l t a l t a V1 (n1)= θn1 δ1 ln(u1t) , V2 (n2)= ωn2 δ2 ln(u2t) , n =n l+1 t=n l n =n l+1 t=n l 1 X− X− 2 X− X− k k 1 n l k l n l k n 2 n l k l n l k n Gk = δ1 − α − Pn −k + α A1Pn k , Gk = δ2 − α − Pn −k + α A2Pn k . − − − − Xl=1 Xl=1 The cooperative strategies are related by n k n k n k c 1 n k c 2 c δ1 − δ2 − ε δ2 − γ1n kGk c δ1 − εγ1n 1G1 − − γ2n k = n− k , γ1n k = n 1 . − 2 − 2 c 1 2 1 2 δ1 − Gk δ1 − εGk + γ1n 1(GkG1 G1Gk) − − c The strategy of player 1 at the last step (the quantity γ1n 1) is evaluated through one of the first-order optimality conditions. − Our simulation has employed the Monte Carlo method, n = 10 and the following probabilities θi =0.1, ωi =0.005i +0.0725, i =1,...,n. Figs. 26 and 27 demonstrate the results of numerical simulation with 50 trials under egoistic and cooperative behavior, respectively. Here points indicate the sim- ulation results and circles correspond to the expected payoffs obtained in (52) and (53). 282 Anna N. Rettieva

±4 ±4

±6 ±6

±8 ±8

±10 ±10 ±12 ±12 ±14 ±14 ±16

±14 ±12 ±10 ±8 ±6 ±4 ±12 ±10 ±8 ±6 ±4

Fig. 26. Nash equilibrium Fig. 27. Cooperative equilibrium

6. Conclusions Cooperation plays an important role in bioresource management problems. It leads to a sparing mode of bioresource exploitation and improves the ecological situation. The paper overviews the results in the fields of cooperation maintenance and co- operative behavior determination. Namely, the author’s new schemes to obtain and maintain the cooperative exploitation are presented. We extend the idea of incen- tive equilibrium to the case with territory sharing and control from the center. We present the incentive condition for rational behavior that is easier to verify than the existing ones. We extend the internal and external condition to the models with coalition structure and offered the coalition stability concept. It is proposed to apply the Nash bargaining approach to obtain cooperative profits and strategies in the case where players possess different discount factors. Moreover, the models where players harvesting times are different (fixed and random) were investigated and the possible cooperative behavior determination concepts were obtained. Analytical and numerical results for particular resource dynamic rules and the players’ payoff func- tions are given. The author continues to work in this direction and the latest results were obtained in the field of multicriteria dynamic games (Rettieva, 2017).

References Abakumov, A. I. (1994). Optimal harvesting in populations’ models. Applied and indastrial mathematics review, 1(6), 834–849 (in Russian). Abakumov, A. I. (1993). Control and optimization in the harvesting models. Vladivostok: Dalnauka (in Russian). Baturin, V. A., Skitnevskii, D. M. and Cherkashin, A. K. (1984). Planning and forecast for ecological-economical systems. Novosibirsk: Nauka (in Russian). Barrett, S. (1994). Self-enforcing International Environmental Agreements. Oxford Eco- nomic Papers, 46, 878–894. Bazikin, A. D. (1985). Mathematical biophisics of interacting populations. Moscow: Nauka. Berdnikov, S. V., Vasilchenko, V. V. and Selutin, V. V. (1999). Mathematical modeling of exogenous perturbation in trophic networks. Applied and indastrial mathematics re- view, 6(2), 145–158 (in Russian). Bloch, F. (1995). Sequential formation of coalitions with fixed payoff division and exter- nalities. Games Econ. Behav., 14, 90–123. Bloch, F. (1996). Noncooperative models of coalition formation in games with spillovers. In: Carraro C., Siniscalco D. eds. New Direction in Economic Theory of Environment. Cambridge: Cambridge University Press. Breton, M. and Keoula, M. Y. (2014) A great fish war model with asymmetric players. Ecological Economics, 97, 209–223. Cooperation in Bioresource Management Problems 283

Carraro, C. (1997) The structure of International Environmental Agreements. Paper pre- sented at the FEEM/IPCC/Stanford EMF Conference on ”International Environmen- tal Agreements on Climate Change”. Venice, pp. 309–328. Carraro, C. The economics of coalition formation. In: Gupta J., Grubb M. eds. Climate Change and European Leadership. Kluwer Academic Publishers, 2000. pp. 135–156. Carraro, C. and Siniscalco, D. (1992) The international protection of the environment. J. Publ. Econ., 52, 309–328. Chaudhuri, K. (1986) A bioeconomic model of harvesting a multispecies fishery. Ecological Modelling, 32, 267–279. Clark, C. W. (1985). Bioeconomic modelling and fisheries management. NY: Wiley. D’Aspremont, C., Jacquemin, A., Gabszewicz, J.J. and Weymark, J.A. (1983). On the stability of collusive price leadership. Can. J. Econ., 16(1), 17–25. De Zeeuw, A. (2008). Dynamic effects on stability of International Environmental Agree- ments. J. of Environmental Economics and Management, 55, 163–174. Ehtamo, H. and Hamalainen, R. P. (1993). A cooperative incentive equilibrium for a re- source management problem. J. of Economic Dynamics and Control, 17, 659–678. Eyckmans, J. and Finus, M. (2003). Coalition formation in a global warming game: how the design of protocols affects the success of environmental treaty-making. Working paper 56, CLIMNEG 2. Finus, M. (2008). Game theoretic research on the design of International Environmental Agreements: insights, critical remarks and future challenges. Int. Rev. of Environmental and Resource Economics, 2, 29–67. Finus, M. and Rundshagen, B. (2003). Endogenous coalition formation in global pollution control: a partition function approach. In: Carraro C. ed. Endogenous formation of economic coalitions. Cheltenham: Edward Elgar, pp. 199–243. Fisher, R. D. and Mirman, L. J. (1992). Strategic dynamic interactions: fish wars. J. of Economic Dynamics and Control, 16, 267–287. Fisher, R. D. and Mirman, L. J. (1996). The complete fish wars: biological and dynamic interactions. J. of Env. Econ. and Manag., 30, 34–42. Gimelfard, A. A., Ginzburg, L. R., Poluektov, R. A., Puh, Yu. A. and Ratner, V.A. (1974). Dynamic theory of biological populations. Moscow: Nauka (in Russian). Goh, B. S. (1980). Management and analysis of biological populations. Agricultural and Managed-Forest Ecology. Amsterdam: Elsevier. Gromova, E. V. and Petrosyan, L. A. (2015) On a approach to the construction of char- acteristic function for cooperative differential games. Math. Game Theory and Appl., 7(4), 19–39 (in Russian). Gurman, V. I. and Druzhinina, I. P. (1978). Nature system models. Novosibirsk: Nauka (in Russian). Hamalainen, R. P., Kaitala, V. and Haurie, A. (1984). Bargaining on whales: A differential game model with Pareto optimal equilibria. Oper. Res. Letters, 3(1), 5–11. Haurie, A. (1976). A note on nonzero-sum differential games with bargaining solutions. J. of Opt. Theory and Appl. 18, 31–39. Haurie, A. and Tolwinski, B. (1984). Acceptable equilibria in dynamic games. Large Scale Systems, 6, 73–89. Il’ichev, V. G., Rohlin, D. B. and Ougolnitskii, G. A. (2000). About economical mechanisms of bioresource management control. Russian Academy of Sciences belluten. Theory and system control, 4, 104–110 (in Russian). Ioannidis, A., Papandreou, A. and Sartzetakis E. (2000). International Environmental Agreements: a literature review. Working Papers, GREEN. Kaitala, V. T. and Lindroos, M. (2007). Game-theoretic applications to fisheries. In: Hand- book of operations research in natural resources, pp. 201–215. Springer. Kulmala, S., Levontin, P., Lindroos, M. and Pintassilgo, P. (2009). Atlantic salmon fishery in the Baltic Sea - A case of trivial cooperation. In ”Essays on the Bioeconomics of the Northern Baltic Fisheries”. Soile Kulmala. PhD thesis, University of Helsinki. 284 Anna N. Rettieva

Levhari, D. and Mirman, L. J. (1980). The great fish war: an example using a dynamic Cournot-Nash solution. The Bell J. of Economics, 11(1), 322–334. Lindroos, M., Kaitala V. T. and Kronbak L. G. (2007). Coalition games in fishery eco- nomics. In: Advances in Fishery Economics, pp. 184–195. Blackwell Publishing. Lindroos, M. (2008). Coalitions in International Fisheries Management. Natural Resource Modeling, 21, 366–384. Marin-Solano, J. and Shevkoplyas E. V. (2011). Non-constant discounting and differential games with random time horizon. Automatica, 47, 2626–2638. Mazalov, V.V. and Rettieva, A.N. (2015). Asymmetry in a cooperative bioresource mana- gement problem. In: Game-Theoretic Models in Mathematical Ecology. Nova Science Publishers, pp. 113–152. Mazalov, V. V. and Rettieva, A. N. (2014). Game-theoretic models for cooperative biore- source management problem. In: Models and Methods in the Problem of Interaction between Atmosphere and Hydrosphere: Textbook. Tomsk, pp. 449–489. Mazalov, V. V. and Rettieva, A. N. (2012). Cooperation maintenance in fishery problems. In: Fishery Management. Nova Science Publishers, pp. 151–198. Mazalov, V. V. and Rettieva, A. N. (2011). The discrete-time bioresource sharing model. J. Appl. Math. Mech., 75 (2), 180–188. Mazalov, V. V. and Rettieva, A. N. (2010). Fish wars and cooperation maintenance. Eco- logical Modelling, 221, 1545–1553. Mazalov, V.V. and Rettieva, A.N. (2010). Incentive conditions for rational behavior in discrete-time bioresource management problem. Doklady Mathematics 81(3), 399–402. Mazalov, V. V. and Rettieva, A. N. (2010). Incentive equilibrium in bioresource sharing problem. Journal of Computer and Systems Sciences International, 49(4), 598–606. Mazalov, V. V. and Rettieva, A. N. (2010). Fish wars with many players. Int. Game Theory Rev., 12(4), 385–405. Mazalov, V. V. and Rettieva, A. N. (2009). The compleat fish wars with changing area for fishery. IFAC Proceedings Volumes (IFAC – PapersOnLine), 7(1), 168–172. Mazalov, V. V. and Rettieva, A. N. (2008). Incentive equilibrium in discrete-time biore- source sharing model. Doklady Mathematics, 78(3), 953–955. Mazalov, V. V. and Rettieva, A. N. (2008). Bioresource management problem with changing area for fishery. Game Theory and Applications, 13, 101–110. Mazalov, V. V. and Rettieva, A. N. (2008). Incentive equilibrium in bioresource manage- ment problem. In: Evolutionary and deterministic methods for design, optimization and control, P. Neittaanmaki, J. Periaux and T. Tuovinen (Eds.). CIMNE. Barcelona. Spain, 451–456. Mazalov, V. V. and Rettieva, A. N. (2007). Cooperative incentive equilibrium for a biore- source management problem. Contributions to Game Theory and Management, 1, 316– 325. Mo, J. and Walrand, J. (2000). Fair End-to-End Window-Based Congestion Control. IEE/ACM Transactions on Networking, 8(5), 556–567. Munro, G. R. (1979). The optimal management of transboundary renewable resources. Can. J. of Econ., 12(8), 355–376. Munro, G. R. (2000). On the Economics of Shared Fishery Resources. In: Int. Relations and the Common Fisheries Policy. Portsmouth, pp. 149–167. Neumann, J. and Morgenstern, O. (1953). Theory of Games and Economic Behavior. Princeton: Princeton University Press. Osborn, D. K. (1976). Cartel problems. American Economic Review, 66, 835–844. Osmani, D. and Tol, R. S. J. (2010). The Case of Two Self-enforcing International Agree- ments for Environmental Protection with Asymmetric Countries. Computational Eco- nomics, 36(2), 93–119. Ostrom, E. (1990). Governing the commons: the evolution of institutions for collective action. Cambridge University Press. Cooperation in Bioresource Management Problems 285

Owen, G. (1968). Game theory. NY: Academic Press. Petrosjan, L. A. (1993). Differential game of pursuit. World Scientific, Singapore. Petrosjan, L. A. and Zaccour, G. (2003). Time-consistent Shapley value allocation of pol- lution cost reduction. J. of Econ. Dyn and Contr., 27(3), 381–398. Petrosjan, L. A. and Zenkevich, N. A. (1996). Game theory. World Scientific, Singapore. Petrosyan, L. A. (1997). Stability of differential games with many players’ solutions Vestnik of Leningrad university, 19, 46–52 (in Russian). Petrosyan, L. A. and Danilov, N. N. (1979). Stability of nonantogonostic differential games with transferable payoffs’ solutions. Vestnik of Leningrad university, 1, 52–59 (in Rus- sian). Petrosyan, L. A. and Danilov, N. N. (1985). Coopeartive differential games and applications. Tomsk (in Russian). Petrosyan, L. A. and Zakharov, V. V. (1981). Game-theoretic approach to nature protect problem. Vestnik of Leningrad university, 1, 26–32 (in Russian). Petrosyan, L. A. and Zakharov, V. V. (1997). Mathematical models in ecology. SPb: SPb State University (in Russian). Petrosyan, L. A. and Zakharov, V. V. (1986). Introduction to mathematical ecology. Leningrad: Lenigrad State University (in Russian). Pintassilgo, P. and Lindroos, M. (2008). Coalition formation in straddling stock fisheries: a partition function approach. Int. Game Theory Rev., 10, 303–317. Plourde, C. G. and Yeung, D. (1989). Harvesting of a transboundary replenishable fish stock: a noncooperative game solution. Marine Resource Economics, 6, 57–70. Puh, Yu.A. (1983). Equilibrium and stability in populaitons dynamic models. Moscow: Nauka. Ray, D. and Vohra, R. (1997). Equilibrium binding agreements. Jour. of Econ. Theory, 73, 30–78. Rettieva, A. N. (2017). Equilibria in dynamic multicriteria games. Int. Game Theory Rev., 19(1), 1750002. Rettieva, A. N. (2014). A bioresource management problem with different planning hori- zons. Automation and Remote Control, 76(5), 919–934. Rettieva, A. N. (2014). A discrete-time bioresource management problem with asymmetric players. Automation and Remote Control, 75(9), 1665–1676. Rettieva, A. N. (2013). Bioresource management problem with asymmetric players. Com- putational information technologies for environmental sciences: selected and reviewed papers presented at the International conference CITES-2013, 127–131. Rettieva, A. N. (2012). Stable coalition structure in bioresource management problem. Eco- logical Modelling 235-236, 102–118. Rettieva, A. N. (2012). A discrete-time bioresource management problem with asymmetric players. Math. Game Theory and Appl., 4(4), 63–72 (in Russian). Rettieva, A. N. (2011). Fish wars with changing area for fishery. Advances in Dynamic Games. Theory, Applications, and Numerical Methods for Differential and Stochastic Games, 11, 553–563. Rettieva, A. N. (2011). Coalition structure stability in discrete time bioresource manage- ment problem. Math. Game Theory and Appl., 3(3), 39–66 (in Russian). Rettieva, A. N. (2010). Cooperative bioresource management regulation. Appl. and Indust. Math. Rev., 17(5), 663–672 (in Russian). Rettieva, A. N. (2009). Cooperative incentive condition in bioresource sharing problem. Large-scale Systems Control, 26.1, 366–384 (in Russian). Silvert, W and Smith, W. R. (1977). Optimal exploitation of multispecies community. Math. Biosci., 33, 121–134. Shapiro, A. P. (1979). Modeling of biological systems. Vladivostok (in Russian). Shevkoplyas, E. V. (2011). The Shapley value in cooperative differential games with random duration. Annals of the Int. Soc. of Dynamic Games, 11, 359–373. 286 Anna N. Rettieva

Smith, M.J. (1968). Mathematical Ideas in Biology. Cambridge University Press. Sorger, G. (2006). Recursive Nash bargaining over a productive assert. J. of Economic Dynamics & Control, 30, 2637–2659. Svirezhev, Yu. M. and Elizarov, E. Ya. (1972). Mathematical modeling of biological systems. Moscow: Nauka (in Russian). Tolwinski, B., Haurie, A. and Leitmann, G. (1986). Cooperative equilibria in differential games. J. of Math. Anal. and Appl., 119, 182–202. Vislie, J. (1987). On the optimal management of transboundary renewable resources: a comment on Munro’s paper. Can. J. of Econ., 20, 870–875. Yeung, D. W. K. (2006). An irrational-behavior-proof condition in cooperative differential games. Int. Game Theory Review, 8(4), 739–744. Yi, S. S. (1997). Stable coalition structures with externalities. Games and Economic Behav- ior, 20, 201–237. Yi, S. S. and Shin, H. (1995). Endogenous formation of coalitions in oligopoly. Mimeo; Department of Economics, Dartmouth College. Contributions to Game Theory and Management, X, 287–298 Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes

Victor V. Rozen Saratov State University, Faculty of Mathematics and Mechanics, Astrakhanskaya st. 83, Saratov, 410012, Russia E-mail: [email protected]

Abstract Saddle point concept is a basic one for antagonistic games with payoff functions. For more large class consisting of games with ordered out- comes, there are different generalizations of the saddle point concept. In this article we consider three types of equilibrium for games with ordered outcomes, namely, saddle points (or Nash equilibrium points), general equilib- rium points and transitive equilibrium points. The main definitions concern- ing games with ordered outcomes are introduced in section 1. In section 2, necessary and sufficient conditions for saddle points in games with ordered outcomes are found. These conditions are formulated by using the so-called characteristic sets of players. Transitive equilibrium points are considered in section 3. Theorem 3 characterizes transitive equilibrium points in an- tagonistic games with ordered outcomes as pre-images of saddle points in antagonistic games with payoff functions under strict homomorphisms. The main result of this article is theorem 4 in which analogy result for mixed ex- tension of game with ordered outcomes is proved. In constructing of mixed extension of game with ordered outcomes, we use the so-called canonical extension of an order on the set of probabilistic measures. Keywords: game with ordered outcomes, saddle point, general equilibrium point, transitive equilibrium point.

1. Introduction The aim of this work is an investigation of equilibrium concept in antagonistic games with ordered outcomes. In contrast to games with payoff functions, in games with preference relations there are many types of equilibrium points. First of all we introduce the basic definitions. Formally, a game of n players with preference relations in the normal form can be given as a system of the type

G = N, (Xi)i I , A, (ωi)i I , F . (1) h ∈ ∈ i where N = 1,...,n is a set of players, n 2; Xi is a set of strategies of the player i; A is{ a set of}outcomes; ω A2 is a ≥preference relation for player i; F is i ⊆ a realization function, i.e. a mapping from the set of all situations X = i N Xi into the set of outcomes A. A game G of the type (1) is called a game with ordered∈ Q outcomes if all ωi (i N) are order relations. For the class of games∈ with ordered outcomes of the type (1), the most important optimality concept is Nash equilibrium. 0 0 Definition 1. A situation x = xi i N in the game G of the form (1) is called ∈ Nash equilibrium point if for all i N and x′ X the correlation ∈  i ∈ i ω 0 i 0 F x x′ F x k i ≤   288 Victor V. Rozen holds.

In the case when preference relations ωi (i N) not satisfy the linearity con- dition, we can consider a certain generalization∈ of Nash equilibrium concept in the following manner.

0 0 Definition 2. A situation x = xi i N in game G is called a general equilibrium ∈ point if there does not exist i N and x′ X such that ∈  i ∈ i

0 ωi 0 F x x′ > F x . (2) k i An antagonistic game with ordered outcomes is a game of the type (1) in which a number of players is equal two and their preferences are mutually inverse. We consider such a game in the form

G = X,Y,A,ω,F (3) h i where X is a set of strategies of player 1, Y is a set of strategies of player 2, A is a set of outcomes, ω is a (partial) order relation on the set A, F : X Y A is a realization function. The preferences of the player 1 are given by the× order→ ω 1 and preferences of the player 2 are given by the inverse order ω− . We assume that X 2, Y 2, A 2. In the case the ordered set A, ω is a complete lattice, the| |≥ game|G|is ≥ called| |a ≥ game with lattice-ordered outcomesh .i For antagonistic game G of the form (3), the definition 1 and definition 2 have the following form.

Definition 3. A situation (x0,y0) in game G of the form (3) is called Nash equi- librium point (or a saddle point) if for any x X,y Y hold the correlations ∈ ∈ ω ω F (x, y ) F (x ,y ) F (x ,y) . (4) 0 ≤ 0 0 ≤ 0

Definition 4. A situation (x0,y0) in game G of the form (3) is a general equilibrium point if there does not exist x X,y Y such that ∈ ∈ ω ω F (x, y0) > F (x0,y0) or F (x0,y) < F (x0,y0) . (5)

For antagonistic games with ordered outcomes, we can introduce another type of equilibrium, so-called transitive equilibrium.

Definition 5. A situation (x0,y0) X Y is called a transitive equilibrium point (or briefly, Tr-equilibrium point) in∈ the× game G of the form (3) if there does not exist x X,y Y such that ∈ ∈ ω F (x, y0) > F (x0,y) . (6)

Remark 1. It easy to show that the following consequences

Nash equilibrium Tr equilibrium General equilibrium ⇒ ⇒ hold and converse implications does not hold. Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes 289

Particularly in antagonistic game G an arbitrary general equilibrium point need not be a transitive equilibrium point since the correlation “not more than” is not transitive (see also the example 1). A motivation for introduction of transitive equi- librium points is the fact that in game G with ordered outcomes, Tr-equilibrium points are exactly pre-images of saddle points in antagonistic games with payoff functions under strict homomorphisms of games (see Theorem 3). The main types of equilibrium points in antagonistic games with ordered outcomes are saddle points and transitive equilibrium points. In this work we use some concepts and notations of the ordered set theory. Particularly, for arbitrary subset B of an ordered set A, ω the operators and are defined as follows: h i ↓ ↑ ω ω B↓ = a A: ( b B) a b ,B↑ = a A: ( b B) a b . { ∈ ∃ ∈ ≤ } { ∈ ∃ ∈ ≥ } 2. Saddle points in antagonistic games 2.1. Characteristic sets of players Consider an antagonistic game G of the form (3) with ordered outcomes. Definition 6. We say that in the game G an outcome a A is guaranteed to ∈ ω player 1 by a strategy x X if for any strategy y Y the correlation F (x, y) a holds; an outcome a A∈ is guaranteed to player 2∈ by a strategy y Y if for≥ any ∈ ω ∈ strategy x X we have F (x, y) a. ∈ ≤ 1 We denote by Vx the set of all outcomes of game G which are guaranteed to 2 player 1 by the strategy x and by Vy the set of all outcomes guaranteed to player 2 by the strategy y i.e.

ω ω V 1 = a A: ( y Y ) F (x, y) a , V 2 = a A: ( x X) F (x, y) a . x { ∈ ∀ ∈ ≥ } y { ∈ ∀ ∈ ≤ } Definition 7. We say that in a game G an outcome a A is forbidden to player 1 ∈ ω by a strategy y Y if for any strategy x X the correlation F (x, y) a does not hold; an outcome∈ a A is forbidden to∈ player 2 by a strategy x X ≥if for any ∈ ω ∈ strategy y Y the correlation F (x, y) a does not hold. ∈ ≤ 1 By Uy we denote the set of all outcomes in game G which are non-forbidden 2 ones for player 1 by a strategy y Y and by Ux the set of all outcomes which are non-forbidden ones for player 2 by∈ a strategy x X: ∈ ω ω U 1 = a A: ( x X) F (x, y) a ,U 2 = a A: ( y Y ) F (x, y) a . (7) y { ∈ ∃ ∈ ≥ } x { ∈ ∃ ∈ ≤ } Definition 8. An outcome a A is called a guaranteed outcome for player 1 if it is guaranteed at least one strategy∈ of this player; an outcome a A is called a non-forbidden outcome for player 1, if it is not forbidden to player∈ 1 any strategy of the other player. The set of all guaranteed outcomes for player 1 is denoted by V (1) and the set of all non-forbidden outcomes for player 1 is denoted by U (1). We have

1 1 V (1) = Vx ,U (1) = Uy . (8) x X y Y [∈ \∈ 290 Victor V. Rozen

For player 2 these sets are denoted by V (2) and U (2), respectively, and are defined dually. It follows immediately from the definitions that in any game G the inclusion V (1) U (1) holds that can be seen as analogous to the well known correlation be- tween⊆ the lower and upper value in a game with payoff function. The dual inclusion V (2) U (2) is true also. ⊆ Definition 9. We say that a game G satisfies the alternativeness condition if the equality V (1) = U (1) holds.

1 1 2 2 The sets of the form Vx , Uy ; Vy , Ux are called characteristic sets of player 1 and player 2, respectively. Using these sets, we define some types of optimal strategies of players in antagonistic game with ordered outcomes.

Definition 10. A strategy x0 X of the player 1 is called the greatest guaranteed ∈ 1 strategy if it satisfies the condition Vx0 = V (1). A strategy x0 X of the player 1 ∈ 2 is called the greatest restrictive strategy if it satisfies the condition Ux0 = U (2). For player 2, the greatest guaranteed strategy and the greatest restrictive strategy are defined dually.

Definition 11. A strategy of a player is called a discriminating one if it provides to penetration into the set of guaranteed outcomes of the other player. Thus, dis- criminating strategies x0 X and y0 Y of players 1 and 2 are characterized, respectively, by the conditions:∈ ∈

( y Y ) F (x ,y) V (2) , ( x X) F (x, y ) V (1) . ∀ ∈ 0 ∈ ∀ ∈ 0 ∈ 2.2. Necessary and sufficient conditions for saddle points Consider an antagonistic game G with ordered outcomes of the form (3).

Theorem 1. An arbitrary situation (x0,y0) in the game G is a saddle point if and only if x0 is a discriminating strategy of player 1 and y0 is a discriminating strategy of player 2.

Proof (of theorem 1). Suppose a situation (x0,y0) is a saddle point in the game G. It is easy to show that in this case F (x0,y0) is the greatest element in V (1) and the smallest element in V (2). Then we have the following equalities:

ω V (1) = (F (x ,y ))↓ = a A: a F (x ,y ) , (9) 0 0 { ∈ ≤ 0 0 } ω V (2) = (F (x ,y ))↑ = a A: a F (x ,y ) . (10) 0 0 { ∈ ≥ 0 0 } By using (9) and (10) we obtain from the definition 4 for any x X, y Y the ∈ ∈ inclusions F (x0,y) V (2) and F (x, y0) V (1) hence the necessary condition is shown. ∈ ∈ We now state the following supporting statement.

ω Lemma 1. 1) The correlation a1 a2 holds for any a1 V (1) and a2 U (2). ω ≤ ∈ ∈ 2) The correlation b b holds for any b V (2) and b U (1). 1 ≥ 2 1 ∈ 2 ∈ Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes 291

Proof (of lemma 1). Assume a1 V (1) then there exists a strategy x0 X of ∈ ω ∈ player 1 such that for any y Y the correlation F (x0,y) a1 satisfies. The ∈ ≥ ω condition a2 U (2) means that the formula ( x X) ( y Y ) F (x, y) a2 holds. ∈ ∀ ∈ ω ∃ ∈ ≤ Setting in this formula x = x0 we obtain F (x0,y0) a2 for some y = y0. On the ω ≤ ω other hand we have F (x0,y0) a1. From these two inequalities we obtain a1 a2 and 1) is proved. Dually we have≥ the condition 2). ≤

We now prove the sufficient condition in theorem 1. Assume that x0 is a dis- criminating strategy of player 1 and y0 is a discriminating strategy of player 2. Using definition 11 and the inclusion V (2) U (2) we obtain F (x, y ) V (1) and ⊆ 0 ∈ F (x0,y) V (2) U (2) for any x X, y Y . Then in according with lemma 1 we obtain∈ for arbitrary⊆ x X and y∈ Y the∈ correlation ∈ ∈ ω F (x, y ) F (x ,y) . (11) 0 ≤ 0

Setting in (11) x = x0 and then y = y0 we have the following double inequality

ω ω F (x, y ) F (x ,y ) F (x ,y) 0 ≤ 0 0 ≤ 0 i.e. the situation (x0,y0) is a saddle point in game G.

Theorem 2. A situation (x0,y0) X Y is a saddle point in game G of the form (3) if and only if ∈ × 1) G satisfies the alternativeness condition; 2) x0 is the greatest guaranteeing strategy of player 1; 3) y0 is the greatest restrictive strategy of player 2. Lemma 2. The inclusion V 1 U 1 (12) x ⊆ y holds for any x X, y Y . ∈ ∈ 1 Indeed, in accordance with definition 6, the condition a Vx means that a is ω ∈ a general minorant for all elements of x-row then a F (x, y). Hence we obtain a U 1 and the inclusion (12) is shown. ≤ ∈ y 1 1 Lemma 3. A situation (x0,y0) is a saddle point in game G if and only if Vx0 = Uy0 . Proof (of lemma 3). By using the operator we can write the required equality in the form ↓ (F (x0,y))↓ = (F (x, y0))↓. (13) y Y x X \∈ [∈ ω Since the conditions a1 a2 and a1↓ a2↓ are equivalents to each other, the defini- tion 4 of a saddle point≤ can be presented⊆ in the form of double inclusion

(F (x, y ))↓ (F (x ,y ))↓ (F (x ,y))↓ (14) 0 ⊆ 0 0 ⊆ 0 for any x X and y Y . Let us show that the conditions (13) and (14) are equivalents∈ to each other.∈ Indeed, assume that condition (14) holds. Then the subset 292 Victor V. Rozen

(F (x ,y ))↓ is the smallest one under inclusion between subsets (F (x ,y))↓ (y Y ) 0 0 0 ∈ and it is the greatest one between subsets (F (x, y ))↓ (x X) hence 0 ∈

(F (x0,y))↓ = (F (x0,y0))↓, (F (x, y0))↓ = (F (x0,y0))↓ y Y x X \∈ [∈ hence we obtain (13). Conversely, suppose (13) holds. Since for any situation (x0,y0) the following two inclusions

(F (x ,y′))↓ (F (x ,y ))↓ (F (x′,y ))↓ (15) 0 ⊆ 0 0 ⊆ 0 y′ Y x′ X \∈ [∈ hold, then we obtain with help (13) that the end members in (15) coincide with (F (x0,y0))↓. Using this fact, we have for any x X and y Y the following correlations: ∈ ∈

(F (x, y ))↓ (F (x′,y ))↓ = (F (x ,y ))↓ = (F (x ,y′))↓ (F (x ,y))↓ 0 ⊆ 0 0 0 0 ⊆ 0 x′ X y′ Y [∈ \∈ hence (14) holds.

By using lemmas 2 and 3, we have the following

Corollary 1. In game G of the form (3) a saddle point there exists if and only if 1 1 there exist and coincide to each other max Vx and min Uy i.e. x X y Y ∈ ∈

1 1 max Vx = min Uy (16) x X y Y ∈ ∈ (operators max and min are considered with respect to inclusion). Moreover, if the left extremum is achieved at the point x = x0 and the right extremum at the point y = y0 then the situation (x0,y0) is a saddle point in game G.

1 To prove the theorem 2 it remains to note that the existence of max Vx is equiv- x X alent to existence of the greatest guaranteeing strategy of player 1 and∈ the existence 1 of min Uy is equivalent to the existence of the greatest restrictive strategy of player 2; y Y ∈ moreover it follows from (16) that the game G satisfies the alternativeness condition which completes the proof of theorem 2.

Corollary 2. For antagonistic games with lattice-ordered outcomes, the equality (16) takes the form

max inf F (x, y) = min sup F (x, y) x X y Y y Y x X ∈ ∈ ∈ ∈ and it coincides with well known condition for the existence of a saddle point in antagonistic game with payoff function (see, for example, Vorob’ev, 1985). Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes 293

3. Transitive equilibrium points in antagonistic games 3.1. Connection between transitive equilibrium points and saddle points Consider an antagonistic game G with ordered outcomes of the form (3). Let ϕ: A IR be a function from the set A of outcomes of game G into real num- bers.→ Then we can construct the following antagonistic game with payoff function

G = X,Y,ϕ F . (17) ϕ h ◦ i Theorem 3. Let G be an antagonistic game with ordered outcomes of the form (3). Then 1. If a situation (x ,y ) X Y is a saddle point in game G where ϕ: A IR 0 0 ∈ × ϕ → is some strict isotonic function from the set of outcomes A into IR then (x0,y0) is a transitive equilibrium point in game G. In the case when the set of outcomes A is finite or countable the converse is truth also, namely we have the following statement. 2. If a situation (x0,y0) X Y is a transitive equilibrium point in game G, then there exists a strict isotonic∈ × function ϕ: A IR from the set of outcomes A → into IR such then (x0,y0) is a saddle point in the game Gϕ. Remark 2. The assertion 2 becomes false when replacing “transitive equilibrium point” by “general equilibrium point” (see example 1).

Proof (of theorem 3). 1. Suppose that the situation (x0,y0) is not a transitive equi- librium point in game G, i.e. there exist the strategies x1 X and y1 Y such ω ∈ ∈ that F (x1,y0) > F (x0,y1). Because the function ϕ is strict isotonic one, we obtain ϕ (F (x1,y0)) > ϕ (F (x0,y1)) in contradiction with our assumption that (x0,y0) is a saddle point in the game Gϕ. To prove the assertion 2, we need in the following lemmas. Lemma 4. (see Rozen, 1988). Consider an arbitrary ordered set A, ω and B A, ω h i ⊆ C A. Assume that b >c for any b B and c C. Then there exists a linear ⊆ ¬ ∈ ∈ ω ω co-ordering ω of the order ω such that b < a < c for any b B C, a B C, c C B. ∈ \ ∈ ∩ ∈ \ ω Lemma 5. (see Rozen, 1988). Assume that b >c for any b B,c C. Then ¬ ∈ ∈ there exists a strict isotonic function ϕ0 : A IR such that ϕ0 (b) ϕ0 (c) for any b B and c C (this function is called a separating→ function). ≤ ∈ ∈ We now prove the theorem 3. Let a situation (x ,y ) X Y be a transitive 0 0 ∈ × equilibrium point in game G. Put B = F (x, y0): x X , C = F (x0,y): y Y . ω { ∈ } { ∈ } Then the condition b >c for any b B,c C holds and using lemma 5 we ¬ ∈ ∈ obtain that there exists a strict isotonic function ϕ: A IRsuch that ϕ (F (x, y0)) ϕ (F (x ,y)) for any x X and y Y . It follows that→ the situation (x ,y ) is≤ a 0 ∈ ∈ 0 0 saddle point in game Gϕ. Example 1. Consider an antagonistic game G with ordered outcomes in which re- alization function F by the Table 1 and the order relation ω by its diagram (see Fig. 1 are given. 294 Victor V. Rozen

Table 1. Realization function F

F y0 y1

x0 c b

x1 d a

Fig. 1. Order relation ω

In this game the situation (x0,y0) is a general equilibrium point (since the el- ement F (x0,y0) = c is a minimal one in its row and it is a maximal one in its column) but the situation (x0,y0) is not a transitive equilibrium point (since the ω correlations F (x1,y0) = d > b = F (x0,y1) hold). We now show without using of theorem 3 that there does not exist a strict isotonic function ϕ: A IR under → which the situation (x0,y0) is a saddle point in the game Gϕ. Indeed in this case we have the following double inequality for any x X and y Y : ∈ ∈ ϕ (F (x, y )) ϕ (F (x ,y )) ϕ (F (x ,y)) . (18) 0 ≤ 0 0 ≤ 0 Then setting in (18) x = x1 and then y = y1 we obtain

ϕ (F (x ,y )) ϕ (F (x ,y )) ϕ (F (x ,y )) 1 0 ≤ 0 0 ≤ 0 1 hence ϕ (F (x1,y0)) ϕ (F (x0,y1)) that is ϕ (d) ϕ (b). On the other hand since ≤ ω ≤ the function ϕ is strict isotonic, the condition d >b implies ϕ (d) > ϕ (b) in contra- diction with above inequality. 3.2. A mixed extension of an antagonistic game with ordered outcomes The main result of the section 3 is a description of the set of transitive equilibrium points in mixed extension of an antagonistic game with ordered outcomes. First of all we need the following notations. For arbitrary ordered set A, ω we denote h i by C0 (ω) the set of all isotonic function from A, ω into real numbers IR and by C (ω) the set of all strict isotonic function fromh A,i ω into real numbers. By a probabilistic measure on a finite ordered set A, ωh wei shall mean a non-negative function p: A IR such that p (a) = 1. Theh seti of all probabilistic measures on → a A ∈ arbitrary set A is denoted by AP. For any ϕ C0 (ω) and p A we put ϕ (p) = (ϕ, p) where (ϕ, p) is the standard scalar product.∈ ∈ e e Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes 295

Definition 12. Let A, ω be a finite ordered set. The canonical extension of the h i order ω on the set of probabilistic measures is called a binary relation ω A A defined by the formula: ⊆ × e e e ω p e p ( ϕ C (ω))) ϕ (p ) ϕ (p ) . 1 ≤ 2 ⇔ ∀ ∈ 0 1 ≤ 2 It is known that ω is an order relation on the set A of probabilistic measures. In an evident form, the order relation ω can be presented as follows. Put for arbitrary probabilistic measure p A and for arbitrarye subset B A: p (B) = e ∈ ⊆ p (a). Then we have the following equivalencee a B e P∈ ω p e p ( B M (ω))) p (B) p (B) (19) 1 ≤ 2 ⇔ ∀ ∈ 1 ≤ 2 where M (ω) is a family of all majorant stable subsets in the ordered set A, ω h ω i (note the subset B A is called majorant stable if conditions a B and a′ > a ⊆ ∈ imply a′ B). ∈ Definition 13. By the mixed extension of a finite antagonistic game

G = X,Y,A,ω,F h i with ordered outcomes we mean an antagonistic game with ordered outcomes of the form G = X, Y, A, ω, F (20) h i where X is the set of probabilistic measures on X, Y the set of probabilistic measures e e e e e e on Y , A the set of probabilistic measures on A, ω is the canonical extension of order ω on thee set of probabilistic measures and F is ae mapping from the set X Y into × A whiche is defined as follows. For any probabilistic measures µ X and ν Y we e ∈ ∈ set F (µ,ν)= F(µ,ν) where F(µ,ν) is a probabilistice measures on A whiche is givene by thee equality e e

e F(µ,ν) (a)= µ (x) ν (y) . (21) F (x,yX)=a Thus the transition from an antagonistic game with ordered outcomes to its mixed extension means to replace the basic sets by sets of probability measures and ex- tension of the order relation and the realization function. 3.3. Transitive equilibrium points in mixed extension of antagonistic game with ordered outcomes Theorem 4. Consider a finite antagonistic game G with ordered outcomes and let G be its mixed extension. An arbitrary situation in mixed strategies (µ ,ν ) X Y 0 0 ∈ × is a transitive equilibrium point in game G if and only if there exists a strict isotonic functione ϕ C (ω) from the set of outcomes A into real numbers IR such thate thee ∈ situation (µ0,ν0) is a saddle point in thee mixed extension (in the classical sense) of the antagonistic game G = X,Y,ϕ F with payoff function. ϕ h ◦ i The proof of “if part” is based on the following lemmas (see Rozen, 2014). 296 Victor V. Rozen

Lemma 6. Let A, ω be an arbitrary finite ordered set and ϕ be a strict isotonic mapping from A,h ω intoi real numbers IR. Define an extension ϕ of the function ϕ h i on the set A setting for any p A ∈ e ϕ (pe)= p (a) ϕ (a) . (22) a A X∈ Then ϕ is a strict isotonic mapping from the ordered set A, ω into real numbers IR. h i e e Lemma 7. Given a finite antagonistic game G = X,Y,A,ω,F with ordered out- h i comes and an arbitrary isotonic function ϕ C0 (ω), we can construct an antago- nistic game G = X,Y,ϕ F with payoff function.∈ Let ϕ F be the payoff function ϕ h ◦ i ◦ in the mixed extension (in the classical sense) of game Gϕ. Then for any situation in mixed strategies (µ,ν) X Y the following equality ∈ × ϕ F (µ,ν)= ϕ F (23) e ◦e (µ,ν) holds where the probabilistic measure F(µ,ν) by (21) and the function ϕ by (22) are defined.

We now prove the “if part” in theorem 4. Indeed, suppose that there exists a function ϕ C (ω) such that a situation (µ0,ν0) is a saddle point in the mixed extension (in∈ the classical sense) of antagonistic game G = X,Y,ϕ F with ϕ h ◦ i payoff function. We need to show that (µ0,ν0) is a transitive equilibrium point in game G. Otherwise we have the correlation

ω e e F(µ1,ν0) > F(µ0,ν1) (24) for some µ X and ν Y . By using lemma 6 we obtain from (24) ϕ F > 1 ∈ 1 ∈ (µ1,ν0) ϕ F ; it can be written with accordance to lemma 7 in the form (µ0,ν1)  e e  ϕ F (µ ,ν ) > ϕ F (µ ,ν ) . (25) ◦ 1 0 ◦ 0 1

On the other hand, since the situation (µ0,ν0) is a saddle point in the mixed ex- tension of antagonistic game Gϕ, we have

ϕ F (µ ,ν ) ϕ F (µ ,ν ) ϕ F (µ ,ν ) ◦ 1 0 ≤ ◦ 0 0 ≤ ◦ 0 1 hence ϕ F (µ1,ν0) ϕ F (µ0,ν1) that contradict to (25). Thus the “if part” in theorem◦ 4 is proved.≤ To prove◦ the converse, we need the following

Lemma 8. Let A, ω be an arbitrary finite ordered set and P,Q A be two poly- hedrons of probabilistich i measures. Assume that for any p P, q ⊆Q the condition ∈ ∈ ω e p >e q holds. Then there exists a strict isotonic mapping ϕ C (ω) from the ¬ ∈   ordered set A, ω into real numbers such that ϕ (p) ϕ (q) for any p P, q Q. h i ≤ ∈ ∈ Proof (of lemma 8). Let (A1,...,Am) be a list of all majorant stable subsets in the ordered set A, ω . Consider a mapping ξ which to every probability measure p A put in correspondenceh i the vector ξ (p) = (p (A ) ,...,p (A )) IRm. Since∈ the 1 m ∈ e Types of Equilibrium Points in Antagonistic Games with Ordered Outcomes 297 mapping ξ is a linear one it translates P and Q in convex polyhedrons ξ (P ) and ξ (Q) respectively. Then R = ξ (P ) + ( 1) ξ (Q) be a convex polyhedron in IRm (see, for example, Leichtweiss, 1980). Show− that the polyhedron R does not contain of semi- positive vectors. Otherwise, assume that there exists a vector u = (u ,...,u ) R 1 m ∈ provided u1 0,...,um 0 and at least one of these inequalities is strict. In this case there exist≥ vectors p≥ P and q Q such that ∈ ∈ (u ,...,u ) = (p (A ) ,...,p (A )) (q (A ) ,...,q (A )) , 1 m 1 m − 1 m i.e. p (Ai) q (Ai)= ui 0 for all i =1,...,m and at least one inequality is strict. − ≥ ω By using (19), we obtain p >e q that contradict to our assumption. Thus the convex polyhedron R does not contain of semi-positive vectors. Then for this poly- hedron R there exists a hyperplane of support with strict positive normal vector c IRm, which contains the null vector 0 IRm (see Rozen, 2014, lemma IV.7), i.e. (ξ∈(p) ξ (q) ,c) 0 for any p P, q Q,∈ hence (c, ξ (p)) (c, ξ (q)). For conjugate − ≤ ∈ ∈ ≤ liner mapping ξ∗ we have (ξ∗ (c) ,p) (ξ∗ (c) , q) for any p P , q Q. It remains ≤m ∈ ∈ to note that in the case vector c IR is a positive one, ξ∗ (c) is a strict isotonic mapping from the ordered set A,∈ ω into real numbers IR (see Rozen, 2014, lemma IV.10) h i

Let us prove the “only if part” in theorem 4. Assume that a situation (µ ,ν ) 0 0 ∈ X Y in mixed strategies is a transitive equilibrium point in a game G of the form× (20), then for all µ X and ν Y we have ∈ ∈ e e e ω e F e >e F . (26) ¬ (µ,ν0) (µ0,ν)  

Put P = F(µ,ν0) : µ X , Q = F(µ0,ν) : ν Y . It is easy to check that the set P coincides{ with the∈ convex} hull{ of the finite∈ set} F : x X and the set Q { (x,ν0) ∈ } coincides with the convexe hull of the finite sete F(µ0,y) : y Y , hence P and Q are convex polyhedrons. Moreover, it follows directly{ from (26)∈ } that the condition ω p >e q holds for any p P , q Q. Thus, all assumptions of lemma 8 satisfy ¬ ∈ ∈ here. According with lemma 8 there exists a strict isotonic mapping ϕ C (ω) from the ordered set A, ω into real numbers such that the inequalities ∈ h i ϕ F ϕ F (27) (µ,ν0) ≤ (µ0,ν)   hold for any µ X and ν Y . By using lemma 7, we can write (27) in the form ∈ ∈

e ϕe F (µ,ν0) ϕ F (µ0,ν) (28) ◦ ≤ ◦ where ϕ F is the payoff function in the mixed extension (in the classical sense) of ◦ game Gϕ. Setting in (28) µ = µ0 and then ν = ν0 we obtain

ϕ F (µ,ν ) ϕ F (µ ,ν ) ϕ F (µ ,ν) ◦ 0 ≤ ◦ 0 0 ≤ ◦ 0 i.e. the situation (µ0,ν0) is a saddle point in the mixed extension of game Gϕ which completes a proof of theorem 4. 298 Victor V. Rozen

Corollary 3. The set TrEq G of transitive equilibrium points it the mixed exten- sion of game G can be presented in the form e TrEq G = Sp Gϕ ϕ C(ω) ∈[ e where Sp Gϕ is the set of saddle points in mixed extension (in the classical sense) of game Gϕ and the function ϕ runs over the set C (ω) consisting of all strict isotonic mappings of the ordered set A, ω into real numbers. h i Remark 3. The statement of theorem 4 is not a consequence of a description of equilibrium points in mixed extension of games with ordered outcomes which was given in (Rozen, 2010). Particularly, assume a situation (µ0,ν0) in mixed extension of antagonistic game G with ordered outcome is a general equilibrium point but is not a transitive equilibrium point. Then there exists two strict isotonic functions 1 ϕ C (ω) and ψ C ω− such that the situation (µ0,ν0) is Nash equilibrium point∈ in mixed extension∈ (in the classical sense) of game X,Y,ϕ F, ψ F with payoff functions. However in this case there does not exist oneh strict isotonic◦ ◦ functioni ϕ C (ω) such that the situation (µ0,ν0) is a saddle point in mixed extension (in the∈ classical sense) of antagonistic game G = X,Y,ϕ F (see also the example 1). ϕ h ◦ i References Vorob’ev, N. N. (1985). Game theory for economists-cyberneticists. Nauka: Moscow (in Russian). Leichtweiss, K. (1980). Convex Sets VEB Deutscher Verlag der Wissenshaften: Berlin. Rozen, V. V. (1988). Reducibility of optimal solutions for games with ordered outcomes. In: Semigroup Theory and Its Applications, pp. 50–60. Saratov State University: Saratov (in Russian). Rozen, V. V. (2010). Equilibrium points in games with ordered outcomes. In: Contributions to game theory and management. Vol.III. Collected papers presented on the Third International Conference Game Theory and Management (Petrosyan, L. A. and N. A. Zenkevich, eds.), pp. 368–386. Graduate School of Management SPbU: SPb. Rozen, V. V. (2014). Ordered vector spaces and its applications. Saratov State University: Saratov (in Russian). Contributions to Game Theory and Management, X, 299–325

Design and Simulation of Coopetition as Lead Generating Mechanism⋆

Maxim Shlegel1 and Nikolay Zenkevich2 1 St. Petersburg University Graduate School of Management E-mail: [email protected] 2 St. Petersburg University Graduate School of Management E-mail: [email protected]

Abstract This paper considers coopetition as form of interaction between companies and agents. As the method to analyse coopetition internet-based platform is used and modeled. The most important part of the research is simulation of lead-generation internet-based platform. As a result, potential industrial impact that can be caused by a lead generating internet platform- based coopetition among companies, which operate in one industry. Keywords: coopetition, internet-based platform, lead generation, agent- based simulation.

1. Introduction There are several ways of possible interaction among organizations. One of the clas- sifications gives us four following types: competition, collaboration, coexistence and coopetition (Bengtsson and Kock, 1999). Coopetition is a kind of interaction, when firms cooperate and compete to each other (operating in one industry) to improve their financial results (Brandenburger and Nalebuff, 1996). In other words entering a coopetition firms try to increase the values of the whole market to share it in competition later: “to create a bigger business pie, while competing to divide it up” (Walley, 2007). One of the best explanations of the phenomena coopetition refers to Kirk S. Pickett who in 1913 described the relationship among oyster dealers, saying that all of them are not just in competition with each other, but in cooperation de- veloping more business for each participant of the market, which means that these oyster dealers in co-opetition now, not in competition (Cherrington, 1976). Basing on all abovementioned information we can derive that coopetition is a kind of com- petition in terms of cooperation, when all players try to make market on which they play “bigger”, to share this “bigger” market among them by competition activities. In other words coopetition is an inter-firm strategy, when companies at first focus of the increase of the profit that their industry can give to them. At that stage they try to make bigger the market or sphere of business that they operate on, companies start some kind of collaborative relationships among them. As the additional value is created, companies start to be rivals to capture the biggest part of this additionally created value on their own. As a result there is an increasing chance to create a common win-win situation for the whole industry for all its participants through a larger market creation (Liu, 2013).

⋆ This work is supported by the Russian Foundation for basic Research, project N.16-01- 00805A 300 Maxim Shlegel, Nikolay Zenkevich

One of the argumentations “For” coopetition as a choice of inter-firm relation- ships that have a potential to capture additional value is the resource-based argu- mentation (Lavie, 2006). One of the general strategies used in terms of alliances is to use supplementary and complementary resources in an integrated way. Such approach has a potential to create more value comparing to the cases, when above- mentioned resources are used separately. This additional value could be expressed in innovations, differentiation of organizations, cost reduction, expansion of the market, cooperative manufacturing and distribution of products. Another potential field of coopetition-based type of interaction between companies that stands on the idea of resources is their utilization. Through cooperation organizations manage to create an additional value through cooperative utilization of their resources. At the same time they manage to capture some individual portion of Joint-created values through the utilization of their specific resources (Ritala and Hurmelinna- Laukkanen, 2009). Nowadays coopetition velocity increases dramatically, which can be proved by recent researches in ICT sector (Basole, Park and Barnett, 2015). If we analyse motivation of companies to enter coopetitional relationships with other organisations, there is one of the main reason – improvement of their competi- tive positions. This could be reached through inter-organisational learning practices and reception of valuable and strategically important resources from such inter- actions (Luo 2004). However these are not the only way of competitive position improvement. There are many examples such as (Garrette, Castaner and Dussauge, 2009; Tong and Reuer, 2010; Rothaermel 2001): Adaptation of partners experience and knowledge: When organisations enter • close relationships (as coopetition or cooperation) they enter a common “knowl- edge pool”. Participation in such pool gives them a chance to obtain some knowledge and experiences from their competitors; Common establishment of new knowledge: Through coopetition organisations • are able to combine their creative skills to generate some new knowledge, which can be used by a particular coopetition group. Such knowledge provides all members of this group with additional competitive advantage; Joint research and development: Entering joint R&D projects companies get a • chance to manage risks and increase budgets of research activities; Defence from innovations (radical ones) that potentially can damage a company: • Getting in touch through coopetition with key competitors organisations can get an opportunity to protect their business from sudden appearance of radical innovations on the market. That could be reached through creation of common informational field, knowledge sharing and common R&D projects; Creation of entry barriers for newcomers and foreign competitors: coopetitional • inter-actions of organisations provide them with a potential to defend their territory with help of price, technology or market instruments; Getting cost reduction through the increase of scale of some operations that • can be done in coopetition (upstream ones): For example, if five organisations make one order from a supplier of goods, they can get a sufficient discount and reduce their costs significantly. Understanding coopetition and its potential from the perspective of value addi- tion and profitability it is important to analyse and examine potential conditions that might cause effect on the process of formation of coopetition among companies. There are at least five issues that cause influence on this process: Design and Simulation of Coopetition as Lead Generating Mechanism 301

Environment: Coopetitive strategy of organisations can be influenced by context in which these companies operate. This context can be described by the governmen- tal policy, resources peculiarities, competition level, quality of services and others (Lado, Boyd and Hanlon, 1997). For instance in environment where companies have a high probability of intervention from abroad, organisations will have a motivation to cooperate to protect their market and at the same moment of time to compete for the market that they defend. In such case organisations have more motivation to cooperate, so coopetition starts to be up-stream dominated. As an opposite, if organisations face the situation when there is a little possibility of intervention, there is a chance that companies start to compete more than cooperate. Nowadays many industries face a dramatic growth of competition due to such factors as internationalisation, innovation growth, internet development and etc. As a result organisations have to find solutions, how to fight uncertainties that arise from such situation. That brings competing sides to the idea of cooperation with each other (Burgers, Hill and Kim, 1993). As an example, when companies face a problem of innovations that have a potential to change the whole market and cause effect on the choice and reactions of customers, cooperation among rivals can move its focus to the question of adaptation of organizations to the quickly changing environment. Doing this together companies increase their chances to succeed and stay on the market (Burgers, Hill and Kim, 1993). Coopetitional costs: Entering a coopetition with other organisations, company has to pay attention to the fact, that occasionally such relationships cause some additional costs to arise (coopetitional costs). Such costs appear due to increasing complexity of relations that come from growth of participants (Lado, Boyd and Hanlon, 1997). As coopetition involves a cooperative component, it is possible to assume that some concepts of cooperation theory are applicable to coopetition con- cept. Cooperative theory describes costs that arise when companies try to maintain the cooperative relationships and potential losses connected with an opportunistic behaviour. All these issues definitely can cause some effects on the form of coopeti- tion among organisations. It is vital for organisations, to get overwhelm these costs with incomes and value that coopetition that they enter can bring to them. Due to this, companies probably have to think, which benefits such coopetition should bring to them. Size of companies: Small and large organisations statistically are less intercon- nected with their partners comparing to the medium-sized organisations. Due to the tendency that small companies usually niche ones, they do not have enough power and competitive potential to cause any influence on their industry or alliance that they enter. Situation around large organisation is affected by the antirust policy of modern governments, which put relations among big companies under a strict monitoring and try to coordinate them. Also it is important to admit, that big international organisations have access to much more resources in comparison with SMEs, as a result motivation to cooperate among these organisations decreases. Medium companies at the same time already have some possibilities to cause some influence on their industries, but still are not big enough to face all difficulties con- nected with market turbulence alone. That makes intermediate companies an ideal subject for cooperative relationships (Burgers, Hill, and Kim, 1993), and potentially make coopetitive inter-actions at lease potentially interesting for them. 302 Maxim Shlegel, Nikolay Zenkevich

How coopetition effects on the competition on a particular market? That ques- tion is examined mostly from the perspective of how cooperation influences on the market. However there are also some researches made in coopetition context (e.g. Oxley et al., 2009). Different researches provide quiet opposite data. While one group of researches provide us with the information and evidence, that cooperation among organiza- tions reduces the degree of competition on the market (Tong and Reuer, 2010). An- other group of scientists state that cooperation and coopetition cause an increase of competition on the market (Gnyawali, 2006). Common research and development programs (widely announced on a particular market) also cause some positive affect on the particular market value, not only on members of coalition, but also on other companies, that do not enter this coalition. Basing on this research authors state that there could be observed an increase of prices of shares of companies that do not enter an alliance could be a result of expected decrease of competition on the market (Oxley et al., 2009). At the same time coopetition has some potential problems for companies. There are some risks for opportunistic behavior (Brandenburger and Nalebuff, 1996), when participants can act selfishly when particular circumstances provide them a chance for this. This can be connected with knowledge expropriation, breach of trust and etc. Basing on the assumption, that coopetition can be risky, companies that enter it, can have some problems with the trust-building issues. Some sources and re- searches suggest that the most significant role in the trust building process goes to a calculative process (Faulkner, 2000; Lewicki and Bunker, 1996). Dyadic coopeti- tion depends mostly on the cost-benefit analysis. Absence of benefits that individual can calculate makes other trust-building mechanisms not sufficient for starting some kind of coopetition. Emotional base plays some kind of moderating role. Reputation based trust decreases opportunistic risks, but tends to be not sufficient enough for the coopetition decision procedure. Analysis of potential partner capabilities tends to be a part of the cost-benefit analysis (Czernek and Czakon, 2016). However the problem of trust could be potentially avoided if there would be no potential inter- actions between participants of a coopetition. Instead of this organizations could interact with a third party, whose main interest would be a coopetition as it is. That party could have its interest from the additional value that was gained through a coopetition. That makes this third party potentially more credible than other par- ticipants of alliance, who can try to get their profit with cheating. The phenomenon of coopetition arises various questions such as trust building among organisations or security of companies that choose a coopetition as a strat- egy (Czernek and Czakon, 2016). Also academic literature demonstrates various attempts to classify different coopetition strategies, types and activities through analysis of actual experience of organisations (Rusko, 2011). One of instruments, that could be used as a base for a coopetition as a strate- gic tool for the whole particular industry is an internet based platform. The phe- nomenon of internet platform (e-platform) is a modern one (Armstrong, 2006). Its current popularity became possible with a rapid development of internet all around the world. The most frequent type of internet platforms is a multisided platform, which provides services for different (usually interconnected) groups of users. Design and Simulation of Coopetition as Lead Generating Mechanism 303

Due to its mechanics, internet based platforms already started to provide services for competing companies. There are many types and forms of services, which are provided at this moment of time. There are even come examples of platforms that operate on the principles of coopetition (Ritala, Golnam and Wegmann, 2014). At present moment of time, question of a coopetition strategies, that could be ran through platforms is examined from the descriptive point of view with the means of case analysis tools. However questions of possible influence on some particular industry of one of coopetition strategies organised on base of an internet platform is not examined as it could be and could be also classified as a research gap. Filling this gap could be valuable as from the perspective of academic knowledge, as from the practical usage of coopetition strategies in modern economy.

2. Two-sided platforms

Nowadays we face a significant growth of popularity of platforms that launch and maintain interactions between two or more parties (sides) (Caillaud and Jullien, 2003; Rochet and Tirole, 2003; Armstrong, 2006) – such as Airbnb, Amazon, and Uber. In terms of current research internet platforms theory and concept of multi-sided market is used mainly to describe a tool (two-sided platform) that could be used as a base for the lead generating coopetition. These platforms manage to create value gain incomes from intermediation between different parties of users, satisfying their needs (Osterwalder, Pigneur and Smith, 2010). Occasionally sides that get into the focus of multi-sided platforms are business audience that provides market with some kind of services or goods, and customers that could be described as end-up users. The first group of users also could be called as advertisers (Rochet and Tirole, 2003). The most part of researches admit that focus on more than one side if a relevant characteristic that describes modern industries in different extent (depends on the industry). “Multisideness” became a new strategic tool, which is widely used by many organizations that manage to demonstrate significant results. Two-sided markets work with the intragroup and intergroup network effects which are also called cross-group effect one of the definitions of which is: cross-group network effects occur. The benefit enjoyed by a user on one side of the platform depends upon how well the platform does at attracting users on the other side (Amstrong, 2006). Basing on this we can see that YouTube could be called a two- sided internet platform which operated with the above-mentioned phenomena of cross-group effect, when its revenues from advertising depend on how regular video subscribers are satisfied. Another significant example of a multi-sided platform is Amazon company, that moved from a simple retailer to the two-sided model, adding another retailers to its business process, and suggesting them to sell their products on the internet based platform, called Marketplace (Ritala, Golnam and Wegmann, 2014) and as it was mentioned before, Even though many of analytics tried to persuade Amazon, that such approach is too risky, today we can see, that that move became a significant step that gave the company (Amazon) a chance to survive and continue its growth. Concentration on clients and on the market development (not on competitors), gave Amazon a boost for the further development, which gives it a chance and fuel to develop not only their own company, but the whole on-line industry, giving us 304 Maxim Shlegel, Nikolay Zenkevich a chance to propose that platforms, designed following the principles and goals of coopetition have a great potential to everybody. One of the key questions of internet based markets that focus on more than one side is to determine, which of the sides provides a more significant contributions to demand of its complement (the other side). In other words there is a question, why parties might join the internet platform. As a result we can meet the idea that consumer side sees as a motive any benefits and additional values that are offered by Internet platform. At the same time, producer side has motives that are mainly linked to the num- ber of potential customers that are classified by this business as a target audience. Second possible reason for service providers to start being a user of some platform is a possible usefulness of information and data that could be collected from its audience. As an example of the second reasoning there are some proofs that B2B companies that tend to be involved in two-sided markets usually get benefits from the private data, that their consumers leave on platforms they use (Fish 2009). One of the possible outcomes from such information could be a well-concentrated advertising, those bases on the personal information (age, gender etc.) of users of such social networks as Facebook.com or vk.com. This information could be used to define whether some person could be a potential user of some services or not. One more significant peculiarity of multi-sided platforms as a form of business model is that usually on of the sides is not charged for the value, that it gets from the platform. Occasionally end-up users category (customers) is not charged for platform usage (that get some services of the platform for free), while business participants that intend to sell their product or to get some valuable data act as subsidizers paying to reach their target audience. That means that platforms need to find and demonstrate a good reason for end-up consumers to join the platform for free, so that there could be created a significant value for services and goods suppliers (Mahadevan, 2000). Abovementioned peculiarities connected with the value creation issue for two different groups of users, pushes the most pert of internet platforms to the business model that consists from a set of steps. Movement from one step to another demon- strates the evolution of a business model that seems to by typical for many successful internet ventures (Muzellec, Ronteau and Lambkin, 2015). On early stages internet platforms concentrate on the values proposition towards end-consumers, persuading them to join a platform. At this stage platforms usually ignore any other sides. That continues until the number of users of a platform reaches some kind of critical mass that could become interesting for B2B clients of the platform. At the second stage of development platform moves its focus on business that is interested in end-up cus- tomers, which were already attracted to the platform. At this stage platform starts to get its first revenues. After venture reaches its first financial goals it moves to the third stage, which could be characterized as a reconsideration of all its services it order to increase the value for both sides of their users. Authors call this business model as B2BandC oriented model (Muzellec, Ronteau and Lambkin, 2015). Also researchers focus mainly on coopetition effects in the scale of one company. As a result, nowadays there is a deep understanding of “What individual compa- nies can achieve from a coopetition”. However, due to the fact that even though coopetition starts to emerge as a strategy, it still remains not so common practice. Design and Simulation of Coopetition as Lead Generating Mechanism 305

As a result there are few possibilities to explore effects, which coopetition is able to bring to the whole particular market or industry.

3. Questions to answer To design of a concept of internet platform-based coopetition among organisations with a base upstream activity aimed at the generation of leads, we have to answer following questions:

What is the possible impact of a lead generating coopetition on companies with • different price and quality strategies? How the number of the coopetition process participants influences on the effec- • tiveness of lead generating coopetition? How the number of the coopetition process participants influences on average • utility that clients get?

4. Agent-based model simulation To answer the abovementioned questions it is needed to evaluate possible outcomes of a complicated system functioning. Such outcomes tend to be hardly evaluated and predicted with simple mathematical calculations. Also it is important to pay attention to the fact that possible outcomes of such system functioning depend on various decisions of different participants of a market (competitors, clients). Above- mentioned conditions tend to be reasonable grounds to take a simulation of agent- based model as a way to test effectiveness of a suggested concept of competition interaction. Simulation is used mainly in researches, when complexity of examined systems becomes so high that basic simple calculations are not enough to get some significant results. In academic researches simulation is described as a problem-solving method (Banks, 2000). The main idea of simulation is to build a model, which could be able to describe real processes at some extent (Law and Kelton, 2000). One of possible applications of a simulation is a prediction of possible results of processes with different values of variables. To run the simulation a model is required. In terms of the current research author uses agent-based modelling (ABM). The main component of ABM is the “agent”. The whole simulation in case of AB modelling bases on functions and parameters of agents, that define what they are, what they do and how they behave (Wooldridge and Jennings, 1995). In ABM agents get some set of rules that define their:

Boundaries - their limitations, interconnections with other agents and etc; • Behaviour and decision-making capabilities – describe how agents make their • choice under various circumstances.

AB models describe the interactions of various agents that are situated in differ- ent situations and receive some programmed inputs concerning the state of environ- ment and different agents. When agents get these inputs, they respond basing on some logic. Actions of agents of ABM can be reactive and proactive, basing on their objectives, environment and rules of a model (Wooldridge and Jennings, 1995). In other words AB modelling operates with the modelling of the behaviour and interactions of various agents with different objectives and parameters, in an 306 Maxim Shlegel, Nikolay Zenkevich environment defined by some set of rules and principles, over time. It is important to pay attention to the fact that agents can act on their own basing on their personal goals, or share some common goals, acting in an organisational context (Jennings, 2001). There is a string view that AB modelling suits the best, situations that run without or with a small influence of central coordination on the behaviour of agents. In other words agent base models are used to simulate bottom-up problems and cases, when behaviour and decisions of individual agents can cause some global effects and trends (Macy and Willer, 2002). However, in terms of the current research there is a number of terms and limi- tations that make it possible to build a simulation that could be used as a base for some conclusions and further analysis.

1 AB model built in terms of current research assumes that there is only one product on one market, with no other goods, which could cause any effect on choice of customers; 2 There is only one advertising tool, used on the market – Pay Per click adver- tising. Other advertising and marketing instruments cause no effect on number of leads, that organisation gets; 3 Each client makes his choice basing on the principles of Utility maximisation; 4 Each client makes his purchase only once in terms of one simulation.

5. Data collection When the model is described and built, it is important to set its parameters. It was decided to use parameters from the real world (from some industry that potentially could apply lead generating internet platform-based coopetition). it was decided to use Russian web-design market, due to the ready availability of data that describes this industry. Basing on web-design market research conducted by the Russian analytical por- tal CMS magazine there was taken the following data:

- Number of companies that currently operate on Russian web-design market; - Average turnover of web-design studios in different regions of Russia; - Segmentation of companies basing on the price criteria; - Identification of instruments that web-design studios use a lead generating tool.

There were two prior methods of data collection (CMS magazine, 2012):

- Questionnaire that was answered by 450 executives of Russian web-design stu- dios (see Appendix 5); - Data collected from 1234 organisations, basing on the profiles of companies registered on web-portal “Runet Rating” (http://www.ratingruneta.ru ).

Basing on the information provided by Yandex Direct budget planning tool there was received information concerning Pay-per click advertising tool parameters and some information about the market potential (Yandex, April 2016):

- Cost per-click rates; - CTR rates; - Number of potential clients. Design and Simulation of Coopetition as Lead Generating Mechanism 307

Yandex is a Russian search engine, which provides services of PPC advertising for organisations that try to find clients on the Russian market. Statistics of conversion rates (CVRs) of web-sites of organisations from different spheres of business was taken from the survey made by online advertising com- pany “WordStream” among 1,000 landing pages. There was analysed the statistical probability and its distribution (basing on the statistics of these landing pages) that people will leave their request on services, provided on particular web-page. Later this statistics was separated to different industries (Kim, 2014). To define, which percent of total revenue organisations invest into advertising there was used a statistics provided by The CMO Survey in terms of the annual research of marketing trends. Information was taken from 3120 organisations that operate in different spheres of business. There was made an e-mail contact survey with follow-up reminders. As a result there was a 9.3% respond rate (289 respon- dents). Research was held from January to February 2016 (The CMO Survey, 2016). Data, taken from the abovementioned sources was used to define the borders of key parameters that describe the environment and agents behaviour and character- istics in terms of current research.

6. Experimental design Current research is based on the experimental design which tests the model with different parameters. Tests with various parameters provide author with the out- puts, which are used by to detect trends, impacts and phenomena that could be used as a base for hypothesis testing. The simulation of a lead generating platform-based coopetition evaluates the following outputs:

ROAS: Revenue on assets spent by company (or coalition) on advertising; • Profit: Difference between total income gained in terms of one simulation and • money spent on advertising.

The simulation of a AB model in terms of current research is made on the base of a AnyLogic 7.3.1 Personal Learning Edition. It is a program based on Java program language that works with agent-based, discrete event, and system dynamics modelling approaches. The main reason for using AnyLogic is its availability. The version used by author is free of charge. Also AnyLogic provides its users with a graphic interface, which simplifies the process of modelling and simulation. Due to the peculiarities of this version of the software there are only two ways of distribution used to describe the parameters: union and triangular distributions.

7. Description of lead generating internet platform-based coopetition The concept of a lead generating internet platform-based coopetition (LGIPBC) bases on the idea of co-invested advertising campaigns of the product. Companies, which distribute the same product, gather into coalition on the base of the internet platform (Operator). Operator provides coalition that gathers on its base a web- page and runs an advertising campaign on the advertising budget of the coalition. Advertising campaign generates traffic of potential clients on the web-page of the coalition. Generated traffic convers into requests for product distributed by members of the coalition (leads). Each lead, generated by a co-invested advertising campaign 308 Maxim Shlegel, Nikolay Zenkevich of the coalition, spreads among all members of this coalition, and after members of the coalition get lead, they start competing for it, with their sales strategies. Described concept includes competition and cooperation at different stages of their interaction process. That means that it can be classified as a concept of a coopetition among companies (Brandenburger and Nalebuff, 1996). Operator charges members of a gathered coalition for its organization, coordi- nation services and organization of the advertising campaign on the budget of the formed coalition. Operator offers companies that produce the same product to join one of coalitions. Coalitions base on groups of companies allocated by the Operator on the market of one particular product. Allocation of groups bases on character- istics of product distributed by companies on the market. Following characteristics could be used as a base for a group allocation process:

- number of functions; - quality of design; - price.

Operator also provides participants with a forecast of possible average price of one lead, that participants can get. Possible average price of one lead is inversely related to the number of companies that enter a coalition. Each organization decides, whether it is ready to join one of announced coalitions or it rejects the offer made by the Operator. If organization accepts the offer than it needs to decide, coalition on base of which exact group it joins (basing on its own perception of its product and its strategy). The main benefit that members of each particular coalition get is a decrease of average price for one lead. This is archived by the following mechanism:

1 Each company that wants to join a coalition pays an entrance fee of this coali- tion. Entrance fee is set by the Operator; 2 Total sum of the entrance fees, paid by members of the coalition is used by the Operator as an advertising budget; 3 Operator distributes advertising budget of a particular coalition on the adver- tising instruments that attract traffic of potential clients on the web-page of the coalition; 4 That traffic of potential clients converts to leads; 5 Operator provides all members of the colocation with a full access to all leads, generated by the web-page of this coalition.

As a result each member of the coalition gets leads that were generated on advertising budget of the coalition. Web-page of the coalition generates more leads with a cheaper price of one lead for one member of the coalition, if we compare it to the price of one lead generated by a solo advertising campaign led by one company for its own brand. When participants of the coalition start getting leads, competition part of the LGIPBC begins. At this point everything depends on the specific features of par- ticipant’s individual marketing policy, their sales systems, quality of the product and etc. After all leads are given to all members of the coalition, Operator stops the LGIPBC session and suggests members to join the next one. There are three main stages of LGIPBC: Design and Simulation of Coopetition as Lead Generating Mechanism 309

- Coalition partition stage; - Co-invested lead generation (cooperating activities); - Competition for customers.

As it was mentioned before Operator is an internet platform. The first group of users of this internet platform consists of companies, which distribute some product. The second group of users (second side) is represented by individuals and organisa- tions, which could be potential customers of the first group of users of the internet platform. That means that this platform could be classified as a two-sided internet platform (Amstrong, 2006). Basing on the conclusion that Operator is a two-sided internet platform, there are grounds for discussion of functions and services that could be provided to the second group of users (potential clients of the first group). However, in terms of the current master thesis, this issue is not discussed due to the fact that, from the standpoint of author, it does not refer to the coopetition in a straight way.

8. Coalitional partition stage Coalitional partition is held among all companies that produce the same product (Companies) with different levels of characteristics that describe it. N = 1,...,i,...,n – set of Companies, n > 0, number of Companies, i N – current{ Company. } ∈ Each Company i produces a product that can be descried in some way. Operator announces characteristics of this product (Characteristics). R = R ,..., R ,..., R { 1 k r} – set of Characteristics, r – number of characteristics. Rk R – particular charac- teristic. ∈ After a set of Characteristics was announced, Operator defines maximum and minimum levels of each Characteristic on the market of a product produced by the Companies (Market). Operator defines maximum and minimum levels of each Char- acteristic on the Market basing on the research of this Market: M = LR1 : LR1,...,LRk : LRk,...,LRr : LRr – Market. LRk – level of a par- ticular characteristic, LR – minimum level of a particular Characteristic on the  k Market, LRk – maximum level of a particular Characteristic on the Market After the Market is described, Operator starts to distinguish particular groups of Companies on the Market. That process is made in the following way:

1 Operator divides the market with the help of cauterization. As a result he distinguishes a set of groups: G = G ,...,G ,...,G – set of Groups, g – { 1 j g} number of Groups, Gj – a particular Group; 2 Operator defines border Levels of each Characteristic k for each particular j group: ; LRk – minimum level of a particular Characteristic k in a particu- j lar group, LRk – maximum level of a particular Characteristic k in a particular group; 3 As a result each particular group j out of a set of Groups can be described in j j j j j j the following way: Gj = LR1:LR1,...,LRk:LRk, ..., LRr:LRr . n o Each Company i on the Market can refer itself to one of the groups. It makes its choice basing on its own perception of Levels of Characteristics of its own product. LRk(i) – perceptional level of a particular Characteristic k by the current Company 310 Maxim Shlegel, Nikolay Zenkevich i. As a result each Company can make its own Characteristic profile of its prod- uct (Profile). CP i = LR1 (i) , LRk (i) ,...,LRr (i) – profile made by a current Company i. { } Operator announces that on the base of each group j there can be formed only one coalition Sj . To enter a particular coalition j Company has to pay an entrance fee. Operator defines amount of entrance fee for each particular group j, ASj > 0, basing on the analysis of the Market. After groups are defined, operator offers each participant to decide, to which group he refers himself. Each Company i makes its choice basing on its own per- ception of characteristics of their product. Finally Operator announces the expected level of average lead price reduction P R from the perspective of individual investments ASj of one particular member of coalition Sj for each coalition formed on base of a particular group j at different levels of coalition advertising budget.

XSj ASj P Rj XSj = − , (1) M(XSj )  where XSj > 0 – advertising budget of a particular coalition Sj ,

X = AS d , (2) Sj j ∗ j dj > 0 – number of members of a particular coalition Sj .

Function M(XSj ) > 0, describes a relationship between the amount of invest- ments in advertising company and the number of leads that come from this adver- tising company. This function can be derived by many ways, one of which (but not unique) is a regression analysis. It depends on:

- Target audience of a coalition; - Advertising instruments, used by coalition; - Season, when advertising campaign is held.

Each additional participant that joins coalition j decreases P Rj . That means, that if there would be no competition increase, connected with the growth of the member of coalition members, it would be a wise strategy for Companies, to form maximum coalition, that could maximise the reduction of price of one lead for its members. Operator uses P Rj as an additional motivation for Companies to enter one of coalitions. Basing on the researches of trust building among companies, there are some grounds to suggest that organisations make their choice whether they trust or no, mainly basing on estimations made with the help of calculations (Faulkner, 2000; Lewicki and Bunker, 1996). Level of average lead price reduction from the perspective of individual investments of one particular member of coalition P Rj is the instrument aimed to satisfy trust-building calculations criteria. After all important information was announced, Companies decide, whether they want to join one of coalitions formed on the base of groups. If there are no Companies that join some particular coalition, than this coalition is not formed. Design and Simulation of Coopetition as Lead Generating Mechanism 311

9. Possible strategies of companies It is important to understand that each Company i has a right to join a coalition that bases on a group with , which does not meet characteristics of this participant. However, such strategy can reduce the number of leads converted to orders by this particular Company, because Levels of Characteristics of its services may not meet expectations of potential customers that can be gathered by a coalition, that Company joined. From the perspective of the whole industry LGIPBC implies a set of possible strategies that could be chosen by Companies. At first each Company should decide if it wants to join a coalition or no. That means that company has to options:

- To join a coalition (Join); - Not to join a coalition (Avoid).

If Company i chooses to join one of coalitions, then it has to decide, whether it joins a group with a product, which characteristics levels are similar to characteris- tics of a product of this company (basing on its own perception), or to join another group. As a result we get the following options:

- To join a group of equals (peer group); - To join a group with a higher characteristics levels (higher group); - To join a group with a lower characteristics levels (lower group).

Finally, when Company decides to join a coalition and chooses which exact coalition it chooses, it should make a choice whether it invests its advertising money only into promotion of the web-page of his coalition, or part of its budget goes to advertising of its own web-site. This choice could be described in two options:

- To invest only into promotion of a coalitional web-page (all in coalition move) - To distribute advertising budget among its own web-site and coalitional web- page (distribution move)

As a result we get the following tree of seven possible strategies (see Fig. 1).

Fig. 1. Possible LGIPBC strategies for Companies.

Depending, on LGIPBC strategy that Company makes it can potentially get different results. All these strategies are examined in mathematical simulation, de- scribed in fourth chapter. 312 Maxim Shlegel, Nikolay Zenkevich

10. Profit and ROAS – individual and coalitional After coalition is formed, Operator starts an advertising campaign with a budget

XSj , gathered from all entrance fees, paid by members of a coalition Sj. Each coali- tion gets its web-page that is located on the platform. This page gives a potential customer, to get an understanding, which companies entered each particular coali- tion, to decide, weather they are ready to send a request for services on the platform (for this coalition) or no. When potential client leaves a request for services, each member of the coalition gets this request. At this moment of time, members of a coalition start competing for this particular lead, to convert this lead into a contract. This is the moment, when the LGIPBC starts to be competitive. When advertising budget of a particular coalition ends up, and a flow of leads stops, there starts a process of evaluation of effectiveness of a LGIPBC session for each coalition and its participants. In terms of current research effectiveness of each LGIPBC session is evaluated through two values: Profit and ROAS. Evaluating profit V (Sj ), of a coalition Sj we take into account a total sum of investments that were spent on advertising campaign, and total income, from all sales, made by all members of a coalition, while an advertising campaign of this coalition was active.

V (S )= I X (3) j Sj − Sj V (Sj ) – profit of a particular coalition Sj ,

XSj > 0 – advertising budget of a particular coalition Sj ,

ISj 0 – total income, that one coalition Sj managed to get at the end LGIPBC session ≥

j ISj = Ii , (4) j X where Ii 0 – individual income, that one member of one particular coalition Sj managed to≥ get at the end of a LGIPBC session. It can be concluded, that each member i of a coalition Sj can evaluate only their own personal profits Vi(j):

V (j)= Ij AS (5) i i − j On the base of personal profit there is a possibility to calculate the return on advertising spends (ROAS) of each member of a coalition Sj :

j ROASi(j)= Ii /ASj (6)

where ROASi(j) – means the return on advertising spends of a current member of a particular coalition Sj ; Finally to evaluate the effectiveness of money spend on advertising campaign of a particular coalition Sj ROAS of each particular coalition should be calculated:

ROASSj = ISj /XSj (7) Profit of each member cannot be announced or predicted before a LGIPBC session is not finished. These values depend on a number of factors including: Design and Simulation of Coopetition as Lead Generating Mechanism 313

- Quality perception of clients; - Current market trends; - Economic situation in a country.

In terms of this research, there is an attempt to simulate client’s behaviour to try to predict possible profits and evaluable potential successful strategies, that could maximise profits of coalition and each its participant.

11. Model mechanics description To estimate potential effectiveness of LGIPBC, there was used a simulation of an agent-based model. In current part there is a description of the model, used to run the simulation, its environment, behavior and parameters of its agents;

1 The model simulates market of companies that distribute only one product (Companies) with one possible coalition on this market g = 1 (S1 = Coalition); 2 There is one company (i = 1) all parameters of which are manually settable values (the Observed Company); 3 Number of Companies, which operate on the market n 0 is a manually settable value, N = 1,...,i,...,n – set of Companies, i N≥ – current Company; 4 Number of clients{ on the market} nl 0, is a manually∈ settable value, nl NL, NL = 1,...,l,...,nl ; NL – set of≥ clients, l NL – current client; ∈ { } ∈ 5 Number of companies that gather into Coalition d1 > 0 is a manually settable value; 6 The value of coalition entrance fee AS1 > 0 is a manually settable value;

7 The coalition gets its total advertising budget XS1 is calculated according to (2); 8 Each Company (Coalition) chooses its own advertising budget ABi 0 for each period of time. In terms of the simulation, this budget is assigned on≥ the basis of uniform distribution and falls into the range with settable borders, where AB is a maximum advertising budget and AB is a minimum one for the Market; 9 Each member of the Coalition has an advertising budget AB AS . If AB = i ≥ 1 i AS1, than it means that a particular member of the Coalition invests only into the co-invested advertising campaign, and does not invest into advertising campaign of his own web-page. If ABi > AS1, than it means that a particular member of the Coalition invests money into advertising campaign of the web- page of the Coalition and also he invests into advertising campaign of his own web-page; 10 Each Company i gets its quality level qi – an integer value that is randomly assigned on the basis of uniform distribution out of Q = q : q – set of quality levels, qi Q. 11 Each quality∈ level q gets its middle price of a quality level (MPQL (q)); 12 When company i gets a particular level of quality, it also gets its price pi, which is randomly assigned on the basis of uniform distribution and falls into the range:

p [MPQL(q) ε MPQL(q); MPQL(q)+ ω MPQL(q)] (8) i ∈ − ∗ ∗ where ε and ω fall into a range from 0 to γ 0 is a manually settable value. ε [0; γ], and ω [0; γ] are randomly assigned≥ on the basis of uniform distri- bution.∈ ∈ 314 Maxim Shlegel, Nikolay Zenkevich

There can be calculated maximum and minimum possible prices on the Market. Minimum possible price on the Market: p = MPQL(q) γ MPQL(q), while maximum possible price on the Market can be calculated− in∗ the following way: p = MPQL(q)+ γ MPQL(q); ∗ 13 Each Company has its own web-page; 14 The Coalition has its own web-page; 15 Each Company (Coalition) uses pay-per click (PPC) advertising as an advertis- ing instrument, when advertisers pay a pay-per click cost (PPCC 0), each time, when their advertisements are clicked; ≥ 16 PPC advertising is the only way of promotion on the market; 17 When potential client gets on the web-page that belongs to a particular Com- pany (Coalition), that means that this potential client has clicked on the ad- vertisement of this Company (Coalition), advertising budget of this Company (Coalition) reduces on PPCC, of this Company (Coalition); 18 There are four PPCC rates, which are manually settable values; 19 In terms of simulation PPCC is assigned to each Company on the basis of uniform distribution between the set of possible options. That simulates the choice, which each Company makes concerning, PPCC rate that it uses; 20 PPCC of the Coalition is a manually settable value; 21 Particular PPCC defines the probability, that potential client will click on the advertisement of a Company that was assigned with a particular PPCC. That probability is called a click-through rate (CTR > 0); 22 Each Company starts its advertising campaign at a random period of time in terms of manually settable borders; 23 Coalition and Observed Company start their advertising campaigns from the beginning of the simulation; 24 Conversion rate (CV R 0) defines a probability that a particular client, who has entered a web-page≥ of a particular Company (Coalition), makes a request on its services. Each Company gets its CV Ri out of the CV R range according to the triangular distribution, where CV R – minimum possible CV R (manually settable value), CV R – maximum possible CV R (manually settable value), and CV Rm – the most possible (manually settable value);

25 CV RS1 of the web-page of the coalition is a manually settable value; 26 When a particular client leaves a request on a web-page of a particular company, this company gets a status of “Potential contractor” of this client; 27 If a particular client leaves a request on a web-page of the Coalition, all members of the Coalition gets a status of “Potential contractor” of this client; 28 Each client l has his desired number of requests NOl > 0, which he leaves on web-pages. NOl is randomly assigned on the basis of uniform distribution to each client and falls into the range with a manually settable borders; 29 If client leaves a request on a web-page of a Company (Coalition) but he did not get his desired number of requests, he continues to visit web-sites of other Companies (but never gets back on the web-page, on which he left his request); 30 If client leaves a request on a web-page of a Company (Coalition) and gets his desired number of requests, he stopes to visit other web-pages; 31 After client stops to visit web-pages, he has to make a choice and pick one Contractor out his set of Potential Contractors; 32 Potential client behaviour description: Design and Simulation of Coopetition as Lead Generating Mechanism 315

(a) Each potential client gets his own subjective level of quality of each Potential Contractor q (i) 0, l ≥

[qi qi α; qi + qi β] , (qi qi α) > 0, ql(i) − ∗ ∗ − ∗ (9) ∈ [0; qi + qi β] , (q qi α) 0,  ∗ i − ∗ ≤ where α and β fall into a range from 0 to τ, where τ is a manually settable value. Here α [0; τ] , and β [0; τ], where α and β are randomly assigned on the basis of∈ uniform distribution∈ (b) Every client l has his quality perception level θl, which falls into the quality perception level range of the Market: θl = [θ; θ], where θ = p/q, and θ = p/s; (c) Every client tries to maximise his subjective utility that a potential client gets from a particular company for its price Ul

θ q (i) p , θ q (i) >p , U (p ,θ , q (i)) = l l n l l i (10) l i l l ∗ 0,− θ q (i)∗ p .  l ∗ l ≤ i As a result, if a potential client chooses between 5 organisations (potential contractors), he always gives his choice to the company that provides him with the maximum subjective utility; 33 To simulate different market environments and various individual strategies current model includes a set of manually settable scenarios: (a) There is a coalition on the market. Advertising budget of each organisa- tion that entered a coalition can be higher than a coalitional entrance fee (companies invest into coalitional web-page and into their own web-sites),

AB AS . i ≥ 1 (b) The observed company enters the coalition; however its advertising budget is equal to the entrance fee of the coalition.

AB1 = AS1; 34 The quality level: of the observed company, which defines its personal quality move, is manually settable: (a) If the Observed Company gets manually set q1 = 2, than the Observed Company has chosen “higher group move”; (b) If the Observed Company gets manually set q1 = 3, than the Observed Company has chosen “peer group move”; (c) If the Observed Company gets manually set q1 = 4, than the Observed Company has chosen “lower group move”; 35 To evaluate the effectiveness of different strategies there is a need for calculation of profit and ROAS of Company (Coalition); (a) ROAS of Company 1 is calculated in the following way: ROAS1 = I1/AB1 where ROAS1 – return on advertising spends of Company 1, I1 0 – income of Company 1; ≥

(b) ROAS of the Coalition S1 is calculated in the following way: ROASs1 =

Is1 /Xs1 where ROASs1 - return on advertising spends of the Coalition, I 0 – income of the Coalition; s1 ≥ (c) Profit of a Company 1 is calculated in the following way: V1 = I1 AB1; (d) Profit of the Coalition S is calculated in the following way: V = I− AS ; 1 s1 s1 − 1 316 Maxim Shlegel, Nikolay Zenkevich

12. Parameters for the simulation To run the simulation of the LGIPBC model, it was decided to use data from some particular market. Through this, results of the simulation could be closer to reality. Also that could ease the process of interpretation and analysis of results. It was decided to use web-design market as a base for LGIPBC model basing on the following criteria:

1 Design of new web sites has an approximate 85% share in the structure of the income of an average Russian web-design studio. That could be a base for a statement that there is a market for the product (design of a new web-site), and web-design studios potentially have enough motivation to attract clients through advertising activities. 2 Respond to the question “From which sources you company gets new clients”, which provided respondents (CEOs of the companies) with multiple choice demonstrated the following tendencies: 3 From 80 to 90% of all Russian web design studios get their clients through a personal recommendations 4 More than 60% of new clients came with the web design studio link, disposed on its previous projects 5 At least 30% of all new clients found these companies with a search engines (Google, yahoo and etc.) 6 From 16% to 21% of new clients came from the PPC advertising (Yandex direct and Google Adwords) 7 From 17% to 27% of new clients came from thematic portals and different platforms, that help companies to get clients (such as Avito.ru)

At the same time approximately 45% of all web design studios planned to spend the most part of their advertising budget on PPC advertising. Basing on this data there could be made a conclusion that PPC advertising (the only advertising activity used in model) is used by web-design market and characteristics this market could be used as a parameters for the simulation model. To define the range of possible advertising budgets it was decided to apply one of approaches of advertising budget identification through a turnover of a company. According to one of these approaches, company should use some percentage from its turnover for some period of time, as an advertising budget for the next period of time. That means that to define potential borders of advertising range, it is needed to know average turnover of web-design studios and which average share of this turnover could be used by them as an advertising budget. In 2011 Russian web design market faced a significant growth, with approxi- mately 53% growth, comparing to the previous year and reached 14.9 billion rub- bles volume. With the growth of the market, web design studios faced a significant increase in their turnover levels demonstrating 11.9 million rubbles average annual turnover in 2011 - 34% growth comparing with 2011 (see Fig. 2). Distribution of total annual turnover among companies operating in different regions of Russian Federation looks in the following way:

1 Central Federal District - 17 881 077 rubbles 2 Northwestern Federal District - 12 645 474 rubbles 3 Ural Federal District - 11 965 143 rubbles Design and Simulation of Coopetition as Lead Generating Mechanism 317

4 Siberian Federal District- 5 287 525 rubbles 5 Volga Federal District - 4 540 238 rubbles 6 Southern Federal District - 1 390 925 rubbles 7 Far Eastern Federal District - 1 240 000 rubbles

Fig. 2. Average annual turnover of Russian web design studio (million rubbles) (CMS magazine, 2012).

According to Chief Marketing Officer survey 2016, Average advertising budgets of companies that offer services in B2B sphere falls around 8,6% from the total revenue of a company. That brings us to the conclusion that average advertising budget of a web design studio is approximately 85,000 rubbles per month. It is decided to use this amount as an advertising budget of the observed company as the most expected one (AB1 = 85, 000). The top border of advertising budget range (AB) is set on level of average monthly advertising budget of the Central Federal District – 128,000 rubbles. Number of Companies (n) on the Market, there was made basing on the web- design market segmentation by the price criteria. In 2012 there was approximately 2,600 web design studios operation on the Russian market. Price diversification among Russian web design studios is pretty wide. Prices of organisations that op- erate in low-cost segment start with 5,000 rubbles and end up with companies that produce web-sites for prices that start from 2 million Rubbles. In the research that describes the web-design market, the most part of web design companies that op- erate on Russian market were distributed to 7 main price categories (price of an average web-site for an organisation):

1 Less than 50,000 rubbles (35.9%) 2 From 50,000 to 100,000 rubbles (31.5%) 3 From 100,000 to 200,000 rubbles (18%) 4 From 200,000 to 300,000 rubbles (8.8%) 5 From 300,000 to 500,000 rubbles (2.8%) 6 From 500,000 to 700,000 rubbles (1.6%) 7 Above 700,000 rubbles (1.6%) 318 Maxim Shlegel, Nikolay Zenkevich

Basing on the analysis it was decided to form groups basing of their pricing policy of organisations. It was decided to reduce the number of groups from 7 to 3 (see Table 1).

Table 1. Grouping of companies on a price basis

Price Price range Percentage of Estimated number category participants of participants 1 Less than 50,000 rubbles 35.9% 933.4 2 From 50,000 to 200,000 49.5% 1287 rubbles 3 Above 200,000 rubbles 11.5% 379.6

One of the main motivations to unite all companies with prices above 200,000 in one group, was the assumption that clients, which can afford themselves a web-site for 500,000 rubbles, do not use PPC instruments to look for a contractor as often, as those, who look for a cheap or middle-priced products. That means that leaving categories with high prices as separate ones could make them unpopular among companies. The second and third price categories were united in one common group, to make representatives of this group to be the most numerous group of companies, which could represent approximately half of the market. In terms of current simulation it was decided to use second group as a total market (n = 1287), because it has a clear price borders that could be used as a price borders of the model: p = 50, 000, p = 200, 000.

Fig. 3. CTR (%) dependence on the average price of one click (Yandex, April 2016).

One of the forms of PPC advertising is a PPC advertising based on the platform of search engines. When people search some word or phrase using one of search engines, they get PPC advertisements in special fields of a page with a search results. According to the data collected by Yandex company (Russian search engine), which provides Russian business with the PPC advertising services, in April 2016 PPC Design and Simulation of Coopetition as Lead Generating Mechanism 319 campaign built on one search phrase “To order a web-site” would have the following terms and characteristics (on 30 days scale): Average number of ad showings – 66,630 Click-through rate (CTR) – varies from 0,64% to 6,31% depending on the rate (average price of one click), that organisation chooses for its promotion (see Fig. 3). Basing on this data, the maximum number of potential clients that visit a web- site of one particular studio can reach the number - 4205 visitors, that number is used to define the number of clients on the simulated Market (l = 4205). Estimated budget, needed to get such number of visitor is above 1 242 000 rubbles. In terms of the current simulation average price per one click rates are used as PPCC rates (see Table 2):

Table 2. PPC advertising instrument costs and CTR (Yandex, April 2016)

PPC advertising instrument Price per one click (PPCC) 144 253 280 376 CTR 0.64% 1.05% 5.46% 6.31%

Table 3. Conversion rates of web-sites in different industries (Kim, 2014)

Finally it is important to estimate, how many visitors of web design studios web- sites convert to actual leads leaving their request for web-site development services. According to “WordStream” company data (see Table 3) median conversion rate of the Internet resources is around 2.23% (B2B service), which means that approx- imately only 2 out of 100 visitors of a web-site of a web-studio convert into leads (Kim 2014). That means that even if company pays minimum price per one click on its ad in PPC campaign (144 rubbles), one lead costs it approximately 7,200 rubbles.

13. The simulation results and analysis In terms of current research there were made more than 300 simulation rounds. Basing on the data, received from these simulation round there can be made some conclusions and suggestions. The values of all parameters of the simulation were 320 Maxim Shlegel, Nikolay Zenkevich taken from the analysis of the processes and trends that take place in the web- design industry. To answer the second sub question of the current research (What is the possi- ble impact of a lead generating coopetition on companies with different price and quality strategies?) author runs a series of tests with the observed company. The aim of these tests it to detect the best scenario (from the perspective of profit and effectiveness) for different combinations of price and quality of the services provided by the observed company. Criteria of effectiveness is evaluated through ROAS. As a result, there were created profiles that demonstrate different levels of profit and ROAS at different scenarios (see Table 4). The main aim of these profiles is to help to define the best scenarios from perspectives of ROAS and profit.

Table 4. RAOS and profit profile of observed company with high quality and low price

Price on ser- vices of the Scenario 1 2 3 4 5 6 observed com- pany: ROAS 1.412429 61.904 0.58851 9.18338 10.0047 28.5714 Profit 35040 127900 -34960 695064 744040 579000 50,000 The strategy(s) with the high- 2 est profit The strategy(s) with the high- 2 est ROAS The strategy(s) with the low- 3 est profit The strategy(s) with the low- 3 est ROAS

When profit of the observed company is used as an effectiveness criteria, out- comes of simulations demonstrate that in most cases companies benefit from Sce- nario 4 and Scenario 2 (see Fig. 4). The only category of companies that did not benefit from a coalition presence on the market is companies with low quality and high or upper-average prices. Basing on this data there could be made an assumption that presence of a LGIPBC has an impact on profits of companies of a particular industry. In addition to that there is a base to suppose that this impact could be classified as positive. In cases when ROAS is taken as main effectiveness criteria, simulation demon- strates pretty close results (see Fig. 5). The only significant difference is that there also appears Scenario 6 as a potential effective scenario for organisations that have low costs and high or low quality of services. ROAS perspective also demonstrates that companies with high or upper-average prices and low quality benefit from sit- uations, when there is no LGIPBC on the market. All other participants get an increase of ROAS when LGIPBC is working and they take part in coopetition. Although, in both effectiveness tests Scenario 2 seems to be not a realistic one, because it seems to be impossible, that all members of the Coalition refuse to in- vest their money into their own web-site. However simulation results demonstrate that organisations with high quality/high and upper-average price combination and Companies with medium quality/low and lower-average price get the best results Design and Simulation of Coopetition as Lead Generating Mechanism 321 from such scenario. That also could be used as a base for the assumption that LGIPBC increases the transparency on the market, making its clients to find Con- tractors, which suit their needs the most.

Fig. 4. Best individual scenarios from the perspective of profit.

Fig. 5. Best individual scenarios from the perspective of ROAS.

The third important assumption that can be made basing on the ROAS tests is the idea, that Scenario 6 of LGIPBC could be effective for companies with a low price policy. It means that companies with a low-price policy can afford themselves not to invest into their own advertising campaigns, but use only the coalition, as the only source of leads, that they get. Basing on this assumption there could be also made an additional assumption, that there is a probability, that LGIPBC has a potential to decrease average prices in one particular industry. According to the abovementioned tests results there is a sufficient basis to state that LGIPBC has a positive impact on industry, and can increase profits and effec- tiveness of advertising campaigns of its participants (except those who have high or upper-average prices and low quality). 322 Maxim Shlegel, Nikolay Zenkevich

Fig. 6. Dependence of ROAS of the coalition on the number of members of the coalition.

The next set of simulation tests was made to answer the third sub-question (How number of the coopetition process participants influences on effectiveness of lead generating coopetition?). Using ROAS as criteria of effectiveness author gets outputs, which could be used a base for the conclusion that answers the third sub- question of current research: Number of members of the coalition has an impact on the ROAS of the coalition (see Fig. 6). There could be observed a clear increase of ROAS until the number of members of a coalition reaches some particular level. After this level there is another clear trend that demonstrates the decrease of ROAS of the coalition. One of the possible reasons for such trend could be that average income of coalition starts to decrease, when the number of participants grows. Growth of the number of participants could cause the transparency increase and decrease of the prices as a result. In other words client see, who has the same quality but lower price, and buy from them. The second test submits the assumption, that LGIPBC has a potential for the increase a transparency of a particular market, however, from the standpoint of author, this assumption should be checked in a more precise way. Finally there were made tests that aimed to define if c appearance on the market and growth of number of its members can potentially increase average utility of one client on the market. As a result there was detected a following tendency (see Fig. 7). Basing on the results of utility tests we can assume that increase of the number of members of a coalition that bases on the LGIPBC (and its existence) have a potential to increase average utility on the market. As a result, level of satisfaction of an average client can increase significantly. That phenomenon detected in terms of simulation can be explained with an assumption that increase of number of member of a coalition gives a client a chance Design and Simulation of Coopetition as Lead Generating Mechanism 323 to compare more offers at once and define the best one (from subjective position of a client)

Fig. 7. Dependence of average utility of a client from number of members.

This potential benefit that market can get from LGIPBC applying also could be used as a ground for the assumption that LGIPBC can become a source of market transparency significant growth, which means an increase of competition among companies and all outcomes that derive from that.

References

Amstrong, M. (2006). Competition in two-sided markets. Rand J Econ, 37(3), 668–691. Banks, J. (2000). Getting started with Automod. Vol. II. Chelmsford Massachusetts: Brooks Automation, Inc 2004. Basole, R. C., H. Park, and B. C. Barnett (2015). Coopetition and convergence in the ICT ecosystem. Telecommunications Policy, 39, 537–552. Bengtsson, M., and S. Kock (1999). Cooperation and competition in relationships between competitors in business networks. The Journal of Business and Industrial Marketing, 14(3), 178–191. Brandenburger, A. M., and B. J. Nalebuff (1996). Co-opetition. New York: Doubleday Cur- rency. Bulgakova, M., Petrosyan L. Cooperative network games with pairwise interactions. Math- ematical game theory and applications, 7(4), 7–18. Burgers, W. P., C. W. L. Hill, and W. C. Kim (1993). A Theory of Global Strategic Al- liances: The Case of the Global Auto Industry Author. Strategic Management Journal, 14 (6), 419–432. Caillaud, B., and B. Jullien (2003). Chicken and egg: competition among intermediation service providers. Rand J. Econ., 34(2), 309–328. Cherrington, P. (1976). Advertising as a business force: a compilation of experiences (Reis- sue edition). Manchester, NH: Ayer Co Pub. Czernek, K., and W. Czakon (2016). Trust-building processes in tourist coopetition: The case of a Polishregion. Tourism Management, 52, 380–394. Faulkner, D. O. (2000). Opposing or complementary functions? In Cooperative strategy. Economic, business and organizational issues, by D. Faulkner, and M. De Rond, 341- 361: Oxford University Press. Fish, T. (2009). My digital footprint. London: Futuretext. 324 Maxim Shlegel, Nikolay Zenkevich

Garrette, B., X. Castaner, and P. Dussauge (2009). Horizontal alliances as an alternative to autonomous production: product expansion mode choice in the worldwide aircraft industry 1945-2000. Strategic Management Journal, (30), 885–894. Gnyawali, D.R. (2006). Impact of Co-Opetition on Firm Competitive Behavior: An Em- pirical Eximination. Journal of Management, 32, 507–530. Jennings, N. R. (2001). An agent-based approach for building complex software systems. Communications of the Arch, 44(4), 35–41. Lado, A.A., N.G. Boyd, and S.C. Hanlon (1997). Competition, Cooperation, and the Search for Economic Rents: A Syncretic Model. The Academy of Management Review, 22(1), 110–141. Kim, L. (2014). Everything You Know About Conversion Rate Optimiza- tion Is Wrong. [online document]. [Accessed March 2014]. Available at: http://www.wordstream.com/blog/ws/2014/03/17/what-is-a-good-conversion-rate Lavie, D. (2006). The competitive advantage of interconnected firms: an extension of the resource-based view. Academy of Management Review, 31, 638–658. Law, M. A., D. Kelton, and W. David (2000). Simulation modeling and analysis. New York [u.a.]: McGraw Hill Lewicki, R. J., and B. Bunker (1996). Developing and maintaining trust in work relation- ships. In Trust in organizations: Frontiers of theory and research. by R. M. Kramer, and T. R. Tyler, 114 - 139. Thousand Oaks, California: Sage. Liu, R. (2013). Cooperation, competition and coopetition in innovation communities. Prometheus, 31(2), 91–105. Luo, Y. (2004). Coopetition in international business. Copenhagen Business School Press. Macy, M. W. and R. Willer (2002). From Factors to Actors: Computational Sociology and Agent-Based Modeling. Annual Review of Sociology, 28, 143–166. Mahadevan, B. (2000). Business models for Internet-based e-commerce: An anatomy. Cal- ifornia Management Review, 42(4), 55–69. Muzellec, L., S. Ronteau, and M. Lambkin (2015). Two-sided Internet platforms: A busi- ness model lifecycle perspective. Industrial Marketing Management, 43, 236–249. Osterwalder, A., Y. Pigneur, and A. Smith (2010). Business model generation. Hoboken, NJ: Wiley and Sons. Oxley, J. E., R. C. Sampson, and Silverman, B. S. (2009). Arms Race or Dtente? How In- terfirm Alliance Announcements Change the Stock Market Valuation of Rivals. Man- agement Science, 55(8), 1321–1337. Ritala, P., A. Golnam, and A. Wegmann (2014). Coopetition-based business models: The case of Amazon.com. Industrial Marketing Management, 43, 236–249. Ritala, P., and P. Hurmelinna-Laukkanen (2009). What’s in it for me? Creating and ap- propriating value in innovation-related coopetition. Technovation, 29, 819–828. Rochet, J. C., and J. Tirole (2003). Platform competition in two-sided markets. J. Eur. Econ. Assoc., 1, 990–1029. Rothaermel, F. T. (2001). Incumbent’s advantage through exploiting complementary assets via interfirm cooperation. Strategic Management Journal, 22, 687–699. Rusko, R. (2011). Exploring the concept of coopetition: A typology for the strategic moves of the Finnish forest industry. Industrial Marketing Management, 40, 311–320. Shubik M. The present and future of game theory. Mathematical game theory and appli- cations, 4(1), 93–116. The CMO Survey (2016). CMO Survey Report: Highlights and insights. [online document]. [Accessed February 2016]. Available at: https://cmosurvey.org/wp- content/uploads/sites/11/2016/02/The CMO Survey-Highlights and Insights-Feb- 2016.pdf Tong, T. W., and J. J. Reuer (2010). Competitive consequences of interfirm collaboration: How joint ventures shape industry profitability. Journal of International Business Stud- ies, 41, 1056–1073. Design and Simulation of Coopetition as Lead Generating Mechanism 325

Walley, K. (2007). Coopetition. An introduction to the subject and an agenda for research. International Studies and Management and Organization, 37(2), 11–31. Wooldridge, M. and N. R. Jennings (1995). Intelligent agents: theory and practice. The Knowledge Engineering Review, 10(2), 115–152. Yandex Direct Budget Forecast. April 2016. Available at: https://direct.yandex.ru Contributions to Game Theory and Management, X, 326–338

On a Dynamic Traveling Salesman Problem ⋆

Svetlana Tarashnina1, Yaroslavna Pankratova2 and Aleksandra Purtyan3 1 St. Petersburg State University, Universitetskaya emb., 7/9, St.Petersburg, 199034, Russia E-mail: [email protected] 2 St. Petersburg State University, Universitetskaya emb., 7/9, St.Petersburg, 199034, Russia E-mail: [email protected] 3 St. Petersburg State University, Universitetskaya emb., 7/9, St.Petersburg, 199034, Russia E-mail: a.purtyan@mail .ru

Abstract In this paper we consider a dynamic traveling salesman problem (DTSP) in which n objects (the salesman and m customers) move on a plane with constant velocities. Each customer aims to meet the salesman as soon as possible. In turn, the salesman aspires to meet all customers for the minimal time. We formalize this problem as non-zero sum game of pursuit and find its solution as a Nash equilibrium. Finally, we give some examples to illustrate the obtained results. Keywords: dynamic traveling salesman problem, non-zero sum game, Nash equilibrium.

1. Introduction We consider the classical traveling salesman problem (TSP). The idea of the TSP is to find a route of a given number of cities, visiting each city exactly once and returning to the starting city where the length of this tour is minimized. The first instance of the traveling salesman problem was from Euler in 1759 whose problem was to move a knight to every position on a board exactly once. The traveling salesman first gained fame in a book written by German salesman B.F. Voigt in 1832 (Michalewicz, 1994) on how to be a successful traveling salesman. He mentions the TSP, although not by that name, by suggesting that to cover as many locations as possible without visiting any location twice is the most important aspect of the scheduling of a tour. The origins of the TSP in mathematics are not really known - all we know for certain is that it happened around 1931 (Michalewicz, 1994). Currently the only known method guaranteed to optimally solve the traveling salesman problem of any size, is by enumerating each possible route and searching for the tour with the shortest length. When n gets large, it becomes impossible to find the cost of every tour in polynomial time. Many different methods of optimiza- tion have been used to try to solve the TSP. The traveling salesman problem has many different real world applications, mak- ing it a very popular problem to solve. For example, some instances of the vehicle routing problem can be modeled as a traveling salesman problem. Here the problem is to find which customers should be served by which vehicles and the minimum

⋆ This work was supported by the Russian Science Foundation (grant 17-11-01079). On a Dynamic Traveling Salesman Problem 327 number of vehicles needed to serve each customer. There are different variations of this problem including finding the minimum time to serve all customers. The TSP: given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city? For the classical traveling salesman problem there are the following difficulties: The rule that one first should go from the starting point to the closest point, • then to the point closest to this, etc., in general does not yield the shortest route. It is an NP-hard problem in combinatorial optimization, important in oper- • ations research and theoretical computer science. Algorithms for finding exact solutions work reasonably fast only for small • problem sizes. In this paper we consider a dynamic traveling salesman problem (DTSP) allow- ing all considered objects (the salesman and customers) to move on a plane with constant velocities. We apply a game theoretical approach to solving the DTSP. In fact, we propose to use some methods of pursuit game theory (Isaaks, 1965) for this purpose (Petrosjan and Shirjaev, 1981; Petrosjan, 1983; Kleimenov, 1993; Tarashnina, 1998; Pankratova and Tarashnina, 2004; Pankratova, 2007). This me- ans that each agent is considered as a player that has his own aim and his profit is described by a payoff function. The players may use admissible strategies and inter- act with each other. Here we find a solution of the DTSP as a Nash equilibrium in a non-zero sum game of pursuit. In other words, we define strategies of all players that provide the minimal length of the salesman route.

2. The game

We have m customers C1,...,Cm who are initially located in different cities and move on a plane with constant velocities, and a salesman S who wants to meet all of them. The players start their motion at the moment t = 0 at initial positions 0 0 0 z1 ,...,zm,z . At each instant t they may choose directions of their motion. Let α be the velocity of salesman S, βj be the velocity of customer Cj , j = 1,...,m, α<βj . Suppose that the salesman never meets the same customer twice and does not return to the starting point (he she stays in the last meeting point). Thus, the salesman tries to find the shortest route that passes through the customers’ cur- rent positions once and each customer also wants to meet the salesman as soon as possible. In contrast to the classical problem, where customers are located at fixed points and may not move, here they move with constant velocities. A strategy of salesmen S

t t t uS(t,z1,...,zm,z )= uS. The salesman uses piecewise open-loop strategies. A strategy of customer Cj is a function of time, players’ positions and a velocity- vector of the salesman at a current time moment, i.e.

t t t t uCj (t,z1,...,zm,z , uS)= uCj , 328 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

t t t t where z1,...,zm,z are current positions of the players and uS is a vector-velocity of S at time instant t. In this game we suppose that the customers use the parallel pursuit strategy (Π-strategy) (Petrosjan, 1965). Denote by S and Cj the sets of admissible strategies of the players, j =1,...,m. U U The game is played as follows: at the initial moment of time the salesman informs customers C1,...,Cm about a chosen direction of his motion. After that, S meets the customers on his route if they cross it. The game is finished when the salesman meets the last customer. S aspires to minimize the total meeting time, i.e. to meet all customers for the minimal time. At the same time each customer wants to minimize his own meeting time. The payoff function of customer Cj is

K (z0,...,z0 ,z0,u ,...,u ,u )= T , (1) Cj 1 m C1 Cm S − j where Tj is a meeting time of S and customer Cj . The payoff function of salesman S is

K (z0,...,z0 ,z0,u ,...,u ,u )= max T ,...,T . (2) S 1 m C1 Cm S − { 1 m} The objective of each player in the game is to maximize his own payoff function. So, we define this problem in a normal form

0 0 0 Γ (z1 ,...,zm,z )= N, i i N , Ki i N , (3) h {U } ∈ { } ∈ i where N = C ,...,C ,S is the set of players, is the set of admissible strategies { 1 m } Ui of player i, and Ki is a payoff function of player i defined by (1) and (2), i N. The constructed game depends on initial positions of the players. Let us fix players’∈ 0 0 0 initial positions and consider the game Γ (z1 ,...,zm,z ).

3. Basic notions and definitions Give some notions of pursuit game theory that help to find a solution of the DTSP.

Definition 1. The parallel pursuit strategy (Π-strategy) is a kind of motion of a customer C regard the motion of salesman S which provides a segment CtSt connecting current players’ positions Ct and St at each time moment t > 0 to be parallel to the initial segment C0S0 and its length strictly decreases.

Since we suppose that all customers use the parallel pursuit strategy, the follow- ing definition of the Apollonius circle is needed.

0 0 0 0 Definition 2. The Apollonius circle A(zj ,z ) for initial positions Cj = zj and 0 0 S = z of customer Cj and salesman S, respectively, is the set of points M such that S0M C0M | | = | j |, α βj where βj >α> 0 (see Fig. 1).

First let us consider a three person game: with salesman S and two customers C1, C2. We have two intersection points of the Apollonius circles. The set of all inter- section points of the Apollonius circles is denoted by Z. In this game Z = z ,z { 12 21} On a Dynamic Traveling Salesman Problem 329

0 0 Fig. 1. The Apollonius circle for game Γ (zj ,z )

0 0 0 Fig. 2. The Apollonius circles for game Γ (z1 ,z2 ,z )

(Fig. 2). In Pankratova, Tarashnina, Kuzyutin, 2016 the analytical formulas for finding coordinates of the intersection points of the Apollonius circles are given. 330 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

In Fig. 3 there are the Apollonius circles for all pairs of S and Cj , j =1,...,m. 0 0 Denote by Aj the Apollonius disk corresponding to the Apollonius circle A(zj ,z ). It is known that if customer Cj uses the parallel pursuit strategy and salesman S uses any admissible strategy from , then all possible meeting points of the sales- US man and customer Cj cover the Apollonius disk (Petrosjan, 1983). In particular, if the salesman moves along a straight line, then a meeting point of S and Cj lies on the Apollonius circle. The union of all Apollonius disks is denoted by A, i.e. A = A1 ... Am and ∂A = ∂(A ... A ) is a boundary of the set A. ∪ ∪ 1 ∪ ∪ m In addition, we introduce a notion of the level of the boundary. The boundary ∂A = ∂1A is called the boundary of the first level. If we remove the boundary of the first level, then the remaining Apollonius disks form a new boundary, we call it the boundary of the second-level and denote by ∂2A, etc.

0 0 0 Fig. 3. The Apollonius circles for game Γ (z1 ,...,zm,z )

4. Nash equilibria Introduce the following types of behavior of salesman S.

1 1 Behavior uS: Salesman S uses the type of behavior uS, according to which he moves along a straight line towards customer Cj , that is, to the nearest point on the boundary of the union of all Apollonius disks Aj , j =1,...,m (Fig. 4). 2 2 Behavior uS: Salesman S uses the type of behavior uS, according to which he moves along a straight line to the nearest intersection point of the Apollonius 0 0 0 0 circles A(zj ,z ) and A(zk,z ) (j = k) that belongs to the boundary ∂A (Fig. 5). 3 6 3 Behavior uS: Salesman S uses the type of behavior uS, according to which he moves along a straight line to the nearest intersection point of the Apollonius On a Dynamic Traveling Salesman Problem 331

1 Fig. 4. Behavior uS

2 Fig. 5. Behavior uS 332 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

0 0 0 0 2 circles A(zj ,z ) and A(zk,z ) (j = k) that belongs to the boundary ∂ A (Fig. 6) and then changes his direction and6 moves along a straight line towards the last customer C , l = j = k. l 6 6

3 Fig. 6. Behavior uS

Theorem 1. In the dynamic traveling salesman problem Γ (z1,z2,z3,z4,z) there exists a Nash equilibrium. It is constructed as follows:

The salesman chooses strategy uS∗ that prescribes to him one type of behavior • 1 2 3 uS, uS or uS and gives the minimal meeting time. The customers use Π-strategy. • 1 Remark 1. If there exists behavior uS, then it provides the salesman the minimum 2 meeting time and there is no sense to consider the other types of behavior: uS and 3 uS.

0 0 0 Example 1. Consider a game Γ (z1,...,z4,z ) with salesman S and four customers C1, C2, C3, C4 with initial conditions:

0 0 0 0 0 S = (8; 8), C1 = (7; 1), C2 = (3; 5), C3 = (9; 2), C4 = (12;4) and velocities α =2, β1 =3, 5, β2 =4, β3 =4, β4 =4, respectively. On a Dynamic Traveling Salesman Problem 333

0 0 0 0 0 1 Fig. 7. Nash equilibrium in Γ (z1 ,z2 ,z3 ,z4 ,z ): behavior uS

In this game there exists Nash equilibrium in which salesman S uses behavior 1 uS. In other words, S moves along a straight line towards customer C1 to point (7,636;5,454). The corresponding trajectory of his motion is shown in Fig. 7 (the thick line). The customers move to the following points using Π-strategy: C moves to point (7,636;5,454), • 1 C moves to point (7,666;5,666), • 2 C moves to point (7,706;5,945), • 3 C moves to point (7,671;5,698). • 4 In fact, the salesman meets the customers in the order C3, C4, C2, C1. That is, customer C1 meets with S last. The players’ payoffs in the Nash equilibrium are equal to:

K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,285649, C1 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,178511, C2 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,037922, C3 1 2 3 4 C2 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,162512, C4 1 2 3 4 C3 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,285649. S 1 2 3 4 C3 C2 C3 C4 S − 0 0 0 Example 2. Consider game Γ (z1,...,z4,z ) with salesman S and four customers C1, C2, C3, C4 with initial conditions:

0 0 0 0 0 S = (7; 8), C1 = (2;11), C2 = (3; 6), C3 = (4;14), C4 = (5; 3) and velocities α =2, β1 =4, β2 =4, β3 =4, β4 =4, 334 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

0 0 0 0 0 2 Fig. 8. Nash equilibrium in Γ (z1 ,z2 ,z3 ,z4 ,z ): behavior uS respectively. In this game there exists a Nash equilibrium in which salesman S uses behavior 2 uS. In other words, S moves along a straight line to the nearest intersection point 0 0 0 0 of the Apollonius circles A(z3 ,z ) and A(z4 ,z ) with coordinates (4,278;8,479). The corresponding trajectory of the salesman’s motion is shown in Fig. 8 (the thick line). By the bold dot we mark the end of the route, and the dashed lines cor- respond to trajectories of the customers’ motions C1, C2, C3 and C4. The customers move to the following points using Π-strategy: C moves to point (5,02, 8,348), • 1 C moves to point (5,376;8,286), • 2 C and C move to point (4,278;8,479). • 3 4 In this case the salesman meets the customers in the order C2, C1, C3&C4. Note that two customers C3 and C4 meet the salesman simultaneously. The players’ payoffs in the Nash equilibrium are equal to:

K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,004803, C1 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 0,824385, C2 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,381790, C3 1 2 3 4 C2 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,381790, C4 1 2 3 4 C3 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,381790. S 1 2 3 4 C3 C2 C3 C4 S − On a Dynamic Traveling Salesman Problem 335

Example 3. Consider a game with one salesman and four customers C1, C2, C3, C4 with initial conditions:

0 0 0 0 0 S = (7; 8), C1 = (2;11), C2 = (13;6), C3 = (4;14), C4 = (5; 3) and velocities α =2, β1 =4, β2 =4, β3 =4, β4 =4, respectively.

0 0 0 0 0 3 Fig. 9. Nash equilibrium in Γ (z1 ,z2 ,z3 ,z4 ,z ): behavior uS

In this game there exists a Nash equilibrium in which salesman S uses behavior 3 uS. In other word, S moves along a straight line to the nearest intersection point of 0 0 0 0 the Apollonius circles A(z3 ,z ) and A(z4 ,z ) with coordinates (4,278;8,479), and then changes his direction and moves along a straight line towards last customer C2 to point (5,399;8,106). The corresponding trajectory of the salesman’s motion is shown in Fig. 9 (the thick line). In Fig. 9, besides the Apollonius circles at the initial time moment one can see the Apollonius circle of the last customer C2 at the moment of meeting the salesman with customers C3 and C4. The position of customer C2 at this moment is marked by point C2′ . The customers move to the following points using Π-strategy: C moves to point (4,830;8,382), • 1 C moves to point (5,399;8,106), • 2 C and C move to point (4,278;8,479). • 3 4 336 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

In this case the salesman meets the customers in the order C1, C3&C4, C2. This means that at first the salesman meets only one customer C1, then C3 and C4 at the same time, and the last he meets C2. The players’ payoffs in the Nash equilibrium are equal to: K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,004803, C1 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,972784, C2 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,381790, C3 1 2 3 4 C2 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,381790, C4 1 2 3 4 C3 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,972784. S 1 2 3 4 C3 C2 C3 C4 S − Example 4. Now we consider a case in which the customers are symmetrically po- sitioned relatively to salesman S. In this game there exist several Nash equilibria. Consider a game with one salesman and four customers C1, C2, C3, C4 with initial conditions: 0 0 0 0 0 S = (7; 8), C1 = (11;12), C2 = (3; 4), C3 = (3;12), C4 = (11;4) and velocities α =2, β1 =3,5, β2 =4, β3 =3,5, β4 =4, respectively. Here we have the symmetrically located Apollonius circles and, therefore, we get two Nash equilibria in this game. In both equilibrium situations salesman S uses 3 behavior uS: 1 Starting at the initial time moment salesman S moves along a straight line 0 0 0 0 to the intersection point of the Apollonius circles A(z3 ,z ) and A(z4 ,z ) with coordinates (9,116; 10,857), and then changes his direction and moves along a straight line towards customer C2 to point (8,625; 10,366). In this case the salesman meets the customers in the order C1, C3&C4, C2.

K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,034963, C1 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 2,124748, C2 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,777778, C3 1 2 3 4 C2 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,777778, C4 1 2 3 4 C3 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 2,124748. S 1 2 3 4 C3 C2 C3 C4 S − 2 Starting at the initial time moment salesman S moves along a straight line 0 0 0 0 to the intersection point of the Apollonius circles A(z1 ,z ) and A(z2 ,z ) with coordinates (4,883; 10,857), and then and then changes his direction and moves along a straight line towards customer C4 to point (5,374; 10,366). In this case the salesman meets the customers in the order C3, C1&C2, C4.

K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,777778, C1 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,777778, C2 1 2 3 4 C1 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 1,034963, C3 1 2 3 4 C2 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 2,124748 C4 1 2 3 4 C3 C2 C3 C4 S − K (z0,z0,z4,z0,z0,u ,u ,u ,u ,u )= 2,124748. S 1 2 3 4 C3 C2 C3 C4 S − On a Dynamic Traveling Salesman Problem 337

So, the route of the salesman can be finished at points (8,625; 10,366) or (5,374; 10,366). The corresponding trajectories of the salesman’s motion is shown in Fig. 10 (the thick line).

0 0 0 0 0 3 Fig. 10. Nash equilibria in Γ (z1 ,z2 ,z3 ,z4 ,z ): behavior uS in a symmetric case

5. Conclusion In the considered dynamic traveling salesman problem we propose a new approach to finding a solution of this task. Applying methods and solution concepts of pursuit game theory we describe motion of the salesman and customers in a form of differ- ential equations and assign them goals to meet the salesman as soon as possible. We find Nash equilibria and consider different examples which illustrate all possible cases of behavior. Further research could be deal with a cooptative version of this dynamic traveling salesman problem taking into account some companies which have many branches interacting with each other and the main office. In cooperative dynamic games the core is often considered as a main solution concept. However, it is important for a solution being time-consistent (Petrosjan, 1977). This property and also strong time-consistency of the core are investigated in (Tarashnina, 2002; Pankratova, 2010; Sedakov, 2015).

References Isaaks, R. (1965). Differential Games: a mathematical theory with applications to warfare and pursuit, Control and Optimization, New York: Wiley. Kleimenov, A. F. (1993). Non zero-sum differential games, Ekaterinburg: Nauka. Lawler, E. L., Lenstra, J. K., Rinnooy Kan, A. H. G. and Shmoys, D. B. (1986). The Trav- eling Salesman. JohnWiley and Sons. 338 Svetlana Tarashnina, Yaroslavna Pankratova, Aleksandra Purtyan

Michalewicz, Z. (1994). Genetic Algorithms + Data Structures = Evolution Programs. Springer-Verlag, 2nd edition. Pankratova, Y. (2007). Some cases of cooperation in differential pursuit games. Contri- butions to Game Theory and Management. Collected papers presented on the Inter- national Conference Game Theory and Management / Editors L. A. Petrosjan, N. A. Zenkevich. St.Petersburg. Graduated School of Management, SPbGU, 361–380. Pankratova, Ya. B. (2010). A Solution of a cooperative differential group pursuit game. Diskretnyi Analiz i Issledovanie Operatsii, 17(2), 57–78 (in Russian). Pankratova, Ya. and Tarashnina, S. (2004). How many people can be controlled in a group pursuit game. Theory and Decision. Kluwer Academic Publishers, 56, 165–181. Pankratova, Y., Tarashnina, S., Kuzyutin, D. (2016). Nash Equilibria in a Group Pursuit Game. Applied Mathematical Sciences, 10(17), 809–821. Petrosjan, L. A. (1965). On a Family of Differential Games of Survival in Rn. Akad. Nauk Dokl. SSSR Ser. Mat., 1, 52–54. Petrosjan, L. A. (1977). Ustojchivost’ Reshenij v Differencial’nyh Igrah so Mnogimi Uchast- nikami. Vestnik Leningrad Univ. Math., 19(4), 49–52. Petrosjan, L. A. and Shirjaev, V. D. (1981). Simultaneous Pursuit of Several Evaders by one Pursuer. Vestnik Leningrad Univ. Math., 13. Petrosjan, L. and Tomskii, G. (1983). Geometry of Simple Pursuit. Nauka, Novosibirsk (in Russian). Reinelt, G. (1994). The Traveling Salesman: Computational Solutions for TSP Applica- tions. Springer-Verlag. Tarashnina, S. (1998). Nash equilibria in a differential pursuit game with one pursuer and m evaders. Game Theory and Applications. N.Y. Nova Science Publ. III, 115–123. Tarashnina, S. (2002). Time-consistent solution of a cooperative group pursuit game. In- ternational Game Theory Review, 4, 301–317. Sedakov, A. A. (2015). Strong Time Consistent Core. Mat. Teor. Igr. Prilozh., 7(2), 69–84. Contributions to Game Theory and Management, X, 339–349 Constructive and Blocking Powers in Some Applications

Svetlana Tarashnina1 and Nadezhda Smirnova2 1 St.Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: [email protected] 2 Higher School of Economics Department of Applied Mathematics and Business Informatics, Soyuza Pechatnikov ul. 16, St. Petersburg, 190008, Russia E-mail: [email protected]

Abstract We investigate the prenucleolus, the anti-prenucleolus and the SM-nucleolus in glove market games and weighted majority games. This kind of games looks desirable for considering solution concepts taking into account the blocking power of a coalition S with different weights. Analyt- ical formulae for calculating the solutions are presented for glove market game. Influence of the blocking power on players’ payoffs is discussed and the examples which demonstrate similarities and differences comparing with other solution concepts are given. Keywords: cooperative TU-game, solution concept, prenucleolus, SM-nucleolus, constructive and blocking power, glove market game, weighted majority game.

1. Introduction In this paper we consider two types of classical cooperative games ”glove mar- ket game” and ”weighted majority game”. Our purpose is to look for the ways to distribute the value v(N) according to some excess-based solution concepts of TU- games such as the prenucleolus (Schmeidler, 1969), the anti-prenucleolus and the SM-nucleolus (Tarashnina, 2011). The interest to these solution concepts is asso- ciated with different taking into account the constructive power and the blocking power of a coalition S N in the game. In fact, ⊆ the prenucleolus takes into account only the constructive power; • the anti-prenucleolus takes into account only the blocking power; • the SM-nucleolus takes into account the average of the constructive and the • blocking power. Comparing the allocations obtained by the mentioned concepts, we attempt to answer what do they represent in the considered games, their similarities and dif- ferences. In particular, we evaluate the impact of the constructive and the blocking power on payoffs of players. Unfortunately, there exists no analytic formulae for the excess-based solutions which have to be computed numerically by solving a series of linear programming problems. The existence of analytic formulae for calculating solutions allow to find allocations directly and analyze them. Regarding to glove market games we use for- mulae from Tarashnina and Sharlai (2015). For weighted majority games we apply the algorithm from Britvin and Tarashnina (2013). The paper is structured as follows. Section 1 contains notions and definitions of cooperative game theory. Section 2 is devoted to glove market games. Apart from 340 Svetlana Tarashnina, Nadezhda Smirnova analytic formulae some illustrative examples are presented and discussed there. In Section 3 we switch reader’s attention to weighted majority games and analyze the considered solutions in this class of games. Finally, we give some insight into the blocking power of a ”weak” and ”strong” players in TU-games.

2. Cooperative game theory concepts In this paper we deal with cooperative games with transferable utility, or simply TU-games. A cooperative TU-game is a pair (N, v), where N = 1, 2,...,n is the set of players and v : 2N R1 is a characteristic function with{ v( ) = 0.} Here 2N = S N is the set of→ coalitions in (N, v). Since the game (N, v)∅ is completely determined{ ⊆ by} the characteristic function v, we shall sometimes represent a TU- game by its characteristic function v. Let GN be the set of TU-games with a finite set of players N. Due to the classical cooperative approach we look for the ways to distribute the amount v(N) over the members of the grand coalition. A corresponding vector of payoffs (or a set of vectors) that distributes the amount v(N) among the players is called a solution of the game. Here we consider solutions that belong to the set X0(N, v) of preimputations of a game (N, v), i.e. X0(N, v) = x Rn : x(N) = v(N) . { ∈ Let} x be a preimputation in a game (N, v). The excess e(x,v,S) of a coalition S at x is e(x,v,S)= v(S) x(S). Due to Maschler (1992), the excess of a coalition evaluates a measure of dissatisfaction− of the coalition at preimputation x, which should be minimized. For each z Rn we define the vector θ(z) Rn, which arises from z by arranging its components∈ in a non-increasing order. ∈

Definition 1. The prenucleolus of a game (N, v) is the set of vectors in X0(v) whose θ(e(x,v,S)S N )’s are lexicographically least, i.e. ⊆ 0 0 (v)= x X (v): θ e(x,v,S)S N lex θ e(y,v,S)S N for all y X (v) . N { ∈ ⊆  ⊆ ∈ } The prenucleolus of a game is a singleton (Schmeidler, 1969), so we denote this single point by ν(v). From Definition 1 it follows that the prenucleolus doesn’t take into account the blocking power of a coalition. This allocation method is based on a notion of constructive power. The meaning of the constructive power relates to the amount v(S) and it is the worth of coalition S, or to be exact what S can reach by cooperation. Two allocation methods that consider the blocking power are the SM–nucleolus and the anti-prenucleolus. By the blocking power of coalition S we understand the difference between v(N) and v(N S) — the amount v∗(S) that the coalition S brings to N if the last is formed —\ its contribution to the grand coalition. Given a cooperative TU-game (N, v), the dual game (N, v∗) of(N, v) is defined by v∗(S)= v(N) v(N S) − \ for all coalitions S N. Then, the dual excess e(x, v∗,S) of a coalition S N at x ⊆ ⊆ is e(x, v∗,S)= v∗(S) x(S) where x is a preimputation in (N, v). − Definition 2. The anti-prenucleolus of a game (N, v) is defined as

0 0 ψ(N, v)= x X (N, v): θ(e(x, v∗,S) θ(e(y, v∗,S) for all y X (N, v) , { ∈ ≺lex ∈ } Constructive and Blocking Powers in Some Applications 341 where θ(e(x, v∗,S)S N ) is a vector of excesses which components are arranged in non-increasing order.⊆ The anti-prenucleolus takes into account only the blocking power of each coalition. Clearly, the anti-prenucleolus of a game (N, v) can be defined as the prenucleolus of the dual game (N, v∗). In order to define the SM-nucleolus, we consider the weighted sum-excess of a coalition S N at each x X0(N, v) as follows ⊆ ∈ 1 1 e(x,v,S)= e(x,v,S)+ e(x, v∗,S). 2 2 Definition 3. The SM-nucleolus of a game (N, v) is defined as

µ(N, v)= x X0(N, v): θ(e(x,v,S) θ(e(y,v,S) for all y X0(N, v) , { ∈ ≺lex ∈ } where θ(e(x,v,S)S N ) is a vector of sum-excesses which components are arranged in non-increasing order.⊆ 1 Here the weights for the constructive and the blocking power are equal to . How- 2 ever, these weights can be arbitrary, what has been shown in Smirnova and Tarashnina (2012), Smirnova and Tarashnina (2016). Notice that the SM-nucleolus coincides with the prenucleolus of the constant- v + v sum game (N, w) where w = ∗ (Tarashnina, 2011). 2 3. Glove market game A glove game is one of the most popular market games in cooperative game theory that was proposed by Shapley and Shubik (1969). This game describes the following situation: there are two types of complementary products on the market and a finite set of firms each of which can produce a product of one type. A customer needs both types of products. Let N consists of two types of players N = P Q where P and Q are disjunct sets of players. Each player of P owns a right-hand∪ glove and each player of Q owns a left-hand glove. If j members of P and k members of Q form a coalition, they have min j, k complete pairs of gloves, each being worth 1. Unmatched gloves are worth nothing.{ } The characteristic function for the game is defined by

v(S) = min P S , Q S , S N. (1) {| ∪ | | ∪ |} ⊂ Thus, the worth of a coalition S is equal to the number of pairs of gloves the coalition can assemble. Without loss of generality, assume P Q . This model is quite popular in cooperative game| | ≥ theory | | and its core as well as other classical solution concepts have been already studied (see Owen, 1975; Aumann and Shapley, 1974; Billera and Raanan, 1981; Einy et al., 1996). The core represents a payoff vector where the holders of the scarce commodity (the left-glove owners in our case) obtain a payoff of 1 and the other players obtain nothing. This result holds for P = 100 and Q = 99 as well as for P = 100 and Q = 1. The same result is fulfilled| | for the prenucleolus| | since it belongs| | to the core (if| | the last is nonempty). The following result holds. 342 Svetlana Tarashnina, Nadezhda Smirnova

Theorem 1. Let (N, v) be a glove market game with characteristic function (1). Suppose that P = p, Q = q, P = i1,...,ip , Q = j1,...,jq , p > q. Then, the prenucleolus ν|(v|) is defined| | by the formulae{ } { }

for p>q νik =0, νjl =1, (2) 1 1 for p = q ν = , ν = , (3) ik 2 jl 2 where i P , j Q. k ∈ l ∈ Shapley and Shubik (1969) noticed violent insensitivity of the core to the ratio of the dimensions of sets P and Q. In contrast, the Shapley value is sensitive to the relative scarcity of the gloves what is an attractive property of the Shapley value. The SM-nucleolus also possesses this desirable sensitivity. For the SM-nucleolus we present here the following result obtained in Tarashnina and Sharlai (2015). Theorem 2. Let (N, v) be a glove market game with characteristic function (1). Suppose that P = p, Q = q, P = i1,...,ip , Q = j1,...,jq , p q. Then, the SM-nucleolus| µ|(v) is| defined| by the{ formulae } { } ≥ q 3p 2q µ = , µ = − , (4) ik 4p 2q jl 4p 2q − − where i P and j Q. k ∈ l ∈ Let us introduce analytic formulae for the anti-prenucleolus in the following form.

Theorem 3. Let (N, v) be a glove market game with characteristic function (1). Suppose that P = p, Q = q, P = i ,...,i , Q = j ,...,j , p q. Then, the | | | | { 1 p} { 1 q} ≥ anti-prenucleolus ν∗(v) is defined by the formulae q 1 ν∗ = , ν∗ = . (5) ik 2p jl 2 where i P , j Q. k ∈ l ∈ In order to compare the solution concepts we present a matrix A of payoffs where apq is a payoff of a right-glove holder in the game that describes the market with p right-hand owners and q left-hand ones, p 1, q 1. ≥ ≥ Table 1. The prenucleolus matrix

no. of left-glove holders 1 2 3 4 no. of 1 0,5 1 1 1 right- 2 0 0,5 1 1 glove 3 0 0 0,5 1 holders 4 0 0 0 0,5

As we see in Table 1 the whole payoff is given to the holders of the scarce com- modity (”strong” players). The rest players receive nothing. The same outcome, that is equal to 0, the players obtain in case they do not cooperate. These players Constructive and Blocking Powers in Some Applications 343 unlikely agree with that kind of allocations. In our opinion, the allocations proposed by the prenucleolus are nonviable. From the other side, the anti-prenucleolus takes into account only the block- ing power of players, and the ”weak” players all together have the half of the total amount, so the strong ones do (see Table 2). That kind of divisions can be inappropriate for players with the scarce commodity.

Table 2. The anti-prenucleolus matrix

no. of left-glove holders 1 2 34 no. of 1 0,5 0,5 0,5 0,5 right- 2 0,25 0,5 0,5 0,5 glove 3 0,167 0,333 0,5 0,5 holders 4 0,125 0,25 0,375 0,5

At the same time, the SM-nucleolus (Table 3) considers a sort of average of the prenucleolus and the anti-prenucleolus. In addition, comparing the payoffs of weak players with the payoffs the Shapley value assigns to them (Table 4) the following inequalities hold. 1 0 xw xw xw xw , ≤ Pr ≤ Sh ≤ SM ≤ Anti ≤ 2 w w w w where xPr , xSh, xSM and xAnti are payoffs of a weak player according to the prenucleolus, the Shapley value, the SM-nucleolus and the anti-prenucleolus, cor- respondingly.

Table 3. The SM-nucleolus matrix

no. of left-glove holders 1 2 3 4 no. of 1 0,5 0,667 0,7 0,714 right- 2 0,167 0,5 0,625 0,667 glove 3 0,1 0,25 0,5 0,6 holders 4 0,071 0,167 0,3 0,5

Table 4. The Shapley value matrix

no. of left-glove holders 1 2 3 4 no. of 1 0,5 0,667 0,75 0,8 right- 2 0,167 0,5 0,65 0,733 glove 3 0,083 0,233 0,5 0,638 holders 4 0,05 0,133 0,271 0,5

In addition, we present here some significant examples to give the intuition of the considered solution concepts. The examples demonstrate some similarities and 344 Svetlana Tarashnina, Nadezhda Smirnova differences of the presented solutions comparing with well-known solution concepts of TU-games such as the core and the Shapley value.

Example 1. Suppose that P = 1, 2 , Q = 3 . The resulting payoff vectors for this game are presented in the following{ } table: { }

The core C(v)=(0, 0, 1) The prenucleolus ν(v)=(0, 0, 1)

1 1 2 The Shapley value ϕ(v)= 6 , 6 , 3

1 1 2  The SM-nucleolus µ(v) = ( 6 , 6 , 3 ) 1 1 1 The anti-prenucleolus ν∗(v)= 4 , 4 , 2  The prenucleolus as well as the core assigns one to player 3 and zero to the other players. On the other hand, players 1 and 2 together can prevent player 3 from getting 1 by forming a coalition against him. Therefore, together they have the same blocking power as player 3 does. The SM-nucleolus takes into account the blocking power of coalition 1, 2 . It 2 1 { } assigns to player 3 and to players 1 and 2 each. Note that for this game the 3 6 SM-nucleolus coincides with the Shapley value. That result was proved for an ar- bitrary three-person TU-game in Tarashnina, 2011. This gives some insight that the SM-nucleolus is a solution concept with similar to the Shapley value properties. 1 The anti-prenucleolus takes into account only the blocking power and assigns 2 1 to player 3 and to players 1 and 2 each. 4 Example 2. Suppose that P = 1, 2, 3 , Q = 4 . The resulting payoff vectors for this game are presented in the following{ } table:{ }

The core C(v)=(0, 0, 0, 1) The prenucleolus ν(v)=(0, 0, 0, 1)

1 1 1 3 The Shapley value ϕ(v) = ( 12 , 12 , 12 , 4 )

1 1 1 7 The SM-nucleolus µ(v) = ( 10 , 10 , 10 , 10 )

1 1 1 1 The anti-prenucleolus ν∗(v) = ( 6 , 6 , 6 , 2 )

The prenucleolus does not assigns any positive payoff to the owners of a right- hand glove, the whole amount of the total payoff goes to the holder of the unique left-hand glove. The Shapley value assigns 25 percents of the total payoff to players 1, 2, and 3 altogether, whereas the SM-nucleolus distributes 30 percents of the total payoff between the holders of a right-hand glove. The anti-prenucleolus allocates 50 percents of the total payoff between the players of different types. Constructive and Blocking Powers in Some Applications 345

Example 3. Suppose that P = 1, 2, 3 , Q = 4, 5 . The resulting payoff vectors for this game are presented in the{ table below.} { }

The core C(v)=(0, 0, 0, 1, 1) The prenucleolus ν(v)=(0, 0, 0, 1, 1)

14 14 14 39 39 The Shapley value ϕ(v) = ( 60 , 60 , 60 , 60 , 60 )

1 1 1 5 5 The SM-nucleolus µ(v) = ( 4 , 4 , 4 , 8 , 8 )

1 1 1 1 1 The anti-prenucleolus ν∗(v) = ( 3 , 3 , 3 , 2 , 2 )

In this example the Shapley value assigns 35 percents of the total payoff to players 1, 2, and 3 altogether, whereas the SM-nucleolus distributes 37,5 percents of the total amount between the holders of a right-hand glove. As a result, we can notice that the SM-nucleolus distributes a bigger part of v(N) between the players with a non-scarce glove than the Shapley value. Finally, let us give an example when p = q. Example 4. Suppose that P = 1, 2, 3 , Q = 4, 5, 6 . The resulting payoff vectors for this game are presented below.{ } { }

1 1 1 1 1 1 The core C(v) = ( 2 , 2 , 2 , 2 , 2 , 2 ) 1 1 1 1 1 1 The prenucleolus ν(v) = ( 2 , 2 , 2 , 2 , 2 , 2 ) 1 1 1 1 1 1 The Shapley value ϕ(v) = ( 2 , 2 , 2 , 2 , 2 , 2 )

1 1 1 1 1 1 The SM-nucleolus µ(v) = ( 2 , 2 , 2 , 2 , 2 , 2 )

1 1 1 1 1 1 The anti-prenucleolus ν∗(v) = ( 2 , 2 , 2 , 2 , 2 , 2 )

As a matter of fact, all players in this game are treated equally and all considered solution concepts propose the same payoff vector as a solution of the game. These examples illustrate also what happens with the right- and left-hand glove owners’ payoffs according to the solutions when changing the dimensions of sets P and Q. Actually, we have demonstrated the importance of taking into account the blocking power of coalitions. The important class of games where forming a block plays a crucial role is a class of weighted majority games.

4. The weighted majority games The excess-based solution concepts taking into account the blocking power have an important applications in modelling the power of players in voting games. There are some well-known power indices such as the Shapley-Shubik power index and the Banzhaf index. 346 Svetlana Tarashnina, Nadezhda Smirnova

In such games, a proposed bill or decision is either passed or rejected. In voting body, the voting rule specifies which subsets of players are large enough to pass bills, and which are not. Those subsets that can pass bills without outside help are called winning coalitions, while those that cannot are called losing coalitions. In such a case, we can take the worth of a winning coalition to be 1 and the worth of a losing coalition to be 0. The resulting game, in which all coalitions have a value of either 1 or 0, is called a simple game. A simple game is completely specified once its winning coalitions are known, and it is traditional to require it to satisfy some reasonable conditions. Definition 4. A simple game is a pair (N, ), where N is the set of players and is the collection of winning coalitions, suchW that W – / (the empty set is a losing coalition); ∅ ∈W – N (the grand coalition is winning); ∈W – S and S T imply T (if S is a winning coalition, so is any coalition∈ W that contains⊆ S). ∈ W One common type of a simple game is a weighted voting game, which is usually represented by [q; ω1, ..., ωn]. Such games are defined by a characteristic function of the form 1 if ωi > q, i S  X∈ v(S)=    0 if ωi q ≤ i S  X∈  for some non-negative numbers ωi, called the weights, and some positive number q, 1 called the quota. If q = ω , we deal with a weighted majority game. 2 i i N X∈ Example 5. Consider two versions of a voting game with four players, in which player 1 has 5 shares and players 2 to 4 have 2 shares each. The quota in the first case is 6, and in the second case is 5. So, there are the following games [6;5,2,2,2] and [5;5,2,2,2] under consideration and we propose to look at the behavior of different solutions in these games. The list of all coalitions and their worths are presented in Tables 5 and 7. It can be noticed that for the considered class of games the following inequality for any S N holds ⊆ v(S) w(S) v∗(S). ≤ ≤

Case 1. The game [6;5,2,2,2]. Note that the core in this game consists of the unique point (1,0,0,0) and, clearly, the prenucleolus is the same point. Comparing the SM- nucleolus and the Shapley-Shubik power index one can see that the SM-nucleolus 1 assigns to each weak player payoff what is more than the Shapley-Shubik power 10 index does. The anti-prenucleolus treats the weak players even higher likely because it takes into account the blocking power of coalition 2,3,4 . Case 2. The game [5;5,2,2,2]. This case differs from the{ previous} one by the status of coalition 2,3,4 . Here coalition 2, 3, 4 belongs to the set of winning coalitions of the game.{ This} is a constant-sum{ simple} game and it is known that the core Constructive and Blocking Powers in Some Applications 347

Table 5. Case 1. The game [6;5,2,2,2].

S v(S) v∗(S) w(S) 1 0 1 0,5 2 0 0 0 3 0 0 0 4 0 0 0 1,2 1 1 1 1,3 1 1 1 1,4 1 1 1 2,3 0 0 0 2,4 0 0 0 3,4 0 0 0 1,2,3 1 1 1 1,2,4 1 1 1 1,3,4 1 1 1 2,3,4 0 1 0,5 1,2,3,4 1 1 1

Table 6. Case 1. Solutions of the game [6;5,2,2,2].

The prenucleolus ν(v)=(1, 0, 0, 0) 7 1 1 1 The SM-nucleolus µ(v)= , , , 10 10 10 10  1 1 1 1  The anti-prenucleolus ν∗(v)= , , , 2 6 6 6   3 1 1 1 The Shapley-Shubik power index ϕ(v)= , , , 4 12 12 12  

2 1 1 1 in these games is empty. Here we get µ(v)= ν(v)= ν∗(v)= , , , . The 5 5 5 5 difference between the these payoffs and the Shapley-Shubik payoffs are interpreted as in the previous case: the weak players get lesser in the the Shapley-Shubik power index. However, the fact that coalition 2,3,4 is winning plays a positive role increasing the power of the weak players in two{ times,} what cannot be said about player 1, whose power decreases. In case 2 coalition 2,3,4 becomes winning what increases the power of the { } 1 1 weak players in two times comparing with case 1 in case 2 versus in case 1 . 5 10   5. Conclusion The analysis of the considered games shows that for some classes of games the prenu- cleolus assignes inappropriate allocations to players. Then, it is necesary to consider alternative solution concepts like the SM-nucleolus and the anti-prenucleolus, which take into account the blocking power of a coalition in the game. Especially it is im- portant for the class of weighted majority games since forming blocks there may discourage to passing a bill. The name of blocking power reflects a key point of a voting process. 348 Svetlana Tarashnina, Nadezhda Smirnova

Table 7. Case 2. The game [5;5,2,2,2].

S v(S) v∗(S) w(S) 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 1,2 1 1 1 1,3 1 1 1 1,4 1 1 1 2,3 0 0 0 2,4 0 0 0 3,4 0 0 0 1,2,3 1 1 1 1,2,4 1 1 1 1,3,4 1 1 1 2,3,4 1 1 1 1,2,3,4 1 1 1

Table 8. Case 2. Solutions of the game [5;5,2,2,2].

2 1 1 1 The nucleolus ν(v)= , , , 5 5 5 5   2 1 1 1 The SM-nucleolus µ(v)= , , , 5 5 5 5  2 1 1 1 The anti-prenucleolus ν∗(v)= , , , 5 5 5 5   1 1 1 1 The Shapley-Shubik power index ϕ(v)= , , , 2 6 6 6  

Apart from the solution concepts considered in the game, there is a relatively new solution concept called the α-prenucleoli set. That set consists of points, each of which takes into account the constructive power with the weight α [0, 1] and the blocking power with the weight 1 α. Clearly, it contains the prenucleolus,∈ the SM-nucleolus and the anti-prenucleolus.− Investigation of this set-valued concept helps to find a value α for which the corresponding solution would possess good properties and be appropriate for players. In that paper we consider the games where the players are devided on the weak and strong ones. The interesting point is to pay attention to the class of games with a veto-player or with a major player (Parilina and Sedakov, 2014; Parilina and Sedakov, 2016).

References Aumann, R. J. and L. S. Shapley (1974). Values of non-atomic games. Princeton University Press. Billera, L. J. and J. Raanan (1981). Cores of non-atomic linear production games. Mathe- matics of Operations Research, 6, 420–423. Britvin, S. V. and S. I. Tarashnina (2013). Algorithms of finding the prenucleolus and the SM-nucleolus of cooperative TU-games. Mat. Teor. Igr Prilozh., 5 (4), 14–32 (in Rus- sian). Constructive and Blocking Powers in Some Applications 349

Einy, E., R. Holzman, D. Monderer and B. Shitovitz (1996). Core and Stable Sets of Large Games Arising in Economics. J. Economic Theory, 68, 200–211. Maschler, M. (1992) The bargaining set, kernel, and nucleolus: a survey. In: Handbook of Game Theory (Aumann, R. J. and S. Hart, eds), Vol. 1, pp. 591–665. Elsevier Science Publishers BV. Owen, J. (1975) On the core of linear production games. Mathematical Programming, 9, 358–370. Parilina, E. and A. Sedakov (2014). Stable Cooperation in Graph-Restricted Games. Con- tributions to Game Theory and Management, 7, 271–281. Parilina, E. and A. Sedakov (2016). Stable Cooperation in a Game with a Major Player. International Game Theory Review, 18(2), 1640005. Schmeidler, D. (1969). The nucleolus of a characteristic function game. SIAM J. Appl. Math, 17, 1163–170. Shapley, L. and M. Shubik (1969). Pure competition, coalitional power, and . International Economic Review, 10(3), 337–362. Smirnova, N. V. and S. I. Tarashnina (2012). Geometrical properties of the [0, 1]-nucleolus in cooperative TU-games. Mat. Teor. Igr Prilozh., 4(1), 55–73 (in Russian). Smirnova, N. V. and S. I. Tarashnina (2016). Properties of solutions of cooperative games with transferable utilities. Russian Mathematics, 60(6), 63–74. Tarashnina, S. (2011)The simplified modified nucleolus of a cooperative TU-game. TOP, 19(1), 150–166. Tarashnina, S. and T. Sharlai (2015). The SM-nucleolus in glove market games. AMS, 19(27), 1331–1340. Contributions to Game Theory and Management, X, 350–374 Coordination in Multilevel Supply Chain⋆

Ekaterina N. Zenkevich1, Yulia E. Lonyagina2 and Maria V. Fattakhova3 1 St. Petersburg State University 7/9 Universitetskaya emb., St. Petersburg, 199034 Russia E-mail: [email protected] 2 St. Petersburg State University 7/9 Universitetskaya emb., St. Petersburg, 199034 Russia E-mail: [email protected] 3 St. Petersburg State University of Aerospace Instrumentation 67 B. Morskaya str., St. Petersburg, 190000 Russia E-mail: [email protected]

Abstract There is a task of coordination in the multilevel supply chains with the tree-like structure taking into consideration the linearity of supply in the final markets that is discussed in this article. Three ways are suggested by authors in order to solve the chain coordination problem, i. e. to the rule of the players strategies choice that are satisfying the certain criteria of optimality. The first way is a decentralized solution that will be issued only when all the supply chain participants act independently from each other. The second way is the optimization of the overall chains revenue in the cooperative game, so called centralized solution. Finally, the third solution is the Nash weighted solution that is created by the optimization of the Nash weighted multiplication. Based on the particular example there is a comparison of all the ways discussed in the article. Keywords: Multilevel supply chains, tree-like structure, overall chains rev- enue, Nash weighted solution.

1. Introduction Modern world is closely connected with trade and business, which supply chain is the indispensable part of. The necessity of firms to sell their goods after being produced make them develop their trading activities by systems of trade flows and trade connections organization. Every year because of the progress and globalization pressure there is a growth of not only the number of these systems, but also of the difficulty, namely their structure and scale. In addition, there are appearing problems of optimization in the already organized supply chains, however the im- portance of their solution might be sometimes underestimated. As a result, badly organized operational performance leads to the loss and nonnetted gain. Therefore, not only the supply chains wide incidence, but also importance of the optimization solutions under the revenue criteria makes the problem of coordination among the players in the supply chain could not more up-to-date. In terms of this, the goal of the current article is the elaboration of the participants coordination way that is aimed to optimize supply chain under the revenue criteria. In the following article one of the most omni-purpose and widespread kind of supply chains is examined, namely, the multilevel supply chains with the tree-like ⋆ This work is supported by the Russian Foundation for Basic Research, project N 16-01- 00805A Coordination in Multilevel Supply Chain 351 structure (the example of such a chain is depicted on the Fig. 1). The problem for these chains coordination is not well studied, because the supply chains modeling of that particular structure has just recently begun. This problem was examined in the works by Corbett C., Karmarkar U. S. (2001) and Carr M. S., Karmarkar U. S. (2005) for the first time. However, later on modeling of the multilevel supply chains was continued in the direction of pricing contracts and horizontal competi- tion (Kaya, 2012; Cho, 2014). Only recently scientists have returned back to the optimization of the multilevel supply chains (Zhou et al., 2015). In the following paper, there are three approaches to the coordination of partic- ipants, that are based on the different models of interaction or on the optimization criteria. For each of the approaches we are describing the process of the partici- pants interaction, based on that we are formulating the optimization criteria and are designing the satisfying way of solutions design.

Fig. 1. The example of the multilevel supply chain with tree-like distribution structure

The further structure of the article will be organized in the following way: Section 2 is devoted to the mathematical formalization of multilevel supply chains with the tree-like distribution structure; in the Sections 3, 4, and 5 there are decentralized, centralized and weighted Nash solutions that are analyzed; in the Section 6 there is an example stated and the delivered results compared; in the final, 7th, section there is a summary and the results of the research presented. 352 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

2. Mathematical formalization of the supply chains Let us look at the tree-like graph G = (X, F ) with a mutual peaks of X and a mutual verges of F . The root peak of this tree can be named as x1. In the set of peaks X let us define the sets of X ,...,X ,X X in the following way: 1 l i ⊂ 1 X1 = x ,Xl = x X Fx = , { } { ∈ | ∅} (1) X = (F X ) for x X , k = 1,l 2, if (F X )= k+1 x \ l ∈ k − x \ l ∅ Comment 1. The inserted multitudes are setting the division of multitude X, such as U l X = X,X X = ,e = r. i=1 i e r ∅ 6 Definition 1. Subset ofT junctures X X,i = 1,...,l, will be named as the set i ⊂ of peaks (junctures) of the Level i. The junctures from the set of Xl will be named the final or the finite.

i We will denote the junctures x from the multitude X as xj , where the upper index is equal to the number of the level Xi, where this peak is situated and the lower index to the order number of this peak in the multitude Xi. For the uniformity, 1 1 the root juncture x will be denoted as x1. What is more by mi we will understand the number of the junctures of the level of I, i.e. m = X , where X - the power i | i| | i| of the multitude Xi.

Definition 2. We will say that dissection of X1,...,Xl the multitude of X peaks, that was defined under the rule of (1), is defining the supply chain with the tree-like control (distributive) structure.

Definition 3. The sector of the peak xi X X is the name of the multitude j ∈ \ l F i . xj Comment 2. The multitude of the sectors together with the root peak are con- trolling the dissection on the multitudes of peaks X. i Under the multitude Sj we will understand the multitude of pairs of indexes of i these tech junctures that are included in the sector of the juncture xj X Xl, so i k i ∈ \ as S = (k,h) x Fxi . Let us notice that under the generation S = . j { | h ∈ j } j 6 ∅ i Assume that every peak xj ,i = 1,l,j = 1,ml, of supply chain consists of fi- i nij nite plurality of elements xjk k=1, for which the set of lattice points is defined nij { } vijk k=1, k : vijk 0, where nij is any positive integer that is not less than 1. This{ } plurality∀ of elements≥ is a context-wise a group of competitive firms that are producing and consuming the homogeneous product as well as having the different vijk production costs (the production power is meant to be unrestricted). For each i i firm xjk xj let us work in the variable qijk 0, that is characterizing the running production∈ volume of this firm as well as the≥ integrated volume of the homogeneous i nij i product that was produced by all firms xjk k=1 from the juncture xj , let us call nij { } i as Qij = k=1 qijk. Then for the sector of each juncture xj X Xl supply chain the following condition is considered to be fulfilled: ∈ \ P nij nrh Qij = qijk = qrht, (2) k=1 r,h:(r,h) Si t=1 X X∈ j X Coordination in Multilevel Supply Chain 353 meaning that there is no deficit or surplus of production in the supply chain. i For every juncture xj X let us work in the variable pij that is equivalent sense ∈ i nij i wise the price according to that firms xjk k=1 from the juncture xj are selling the unit of the good produced. It is considered{ } that for the every of the final peaks xi X there is the following linear function prescribed j ∈ l p = a b Q (3) lj lj − lj lj where alj > 0, blj > 0. In fact, it means that the final peaks are realizing their product in the non-competitive consumer markets that are functioning according to the Cournot model with the linear correspondence that could be expressed by the formula (3).

Definition 4. The set of definitions ( qijk i,j,k, pij i,j is defining the trading flow d in the supply chain. { } { }

Definition 5. Flow d will be named feasible, if plj > 0,Qij > 0, j = 1,ml. Let the set D be the multitude for all the feasible flows in the supply chain. For i nij x i ∈ j each of the firms xjk k=1 for i = 1,l,j = 1,ml let us define the function πijk – the revenue function{ that} is set on the multitude D among all the feasible trading flows in the following way: q (p v ), if i = 1; 11k 11 − 11k π (d)= q (a b Q p v ), if i = l; ijk  ljk ij lj lj rh ljk q (p − p v− ), − in all other cases.  ijk ij − rh − ijk i r where prh : xj Sh.  Let us arrange∈ the multitude of peaks X supply chain: in the first place is a root peak, then the junctures of the second level in the ascending order, then of the third, fourth levels and up to the final inclusively, i.e. we will receive the arranged 1 2 2 l system x1, x1, x2,...,xmi . This arranged multitude of all the junctures (let us denote it{ with N) of supply} chain we will consider as the multitude of players. i The multitude of Uijin the strategy of the player xj will be considered as the multitude of all the possible vectors uij D, where uij is created out of the arranged ∈ i nij i order of variables that are defined for all the firms xjk k=1 xj and are situated within the area defining the feasible flow, namely: { } ∈

i uij = (qij1,...,qijnij ,pij ) D , xj N; i = 1,l 1, j = 1,ml Uij = { ∈ } ∈ − (4) u = (q ,...,q ) D , xl N, j = 1,m .  lj lj1 ljnlj ∈ j ∈ l Within this article we will examine three ways of the objectives formulation and optimality criteria. Let us consider the case when each of the supply chains participants is acting independently from each other and exclusively in favor of his own interests, then such model and corresponding to it solution will be named decentralized. If all the supply chain participants are cooperating and predefining to act concordantly in order to maximize the total revenue of the supply chain, then such problem will be called centralized. The third variant weighted Nash solution is the result of the optimization problem solution, in which as a matter of the objective function the weighted Nash solution is stated whereas as a status quo point it is the solution of the decentralized model in the same supply chain that is used. 354 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

3. Game-theoretic model of the multilevel decentralized supply chain 3.1. Formalization and the optimality criteria First of all, let us describe the procedure of the decision-making in the decentralized model: Step 1. The root juncture is denoting the selling price for the junctures of its sector. Step 2. The peaks of the second level in the supply chain having received the information from the root juncture, are defining the price for a good to the peaks of their sectors. Then the procedure is repeating up to the junctures of the next to last level inclusively. Step 3. The final peaks based on the prices having received from their suppliers, and supply functions are defining the volumes of production of the good to the market. Step 4. The procedure of volumes disposal is happening between firms on the each of the peaks of the final level. Step 5. Information about the volumes is arriving to all the upper-situated levels and within each juncture is happening the procedure of volumes disposal between firms. Step 6. Calculation of revenue from each participant in the supply chain. The decision-making process that is described above characterizes the decen- tralized multilevel tree-like supply chain as the conflict-managed system, with the hierarchical structure, therefore these systems specifically are defined by the order of the managerial levels that are followed one by one in the order of the denoted priority.

Definition 6. The feasible flow d∗ will be called optimal if it is fulfilled:

ij π (d∗)π (d ), i = 1,l,j = 1,m , k = 1,n , (5) ijk ijk ∀ i ij ij where (d ) is the flow that was created by the deviation of the strategy uij of the i player xj .

Let us look into the plus-sum multistage game Γ with hierarchical structure that is revealed in a plurality Y, Ui i Y , Hi i Y where Y = 1, 2,...,k is the h { } ∈ { } ∈ i { } multitude of players with dissection into the subsets according to the priority, Ui is the multitude of managing stimulus of the player i to the players that are subject to him, Hi is the payoff functional of the player i that was set in the Cartesian product of sets Ui leading the players U = i Y Ui. Control vector u = (u1, ,uk) is forming the situation in the game Γ . At the∈ present time lets take the arranged··· multitude of supply chain junctures N =Q x1,...,xl as the multitude of the { 1 ml } players Y , as the multitude of the controlling actions multitude Uij of players i i strategies xj N. Each of the player xj N will be assigned in the correspondence i∈ ∈ the vector πj = (πij1, πij2,...,πijnij ). Then as the payoff functions of the players i 1 l let us take accordingly the arranged set of vectors πj : π = π1,...,πml . i { } Then the plurality N, Uij i,j:xi N , πj i,j:xi N is defined as the plus-sum h { } j ∈ { } j ∈ i multistage game with the hierarchical structure, and the task of decentralized model coordination of the multilevel supply chain is the process of finding the Nash equi- librium in the multilevel hierarchical game with the complete information. Coordination in Multilevel Supply Chain 355

3.2. Construction of the two-level decentralized supply chain solution Let us begin the coordination task with the particular example when l = 2, namely there are only 2 levels in the supply chain and it has the form of vector (see the Fig. 2).

Fig. 2. Two-level supply chain.

2 Let us look at the firm k in the finite juncture xj , where 1 j m2,1 k n2j . For it the revenue formula equation looks like: ≤ ≤ ≤ ≤ π = q (p p v ) (6) 2jk 2jk 2j − 11 − 2jk Let us apply in this formula the equation for p2j , taking into the consideration the supply function (3), namely:

n2j p = a b Q ,Q = q . 2j 2j − 2j 2j 2j 2jk Xk=1 Then we will get the following equation:

n2j π = q (a b q p v ) (7) 2jk 2jk 2j − 2j 2jh − 11 − 2jk hX=1 For the conforming of the assumption (5) let us apply to the revenue function (7) the condition of necessity for the maximum:

n2j ∂π2kj = a2j b2j q2jh p11 v2jk b2jq2jk =0, ∂q2jk − − − ! − hX=1 and express the q2jk: n 1 1 2j q2jk = (a2j p11 v2jk) q2jh. (8) 2b2j − − − 2 h=1,h=k X6 356 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Let us perform (6) (8) for all k = 1,n2j and we will come up to the system:

1 (a2j p11 v2j1) 211 1 q2j1 b2j ··· 1 − − 121 1 q2j2 (a2j p11 v2j2)  b2j − −   . . . ···. .   .  = . (9) . . . . . · . .       111 2 q  1     2jn2j   a2j p11 v2jn2j   ···     b2j − −        Matrix of the system (9) is a non-degenerate due to the linear conn ection of the series (columns), thus, this system may be solved in a one-valued way relatively to the all q2jk. Let us find the opposite matrix for the matrix of the system (9):

1 211 1 − n2j 1 1 ··· n +1 n −+1 n −+1 121 1 2j 2j ··· 2j  ···  = . . . . ; . . . . .  . . . .  . . . . . 1 1 n2j   − −  111 2   n2j +1 n2j +1 ··· n2j +1   ··· [n2j n2j ]     × and let us multiply on the left-hand side both of the sides (9) by this matrix:

1 (a2j p11 v2j1) q2j1 n2j 1 1 b2j − − − − n2j +1 n2j +1 n2j +1 1 q2j2 ··· (a2j p11 v2j2) . . . .  b2j   .  = . . . . − − (10) .  . . . .  · . . 1 1 n2j .   − −   q n2j +1 n2j +1 n2j +1  1   2jn2j   ···   a2j p11 v2jn2j       b2j − −      Having accomplished the multiplication in the par (10) we will get the following equation for q2jk:

n 1 2j q2jk = a2j p11 n2j v2jk + v2jh , k = 1,n2j. (11) b2j (n2j + 1)  − −  h=1,h=k X6   The found value of the variables is in reality the point of maximum to the revenue function, i.e.: 2 ∂ π2jk 2 = b2j b2j = 2b2j < 0, ∂q2jk − − − remain valid b2j ; ∂2π 2jk =0, r = k. ∂q2jk∂q2jr ∀ 6

We can find the equation for Q2j :

n n n 2j 2j 1 2j Q2j = q2jk = a2j p11 n2j v2jk + v2jh = b2j (n2j + 1)  − −  k=1 k=1 h=1,h=k X X X6 n2j   (12) n (a p ) v 2j 2j − 11 − 2jk k=1 = X , j = 1,m2. b2j (n2j + 1) Coordination in Multilevel Supply Chain 357

Let us have a look into the root sector. For the firm k from the root peak 1.1 the function of revenue has the following form:

π = q (p v ), k = 1,n . (13) 11k 11k 11 − 11k 11

The condition of surplus elimination and deficit (2) is expressed in the formula

n11 m2 m2 n (a p ) n2j v Q = q = Q = 2j 2j − 11 − h=1 2jh , 11 11k 2j b (n + 1) j=1 j=1 2j 2j Xk=1 X X P

from that one can express the value p11 from variables q11k:

n11 m2 n2j n2j a2j h=1 v2jh q11k + − − b2j (n2j + 1) k=1 j=1  P  p = X X . (14) 11 m2 n2j b (n + 1) j=1 2j 2j X  

Let us plug received equation (14) in the revenue formula (13):

m2 n a n2j v Q + 2j 2j − h=1 2jh − 11 b (n + 1)  j=1  2j 2Pj   π11k = q11k X v11k , k = 1,n11, (15) m2 −  n2j     b2j (n2j + 1)   j=1     X  and then let us use the maximum condition of necessity to the equation for the revenue functions (15):

m2 n2j n2j a2j h=1 v2jh Q11 + − − b2j(n2j + 1) ∂π11k  j=1  P   = X v11k ∂q m2 − − 11k  n2j     b2j(n2j + 1)  (16)  j=1     q X  11k =0, k = 1,n . m2 11 − n2j b (n + 1) j=1 2j 2j X   358 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Having leaved the variables q11k in the left side and having transferred other parameters to the right side, we will receive the following system:

21 1 q ··· 111 12 1 q112  . . ···. .   .  = . . . . .      11 2   q   ···   11n11  m m    2 n a n2j v jh 2 n 2j 2j − h=1 2 v 2j b (n + 1) − 111 b (n + 1)  j=1  2j P2j  j=1  2j 2j   Xm Xm 2 n a n2j v jh 2 n  2j 2j − h=1 2 v 2j   b (n + 1) − 112 b (n + 1)  =  j=1 2j 2j j=1 2j 2j  . (17)  X  P  X     .   .     m2 n2j m2   n2j a2j h=1 v2jh n2j   − v11n11   b2j (n2j + 1) − b2j(n2j + 1)   j=1  P  j=1     X X 

Matrix of the system (17):

21 1 12 ··· 1  . . ···. .  . . . .    11 2   ··· [n11 n11]   × is a non-degenerate due to the linear independence of its columns (rows). That is why we can express in a one-valued way the meanings of the variables q11k, having multiplied this system to the opposite matrix that has the form:

n11 1 1 − − n11+1 n11+1 n11+1 . . ···. .  . . . .  . 1 1 n11 − −  n11+1 n11+1 ··· n11+1   

We will receive the equations for q11j , j = 1,n11: Coordination in Multilevel Supply Chain 359

q111 n11 1 1 − − n11+1 n11+1 n11+1 q112 . . ···. .  .  = . . . . .  . . . .  × . 1 1 n11   n −+1 n −+1 n +1  q11n   11 11 ··· 11   11    m m   2 n a n2j v 2 n 2j 2j − h=1 2jh v 2j b (n + 1) − 111 b (n + 1)  j=1  2j 2Pj  j=1  2j 2j   Xm Xm 2 n a n2j v 2 n  2j 2j − h=1 2jh v 2j   b (n + 1) − 112 b (n + 1)   j=1 2j 2j j=1 2j 2j  ×  X  P  X     .   .   m m   2 n a n2j v 2 n   2j 2j − h=1 2jh v 2j   b (n + 1) − 11n11 b (n + 1)   j=1 2j 2j j=1 2j 2j   X  P  X      (18)

After simplification (18) we will come up to the pars:

m n2j n 1 2 1 11 q11k = n2j a2j v2jh n11v11k + v11r , (n11 + 1) b2j  − −  j=1 h=1 r=1,r=k (19) X X X6   k = 1,n11.

The values found (19) are in reality the points of maximum, because

∂2π 1 1 11k = + = 2 m2 − m2 − ∂q11k n2j n2j b (n + 1) b (n + 1) j=1 2j 2j j=1 2j 2j X   X   (20) 2 = < 0, m2 − n2j b (n + 1) j=1 2j 2j X  

n2j due to the fact that > 0, j = 1,m2; b2j (n2j +1) ∀   ∂π 11k =0, r = k. ∂q11k∂q11r ∀ 6

In the formula (19) all the parameters are known, because they are the predefined ones in the supply chain. As a consequence, the meanings of the variables q11k are known as well. Thus, further we can consequently find the meanings of the variables p11,Q11, q2jk, j = 1,m2, k = 1,n2j p2j , j = 1,m2. That is how the optimal flow for the two-level decentralized supply chain was found and the problem of coordination was solved. Analytical equations of the meanings of values in equilibrium are stated in the Table 1. 360 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Table 1. Analytical equations for the meanings of variables in equilibrium

Variable Equation m n2j 1 2 1 q11k , k = 1, n11 n2j a2j − v2jh− n11 + 1 b2j j=1 h=1 X X n11 −n11v11k + v11r  r=1,r6=k m X 1 2 1  Q11 n11n2j a2j − n11 + 1 b2j j=1  Xn2j n11 −n11 v2ih − v11r h=1 r=1 ! n11 mX2 X n2j n2j a2j − h=1 v2jh − q11k + b2j (n2j + 1) k=1 j=1 11  P  p X m2X n2j b2j (n2j + 1) j=1 X   q2jk , 1 a2j − p11 − n2j v2jk + j = 1,m2,k = 1, n2j b2j (n2j + 1))  n2j

+ v2jh  h=1,h=6 k X n2j n2j (a2j − p11) − k=1 v2jk Q2j , j = 1,m2 b2j (n2j + 1) P Coordination in Multilevel Supply Chain 361

3.3. Nash equilibrium in the multilevel decentralized game

Let the decentralized tree-like supply chain be set with the certain number of levels. Analogous to the previous section the solution of the coordination problem we will begin with the analysis of the final junctures proceeding to the direction of the final peak. l Let us analyze the revenue function of the firm k from the juncture xj :

π = q (p p v ),p : (l, j) Si. (21) ljk ljk lj − it − ljk it ∈ t

Let us substitute in the revenue formula (3.3.1) the formula for the variable plj , using the supply function (3):

π = q (a b Q p v ). (22) ljk ljk lj − lj lj − it − ljk

Having done (21) (22) for all k = 1,nli and having applied the maximum condition of necessity:

∂πljk =0, k = 1,nlj , (23) ∂qljk

we will result in the following system:

1 (alj pit vlj1) 211 1 qlj1 blj ··· 1 − − 121 1 q (alj pit vlj2) lj2  blj − −   . . . ···. .   .  = . . (24) . . . . . · . .       111 2 q  1     ljnlj   (alj pit vljn1j )   ···     blj − −        System (24) has the matrix:

211 1 121 ··· 1  . . . ···. .  . . . . .    111 2   ··· [nlj nlj ]   × that is non-degenerate due to the linear dependence of columns (rows). That is why system (24) can be solved in a one-valued way in correspondence to the variables qljk, k = 1,nlj and the unambiguous solution has the form:

nlj 1 1 1 (alj pit vlj1) qlj1 − − blj nlj +1 nlj +1 ··· nlj +1 1 − − qlj2 b (alj pit vlj2)   =  . . . .   lj − −  ...... · .    1 1 nlj    q  − −   1   ljnlj     (alj pit vljnlj )     nlj +1 nlj +1 ··· nlj +1   blj − −        362 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova or after the multiplication of the solution has the form:

n 1 lj alj pit + nlj vlj1 vljh blj (nlj + 1) − −  h=2 !!  Xnlj qlj1  1  qlj2  alj pit + nlj vlj2 vljh   blj (nlj + 1)  −  −    .  =  h=1,h=2  . (25) .  X6  .    .      .   qljnlj       nlj 1     1 −   alj pit + nlj vljnlj vljh   blj (nlj + 1)  −  −    h=1   X       l For the juncture xj the following par is valid as well

nlj nlj nlj (alj pit) k=1 vljk Qlj = qljk = − − . (26) blj (nlj + 1) Xk=1 P Let us fulfill the same analogical operations (21) (26) for all the final peaks l xj Xl. ∈ (l 1) Now let us analyze the firm k from xj − . Its revenue function has the following form: π(l 1)jk = q(l 1)jk p(l 1)j pit v(l 1)jk , k = 1,n(l 1)j, (27) − − − − − − − i where pit : (l 1, j) S .  − ∈ t Taking into consideration that the juncture x(l 1)j composes a sector, then from the condition of the deficit and surplus elimination− (2) let us have the formula

n(l−1)j

q(l 1)jk = Q(l 1)j = Qlh = − − k=1 h:(l,h) Sl−1 X X∈ j (28) nlh nlh alh p(l 1)j r=1 vlhr = − − − , blh(nlh + 1) h:(l,h) Sl−1  P X∈ j

from that it is possible to express the variable p(l 1)j in one-valued terms: −

p(l 1)j = f(l 1)j q(l 1)j1,...,q(l 1)jn = − − − − (l−1)j n (l−1)j  nlh nlhalh r=1 vlhr q(l 1)jk + − − − blh(nlh + 1) (29) k=1 h:(l,h) Sl−1 P = X X∈ j . nlh blh(nlh + 1) h:(l,h) Sl−1 X∈ j Let us substitute (29) in the revenue formulas (27)

π(l 1)jk = q(l 1)jk f(l 1)j pit v(l 1)jk , k = 1,n(l 1)j , (30) − − − − − − −  Coordination in Multilevel Supply Chain 363 and let us apply the maximum condition of necessity to the formulas (30):

∂π(l 1)jk − = f(l 1)j pit v(l 1)jk+ ∂q(l 1)jk − − − − − 1 + q(l 1)jk − =0, k = 1,n(l 1)j , (31) − · l 1 − − nlh blh(nlh + 1) h:(l,h) Sj X∈

or in the matrix form:

211 ... 1 q(l 1)j1 121 ... 1 −   q(l 1)j2 . . . . .  −.  = . . . . . · .  .     .   q((l 1)jn   111 . 2   − (l−1)j       l 1 nlh − 1 nlhalh nlhpit nlhv(l 1)j1 vlhr  blh(nlh + 1) − − − −  h:(l,h) Sj r=1 ! X∈ X l 1 nlh  − 1   nlhalh nlhpit nlhv(l 1)j2 vlhr   blh(nlh + 1) − − − −  =  h:(l,h) Sj r=1 !  .  X∈ X   .   .   .   l 1 nlh   − 1   nlhalh nlhpit nlhv(l 1)jn(l−1)j vlhr   blh(nlh + 1) − − − −   h:(l,h) Sj r=1 !   X∈ X   (32)

Since the matrix of the system (32)

211 1 121 ··· 1  . . . ···. .  . . . . .    111 2   ··· [n(l−1)i n(l−1)i ]   × is a non-degenerate one due to having linear independence of the columns (rows) the opposite matrix exists: 364 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

1 211 1 − 121 ··· 1  . . . ···. .  = . . . . .    111 2   ··· [n(l−1)j n(l−1)j ]   × n(l 1)j 1 1 − − − n(l 1)j +1 n(l 1)j +1 ··· n(l 1)j +1  − . − . . − .  = . . . . (33)    1 1 n(l 1)j   − − −   n(l 1)j +1 n(l 1)j +1 ··· n(l 1)j +1   − − − n(l−1)j n(l−1)j   × As a result of that, (3.3.10) could be solved in a one-valued way in relation to the variables q(l 1)jk, k = 1,n(l 1)j : − −

l 1 1 − 1 q(l 1)jk = (nlhalh nlhpit − n(l 1)j +1  blh(nlh + 1) − − − h:(l,h) Sj X∈ nlh  n(l−1)j

vlhr n(l 1)j nlhv(l 1)jk + nlh v(l 1)je , k = 1,n(l 1)j. − − − − −  − r=1 e=1,e=k X X6  (34)

There are could be further calculated the value of Q(l 1)j : −

n l j ( −1) 1 1 Q(l 1)j = q(l 1)jk =  − − n(l 1)j +1 blh(nlh + 1)× k=1 − h:(l,h) Sl−1 X  X∈ j  nlh n(l−1)j n(l 1)j nlhalh nlhpit vlhr nlh v(l 1)jk . (35) × − − − − − r=1 ! !# X Xk=1 l 1 Let us repeat the process (27) (35) for all the remained junctures xi− from the l 1 same level: xi− Xl 1,i = j. ∈ − 6 i Then we by the similar way will analyze the peaks xt from multitudes Xi peaks of the level i, i = (l 2), (l 3),..., 2, will solve the two level subgame in each of the sectors that we created− by− these junctures, having received the solution depending i on the supplier price of the juncture xt and express the meaning of this price in terms of the variables from the volume juncture. 1 Let us proceed to the analysis of the multitude in the first level peaks X1 = x1 . 1 { } The revenue functions view for the certain firm k from the juncture x1 has the view: π = q (p v ). (36) 11k 11k 11 − 11k

Let us consider that the variable p11 has the expression by the variables q11k, k = 1,n11 and the parameters of the production costs that can be received after the Coordination in Multilevel Supply Chain 365 consideration of all Xi, i = 2,l 1 from the condition of the deficit and surplus nonexistence: −

p11 = f11(q111,...,q11n11 , vit1,...,vitnit ,...,v111,...,v11n11 ), i,t : (i,t) S1, (37) ∈ 1 where f11 is the linear function depending on the arguments q111,...,q11n11 . Let us substitute the equation (37) in the revenue function (36)

π = q (f (q ,...,q , v ,...,v ,...,v ) v ) , (38) 11j 11k 11 111 11n11 it1 itnit 11n11 − 1k and apply to the (38) the maximum condition of necessity:

∂π 11k = f (q ,...,q , v ,...,v ,...,v ,...,v ) ∂q 11 111 11n11 it1 itnit 111 11n11 − 11k (39) ∂f v + q 11 =0, k = 1,n , − 11k 11k ∂q11k 11

∂f11 As this takes place the meanings of all derivatives , k = 1,n11 are constant ∂q11k due to the linearity of the function f11. The system (3.3.15) is the linear equations system relative to q111,...,q11n11 with a nondegenerate matrix

211 1 121 ··· 1  . . . ···. .  (40) . . . . .    111 2   ··· [n11 n11]   × and due to that it is uniquely solvable in relation to all q11k, k = 1,n11 where this solution depends only on the predefined supply chain parameters. Then by consequently substituting the deduced meanings to the equations for the unknown variables we will find their equilibrium meanings. Hence, the optimal flow d∗ is found and the task of coordination to the decentralized model of the multilevel supply chains is solved.

4. Coordination of the centralized multilevel supply chain Let the certain multilevel supply chain with the tree-like distributive structure be defined. Let us assume that all its participants are joining the coalition and deciding to act in coordination having the goal of the total profit functions maximization in the overall supply chain under the known linear supply functions in the finite junctures. For each of the firms from this chain let us write down its revenue function πijk(d), and then let us sum them by i = 1,l, j = 1,mi, k = 1,nij in order to find the overall supply chain revenue Π(d). Then it is necessary to find that feasible flow d that can contribute to the satisfaction of the formula b argmaxd DΠ(d)= d, ∈ b 366 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova leading us to the optimization problem under the following conditions:

l mi nij

max Π(d)= max πijk qij1,...,qijnij , vij1,...,vijnij ,pij ,pth + d D qijh ,pij  ∈ i=2 j=1 k=1 X X X  n11  + π (q ,...,q , v ,...,v ,p ) , p : (i, j) St ; 11k 111 11n11 111 11n11 11 th ∈ h k=1 ! X (41)

nlj p = a b q , j = 1,m ; (42) lj lj − lj ljk l kX=1 nth nij q = q ,t,h : xt / X ; (43) thr ijk h ∈ l r=1 i,j:(i,j) St k=1 X X∈ h X q 0,i = 1,l,j = 1,m , k = 1,n ; (44) ijk ≥ i ij

p 0,i = 1,l,j = 1,m . (45) ij ≥ i From the properties of the maximizing function Π(d) and view of the constraints (42) (45) we conclude that (41) (45) is the linear optimization problem under the linear constraints of equation and inequation types. For the solution of the analyzed optimization problem there was a program cre- ated in the MATLAB environment. This program realized the interactive of the maximum point search under the constraints of equation and in- equation types based on the sequential quadratic programming method. Optimization problem (41) (45) (and, as a consequence, results received after its solution) has only one, but very substantial, drawback: it requires after the usage an additional imputation system, because under the received optimal volumes that are really minimizing the revenue on the whole supply chain, the revenue of the certain participants is pertaining to zero or negative. That is why after the optimal flow to the chain identification it is necessary to imply the contract system among all the participants which states explicitly the imputation of the total revenue received. However, it is very often difficult to implement that in real life. Let us analyze the method using an alternative definition of the optimization problem and not requiring after it usage of any mathematical instruments.

5. Formalization of coordination attitude with the weighted Nash solution usage Let us have the game in the standard form, namely the plurality Γ = N, Yi i N , Hi i N , where N = 1, 2,...,n is a nonvacuous set of players, h { } ∈ { } ∈ i { } Yi is the set of players i strategies, and Hi is a payoff functional of the player i that is defined on the Cartesian product of sets Yi i N for the strategies of players { } ∈ 1 2 2 l Y = i N Yi, Hi : Y R. Simply ordered plurality N = x1, x1, x2,...,xml for all the junctures∈ of the→ supply chain we will consider as the{ plurality of players} and pluralitiesQ U , defined by formula (4) pluralities for strategies of players xi N. Let ij j ∈ us for each player xi N define in accordance the vector πi = π , π ,...,π j ∈ j ij1 ij2 ijnij  Coordination in Multilevel Supply Chain 367 and in terms of players payoff functional let us take the mix of these vectors, simply ordered according to the ordering of the players plurality π = π1,...,πl . { 1 ml } Let us call π∗ = πijk∗ as the revenue of all the supply chain participants i,j,k that is gained in decentralizedn o solution of a coordination problem in the same supply chain. Let us create the function

l mi nij αijk Φ(d)= π (d) π∗ , ijk − ijk i=1 j=1 k=1 Y Y Y  where α are certain numbers such as α > 0, i = 1,l,j = 1,m , k = 1,n and ijk ijk ∀ i ij

l mi nij αijk =1. i=1 j=1 X X Xk=1 Then the solution of the following optimization problem with constraints is, on the one hand, the weighted Nash solution and on the other is the Pareto-optimal flow in the supply chain:

l mi nij πijk max πijk qij1,...,qijnij , vij1,...,vijnij ,pij ,pth πijk∗ qijh ,pij  −  × i=2 j=1 k=1 Y Y Y    n11  π11k (π11k (q111,...,q11n11 , v111,...,v11n11 ,p11) π11∗ k) , × − !# kY=1 p : (i, j) St ; th ∈ h (46)

π π∗ , i = 1,l, j = 1,m , k = 1,n ; (47) ijk ≥ ijk i ij

nlj p = a b q , j = 1,m ; (48) lj lj − lj ljk l Xk=1

nth nij q = q , t,h : xt X ; (49) thr ijk h ∈ l r=1 i,j:(i,j) St k=1 X X∈ h X q 0, i = 1,l, j = 1,m , k = 1,n ; (50) ijk ≥ i ij

p 0, j = 1,m (51) lj ≥ i For the solution of this linear optimization problem with the non-linear con- straints there was a program created in MATLAB that is representing the iterative search of optimal solution with the predefined constraints in the kind of equations and inequations with the usage of the sequential quadratic programming method as the most effective method of the linear functions constrained optimization. 368 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

6. Example and comparison of the solutions Let us look at the specific example of the supply chain and compare the solutions that were received after each of the methods were implemented. Let us have the supply chain depicted on the Table 2.

Table 2. Meanings of the supply chain parameters

Juncture Juncture Juncture Juncture 1 2 3 3 x1 x1 x1 x2 Number of firms in the juncture, n11 = 2 n21 = 1 n31 = 4 n32 = 2 nij Meaning of costs v111 = 1500 v211 = 700 v311 = 342 v321 = 120 to the single v112 = 1505 v312 = 340 v322 = 122 unit of good v313 = 338 production, vijh v314 = 345

Let us find consequent decentralized solution for this supply chain, then central- ized, and finally Nash solution in which as the weight coefficients there will be the following numbers used: 1 α = α = ; 111 112 3 1 α = α = ; 111 112 3 2 α = ; 211 9 1 α = α = α = α = α = α = . 311 312 313 314 321 322 54 Comment. These numbers were received by the authors algorithm of the number crunching in the weighted coefficients, according to which the largest weight is assigned to the root juncture, and then the weights are decreasing by the movement from the level to level. Let us find the decentralized solution for this example. Revenue function for all the firms from the juncture of the 3rd level have the type (52) and (53):

4 π = q 5000 0, 25 q p 342 , (52) 311 311  − 31j − 11 −  j=1 X   4 π = q 5000 0, 25 q p 340 , 312 312  − 31j − 11 −  j=1 X  4  π = q 5000 0, 25 q p 338 , 313 313  − 31j − 11 −  j=1 X   4 π = q 5000 0, 25 q p 345 ; 314 314  − 31j − 11 −  j=1 X   Coordination in Multilevel Supply Chain 369

2 π = q 6000 0, 09 q p 120 , (53) 321 321  − 32j − 21 −  j=1 X   2 π = q 6000 0, 09 q p 122 . 322 322  − 32j − 21 −  j=1 X   Let us apply to all the functions in (6.1) and (6.2) the maximum condition of necessity and deduce the two sets of equations respectively:

0, 5 0, 25 0, 25 0, 25 q311 4658 p11 0, 25 0, 5 0, 25 0, 25 q 4660 − p = 312 = 11 . (54)  0, 25 0, 25 0, 5 0, 25  q   4662 − p  313 − 11  0, 25 0, 25 0, 25 0, 5   q314   4655 p11       −        0, 18 0, 09 q 5880 p 321 = 21 ; (55) 0, 09 0, 18 q 5878 − p    322   − 21  After solving the systems (54) and (55) we have the formula for q3ij :

q311 = 3724 0, 8p11, q = 3732 − 0, 8p ,  312 11 (56) q = 3740 − 0, 8p ,  313 11  q 4 = 3712− 0, 8p . 31 − 11  q = 1 (588200 100p ), 321 27 21 (57) q = 1 (587600 − 100p ).  322 27 − 21 Because of the deficit and surplus mitigation condition we will receive the for- mula 1175800 200 Q = q + q = p = Q = q , 32 321 322 27 − 27 21 21 211 from that one can express meaning of the variable p21

p = 5879 0, 135q . (58) 21 − 211 2 For the unique firm out of the juncture x1 revenue function is written in the form of the formula: π = q (p p 700) , 211 211 21 − 11 − substituting in which the equation (58), we will find:

π = q (5879 0, 135q p 700) . 211 211 − 211 − 11 − Implementation of the maximum condition of necessity to this equation will be resulted in the par:

∂π211 517900 100 =0= q211 = p11. (59) ∂q211 ⇒ 27 − 27 370 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

The condition of the surplus and deficit mitigation in the root peak center can be written in the form of equation

4 Q11 = q111 + q112 = Q21 + Q31 = q211 + q31i, i=1 X from that after having substituted (6.5) in (6.8) one can express p11: 1150520 135 p = (q + q ) (60) 11 233 − 932 111 112 Firms 1 and 2 from the root sector have the following revenue functions respec- tively: π = q (p 1500) , (61) 111 111 11 −

π = q (p 1505) , (62) 112 112 11 − which after the plugging in (60) will have the form:

1150520 135 π = q (q + q ) 1500 , (63) 111 111 233 − 932 111 112 −   (1150520 135 π = q (q + q ) 1505 . (64) 112 112 233 − 932 111 112 −   After the implementation of the maximum condition of necessity to the (63) and (64) we will receive a system:

135 135 q 801020 932 466 111 = 233 , 135 135 q 799855  466 932   112   233  the unique solution of which has the form:

q = 213916 7923, 111 27 (65) q = 212984 ≈ 7889.  112 27 ≈ Let us substitute the found meanings (65) in the equations (60) 616895 p = 2648. (66) 11 233 ≈ Let us substitute the meaning (66) in the (56) and (59) so that we will come up with the following:

374176 q = 1606, 311 233 ≈ 376040 q312 = 1614, 233 ≈ (67) 377904 q = 1622, 313 233 ≈ 371380 q = 1594, 314 233 ≈ Coordination in Multilevel Supply Chain 371

19660400 q = 9375; (68) 211 2097 ≈ Then, after having substituted (68) in the formula (58) we will find the meaning for p21: 1074901 p = 4613. (69) 21 233 ≈

By substituting (68) to the formula (58) let us find q321 and q322 from the equation (57): 3284500 q = 4699; (70) 321 699 ≈

9806900 q = 4677. 322 2097 ≈ Finally, using the calculated meanings of production volumes (67) and (70) in the 3 3 finite junctures x1 and x2, let us calculate the meaning of optimal prices p31 and p32 with the usage of the supply function:

790125 p = 3391, 31 233 ≈

1201396 p = 5156. 32 233 ≈ Knowing the equilibrium meanings of all variables, we can calculate the revenue of every participant and then receive the overall supply chain revenue that is equal to:

53525765475416 Πd = 3.6516 107. (71) 1465803 ≈ ·

Now let us find the optimal meanings by solving with the help of MATLAB platform the total revenue maximization problem in case of decentralized supply chain model and the maximization of the weighted Nash solution problem. Let us place all the received meanings in the single table (Table 3) for the intuitive comparison. While comparing the meanings of the total chain revenue in decentralized and centralized models, let us notice that in case of centralized participants behavior the total revenue of chain has been increased to 1, 1042 107 or approximately to 30%. The number of analogous numerative experiments· has found out that the chain centralization has on average the 25% gain in terms of total revenue in comparison with decentralized model. What is more, it is clear from the table that the Nash weighted arbitrage solution has increased the total profit of supply chain approxi- mately to 29% from the revenue meaning in the Nash equilibrium in decentralized model. This result is a bit worse that has been received by the means of overall supply chain revenue maximization problem solution. However, the Nash weighted arbitrage solution guarantees for each of participants the positive gain and does not require an imputation procedure. 372 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Table 3. Meanings of variables and revenue

Nash Solution of the Nash weighted equilibrium total profit arbitrage maximization solution problem 1 Juncture x1 Volume of q111 ≈ 7923 q111 ≈ 27566 q111 ≈ 13629 output q112 ≈ 7888 q112 ≈ 0 q112 ≈ 13126 Price p11 ≈ 2648 p11 ≈ 1553 p11 ≈ 2402 Revenue of π111 ≈ 9092365 π111 ≈ 60461350 π111 ≈ 12299768 participants π112 ≈ 9013310 π112 ≈ 0 π112 ≈ 11780161 2 Juncture x1 Volume of q211 ≈ 9375 q211 ≈ 21242 q211 ≈ 21441 output Price p21 ≈ 4613 p21 ≈ 0 p21 ≈ 3716 Revenue of π211 ≈ 118664742 π211 ≈ 16198593 π211 ≈ 13144816 participants 3 Juncture x1 Volume q311 ≈ 1606, q311 ≈ 0, q311 ≈ 696 of output q312 ≈ 1614, q312 ≈ 0, q312 ≈ 1193 q313 ≈ 1622, q313 ≈ 6324, q313 ≈ 2424 q314 ≈ 1594 q314 ≈ 0 q314 ≈ 1001 Price p31 ≈ 3391 p31 ≈ 3419 p31 ≈ 3671 Revenue of π311 ≈ 644733 π311 ≈ 0 π311 ≈ 644770 participants π312 ≈ 651173 π312 ≈ 0 π312 ≈ 1108408 π313 ≈ 657644 π313 ≈−4285648 π313 ≈ 2256850 π314 ≈ 635134 π314 ≈ 0 π314 ≈ 925272 3 Juncture x2 Volume of q321 ≈ 4699, q321 ≈ 21242, q321 ≈ 12018 output q322 ≈ 4677 q322 ≈ 0 q322 ≈ 9423 Price p32 ≈ 5156 p32 ≈ 4088 p32 ≈ 4070 Revenue of π321 ≈ 1987132 π321 ≈−24758272 π321 ≈ 2821735 participants π322 ≈ 1968381 π322 ≈ 0 π322 ≈ 2193425 Total chain: ≈ 3, 65 · 107 ≈ 4, 76 · 107 ≈ 4, 72 · 107 revenue Coordination in Multilevel Supply Chain 373

7. Conclusions Within this paper we have analyzed supply chains with the tree-like distributive structure, where each juncture of this chain represents the competitive firms plu- rality that are producing and consuming the homogeneous product and that are having different production costs, but at the same time junctures do not compete with each other. It was assumed that the markets where the final products are re- alized by the finite junctures, do not compete with each other and function under the Cournot model with linear supply functions. We have discussed the question of participants coordination, i.e. the of the problem concerning the choice of such strategies that are satisfying the predefined optimality criteria. The mathematical formalization of the multilevel tree-like supply chains with the help of tree-like graph was conducted and the three solutions to the coordination problem were proposed: decentralized solution, centralized solution and weighted Nash solution. The search for decentralized solution has resulted in absolute Nash equilibrium being found in the multilevel hierarchical fully equipped with information game for which we have created the algorithm of this equilibrium solution finding. For the case of the centralized participants behavior in the supply chain with the analyzed structure, the coordination problem was formulated as the problem of non-linear conditional optimization. Numerical simulation has found that such an approach increases the total revenue of the supply chain on average at 25%, but is does not guarantee the positive gain to all of the participants, so requires the imputation system to be implemented. The analysis of results having received from the numerical simula- tion, has forced us to find an alternative approach to the supply chain coordination. Acting as such an approach the Nash weighted solution was chosen that, as it was found out experimentally, even though gives a smaller gain in terms of revenue than the one examined earlier, but guarantees the positive gain to all of the participants.

References Petrosyan, L. A., Zenkevich, N. A. and E. V. Shevkoplyas (2014). Game theory. 2nd Edi- tion. BCV-Press, Saint-Petersburg, 432 p. Adida, E., DeMiguel, V. (2011). Supply Chain competition with multiple manufacturers and retailers. Operation Research, Vol. 59(1), 156–172. Cachon, G. P. (2003). Supply chain coordination with contracts. Handbooks in Operations Research & Management Science, 11, 227–339. Carr, M. S., Karmarkar, U. S. (2005). Competition in multi-echelon assembly supply chains. Management Science, 51, 45–59. Cho, S.-H. (2014). Horizontal mergers in multi-tier decentralized chains. Management Sci- ence, 51, 45–59. Corbett, C., Karmarkar, U. S. (2001). Competition and structure in serial supply chains with deterministic demand. Management science, 47, 966–978. Gasratov, M. G., Zacharov, V. V. (2011). Game-theoretic approach for supply chains opti- mization in case of dterministic demand. Game theory and applications, 3(1), 23–59. Gorbaneva, O. I., Ougolnitsky, G. A. (2016). Static models of concordance of private and public interests in resource allocation. Game theory and applications, 8(2), 28–57. Kaya, M., Ozer, O. (2012). Pricing in business-to-business contracts: sharing risk, profit and information. The Oxford Handbook of Pricing Management. Oxford: Oxford Uni- versity Press, 738–783. Laseter, T., Oliver, K. (2003). When will supply chain management grow up? Strat- egy+business, Issue 32. Tyagi, R. K. (1999). On the effect of downstream entry. Management science, 45, 59–73. 374 Ekaterina N. Zenkevich, Yulia E. Lonyagina, Maria V. Fattakhova

Vickers, J. (1995). Competition and regulation and vertically related markets. Review of economics study, 62, 1–17. Zenkevich, N. A., Zyatchin, A. V. (2016). Strong coalitional structure in a transportation game. Game theory and applications, 8(1), 63–79. Zhou, D., Karmarkar, U. S., Jiang, B. (2015). Competition in multi-echelon distributive sup- ply chains with linear demand. International Journal of Production Research, 53(22), 6787–6807. Ziss, S. (1995). Vertical separation and horizontal mergers. Journal of industrial economics, 43, 63–75. Contributions to Game Theory and Management, X, 375–395

Strategic Alliances Stability Factors ⋆

Nikolay Zenkevich and Anastasiia Reusova Saint Petersburg State University, Graduate School of Management, St. Petersburg, 199004, Russia, Volkhovskiy Pereulok, 3 E-mail: [email protected] [email protected]

Abstract The article extends the line of research on strategic alliance sta- bility, which has been studied widely in the academic literature for the past decades. Contrary to the majority of existing papers, this study adopts a multi-dimensional view on strategic alliance stability, and differentiates be- tween two major stability components: internal and external stability. Direct and indirect effects of trust, resource complementarity and partners’ long- term orientation on external and internal stability were studied in the paper. Using structural equation modeling (SEM) as an empirical method, the re- search shows that (1) internal stability is positively influenced by trust and resource complementarity, while (2) external stability is positively affected by partners’ long-term orientation. Moreover, (3) the study supported a hy- pothesis about a positive relationship between external and internal stability. Keywords: strategic alliance stability, internal stability, external stability, trust, long-term orientation, resource complementarity.

1. Introduction Strategic alliances (SA) are widely recognized to be a form of inter-organizational re- lationships that aids firms in standing against the competition in a complex business environment (Akkaya, 2007) and in creating customer value (Iyer, 2002; Umukoroa, Sulaimonb, Kuyeb, 2009). At the same time, some scholars estimate the failure rate of strategic alliances to mount to 60-65% due to unmet objectives, failed expecta- tions or other reasons (Geringer and Hebert, 1991; Umukoroa, Sulaimonb, Kuyeb, 2009; Gibbs and Humphries, 2016). As a phenomena, stability of long-term cooperative decisions, and strategic al- liance stability in particular, is acknowledged to be a fundamental problem that has been studied in academic literature for the last 30 years. The drawback of most of the researches on the topic is in viewing strategic alliance stability as a static (Jiang, Li and Gao, 2008) and one-dimensional concept (Zenkevich, Koroleva, Mamedova, 2014a), while relationships between partners in an alliance are certainly dynamic, which makes their management at least challenging (Douma et. al. 2000; Buffenoir, Bourdon, 2013). Therefore, this study is aimed at providing an integrated approach to the concept of strategic alliance stability and its factors.

2. Stability in Strategic Alliances: Theoretical Framework Strategic alliance (SA) can be defined as a long-term cooperative agreement be- tween partner companies that stay legally independent from each other after alliance ⋆ This work is supported by the Russian Foundation for Basic Research, projects No.16- 01-00805A 376 Nikolay Zenkevich, Anastasiia Reusova formation, share cooperation benefits and governance control over defined objectives and are continuously involved into one or more strategically important areas (Zenke- vich, Koroleva, Mamedova, 2014a). Managing an alliance in a way that promotes cooperation between partners and decreases opportunistic behavior is a highly relevant topic for alliance management. Despite all the advantages that strategic alliances are aimed to bring to partner companies, alliance involvement might incur unexpected and/or unwanted states and events for individual firms in an alliance (Kolenak, 2007). It is not uncommon that such issues lead to deteriorated performance and can cause alliance prema- ture termination (Geringer and Hebert, 1991; Umukoroa, Sulaimonb and Kuyeb, 2009). Partially, this phenomenon is addressed in a light of strategic alliance stabil- ity (SAS).

2.1. Strategic Alliance Stability Definition and Conceptualization: Merging Game Theoretic and Managerial Perspective The focus of researchers on strategic alliance stability has been split between two general concepts: strategic alliance stability and strategic alliance instability (Jiang, Li and Gao, 2008). See Table 1 for reference. It appears that strategic alliance in- stability rather than strategic alliance stability was the first and dominant focus of numerous studies (e.g., Franko, 1971; Killing, 1982, 1983; Gomes-Casseres, 1987; Inkpen and Beamish, 1997; Yan and Zeng, 1999; Das and Teng, 2000; Gill and Butler, 2003; Nakamura, 2005). Moreover, it is quite often that authors do not con- ceptually differentiate between SA stability and instability, and sometimes switch between the two in one study (e.g., Yan, 1998; Yan and Zeng, 1999). The definition of Zenkevich, Koroleva, Mamedova (2014a) is adopted in the paper as a working stability definition as long as it provides a comprehensive and approach to the concept that implies an opportunity to assess SAS with some degree of precision at least in some aspects. Based on previous studies of cooperative relationships stability in game the- ory (Moor, 1971; Zenkevich, Petrosyan and Yeung, 2009; Gill and Butler, 2003; Wong, Tjosvold and Zhang, 2005; Kumar, 2011), Zenkevich, Koroleva and Mame- dova (2014a, b) introduce several components of strategic alliance stability on two levels. On the first level, there is external and internal, or cooperative, stability. On the second level internal (cooperative) stability of strategic alliances, is com- prised of motivational, strategic and dynamic stability. The overall stability scheme is presented in the Fig. 1. The concept of external stability implies assessing the stability of and alliance as if it was a separate economic entity. Such evaluation is conventionally done with the help of economic indicators. In case of a strategic alliance, external stability is implied when alliance’s economic results have a raising trend. In this context, economic results of the strategic alliance can include net profit, revenue, market share, etc. If the trend is long-term, partner companies perceive a strategic alliance as a successful one, so they have a lasting motivation to maintain cooperation. It is important to consider the long-term trend because in a short-term perspective a strategic alliance might experience losses (e.g., due to initial stages of alliance implementation, unfavorable external conditions, etc.), which will be perceived as “natural” and will not deteriorate participants cooperative intent, at least, to a significant extent in case the long-term trend is positive. Strategic Alliances Stability Factors 377

Table 1. Definitions of strategic alliance stability/instability

Academic paper Definition (Zenkevich, Koroleva and “Strategic alliance stability should be understood Mamedova, 2014a) as a success of alliance performance during the pe- riod of alliance operations under conditions of con- stant motivation of each partner firm to maximize the results of cooperation.” (Jiang, Li and Gao, 2008) “. . . we define alliance stability as the degree to which an alliance can run and develop success- fully based on an effective collaborative relation- ship shared by all partners.” (Huang, 2003) “Stability, means in the process of movement, or in- (Hong, Yu and Zhichao, terference, whether or not the system can keep its 2011) former state. As for the specific strategic alliance, it means that the strategic alliance, as an organi- zation can keep its stable state, it is a dynamic stability, relative stability.” (Inkpen and Beamish, 1997) “. . . joint venture is considered unstable if the part- (Das, Teng, 2000) ners’ equity holding in the joint venture changed (Sim and Ali, 2000) (including take-over by one partner) since the for- mation or the venture is terminated. Termination as a result of a project ending was not included.” (Qing and Zhang, 2015) “. . . instability of such an [a competitive] alliance means short and fragile cooperation, and the fail- ure of alliance” Source: augmented from (Zenkevich, Koroleva, Mamedova, 2014a)

Fig. 1. Strategic alliance stability structure. Source: (Zenkevich, Koroleva, Mamedova, 2014a)

At the same time, as a strategic alliance is an agreement between companies which are eager to attain their own objectives within the alliance, this explains the need of introduction of internal (or cooperative) stability concept, which is well studied in game theory. Not only game theory has thoroughly studied different com- 378 Nikolay Zenkevich, Anastasiia Reusova ponents of internal stability of cooperative relationships, but it has also developed a holistic approach for its evaluation (Zenkevich, Petrosyan and Yeung, 2009). In managerial studies, internal alliance stability has been best described in pa- pers dedicated to the issues of strategic management (e.g., Gill and Butler, 2003; Wong, Tjosvold and Zhang, 2005; Kumar, 2011). An important assumption for internal stability conceptualization is that part- ners in a strategic alliance are rational, this is why they enter a strategic alliance expecting that the benefits of their cooperation will exceed possible benefits of their actions in case they kept operating individually (Zenkevich, Koroleva and Mame- dova, 2014a, Qing and Zhang, 2015). Having a closer look at the internal stability structure, motivation to cooperate is acknowledged to be essential for strategic alliance stability. Zenkevich, Koroleva, Mamedova (2014a) in their paper explain that motivational stability means that partners find it beneficial to actively contribute to alliance operations, or actively commit to alliance activities (Kumar, Scheer, and Steenkamp, 1995) because such behavior will increase the overall benefits of the alliance, hence, individual benefits of each partner (Gulati, Khanna and Nohria 1994; Sarkar et al, 2001). Such defi- nition of a strategic alliance stability is close to the understanding of commitment introduces by Das and Teng (1998) and described above. Motivational stability is pre-defined not only by economic factors and their trends, but also by relationships among alliance participants (Deitz et al., 2010; Hunt, Lambe and Wittmann, 2002). Motivation for further cooperation is sup- ported by such factors as trust (Anderson and Weitz, 1989; Huo, Ye and Zhao, 2015), respect for cross-cultural differences (Doz and Hamel, 1998; Yan and Luo, 2001) as well as shared goals and objectives (Anderson and Weitz, 1989; Ozorhon et al, 2008) and participants’ commitment (Kumar, Scheer and Steenkamp, 1995). One can say that alliance partners are committed to the alliance in case he contributes resources and capabilities necessary for alliance success (Jiang, Li and Gao, 2008). Partners’ commitment has a positive influence on partners’ relationship because it indicates that alliance partners are loyal and long-term oriented, which increases reciprocity and cooperation levels. If partners are committed to the relationship, they are less likely to deviate from cooperation. On a contrary, when partners are not committed to the alliance, they are not likely to establish a close cooperation with each other, which destabilizes the relationship. Given partners’ commitment, they tend to positively evaluate the chance to receive the expected benefits during the lifetime of an alliance (Zaheer and Venkatraman, 1995). As mentioned above, strategic stability is well studied in game theory (Petros- jan,1977, Zenkevich, 2009). Assuming that strategic alliance partners are rational, the fact that partners make a decision to form a strategic alliance means that they find such form of cooperation to be the most beneficial for them compared to all other opportunities in the market, including other partnerships, and an opportu- nity to operate alone. However, when the strategic alliance is in the implementation phase, some of the partners might reconsider staying within an alliance as no longer beneficial and might be willing to enter the alliance. Strategic stability of a strate- gic alliance assumes that none of the partners find it beneficial to decline from the cooperative agreement among partners, while other partners pertain to it. Dynamic stability is examined in game theory along with strategic stability as a part of internal stability of cooperative relationships (Zenkevich, 2009). Dynamic Strategic Alliances Stability Factors 379 stability of strategic alliances refers to benefits sharing in an alliance, or the payoff structure. Payoff structure is an important issue for alliance partners as they are motivated not only through economic benefits generated by an alliance as an eco- nomic entity, but also by benefits that are allocated to them personally (Umukoroa, Sulaimonb and Kuyeb, 2009). It has been mentioned by Franko (1971) that an alliance is stable rather than unstable when partners agree to agree on the initial profit sharing mechanism and satisfied with it. At the stage of alliance formation, partners form an understanding of what kind of benefits and in what quantity they find to be fair for them in comparison with all the threats and possible disadvantages, such as opportunity costs, that they are likely to face due to alliance participation and all the inputs they have to make for cooperation. The alliance is dynamically stable in case when at each moment of time the sum of gained and expected benefits by a partner corresponds to the amount and type of benefits the partner had been expecting to gain when signing the contract for cooperation. Dynamic stability assumes that this principle is supported for each of the partners in a strategic alliance. In case a partner realizes that it will not be possible to get all the expected benefits that had been expected from the alliance, partner’s motivation to con- tinue alliance participation might decrease or even disappear (Zenkevich, Koroleva, Mamedova, 2014a). Nevertheless, given that a well-developed pay-off structure is necessary for al- liance success and stability (Khanna et al, 1998), it is not a sufficient condition for the alliance stability on its own (Agarwal, Croson and Mahoney, 2010).

2.2. Strategic Alliance Stability Factors: Hypotheses Development For the purpose of this research, strategic alliance stability was analyzed as a multi- dimensional construct, however, the distinction among strategic alliance stability components was made on the most aggregate level: between external and internal stability. Given the fact that not much has been done in merging game theory ap- proach to strategic alliance stability conceptualization, which is comprehensive and all-inclusive, and broader managerial studies that examine strategic alliance stabil- ity factors, the benefits of such SAS conceptualization within the study are clear. Fist, such an approach pertain concept integrity. Second, conceptualizing stability this way, there is an ability to identify differences in relationships between strategic alliance stability factors and different strategic alliance stability components on the most aggregate level to gain a general understanding about these interconnections. The third benefit is the feasibility of further empirical analysis given the number of constructs to be analyzed in one study. Partner firms can increase cooperation by altering factors affecting cooperation (Umukoroa, Sulaimonb and Kuyeb, 2009), therefore, influencing strategic alliance stability (Deitz et al, 2010). Long-term orientation. Studies show that the longer the “shadow of the future”, the less likely it is that partners are going to engage into opportunistic activities because the consequences such behavior might have are to be considered by them (Axelrod, 1984; Heide and Miner, 1992; Das and Rahman, 2010). In turn, long-term orientation increases the shadow of the future, making partners dependent on each others’ behavior, and their cooperation more vigorous (Das and Rahman, 2010). 380 Nikolay Zenkevich, Anastasiia Reusova

Moreover, in case partners are long-term oriented, they stay committed to the alliance even in case of temporary inequalities between them as they believe that all the inequalities will even out in the long-run (Das and Rahman, 2010), therefore, partners will expect to at least be able to gain the amount of benefits indicated by the alliance contract. Long-term orientation of partners also decreases the urge, or the pressure, of gaining quick results. The importance of the absence of pressure for quick results is especially important for strategic alliances as it is rare when it is possible for them to start generation positive economic outcome right after establishment (Das and Rahman, 2010; Zenkevich, Koroleva and Mamedova, 2014a). If the alliance horizon is set to be long, partners are going to be willing to commit to the relationship and make efforts to preserve it (Ring and Van de Ven, 1994). As follows from the definition of external SAS, an alliance has to demonstrate an increasing long-term trend in its economic results to be externally stable (Zenke- vich, Koroleva and Mamedova, 2014a,b). In case partners are long-term oriented, they are likely to believe in the alliance perspective (L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013) and, contrary to the short-term orientation, will not be likely to behave opportunistically, which would have a detrimental effect on alliance economic results of an alliance (Das and Rahman, 2010). Overall, long-term mo- tivation appears to be important for both internal and external stability of SAs (Zenkevich, Koroleva and Mamedova, 2014a). The following hypotheses are put forward: H1: Long-term orientation is positively associated with external stability of a strategic alliance H2: Long-term orientation is positively associated with internal stability of a strategic alliance Trust. Trust in partner relationships decreases uncertainties, therefore, posi- tively affects conflict resolution abilities and enhances cooperation (Granovetter, 1985; Madhok, 1995; Deitz et al, 2010). Trust reduces transaction costs by devel- oping a desirable transaction climate (Granovetter, 1985; Madhok, 1995; Huo, Ye, Zhao, 2015). Without mutual trust, partners would be likely to behave opportunis- tically by taking advantage of doubtful situations, not explicitly defined by the contract (Williamson, 1975), which would affect the cooperation between partners, in particular (Das, Rahman, 2010), perceived payoff equality and fairness along with partners’ willingness to stay within an alliance and commit to it. It has also been claimed by scholars that trust has an impact on the degree to which partners are long-term oriented as even during the hard times for an alliance, partners would believe that short-term losses would be compensated by long-term gains (Ganesan, 1994; Lee and Dawes, 2005; Ryu, Park and Min, 2007; Yu and Pysarchik, 2002; Zhao and Cavusgil, 2006; Jiang, Li, Gao, 2008). However, the association between trust and alliance success in terms of alliance economic performance is not clearly articulated in the literature. It is argued by Nielsen (2007) that trust has rather an indirect impact on economic results of an alliance, therefore, on a sequence of economic results in time as well. The following hypotheses are put forward: H3: Trust is positively associated with internal stability of a strategic alliance H4: Trust is positively associated with long-term orientation in a strategic al- liance Strategic Alliances Stability Factors 381

Resource complementarity. Resource complementarity is believed to be a crucial, corner stone element to reach and maintain SAS (Deitz et al, 2010). Deitz et al. (2010) emphasize that partners with complementary resources are able to combine them in a unique way to attain a competitive advantage in the market through extracting value from valuable, rare, durable and inimitable resource com- binations (Barney, 1991, 1992). When the competitive level of complementarity is achieved, the probability that partners are willing to change the alliance form or to exit the alliance should decrease significantly (Deitz et al, 2010). Partners with complementary resources are seen as mutually dependent (Geringer, 1988) as partners’ resource contribution is beneficial for each party by definition. It has been shown in the study of Beamish (1988) that multinational companies are eager to find local partners with complementary resources while expanding their business abroad. On the other hand, Park and Ungson (1997) have shown that low resource complementarity is reflected in increased termination rates of alliances. By recognizing that a partner supplies resources that complement firm’s own ones, a firm also recognizes the original value of its partner for the alliance and the interdependence between partners. Therefore, in this sense resource complemen- tarity leads to increased partners’ trust and decrease opportunistic tendencies in a relationship (Morgan and Hunt, 1994; Sarkar et.al. 2001). Furthermore, L´opez- Navarro, Callarisa-Fiol and Moliner-Tena (2013) find that resource complementarity influences partner commitment through trust, not finding the support for a direct relationship. Scholars have proposed and empirically tested the hypothesis that resource com- plementarity positively influences partner intentions to remain in the JV and coop- erative intent, respectively (Deitz et al, 2010; Jiang, Li and Gao, 2008), and Deitz et al (2010) found support for each case. Not only resource complementarity is connected to partners’ internal coopera- tion, but it also has been studied as an antecedent of a desirable economic perfor- mance due to synergies created among complementarity resources (Lambe, Spekman and Hunt, 2002; Nielsen, 2007). The following hypotheses have been put forward: H5:Resource complementarity is positively associated with external stability of a strategic alliance H6: Resource complementarity is positively associated with internal stability of a strategic alliance H7: Resource complementarity is positively associated with partners’ trust External and internal stability. There is a rationale to assume that exter- nal stability, the proxy of which is an upward trend in alliance economic results (Zenkevich, Koroleva and Mamedova, 2014a, b), is positively associated with in- ternal stability of a SA. As a primary reason of alliance formation is connected to economic benefits generation and gaining an expected financial return (Umukoroa, Sulaimonb and Kuyeb, 2009; Qing and Zhang, 2015), it is expected that economic results are considered by partners during the alliance implementation phase. More- over, alliance success in the real world is evaluated by partners in comparison with some referent: either another company, industry, or itself at a different point of time (Hunt, Lambe and Wittmann, 2002). Therefore, partners continuously evaluate al- liance performance and make their decisions on the future cooperation based on results of the assessment, deciding how to behave within an alliance, whether or 382 Nikolay Zenkevich, Anastasiia Reusova not to stay in the alliance, maintain the same alliance form, etc. (Qing and Zhang, 2015). Therefore, there is a reason to put the following hypothesis forward: H8:External stability is positively associated with internal stability 2.3. Strategic Alliance Stability Factors: Conceptual Model The conceptual model of strategic alliance is depicted in the Fig. 2. Each arrow in the conceptual model represents a causal relationship and corresponds to a certain hypothesis. Overall, there are 8 hypotheses on the relationships between SAS factors and SAS components, the connection between SAS components, and the connections between SAS factors. Note that a sign (+) in the parenthesis stands for a positive association between constructs.

Fig. 2. Conceptual model: strategic alliance stability factors. Source: Adapted from (Deitz et al, 2010; L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013)

Given the number of hypotheses and a complex set of interconnections that exist among constructs, it makes sense to increase model complexity gradually to test it. Hence, a deeper understanding of relationships, direct and indirect effects of SAS factors on SAS components might be obtained. Therefore, the first model to be tested in the following empirical part incorpo- rates only direct relationships between SAS factors and SAS components (see Fig. 3 ), which are presented by hypotheses H1, H2, H3, H5, H6. After the model in the Fig. 3 is tested, a direct impact of SAS factors on SAS components can be determined. This differentiation needs to be made in order to define different types of direct and indirect effects. In the hypotheses scheme (Fig. 4 ), a new hypotheses (H8) is added to the set of relationships, which allows to examine whether or not External stability is positively associated with Internal stability, therefore, also examining indirect effect between Long-term orientation and Internal stability as well. Fig. 5 represents the next set of hypotheses to be tested empirically, it is the last modification of the conceptual model before the final version in the Fig. 2. Comparing the model in a Fig. 5 with a model in a Fig. 4, an additional hypothesis H7 is introduced. By testing the model in Figure 5, it will be possible to make Strategic Alliances Stability Factors 383

Fig. 3. Hypotheses scheme (1) for empirical test

Fig. 4. Hypotheses scheme (2) for empirical test

Fig. 5. Hypotheses scheme (3) for empirical test 384 Nikolay Zenkevich, Anastasiia Reusova conclusions on whether or not Trust plays a mediator role for the relationship between Resource complementarity and Internal stability.

3. Empirical Test of Strategic Alliance Stability Factors Model The data for an empirical part of the research was collected through a web-based questionnaire. As the questionnaire was web-based, a link to it was distributed to companies that might have potentially been involved into strategic alliances by email. Survey respondents were European companies’ employees that were involved in strategic alliances. There was no particular focus on a type of a strategic alliance or on the industry an alliance operates in. The database of contact details that was used to approach respondents had been compiled of different sources, partic- ularly from SDC Platinum and Amadeus (Bureau van Dijk) database. The total number of respondent equaled 184, however, later, the sample was decreased to 175 observations. Given the nature of variables under examination, the set of hypotheses and the type of relationships among variables (see Fig. 5 above), in particular that some variables act as both, dependent and independent variables, and given the explana- tory nature of the research the most appropriate method for data analysis would be structural equation modeling (SEM). SEM is a widely used tool in managerial researches because it enable the researcher to evaluate causal relationships between constructs that cannot be measured directly (latent constructs), often describing theoretical concepts, connected with a complex set of interrelationships. The variables represented by ovals in the conceptual model (Fig. 5) represent latent constructs and will be referred to as “latent constructs” or “constructs” later on. Considering the sample size that is sufficient for running the covariance-based SEM (CB-SEM), this study follows the CB-SEM methodology for the conceptual model assessment. For this purpose, IBM SPSS Amos 19 software package was used. Therefore, the following parts reproduce the logic of a two-step SEM-methodology. 3.1. Data Collection In this research, primary data was collected from the web-based questionnaire sent out to European firms. Respondents were asked to give their answers on the alliance that had been functioning at the moment of filling out the survey. In the survey, 7-point Likert (1932) type of scale was used, as it provides internal scale assessment and is believed to be a powerful tool for data analysis (Hair et al, 2010). Contact details of respondents were extracted from two databases: SDC Plat- inum and Amadeus (Bureau van Dijk). Originally, a thousand email addresses of strategic alliances were extracted from SDC Platinum, however, as, generally, many strategic alliances are short-term, it is well-explained that 60% of email addresses from the extracted database did not exist at the moment of survey distribution. Only one response was generated from the original distribution attempt. At the second attempt, a new database for contact addresses was compiled using Amadeus Bureau van Dijk. The most of responses, therefore, were obtained from sending the survey out to email addresses from Amadeus database. Out of 1167 potential respondents who have opened the link to the survey, 184 complete responses were obtained, which constitutes 15.77% of the original number. However, some of the observations represented the alliances that were too Strategic Alliances Stability Factors 385 young to draw any conclusions on their stability (less than 1 year of functioning). As strategic alliance stability is applied for long-term alliances, only the alliances that were at least one year of existence at the moment of respondent filling in the questionnaire. Consequently, the sample size was decreased to 175 observations. Respondents were managers of strategic alliances, managers of partner companies and employers of both alliances and partner companies that operate in Europe. Raw data was collected in a form of a survey created at surveygizmo.com. Most of respondents (48.0%) described themselves as managers of companies that participated in strategic alliances, while 39.4% of respondents were strategic alliance managers. The rest of respondents were either employed by a company in- volved in strategic alliances (7.4%), or worked in a strategic alliance (5.1%). Overall, it can be argued that respondents were in a position to answer alliance-related ques- tions by providing relevant information because approximately 90% of respondents represented either alliance management team or the management team of partner companies they were involved in strategic alliances. Speaking of the industry alliances in a sample belong to, most of them are concentrated in the business services industry (19%), machinery industry comes second (9.2%), followed by chemical and allied products industry (5.2%). Overall, the sample constitutes of alliances that are distributed across over 18 industries. As for the size of alliances in the sample, the most part of them (54.3%) belong to the “micro” category, according to Eurostat classification, and have between 1 and 9 permanent employees. The second biggest category of alliances (22.9%) in the sample in terms of size is “small” alliances with 10-49 employees. The third biggest category of alliances (12.5%) in a sample are “large” alliances with 250 or more permanent employees. The rest of the sample (10.2%) is represented by “medium” alliances. Lastly, respondents were asked to classify their alliance into three categories: joint venture, minority equity alliance or non-equity alliance. Such classification is general enough (Das and Rahman, 2010), which is suitable for the purpose of this study. Most alliances in the sample (46.9%) are non-equity alliances, followed by joint ventures (28.6%) and minority equity alliances (24.6%). 3.2. Structural Equation Modeling (SEM) of Strategic Alliance Stability Measurement model (MM) corresponds to the conceptual model (Fig. 2 ) in terms of latent constructs that need to be measured by a set of measured variables. To recap, latent constructs are the following: external stability, internal stability, trust, long- term orientation, resource complementarity. Each of them has a set of indicators, or measured variables, used for latent construct assessment. External stability views SA as a separate economic entity, so it is possible for an external observer to draw conclusions on its stability. Following external sta- bility definition, it is assumed that a strategic alliance is externally stable in case its economic results show a raising trend (Zenkevich, Koroleva, Mamedova, 2014a, b). Economic results of the strategic alliance might include its net profit, revenue, market share, etc. Therefore, survey participants were asked to evaluate statements about strategic alliance economic results (on a scale from “1” – “Completely dis- agree” to “7” – “Completely agree”) from the most general to more exact terms. As discussed earlier in the text, internal stability of a strategic alliance is a multi-dimensional construct, and is comprised of motivational, strategic and dy- 386 Nikolay Zenkevich, Anastasiia Reusova namic stability. Therefore, each of these elements should be reflected in internal SAS measurement scale. Inter-partner relationships play a great role in strategic alliance stability (Deitz et al, 2010), and their constant mutual involvement in al- liance activities is an important element of its stability that eventually has an effect on alliance performance. The extent to which partners are involved into alliance ac- tivities stem from their motivation to enhance alliance economic results, therefore, to maximize their own benefits (Wong, Tjosvold and Zhang, 2005; Deitz et al, 2010; Gulati, Khanna, and Nohria 1994; Sarkar et al. 2001; L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013). The next element of internal stability is the dynamic stability, which is observed in cases when partners’ expected and gained benefits correspond to the benefits expected at the moment of signing the contract (Zenkevich and Petrosjan, 2006; Kumar, 2011). According to the optimal decision principle (Zenkevich, Petrosyan and Yeung, 2009), the fact that the contract was signed among partners and they have agreed on cooperation indicates that partners have accepted the rules of ben- efits sharing and that they have a clearly established procedure of how benefits should be split among them. Hence, in case of dynamic stability, the procedure of benefits sharing is also known to participants. Lastly, if an alliance is strategically stable, all the participants prefer to stay within a particular alliance given all other options available, and are likely to con- tinue cooperation further without leaving the alliance prematurely (Zenkevich, 2009; Zenkevich, Koroleva and Mamedova, 2014a,b). Therefore, participants were asked to evaluate statements about partners’ contribution to the alliance, benefits sharing and their attitude to the current alliance. As discussed previously, trust is an important characteristic of partner relation- ships in strategic alliances. In their study on the third-party supplier relationships, Huo, Ye and Zhao (2015) claim that trust is indicated by one party’s assessment of another’s honesty, eagerness to consider the party’s perspective. Another indication of presence of trust in a relationship would be an outside observation of partners’ relationships that were characterized as honest and truthful, fair and just (L´opez- Navarro, Callarisa-Fiol and Moliner-Tena, 2013). Overall, if partners stay faithful to each other (Deitz et al, 2010), this is an indicator of trust in a relationship. At a contrary, the fact that partners found it necessary to deal cautiously with each other would indicate the absence of trust (L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013). Partners with long-term orientation hope for their relationship with each other to bring them economic benefits in the future (L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013; Ganesan, 1994; Kelley and Thibaut, 1978). As follows, due to the value that the cooperation generates, partners would be concerned about their existing relationship. Logically, contrary to short-term oriented firms who would push the partner to generate quicker results (Das and Rahman, 2010) and try to get immediate benefit from each transaction (Das and Teng, 2000; Ganesan, 1994), long-term oriented partners would put their long-term goals before the quick gain (Das and Teng, 2000). Moreover, there is evidence that partners with long-term orientation will adjust their behavior in order to focus on the achievement of the long-term goals, e.g., partners will assist each other in resolving issues because they believe that another partner will do the same for them (Griffith, Harvey and Lusch, 2006; Lee and Dawes, 2005; Lusch and Brown, 1996). In other words, long-term Strategic Alliances Stability Factors 387 orientation promotes the alignment in partners’ goals and actions (L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, 2013). In case partners acknowledge that their resources are complementary, they are likely to assume that each of them adds substantial value to the alliance jointly as well as that their resources and competencies complement each other. Moreover, partners are likely to agree that the strategic fit among them is the best possi- ble and, therefore, they could not have found a partner with a better strategic fit (Deitz et al, 2010), as the combination of resources among them creates a compet- itive advantage through synergies (Hunt, Lambe and Wittmann, 2002) and helps attain their joint objectives (Lambe, Spekman and Hunt, 2002; Hunt, Lambe and Wittmann, 2002). Moreover, given that resources are complementary, it means that they should be distinct (Hunt, Lambe and Wittmann, 2002) to create synergies be- tween partners and provide more benefits to partners than they could have gained operating individually (Lambe, Spekman and Hunt, 2002). After the measurement model assessment (overall model fit, construct validity and reliability) via confirmatory factor analysis and AVE, CR, DV calculation., measurement model respecifications were required. After the respecification, the measurement model has shown an adequate fit to empirical data. Factor reliability and validity have also proven to be adequate. Fig. 6 provides a graphical representation of a SM, and matches the conceptual model. In the figure, only causal relationships between latent constructs are shown, measured variables are omitted for convenience of the reader.

Fig. 6. Structural Model of Strategic Alliance Stability Factors

Structural model (SM) was assessed and showed adequate fit. As mentioned previously in the text, the model was split into several models. All of them proved adequate fit, see the Table 2 below. 3.3. Strategic Alliance Stability Factors: Direct and Indirect Relationships Modeling results show that most, but not all of the specified relationships are sta- tistically significant. However, only 2 relationships out of 8 have demonstrated sta- 388 Nikolay Zenkevich, Anastasiia Reusova

Table 2. Measurement and structural models comparison

Model Fit Indices Final SM (di- SM SM SM CFA rect) (ES→IS) (ES→IS, (ES→IS, RC→T) RC→T, T→LTO) χ2 394 420.234 415.471 457.317 417.108 df = 216 (p=0.000) (p=0.000) (p=0.000) (p = 0.000) df = 219 df=218 df=219 df = 219 χ2 normed 1.82 1.92 1.91 2.10 1.90 CFI 0.919 0.911 0.912 0.894 0.912 RMSEA 0.069 0.073 0.072 0.079 0.072 90 percent 90 percent 90 percent 90 percent 90 percent confidence confidence confidence confidence confidence interval interval interval interval interval RMSEA = RMSEA = RMSEA = RMSEA = RMSEA = (0.058; (0.062; (0.062; (0.069; (0.062; 0.079) 0.083) 0.083) 0.089) 0.083) PNFI N/A 0.720 0.719 0.707 0.721 tistical insignificance, therefore, it can be claimed that, overall, theoretical model adequately fits the data. See Table 3 for reference. All the significant effects of SAS

Table 3. Modeling results. Path coefficients and their significance

Hypothesis Structural relationship Estimate H1 Long-term orientation External stability 0.433*** H2 Long-term orientation Internal stability 0.025 (ns) H3 Trust Internal stability 0.355*** H4 Trust Long-term orientation 0.609*** H5 Resource complementarity External sta- 0.056 (ns) bility H6 Resource complementarity Internal sta- 0.377*** bility H7 Resource complementarity Trust 0.450*** H8 External stability Internal stability 0.171* ns – not significant *significantly different from zero at the 0,05 level (two-tailed) **significantly different from zero at the 0,01 level (two-tailed) ***significantly different from zero at the 0,001 level (two-tailed) determinants on both SAS components correspond to theoretical assumptions. SEM has shown that SAS determinants have different effects on the components of SAS. More specifically, Trust and Resource complementarity have a direct positive effect on Internal SAS, the effect of Resource Complementarity on External stability is indirect and minor (see Table 4), while Long-term orientation is the only significant and direct determinant of External stability. These results partially correspond to findings revealed by previous studies. Speak- ing of Trust and Resource complementarity effects on Internal stability, results of Strategic Alliances Stability Factors 389 an empirical test go in line with findings by Deitz et al (2010) that find a direct and significant effect of Resource complementarity on the intent to stay within a joint venture as well as partner commitment. It has also been proven by the same authors that Trust is positively associated with commitment. However, authors find marginal support for the causal relationship between Trust and commitment. Clearly, there is a difference in stability conceptualization chosen in this paper and in the paper by Deitz et al (2010). Contrary to the expected results predicted by theory, Resource complementarity did not manifest a significant effect on External stability. This finding might indicate that in case multiple SAS components are taken into consideration, the effect of Resource complementarity on Internal stability prevails. At the same time, regarding External and Internal stability components in separate models is not logical as it is required that both components are present for an alliance to be overall stable (Zenkevich, Koroleva and Mamedova, 2014a,b). The effect of Long-term orientation has proven to be positive and significant in relation to External stability, which supports theoretical assumptions put forward in the respective part of the text. At the same time, the effect of Long-term orientation on Internal stability has been found insignificant in the examined model. Contrary to this result, L´opez-Navarro, Callarisa-Fiol and Moliner-Tena (2013) find a significant and positive relationship between Long-term orientation and partner commitment in export joint ventures. The discrepancy in finding might result, firstly, from dif- ference in sampling. In particular, the current study addressed all alliance types, while the abovementioned research focuses exclusively on export JVs. Secondly, the discrepancy in findings might stem from differences in conceptualization of the outcome variable. As it has been mentioned for (Deitz et al, 2010), the term “com- mitment” is most closely related to “motivational stability”, which constitutes one part of Internal stability. Therefore, there is an implication for further research that Long-term orientation can be regarded as a factor of one of the Internal stability components, e.g., motivational stability. Thirdly, it can be claimed, that the effect of Long-term orientation on External stability prevails in the model, and makes the effect of Long-term orientation on Internal stability statistically insignificant. Although, as it was already mentioned, considering External and Internal stability as outcome variables in separate models does not make sense. L´opez-Navarro, Callarisa-Fiol and Moliner-Tena (2013) have found that Re- source complementarity is positively and significantly associated with Trust. This result corresponds to the findings on the association between Resource complemen- tarity and Trust demonstrated in the current paper (see Table 4). Moreover, Deitz et al (2010) have found that there is a partial mediation by Trust between Re- source complementarity and intent to remain in an alliance. The same result has been obtained for Trust, Resource complementarity and Internal stability exam- ined in the current paper (see Table 4). Moreover, L´opez-Navarro, Callarisa-Fiol and Moliner-Tena, (2013) find a significant and positive relationship between Trust and Long-term orientation, which corresponds to the findings in this paper (see Table 4). The positive and significant effect of External stability on Internal stability has been identified, as predicted by theory. This finding also corresponds to results provided in the paper by Fu, Lin and Sun (2013) who have found a positive and significant effect of the increase in economic results of alliance activities, namely, 390 Nikolay Zenkevich, Anastasiia Reusova the income increase, on SAS. However, in the current study, the effect of External stability on Internal stability is not as strong as the influence of other determinants on particular components of stability. For research hypotheses testing summary, refer to the Table 4.

Table 4. Hypotheses test results

Hyp. Hypothesis formulation St.est. Result H1 Long-term orientation is positively associ- 0.433*** Supported ated with external stability of a strategic al- liance H2 Long-term orientation is positively associ- 0.025 (ns) N/A ated with internal stability of a strategic al- liance H3 Trust is positively associated with internal 0.355*** Supported stability of a strategic alliance H4 Trust is positively associated with long-term 0.609*** Supported orientation in a strategic alliance H5 Resource complementarity is positively as- 0.056 (ns) N/A sociated with external stability of a strategic alliance H6 Resource complementarity is positively as- 0.377*** Supported sociated with internal stability of a strategic alliance H7 Resource complementarity is positively as- 0.450*** Supported sociated with partners’ trust H8 External stability is positively associated 0.171* Supported with internal stability ns – not significant *significantly different from zero at the 0,05 level (two-tailed) **significantly different from zero at the 0,01 level (two-tailed) ***significantly different from zero at the 0,001 level (two-tailed)

Fig. 7. SEM final results. Dependence paths Strategic Alliances Stability Factors 391

Considering the fact that direct and indirect effect of each SAS determinant can be identified, direct and indirect effect for each construct have been calculated in relation to ES and IS based on the data used for analysis. To differentiate among dif- ferent effects, 4 models have been tested (each following model includes all the paths of the previous model plus one new path): SM with direct effects between SAS fac- tors and SAS components; SM with an additional path External stability Internal stability; SM with an additional path (Resource complementarity Trust);→ SM with an additional path (Trust Long-term orientation). Next, the analysis→ of direct and indirect effects has been made→ based on significant paths. See Table 4 for the refer- ence. By comparing direct effects in all 4 models in Table 4, it can be argued that all the path coefficients estimates remain approximately the same compared in models with different numbers of causal relationships. This implies consistency in results for all the models.

4. Implications and further research 4.1. Managerial implications Given the fact that a more stable SA is likely to survive external turbulences and experience greater economic success, reaching its strategic goals, it is important to understand the mechanics behind SAS dynamics and use it for SAS management (Jiang, Li and Gao, 2008). Results of the empirical research described above, may be used by managers in alliances and managers in partner companies. While SAS can be assessed using game theory approach by interpreting finan- cial data along with inside expert estimations (Zenkevich, Koroleva and Mamedova, 2014b), SAS assessment would be incomplete without SAS management. Theoret- ical results provided in the paper suggest which inter-organizational factors could be altered in order to enhance external and internal stability of strategic alliances, given the importance of either component for the overall alliance stability. Results provided in Fig. 7, suggest that direct determinants of internal SAS are trust and resource complementarity, considering that the latter has a greater effect on internal stability. Moreover, trust plays a mediating role in a relationship between resource complementarity and internal stability by interference. The only factor in the model affects external stability directly, which is long-term orientation. Contrary to expectations that scholars and management practitioners might have, long-term orientation of partners does not directly and significantly affect internal stability of strategic alliances as well as resource complementarity does not directly affect external strategic alliance stability. It means that in practice managers who are willing to enhance the overall stability should manage different SAS factors simultaneously in order to reach a higher stability level. Given that long-term orientation is critically important for external stability of strategic alliances, or the raising trend of economic results, it should be considered in alliance management. Long-term orientation might occur especially important for alliances approaching their termination date as partners might not feel bonded enough anymore, and might demonstrate opportunistic tendencies, which would have a negative impact on the trend of economic results. Therefore, it is advisable for companies to choose partnerships with aligned goals and objectives that lay beyond goals and objectives of a particular strategic alliance, and might serve as an additional link between partners. 392 Nikolay Zenkevich, Anastasiia Reusova

While resources are often immobile and it might not be feasible to enhance resource complementarity during the implementation stage of an alliance, it seems reasonable to enhance trust among partners and pay closer attention to relationship management. This could include building communication channels and facilitating communication overall, managing cultural distance in terms of national, professional and organizational cultures, etc. (Elmuti and Kathawala, 2001). Then, given that resource complementarity is one of the criterion for partner selection in many al- liances, partners should pay close attention to resource complementarity as it does not only play role at a formation stage of an alliance, but also affects SAS on the implementation stage. Moreover, it has been found that the effect of the trend of economic results (external stability) on internal stability is not as strong as the effect of such deter- minants as trust and resource complementarity. Therefore, relational factors, often disregarded in strategic alliances (Agarwal, Croson and Mahoney, 2010) should be subject to constant monitoring during the implementation phase of an alliance.

4.2. Research limitations and further research

The study is subject to some limitations that can be addressed further. The pri- mary reason for most of limitations in this study is scarcity of data and difficulties connected with data collection. First, the research does not differentiate between different alliance types (e.g., equity, non-equity) because strategic alliances are not easily accessible for the outsider from the point of information collection, e.g., most alliances do not publish financial data and are restricted to provide sensitive infor- mation (Jiang, Li and Gao, 2008). Second, given sample characteristics, study results can be best generalized for micro and small size European alliances, mainly in business service industry. How- ever, some peculiarities can be found for larger alliances and alliances that operate in different fields. Therefore, results provided in the current study, should be applied in practice with a careful consideration of organizational and industrial conditions that an alliance operates with. The same issue can also be seen as a focus for further examination. Third, given the fact that internal SAS consists of 3 components (dynamic, strategic, motivational stability; see Fig. 1), an additional study on interrelation- ships among them and on their determinants can be considered further. Based on the mismatch between obtained results, expected findings and results provided in other empirical papers, there is a rationale to assume that, e.g., long-term orienta- tion that did not exhibit a significant effect on internal stability overall, might have an effect on one of its components, most likely, on motivational stability. Similar conclusions can be made on the effects of resource complementarity on strategic and motivational stability, which might be different in each case. Fourth, given current tools for SAS assessment (Zenkevich, Koroleva and Mame- dova, 2014b), it is now possible to make conclusions on the presence of strategic alliance stability, however, stability level is still hardly quantifiable. Therefore, there is a vast potential for researchers to address the issue of a quantitative stability level assessment (e.g., developing stability indices). Strategic Alliances Stability Factors 393

References Agarwal, R., Croson, R., and Mahoney, J. T. (2010). The role of incentives and commu- nication in strategic alliances: an experimental investigation. Strategic Management Journal, 31(4), 413–437. Akkaya, C. (2007). Technology Based Alliances: A Turkish Perspective. MPRA. Paper 3479. Anderson, E., and Weitz, B. (1989). Determinants of continuity in conventional industrial channel dyads. Marketing science, 8(4), 310–323. Axelrod, R. (1984). The evolution of cooperation. New York: Basic Books. Barney, J. B. (1991). Firm resources and sustained competitive advantage. Journal of man- agement, 17(1), 99–120. Barney, J. B. (1992). Integrating organizational behavior and strategy formulation research: A resource based analysis. Advances in strategic management, 8(1), 39–61. Beamish, P. W. (1988). Multinational Joint Ventures in Developing Countries, Routledge. London and New York.145. Buffenoir, E., and Bourdon, I. (2013). Managing Extended Organizations and Data Gov- ernance. Digital Enterprise Design and Management, 205, 135–145. Das, T. K., and Rahman, N. (2010). Determinants of partner opportunism in strategic alliances: a conceptual framework. Journal of Business and Psychology, 25(1), 55–74. Das, T. K., and Teng, B. S. (1998).Between trust and control: Developing confidence in partner cooperation in alliances. Academy of management review, 23(3), 491–512. Das, T. K., and Teng, B. S. (2000). Instabilities of strategic alliances: An internal tensions perspective. Organization science, 11(1), 77–101. Deitz, G. D., Tokman, M., Richey, R. G., and Morgan, R. M. (2010). Joint venture stability and cooperation: Direct, indirect and contingent effects of resource complementarity and trust. Industrial Marketing Management, 39(5), 862–873. Douma, M. U., Bilderbeek, J., Idenburg, P. J., and Looise, J. K. (2000). Strategic alliances: managing the dynamics of fit. Long Range Planning, 33(4), 579–598. {Douma et. al. 2000} Doz, Y. L., and Hamel, G. (1998). Alliance advantage: The art of creating value through partnering. Harvard Business Press. Elmuti, D., and Kathawala Y. (2001). An overview of strategic alliances. Management decision, 39(3), 205–18. Franko, L. G. (1971). Joint venture divorce in the multinational company. The International Executive, 13(4), 8–10. Fu, S., Lin, J., and Sun, L. (2013). An empirical examination of the stability of the alliance of “a company+ farmers” From the perspective of farmers. Chinese Management Stud- ies, 7(3), 382–402. Iyer, K. N. S. (2002). Learning in Strategic Alliances: An Evolution- ary Perspective. Academy of Marketing Science Review, 2002 (10). http://www.carloscorreia.net/livros/learning strategic alliance.pdf Ganesan, S. (1994). Determinants of long-term orientation in buyer-seller relationships. The Journal of Marketing, 58, 1–19. https://warrington.ufl.edu/centers/retailcenter/docs/papers/Ganesan1994.pdf Geringer, J. M. (1988). Joint venture partner selection: Strategies for developed countries. Praeger Pub Text. Geringer, J. M., and Hebert, L. (1991). Measuring joint venture performance. Journal of International Business Studies, 22(2), 249–263. Gibbs, M. R., and Humphries, M. A. (2016). Enterprise Relationship Management: A Paradigm for Alliance Success. Ashgate Publishing, Ltd. Gill, J., and Butler, R. J. (2003). Managing instability in cross-cultural alliances. Long range planning, 36(6), 543–563. 394 Nikolay Zenkevich, Anastasiia Reusova

Gomes-Casseres, B. (1987). Joint venture instability: Is it a problem. Division of Research, Harvard University. Granovetter, M. (1985). Economic action and social structure: The problem of embedded- ness. American journal of sociology, 91(3), 481–510. Gulati, R., Khanna, T., and Nohria, N. (1994). Unilateral commitments and the importance of process in alliances. Sloan Management Review, 35(3), 61–69. Hair, J. F., Anderson, R. E., Babin, B. J., & Black, W. C. (2010). Multivariate data analysis: A global perspective (Vol. 7). Upper Saddle River, NJ: Pearson. Heide, J. B., and Miner, A. S. (1992). The shadow of the future: Effects of anticipated inter- action and frequency of contact on buyer-seller cooperation. Academy of management journal, 35(2), 265–291. Hong, J. and Yu H. and Zhichao, C. (2011). Research About Measures of Enhancing Sta- bility of Competitive Strategic Alliance. Chinese Business Review, 10(12), 1191–1198. Huang, Y. (2003). Selling China: Foreign direct investment during the reform era. Cam- bridge University Press. Hunt, S. D., Lambe, C. J., and Wittmann, C. M. (2002). A theory and model of business alliance success. Journal of Relationship Marketing, 1(1), 17–35. Huo, B., Ye, Y., and Zhao, X. (2015). The impacts of trust and contracts on opportunism in the 3PL industry: The moderating role of demand uncertainty. International Journal of Production Economics, 170 (PA), 160–170. Inkpen, A. C., and Beamish, P. W. (1997). Knowledge, bargaining power, and the instability of international joint ventures. Academy of management review, 22(1), 177–202. Jiang, X., Li, Y., and Gao, S. (2008). The stability of strategic alliances: Characteristics, factors and stages. Journal of International Management, 14(2), 173–189. Khanna, T., Gulati, R., and Nohria, N. (1998). The dynamics of learning alliances: Compe- tition, cooperation, and relative scope. Strategic management journal, 19(3), 193–210. Killing, J. P. (1982). How to make a global joint venture work. Harvard business review, 60(3), 120–127. Killing, J. P. (1983). Strategies for Joint Venture Success. Praeger, New York. Kolenak, J. (2007). Reinforcement of Success of Strategic Alliance of Small and Medium Enterprises in the Czech Republic. Mokso darbai: Vadyba, 3(4), 12–13. Kumar, M. V. (2011). Are joint ventures positive sum games? The relative effects of coop- erative and noncooperative behavior. Strategic Management Journal, 32(1), 32–54. Kumar, N., Scheer, L. K., and Steenkamp, J. B. E. (1995). The effects of perceived interde- pendence on dealer attitudes. Journal of marketing research, 32(3), 348–356. Lambe, C. J., Spekman, R. E., and Hunt, S. D. (2002).Alliance competence, resources, and alliance success: conceptualization, measurement, and initial test. Journal of the academy of Marketing Science, 30(2), 141–158. Lee, D. Y., and Dawes, P. L. (2005). Guanxi, trust, and long-term orientation in Chinese business markets. Journal of international marketing, 13(2), 28–56. L´opez-Navarro, M. A.,´ Callarisa-Fiol, L., and Moliner-Tena, M. A.´ (2013). Long-Term Ori- entation and Commitment in Export Joint Ventures among Small and Medium-Sized Firms. Journal of Small Business Management, 51(1), 100–113. Madhok, A. (1995). Revisiting multinational firms’ tolerance for joint ventures: A trust- based approach. Journal of international Business studies, 117–137. Moor, R. E. (1971). The Use of Economics in Investment Analysis. Financial Analysts Journal, 27(6), 63–69. Morgan, R. M., and Hunt, S. D. (1994). The commitment-trust theory of relationship mar- keting. The journal of marketing, 58(3), 20–38. Nakamura, H. R. (2005). Motives, Partner Selection and Productivity Effects of M&As: The Pattern of Japanese Mergers and Acquisition. Thesis (Ph.D.), Institute of International Business, Stockholm School of Economics. Nielsen, B. B. (2007). Determining international strategic alliance performance: A multi- dimensional approach. International Business Review, 16(3), 337–361. Strategic Alliances Stability Factors 395

Ozorhon, B., Arditi, D., Dikmen, I., and Birgonul, M. T. (2008). Effect of partner fit in in- ternational construction joint ventures. Journal of Management in Engineering, 24(1), 12–20. Park, S. H., and Ungson, G. R. (1997). The effect of national culture, organizational com- plementarity, and economic motivation on joint venture dissolution. Academy of Man- agement journal, 40(2), 279–307. Qing, X., and Zhang, W. (2015).Co-opetition and the Stability of Competitive Contractual Strategic Alliance: Thinking Based on the Modified Lotka-Voterra Model. International Journal of u-and e-Service, Science and Technology, 8(1), 67–78. Petrosjan, L. A. (1977). Stable solutions of differential games with many participants. Vi- estnik of Leningrad University, 19, 46–52. Ring, P. S., and Van de Ven, A. H. (1994). Developmental processes of cooperative interor- ganizational relationships. Academy of management review, 19(1), 90–118. Ryu, S., Park, J. E., and Min, S. (2007). Factors of determining long-term orientation in interfirm relationships. Journal of Business Research,60(12), 1225–1233. Sarkar, M. B., Echambadi, R., Cavusgil, S. T., and Aulakh, P. S. (2001). The influence of complementarity, compatibility, and relationship capital on alliance performance. Journal of the academy of marketing science,29(4), 358–373. Sim, A. B., and Ali, M. Y. (2000). Determinants of stability in international joint ven- tures: Evidence from a developing country context. Asia Pacific Journal of Management, 17(3), 373–397. Umukoroa, F. G., Sulaimonb, A. H. A., and Kuyeb, O. L. (2009). Strategic alliance: an insight into cost of structuring. Serbian Journal of Management, 4(2), 259–272. Williamson, O. E. (1975). Markets and hierarchies. New York: Free Press. Wong, A., Tjosvold, D., and Zhang, P. (2005). Developing relationships in strategic al- liances: Commitment to quality and cooperative interdependence. Industrial Marketing Management, 34(7), 722–731. Yan, A. (1998). Structural stability and reconfiguration of international joint ventures. Journal of international business studies, 29(4), 773–795. Yan, A., and Luo, Y. (2001). International joint ventures: Theory and practice. New York: M.E. Sharpe. Yan, A., and Zeng, M. (1999).International joint venture instability: A critique of previous research, a reconceptualization, and directions for future research. Journal of interna- tional Business studies, 30(2), 397–414. Yu, J. P., and Pysarchik, D. T. (2002). Economic and non-economic factors of Korean manufacturer-retailer relations. The International Review of Retail, Distribution and Consumer Research, 12(3), 297–318. Zenkevich N. A., Koroleva A. F., Mamedova Zh. A. (2014a). Concept of Joint Venture Sta- bility. Vestnik of St. Petersburg University. Ser. Management, (1), 28–56. Zenkevich N. A., Koroleva A. F., Mamedova Zh. A. (2014b). Joint Venture Stability As- sessment methodology. Vestnik of St. Petersburg University. Ser. Management, (3), 41–74. Zenkevich, N. A. (2009). Modelirovanie ustojchivogo sovmestnogo predprijatija. Nauchnie doklady, 1(R), Institute of Management, St. Petersburg State University. Zenkevich, N. A., and Petrosjan, L. A. (2006). Time-consistency of cooperative solutions. Discussion Paper, Institute of Management, St. Petersburg State University. Zenkevich, N. A., Petrosyan, L. A., and Yeung, D. V. K. (2009). Dinamicheskie igry i ikh prilozheniya v menedzhmente (Dynamic Games and Their Applications in Manage- ment). Institute of Management, St. Petersburg State University. Zhao, Y., and Cavusgil, S. T. (2006). The effect of supplier’s market orientation on man- ufacturer’s trust. Industrial Marketing Management, 35(4), 405–414. Contributions to Game Theory and Management, X, 396–403

10 Years Game Theory and Management (GTM)

Maria Bulgakova St. Petersburg State University, 7/9 Universitetskaya nab., St.Petersburg, 199034, Russia E-mail: mari [email protected]

The International Conference Game Theory and Management (GTM) started in 2007 and after takes place every year in Saint-Petersburg State University (Russia). In this paper we will present the list of plenary speakers of this conference during last 10 years and show some photos of plenary speakers.

GTM 2007 1. (Hebrew University, Israel) and Roberto Serrano (Brown University, Providence,USA) ”An Economic Index of Riskiness” 2. Dean Foster and Sergiu Hart (Hebrew University, Israel) ”An Operational Measure of Riskiness” 3. David W. K. Yeung (Hong Kong Baptist University, Hong-Kong) and Leon A. Petrosyan (St. Petersburg University, Russia) ”Managing Catastrophe-bound Industrial Pollution with Game-theoretic Algorithm: The St. Petersburg Ini- tiative” 4. Georges Zaccour (HEC Montreal, Canada) ”Differential Games in Marketing Channels” The proceedings of the conference are published in: ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A.Zenkevich eds., Vol.1, 2008, 565 p.

Robert Aumann Sergiu Hart Leon A. Petrosjan

Georges Zaccour

Fig. 1. Plenary speakers of GTM 2007 10 Years Game Theory and Management (GTM) 397

GTM2008 5. Tamer Basar (University of Illinois at Urbana-Champaign, USA) ”Hierarchi- cal Games and Reverse Engineering” 6. John F. Nash (Princeton University, USA) ”Research on the Problem of Eval- uating Cooperative Games; the Method of Agencies and the Search for Con- firmation through Model Variants” 7. Geert J. Olsder (Delft University of Technology, the Netherlands) ”Be the Boss” 8. Leon A. Petrosyan (St. Petersburg University, Russia) ”Stable Cooperation” 9. David W. K. Yeung (Hong Kong Baptist University, Hong-Kong) ”Coopera- tive game-theoretic Mechanism Design for Optimal Resource Use” The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol. 2, 2009, 514 p.

John F. Nash Greet J. Olsder Tamer Bazar

David W. K. Yeung

Fig. 2. Plenary speakers of GTM 2008

GTM2009 10. Pierre Bernhard (Nice Sophia Antipolice University, France) ”Nonzero-sum Dynamic Games in the Management of Biological Systems” 11. Dmitry A. Novikov (Institute of Control Sciences RAS, Russia) ”Reflexive Games: Theory and Applications” 12. (University of Bonn, Germany) ”Incomplete Equilibrium” 13. Myrna Wooders (Vanderbilt University, USA) ”Games with many players as models of large economies” 398 Maria Bulgakova

The proceedings of the conference are published in : Contributions to Game Theory and Management, L.A. Petrosyan & N.A. Zenkevich eds., Vol.3, 2010, 486 p.

Dmitry A. Novikov (second Pierre Bernhard (fifth from Reinhard Selten from the left) the left)

Fig. 3. Plenary speakers and other participants of GTM 2009

GTM2010 14. Alain Haurie (University of Geneva, Switzerland) ”A Piecewise Deterministic Game Model for International GHG Emission Agreements” 15. Herv Moulin (Rice University, USA) ”Clearing Supply and Demand under Bilateral Constraints” 16. Ralph Tyrrell Rockafellar (University of Washington, USA) ”Coherent Mod- eling of Risk in Optimization under Uncertainty” 17. Arkady Kryazhimskiy (Steklov Institute of Mathematics RAS, Russia) ”Mar- ket Equilibrium in Negotiations and Growth Models” The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol. 4, 2011, 514 p.

GTM2011 18. Vladimir V. Mazalov (Institute of Applied Mathematical Research, KRC RAS, Russia) ”Bargaining Models and Mechanism Design” 19. Roger B. Myerson (University of Chicago, USA) ”Sequential equilibria of games with infinite sets of types and actions” 20. J¨orgen W. Weibull (Stockholm School of Economics, Sweden) ”Robust set- valued prediction in games” 21. Shmuel Zamir (The Hebrew University, Israel) ”Extending the Condorcet Jury Theorem to a general dependent jury” 22. Martin Shubik (Yale University, USA) ”The Present and Future of Game Theory” The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.5, 2012, 412 p. 10 Years Game Theory and Management (GTM) 399

Alain Haurie Herv Moulin Ralph Tyrrell Rockafellar

Arkady Kryazhimskiy

Fig. 4. Plenary speakers of GTM 2010

GTM2012

23. Michele Breton (HEC Montreal, Canada) ”Borrowing and lending: two sides of the financing game” 24. Josef Hofbauer (University of Vienna, Austria) ”Deterministic Evolutionary Game Dynamics” 25. Sylvain Sorin (Polytechnic University, Paris, France) ”Recent advances in zero-sum dynamic games” 26. Ehud Kalai (Northwestern University, USA) ”Cooperation in Strategic Games revised” 27. Sergey Aseev (Steklov Mathematical Institute RAS, Moscow, Russia) ”The Pontryagin maximum principle for infinite-horizon problems and its applica- tions in economics”

The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.6, 2013, 458 p.

GTM2013

28. Finn Kydland (University of California, Santa Barbara, USA) ”On dynamic games” 29. Bernard De Meyer (Universit Paris 1, Panthon-Sorbonne, France) ”Risk aver- sion and price dynamics on the stock market” 30. Burkhard Monien (Paderborn University, Germany) ”The complexity of com- puting equilibria” 31. Leon Petrosyan (St. Petersburg University, Russia) ”Time-consistent and strategically supported cooperation in dynamic games” 400 Maria Bulgakova

Roger B. Myerson Vladimir V. Mazalov J¨orgen W. Weibull

Leon A. Petrosjan, Shmuel Zamir, Nikolay Zenkevich, GTM 2011 J¨orgen W. Weibull

Fig. 5. Plenary speakers and other participants of GTM 2011

Ehud Kalai (in the center) Sergey Aseev (on the left)

Fig. 6. Plenary speakers and other participants of GTM 2012

The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.7, 2014, 438 p.

GTM2014 32. Guillermo Owen (Naval Postgraduate School, Monterey, California, USA) ”A game-theoretic approach to networks” 33. Steffen Jørgensen (University of Southern Denmark, Denmark) ”Recent De- velopments in Lanchester Advertising Differential Games” 34. Abraham Neyman (The Hebrew University of Jerusalem, Israel) ”Robust equilibria of continuous-time stochastic games” 35. Fuad T. Aleskerov (High School of Economics, Moscow, Russia) ”Power in groups: theory and applications” 10 Years Game Theory and Management (GTM) 401

Finn Kydland Bernard De Meyer Burkhard Monien

Fig. 7. Plenary speakers of GTM 2013

The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.8, 2015, 366 p.

Guillermo Owen Steffen Jørgensen (on the Abraham Neyman right)

GTM 2014 Fig. 8. Plenary speakers and other participants of GTM 2014

SING11 - GTM2015

36. David Schmeidler (School of Mathematical Sciences Tel Aviv University, Is- rael) ”Experimental Study of Estimation and Bidding in Common-Value Auc- tions with public information” 37. Hans Peters (Department of Quantitative Economics, University of Maas- tricht, Netherland) ”An axiomatic characterization of the Owen-Shapley spa- tial power index” 38. Alexander Vasin (Operations Research Department, Moscow State Univer- sity, Russia) ”Auctions of Homogeneous Goods: Game-Theoretic Analysis” 39. Georges Zaccour (Department of Management Sciences, HEC Montreal, Canada) ”Durable Agreements in a Class of Stochastic Games” 402 Maria Bulgakova

The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.9, 2016, 385 p.

David Schmeidler Hans Peters Alexander Vasin

GTM 2015

Fig. 9. Plenary speakers and other participants of SING11-GTM 2015

GTM2016 40. Jean-Jacques Herings (School of Business and Economics, Maastricht Uni- versity, The Netherlands) ”Equilibrium and Matching under Price Control” 41. Eric Maskin (Department of Economics, Harvard University, USA) ”Elections and Strategic Voting: Condorcet and Borda” 42. Eilon Solan (School of Mathematical Sciences, Tel Aviv University, Israel) ”Multiplayer Stochastic Games: Techniques, Results, and Open Problems” 43. Alexander Tarasyev (Department of Dynamic Systems, IMM, RAS, Eka- terinburg, Russia) ”Decompositional Algorithms for Construction of Control Strategies in Dynamic Games” The proceedings of the conference are published in : ”Contributions to Game Theory and Management”, L.A. Petrosyan & N.A. Zenkevich eds., Vol.10, 2017, 376 p. 10 Years Game Theory and Management (GTM) 403

Alexander Tarasyev Eric Maskin Eilon Solan

Leon Petrosyan, Eric Maskin, Nikolay Zenkevich, Jean-Jacques Herings, Eilon Solan Fig. 10. Plenary speakers and other participants of GTM 2016 CONTRIBUTIONS TO GAME THEORY AND MANAGEMENT

Collected papers

Volume X

presented on the Tenth International Conference Game Theory and Management

Editors Leon A. Petrosyan, Nikolay A. Zenkevich.

 



 70 100

∗ 1/16