Int. J. Appl. Math. Comput. Sci., 2015, Vol. 25, No. 2, 337–351 DOI: 10.1515/amcs-2015-0026

COMPUTING THE STACKELBERG/NASH EQUILIBRIA USING THE EXTRAPROXIMAL METHOD: CONVERGENCE ANALYSIS AND IMPLEMENTATION DETAILS FOR MARKOV CHAINS GAMES

a b a,∗ KRISTAL K. TREJO ,JULIO B. CLEMPNER ,ALEXANDER S. POZNYAK

aDepartment of Automatic Control, Center for Research and Advanced Studies Av. IPN 2508, Col. San Pedro Zacatenco, 07360 Mexico City, Mexico e-mail: {apoznyak,ktrejo}@ctrl.cinvestav.mx

bCenter for , Management and Social Research, National Polytechnic Institute Lauro Aguirre 120, col. Agricultura, Del. Miguel Hidalgo, Mexico City, 11360, Mexico e-mail: [email protected]

In this paper we present the extraproximal method for computing the Stackelberg/Nash equilibria in a class of ergodic controlled finite Markov chains games. We exemplify the original game formulation in terms of coupled nonlinear pro- gramming problems implementing the Lagrange principle. In addition, Tikhonov’s regularization method is employed to ensure the convergence of the cost-functions to a Stackelberg/ point. Then, we transform the problem into a system of equations in the proximal format. We present a two-step iterated procedure for solving the extraproximal method: (a) the first step (the extra-proximal step) consists of a “prediction” which calculates the preliminary position approximation to the equilibrium point, and (b) the second step is designed to find a “basic adjustment” of the previous prediction. The procedure is called the “extraproximal method” because of the use of an extrapolation. Each equation in this system is an optimization problem for which the necessary and efficient condition for a minimum is solved using a quadratic programming method. This solution approach provides a drastically quicker rate of convergence to the equilib- rium point. We present the analysis of the convergence as well the rate of convergence of the method, which is one of the main results of this paper. Additionally, the extraproximal method is developed in terms of Markov chains for Stackelberg games. Our goal is to analyze completely a three-player Stackelberg game consisting of a leader and two followers. We provide all the details needed to implement the extraproximal method in an efficient and numerically stable way. For in- stance, a numerical technique is presented for computing the first step parameter (λ) of the extraproximal method. The usefulness of the approach is successfully demonstrated by a numerical example related to a pricing model for airlines companies.

Keywords: extraproximal method, Stackelberg games, convergence analysis, Markov chains, implementation.

1. Introduction traproximal step” i.e., gathering certain information on the future development of the process. Using this The extraproximal approach (Antipin, 2005) can be information, it is possible to execute the “principle considered a natural extension of the proximal and step” based on the preliminary position. It seems gradient optimization methods used for solving the more natural to call this procedure an “extraproximal method” difficult problems of finding an equilibrium point in (because of the use of extrapolation (Antipin, 2005)). . The simplest and most natural approach Along with the extra-gradient technique (Poznyak, 2009), to implement the proximal method is to use a simple the extraproximal method can be considered a natural iteration by omitting the prediction step. However, extension of the proximal and gradient optimization as shown by Antipin (2005), this approach fails. A approaches to resolve more complicated game problems, more versatile procedure would be to perform an “ex- such as the Stackelberg–Nash game. ∗Corresponding author Our goal is to analyze a three-player Stackelberg 338 K.K. Trejo et al.

game (von Stackelberg, 1934), a leader and two followers, the extraproximal method. The usefulness of the method for a class of ergodic controllable finite Markov chains is successfully demonstrated by a numerical example (Poznyak et al., 2000) using the extraproximal method related to a pricing oligopoly model for airlines companies (Antipin, 2005). The leader commits first to a x that choose to offer limited airline seats for specific routes. in the Markov chains game. Then, the followers realize a The paper is organized as follows. The next section Nash solution (Clempner and Poznyak, 2011) obtaining presents the preliminaries needed to understand the rest · ·| ϕl( , x)(l =1, 2) as the cost functions. The optimal of the paper. The formulation of the problem for the actions x of the leader are then chosen to minimize its N -player game and the general format iterative version · ·| payoff ϕ0( , x). The main concern about Stackelberg of the extraproximal method is presented in Section 2. games is as follows: the highest leader payoff is obtained A three-player Stackelberg game for a class of ergodic when the followers always reply in the best possible way controllable finite Markov chains using the extraproximal for the leader (this payoff is at least as high as any Nash method is given in Section 4. Section 5 includes the payoff). For ground-breaking works in the Stackelberg analysis of the converge of the extraproximal method and game, see those by Bos (1986; 1991), Harris and Wiens proves the convergence rates, which is one of the main (1980), Merril and Schneider (1966) or von Stackelberg results of this paper. A numerical example related to a (1934). Surveys can be found in the works of Breitmoser pricing oligopoly model for airline companies validates (2012), De Fraja and Delbono (1990), Nett (1993) or the proposed extraproximal method in Section 6. Final Vickers and Yarrow. (1998). comments are outlined in Section 7. In this paper we present the extraproximal method for computing the Stackelberg/Nash equilibria in a class of ergodic controlled finite Markov chains games. We exemplify the original game formulation in terms of 2. Controllable Markov chains coupled nonlinear programming problems implementing the Lagrange principle. In addition, Tikhonov’s A controllable Markov chain (Clempner and Poznyak, regularization method is employed to ensure the 2014; Poznyak et al., 2000) is a quadruple MC = convergence of the cost-functions to a Stackelberg/Nash {S, A, Υ, Π},whereS is a finite set of states, S ⊂ N, equilibrium point. The Tikhonov regularization is one of endowed with discrete topology; A is the set of actions, the most popular approaches to solve discrete ill-posed which is a metric space. For each s ∈ S, A(s) ⊂ A is problems represented in the form of a non-obligatory strict the non-empty set of admissible actions at state s ∈ S. convex function. Then, we transform the problem to a Without loss of generality we may take A= ∪s∈SA(s); system of equations in the proximal format. Υ={(s, a)|s ∈ S, a ∈ A(s)} is the set of admissible We present a two-step iterated procedure for solving state-action pairs, which is a measurable subset of S × A; the extraproximal method: (a) the first step (the Π(k)= π(i,j|k) is a stationary transition controlled extra-proximal step) consists of a “prediction” which matrix, where calculates the preliminary position approximation to the equilibrium point, and (b) the second step is designed to π(i,j|k) find a “basic adjustment” of the previous prediction. Each equation in this system is an optimization problem for ≡ P s(n +1)=s(j)|s(n)=s(i),a(n)=a(k) , which the necessary condition for a minimum is solved using a quadratic programming method, instead of the representing the probability associated with the transition iterating projectional gradient method given by Moya and from state s(i) to state s(j) under an action a(k) ∈ A s(i) Poznyak (2009). Iterating these steps, we obtain a new (k =1,...,M) at time n =0, 1,... . quick procedure which leads to a simple and logically justified computational realization: at each iteration of The dynamics of the game for Markov chains are both sub-steps of the procedure, the functional of the game described as follows. The game consists of (N +1) decreases and finally converges to a Stackelberg/Nash players (denoted by l = 0, N ) and begins at the initial l equilibrium point (Trejo et al., 2015). state s (0) which (as well as the states further realized We present the analysis of the convergence as well by the process) is assumed to be completely measurable. the rate of convergence of the method. Additionally, the Each player l is allowed to randomize, with distribution l l ∈ l l extraproximal method is developed in terms of Markov d(k|i)(n), over the pure action choices a(k) A (s(i)), chains for Stackelberg games consisting of a leader and i = 1,Nl and k = 1,Ml. The leader corresponds to two followers. We provide all the details needed to l =0and followers to l = 1, N . At each fixed strategy 0 implement the extraproximal method in an efficient and of the leader d(k|i)(n), the followers make the strategy numerically stable way. For instance, a numerical method selection trying to realize a Nash equilibrium. Below we l l is presented for computing the first step parameter (λ)of will consider only stationary strategies d(k|i)(n)=d(k|i). Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 339

These choices induce the state distribution dynamics Notice that by (4) it follows that l l l l l P s =s(il) = c , P s (n +1)=s(jl) (il,kl) kl N M l (6) l l c(i ,k ) l l l l l l l d(kl|il)= l . = π(il,jl|kl)d(kl|il) P s (n)=s(il) . c(il,kl) il=1 kl=1 kl In the ergodic case (when all Markov chains are In the ergodic case, l ergodic for any stationary strategy d(k|i)) the distributions l l l c(il,kl) > 0 P s (n +1)=s(jl) exponentially quickly converge to l kl their limits P s = s(i) satisfying for all l = 0, N .Theindividual aim of each participant is l l P s = s(jl) Jl(cl) → min . Nl Ml cl∈Cl l l l l (1) adm = π(il,jl|kl)d(kl|il) P s =s(il) . il=1 kl=1 Obviously, here we have a conflict situation which can be resolved by the Stackelberg–Nash equilibrium concept For any player l, his/her individual rationality is the discussed in detail below. player’s cost function J l of any fixed policy dl,defined over all possible combinations of states and actions, and 3. Formulation of the problem indicates the expected value when taking action al in state sl and following policy dl thereafter. The J-values can be 3.1. Stackelberg–Nash equilibrium concept. To expressed by simplify the descriptions, below let us introduce the new variables: l l l l l l J (i ) (c ):= W(il,kl)d(kl|il)P s =s l , (0) i ,k (2) (0) l l x := col c ,X:= Cadm, (7) where l (l) l (l) N l l l v := col c ,V:= Cadm l = 1, . W(il,kl) = J(il,jl,kl)π(il,jl|kl), jl Consider a non-zero sum game with a leader whose l strategies are denoted by x ∈ X and N followers with (i ) the function J(il,jl,kl) being a constant at state s l when l l l strategies v ∈ V l = 1, N . Denote by the action a(kl) is applied. Then, the cost function of each player, depending on the states and actions of all N l 1 N ∈ l participants, is given by the values W(i1,k1;...;iN ,kN ),so v =(v ,...,v ) V := V that the “average cost function” Jl for each player l in the l=1 stationary regime can be expressed as the joint strategy of the followers. The leader is assumed to anticipate the reactions of the followers. They are l 0 N J c ,...,c trying to reach one of the Nash equilibria for any fixed N strategy x of the leader, that is, to find a joint strategy ··· l l ∗ 1∗ N∗ ∈ := W(i1 ,k1,...,iN ,kN ) c(il,kl), (3) w = w ,...,w W satisfying for any admissible l l i0,k0 iN ,kN l=0 x ∈ X, w ∈ W and any l = 1, N the system of inequalities (the Nash condition) where l  l  l lˆ∗ l l c := c(il,kl) il=1,Nl;kl=1,Ml gl(v ,w |x) ≤ 0 for any v ∈ V and all l = 1, N , is a matrix with elements l lˆ l lˆ l lˆ gl(v ,w |x):=ϕl(w ,w |x) − ϕl(v ,w |x), l l l l c(i ,k ) = d(k |i )P s =s(i ) (4) (8) l l l l l ˆ where wl is a strategy of the rest of the followers adjoint satisfying to vl, namely, ⎧ cl : cl =1,cl ≥0, lˆ 1 l−1 l+1 N ⎪ (il,kl) (il,kl) w := v ,...,v ,v ,...,v ⎨ il,kl cl ∈ Cl = N adm ⎪ (5) ˆl m ⎩⎪ l l l ∈ W := V c(jl,kl)= π(il,jl|kl)c(il,kl). kl il,kl m=1,m=l 340 K.K. Trejo et al.

and   The approximative solution obtained by the Tikhonov l l lˆ w := arg min ϕl v ,w |x . regularization (see Poznyak et al., 2000) is given by vl∈V l ∗ ∗ (xδ ,wδ )=argmin max Lδ(x, v, w, λ), l lˆ x∈X v∈V, w∈W, λ≥0 Here ϕl(v ,w |x) is the cost-function of the player l, which plays the strategy vl ∈ V l, and the rest of the ˆ ˆ δ 2 wl ∈ W l Lδ(x, v, w, λ):=fδ(x, w)+λgδ(v, w|x) − λ , players—the strategy . 2 (12) Lemma 1. The Nash equilibrium w¯ ∈ W for the fol- where δ>0 and lowers given a fixed strategy of the leader x ∈ X (8) can δ 2 be equivalently expressed in the joint format (Tanaka and fδ(x, w):=f(x, w)+ x , Yokoyama, 1991), 2 (13)

max g (v, w¯|x) ≤ 0, δ 2 2 v∈W gδ(v, w|x)=g(v, w|x)− (v +w ) 2 N   l lˆ l lˆ Notice that with δ>0 the functions considered g(v, w|x):= ϕl(w ,w |x) − ϕl(v ,w |x) , l=1 become strictly convex providing the uniqueness of the (9) discussed conditional optimization problem (12). For δ = l lˆ l lˆ ϕl(w ,w |x):= min ϕl(z ,w |x), 0 we have (11). Notice also that the Lagrange function in l l z ∈W (12) satisfies the saddle-point condition (Poznyak, 2009), ∈ ≥ N namely, for all x X and λ 0 we have l l l v ∈ W ,w∈ W := W . ∗ ∗ ∗ ∗ ∗ l=1 Lδ(xδ ,vδ,wδ,λδ) ≤Lδ(xδ ,vδ ,wδ ,λδ ) ∗ ∗ ∗ (14) ≤Lδ(xδ,v ,w ,λ ). Proof. Summing (8) implies (9). Conversely, taking δ δ δ vm = wm for all m = l in (9), which is valid for any admissible vl, we obtain (8).  3.3. Proximal format. In the proximal format (see Antipin, 2005) the relation (12) can be expressed as Notice that the condition g(v, w¯|x) ≤ 0 (9) is   equivalent to ∗ 1 ∗ 2 ∗ ∗ ∗ λδ =argmax − λ − λδ  + γLδ(xδ ,vδ ,wδ ,λ) , max {g(v, w¯|x)}≤0 λ≥0 v∈V 2 for any fixed v ∈ V and any x ∈ X.   ∈ ∈ ∗ 1 ∗ 2 ∗ ∗ ∗ Let f (x, w)(x X, w W ) be the cost function xδ =argmin x − xδ  + γLδ(x, vδ ,wδ ,λδ ) , l lˆ x∈X 2 of the leader. The functions of the followers ϕl(v ,w |x) N l = 1, and the function of the leader f (x, w) are   assumedtobeconvexinalltheirarguments. ∗ 1 ∗ 2 ∗ ∗ ∗ vδ =argmax − v − vδ  + γLδ(xδ ,v,wδ ,λδ ) , v∈V 2 Definition 1. Astrategyx∗ ∈ X of the leader together ∗ with the collection w ∈ W of the followers is said to be   a StackelbergÐNash equilibrium if ∗ 1 ∗ 2 ∗ ∗ ∗ wδ =argmax − w − wδ  + γLδ(xδ ,vδ ,w,λδ ) , w∈W 2 (x∗,w∗) ∈Arg min max {f(x, w)|g(v, w|x)≤0} . x∈X v∈V,w∈W (15) (10) ∗ ∗ ∗ ∗ where the solutions xδ ,vδ ,wδ and λδ depend on the small parameters δ, γ > 0. Define the extended variables 3.2. Regularized Lagrange principle application. ⎛x˜:=⎞ x ∈ X˜:=X, Applying the Lagrange principle (see, for example, v Poznyak et al., 2000) for Definition 1, we may conclude y˜:= ⎝ w ⎠ ∈ Y˜ :=V × W × R+, that (10) can be rewritten as λ (x∗,w∗) ∈Arg min max L(x, v, w, λ),   x∈X v∈V, w∈W, λ≥0 w˜1 w˜ = ∈ X˜ × Y,˜  w˜2  L(x, v, w, λ):=f(x, w)+λg(v, w|x). v˜1 v˜ = ∈ X˜ × Y,˜ (11) v˜2 Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 341 and the functions Remark 1. The direct one-step procedure, when λn+1 = λn, does not work (see the counter-example of Antipin L˜δ(˜x, y˜):=Lδ(x, v, w, λ), (2005)).

˜ ˜ Ψδ(˜w, v˜):=Lδ(˜w1, v˜2) − Lδ(˜v1, w˜2). 4. Markov format for the extraproximal ∗ ∗ ∗ ∗ For w˜1 =˜x, w˜2 =˜y, v˜1 =˜v1 =˜xδ and v˜2 =˜v2 =˜yδ ,we method have ∗ ∗ We consider a three-player Stackelberg game l = 1, N . Ψδ(˜w, v˜):=L˜δ(˜x, y˜δ ) − L˜δ(˜xδ , y˜). Player l should be understood to refer to players in a In these variables the relation (15) can be represented in a general expression of the game. In the remainder of “short format”as   this paper, the leader is designated as player (0) and ∗ 1 ∗ 2 ∗ the followers as players (1) and (2). Numbers are used v˜ =arg min w˜ − v˜  +γΨδ(˜w, v˜ ) . (16) w˜∈X×Y 2 to refer to players in a specific expression. We also assume that the number of strategies and actions that this 3.4. Extraproximal method. The extraproximal description can take is finite and fixed. Here we will apply method for the conditional optimization problems (12) the iterative quadratic method providing a significantly was suggested by Antipin (2005). We design the method quicker rate of convergence. for the static Stackelberg–Nash game in a general format. The general format iterative version (n =0, 1,...) 4.1. Cost functions and notation. The individual of the extraproximal method with some fixed admissible cost-functions of the followers and the leader are defined initial values (x0 ∈ X, v0 ∈ V , w0 ∈ W ,andλ0 ≥ 0) is as follows. as follows: Cost-function for the leader: 1. The first half-step (prediction):   J(0)(c(1),c(2),c(0)) 1 2 λn =argmin λ − λn −γLδ(xn,vn,wn,λ) , λ≥0 2   (0) (1) (2) (0) = W c c c . 1 2 (i1 ,k1;i2,k2;i0,k0) (i1,k1) (i2,k2) (i0,k0) xn =argmin x − xn +γLδ(x, vn,wn, λ¯n) , i ,k i ,k i ,k x∈X 2 1 1 2 2 0 0   (21) 1 2 vn =argmin v − vn −γLδ(xn,v,wn, λ¯n) , v∈V 2  Cost-functions for the followers: 1 2 w¯n =argmin w − wn −γLδ(xn,vn,w,λ¯n) , w∈W 2 Follower 1: (17) J(1)(c(1),c(2),c(0)) or, in the extended variables,   1 2 (1) (1) (2) (0) vˆn =arg min 2 w˜ − v˜n +γΨδ(˜w, v˜n) . (18) w˜∈X×Y = W(i1 ,k1;i2,k2;i0,k0)c(i1,k1)c(i2,k2)c(i0,k0). i1,k1 i2,k2 i0,k0 2. The second (basic) half-step, (22)   1 2 λn+1 =argmin λ − λn −γLδ(¯xn, v¯n, w¯n,λ) , Follower 2: λ≥0 2   (2) (1) (2) (0) J (c ,c ,c ) 1 2 xn+1 =argmin x − xn +γLδ(x, v¯n, w¯n, λ¯n) , x∈X 2   (2) (1) (2) (0) 1 2 = W(i ,k ;i ,k ;i ,k )c(i ,k )c(i ,k )c(i ,k ). vn+1 =argmin v − vn −γLδ(¯xn,v,w¯n, λ¯n) , 1 1 2 2 0 0 1 1 2 2 0 0 v∈V i ,k i ,k i ,k 2  1 1 2 2 0 0 1 2 (23) wn+1 =argmin w − wn −γLδ(¯xn, v¯n,w,λ¯n) w∈W 2 For n =0, 1,..., let us define the vectors (19) (0) or, in the “short format”, xn =colc (n),   1 2   v˜n+1 =arg min w˜ − v˜n + γΨδ(˜w, vˆn) . (1) w˜∈X×Y 2 col c (n) vn = , (20) col c(2) (n) 342 K.K. Trejo et al. (1)ˆ l col c (n) For computing the Nash equilibrium point, we define wn = w = ˆ , col c(2) (n) the function g(v, w, x) as follows:

  g(v, w|x) (1)  l col ˚c (n) ˆ ˆ w = (1) (1) (2) (0) (2) (1) (2) (0) col ˚c(2) (n) = W ˚c c c + W c ˚c c i ,k i ,k i ,k 1 1 2 2 0 0  ˆ ˆ ⎛   ⎞ − W (1)c(1)c(2)c(0) − W (2)c(1)c(2)c(0) . (1) (2)ˆ col arg min J z,c (n) ,xn ⎜ z∈C(1) ⎟ (27) ⎜ adm   ⎟ = ⎝ (2) (1)ˆ ⎠ . col arg min J c (n) ,z,xn (2) z∈Cadm 4.2. Tikhonov’s regularization method. The regula- rized problem for (24) is defined as follows: Let us introduce for simplicity the following notation: (0) (1) (2) (0) δ (0) 2 fδ(x, w)= W ˚c ˚c c + c  , 2 (l) (l) i1,k1 i2,k2 i0,k0 W(i1 ,k1;i2,k2;i0,k0) = W and for (27) we have (l) (l) c(il,kl) = c gδ(v, w|x)  (ˆl) (ˆl) ˆ ˆ c (il,kl)=c = W (1)˚c(1)c(2)c(0) + W (2)c(1)˚c(2)c(0)

i1,k1 i2,k2 i0,k0 (l) (l)  ˚c =˚c (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) (il,kl) − W c c c − W c c c ⎛ ! ! ⎞ (l) (l) ! ! 2 c = c ! (1) !2 ! (1)ˆ ! (il,kl) δ ! ! − ⎝! c ! ! c ! ⎠ ! (2) ! + ! (2)ˆ ! . For the leader we have 2 c c (0) (1) (2) (0) (28) f(x, w)= W ˚c ˚c c . (24) i1,k1 i2,k2 i0,k0 4.3. Lagrange principle. Let us apply the Lagrange From principle (Poznyak, 2009) with the Lagrange function     J(1) (1) (2)ˆ | (0) ≤ J(1) (1) (2)ˆ | (0) ˚c ,c c c ,c c , Lδ(x, v, w, λ)     (2) (2) (1)ˆ (0) (2) (2) (1)ˆ (0) δ 2 J ˚c ,c |c ≤ J c ,c |c , = fδ(x, w)+λgδ(v, w, x) − λ  2 (0) (1) (2) (0) we have = W ˚c ˚c c   i1,k1 i2,k2 i0,k0 l lˆ  ϕl w ,w |x (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0)     + λ W ˚c c c + W c ˚c c (1) (1) (2)ˆ (0) (2) (2) (1)ˆ (0)  = J ˚c ,c |c + J ˚c ,c |c (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) − W c c c − W c c c ⎛ ⎞ ! ! ! !2   ! (1) !2 ! (1)ˆ ! (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) δ ⎝! c ! ! c ! ⎠ δ (0) 2 δ 2 = W ˚c c c + W c ˚c c , − λ ! (2) ! + ! ˆ ! + c  − λ . 2 c ! c(2) ! 2 2 i1,k1 i2,k2 i0,k0 (25) 4.4. Extrapoximal method for Markov chains. The as well as procedure for solving the Stackelberg/Nash equilibrium   l lˆ point for the extraproximal method consists of the ϕl v ,w |x     following “iterative rules” implementation. (1) (1) (2)ˆ (0) (2) (2) (1)ˆ (0) = J c ,c |c + J c ,c |c 1. First half-step:   (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) 1.a. For = W c c c + W c c c .   i1,k1 i2,k2 i0,k0 1 2 λn =argmin (λ − λn) − γLδ(xn,vn,wn,λ) , (26) λ≥0 2 Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 343 we have     (1) T (1) ⎡ − c c (n)  c(2) c(2)(n) ⎣ (1) (1) (2)ˆ (0)  λn = λn + γ W ˚c c (n)c (n) (1) (1) (2)ˆ (0) n i1,k1 i2,k2 i0,k0 + γλ W c c (n)c (n) i1,k1 i2,k2 i0,k0 (2) (1)ˆ (2) (0) (1) (1) (2)ˆ (0) ˆ + W c (n)c c (n) − W ˚c c (n)c (n) + W (2)c(1)(n)˚c(2)c(0)(n) (1) (1) (2)ˆ (0)  − W c (n)c (n)c (n) ˆ  − W (2)c(1)(n)˚c(2)c(0)(n) ˆ − W (2)c(1)(n)c(2)(n)c(0)(n) ⎛ ! ! ⎞⎤+ ! !2 ! !2 (0) (1) (2) (0) ! (1) ! ! (1)ˆ ! − γ W ˚c ˚c c (n) − δ ⎝! c (n) ! c (n) ⎠⎦ γ ! (2) ! + ! (2)ˆ ! c (n) ! ! i1,k1 i2,k2 i0,k0 2 c (n) ! ! ! (1) !2 δ (0) 2 1 ! c (n) ! 1 − γ c (n) + ! (2) ! × . 2 2 c (n) 1+γδ ⎫ ! !2 ! ˆ ! ⎬ δ ! c(1)(n) ! δ 2 +γλn ! ˆ ! + γ λn . 1.b. For 2 ! c(2)(n) ! 2 ⎭   1 2 xn =argmin x − xn + γLδ(x, vn,wn, λn) , x∈X 2 1.d. For   1 2 we have wn =argmin w − wn − γLδ(xn,vn,w,λn) , w∈W 2   1 δ (0) 2 xn =argmin + γ c  we have x∈X 2 2  ⎧ (0) (1) (2) (0) ! !2 ⎨  ! ˆ ! + γ W ˚c ˚c c 1 δ ! c(1) ! i1,k1 i2,k2 i0,k0 wn =argmin + γλn ! ˆ !  w∈W ⎩ 2 2 ! c(2) ! (1) (1) (2)ˆ (0) n + λ W ˚c c (n)c T (1)ˆ (1)ˆ (2) (1)ˆ (2) (0) (1) (1) (2)ˆ (0) c c (n) + W c (n)˚c c − W c (n)c (n)c − ˆ ˆ  c(2) c(2)(n) − (2) (1)ˆ (2) (0)  W c (n)c (n)c (1) (1) (2)ˆ (0) − γλn W ˚c c c (n) (0)T (0) 1 (0) 2 − c (n)c + c (n) i1,k1 i2,k2 i0,k0 ⎛ 2 ⎞ (2) (1)ˆ (2) (0) (1) (1) (2)ˆ (0) ! ! ! !2 + W c ˚c c (n) − W c (n)c c (n) ! (1) !2 ! (1)ˆ !  δ c (n) ! c (n) ! ˆ − ⎝! ! ! ! ⎠ − (2) (1) (2) (0) γλn ! (2) ! + ! (2)ˆ ! W c c (n)c (n) 2 c (n) c (n) − (0) (1) (2) (0) δ 2 γ W ˚c ˚c c (n) − γ λn } . i1,k1 i2,k2 i0,k0 2 ! !2 ! (1)ˆ ! δ (0) 2 1 ! c (n) ! − γ c (n) + ! ˆ ! 2 2 ! c(2)(n) ! 1.c. For ! ! * ! (1) !2 δ ! c (n) ! δ 2   + γλn ! (2) ! + γ λn . 1 2 2 c (n) 2 vn =argmin v − vn − γLδ(xn,v,wn, λn) , v∈V 2 we have 2. Second half-step: 2.a. For &   ! !2   1 δ ! (1) ! ! c ! 1 2 vn =argmin + γλn ! (2) ! λn+1 =argmin (λ − λn) − γLδ(xn, vn, wn,λ) , v∈V 2 2 c λ≥0 2 344 K.K. Trejo et al.     (1) T (1) we have c c (n) − (2) (2) ⎡ c c (n)   (1) (1) (2)ˆ (0) ⎣ (1) (1) (2)ˆ (0) + γλn W c c (n)c (n) λn+1 = λn + γ W ˚c c (n)c (n) i1,k1 i2,k2 i0,k0 i1,k1 i2,k2 i0,k0 (2) (1)ˆ (2) (0) (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) + W c (n)c c (n) − W ˚c c (n)c (n) + W c (n)˚c c (n)  ˆ ˆ − (2) (1) (2) (0) − W (1)c(1)(n)c(2)(n)c(0)(n) W c (n)˚c c (n)  ˆ δ − W (2)c(1)(n)c(2)(n)c(0)(n) − γ W (0)˚c(1)˚c(2)c(0)(n) − γ c(0)(n)2 2 ⎛ ! ! ⎞⎤+ i1,k1 i2,k2 i0,k0 ! !2 2 ⎫ ! (1) ! ! (1)ˆ ! ! ! ! !2 δ c (n) ! c (n) ! 2 ! ˆ ! ⎬ − ⎝! ! ⎠⎦ ! (1) ! (1) 2 γ ! (2) ! + ! (2)ˆ ! 1 ! c (n) ! δ ! c (n) ! δ 2 c (n) ! c (n) ! + ! (2) ! + γλn ! ˆ ! + γ λn . 2 c (n) 2 ! c(2)(n) ! 2 ⎭ 1 × . 1+γδ 2.d. For   1 2 2.b. For wn+1 =argmin w − wn − γLδ(xn, vn,w,λn) , w∈W 2   1 2 xn+1 =argmin x − xn + γLδ(x, vn, wn, λn) , we have x∈X 2 ⎧ ! !2 ⎨  ! ˆ ! 1 δ ! c(1) ! we obtain wn+1 =argmin + γλn ! ˆ ! w∈W ⎩ 2 2 ! c(2) !

xn+1 T   (1)ˆ (1)ˆ 1 δ (0) 2 − c c (n) =argmin + γ c  (2)ˆ (2)ˆ x∈X 2 2 c c (n)   (0) (1) (2) (0) (1) (1) (2)ˆ (0) + γ W ˚c ˚c c − γλn W ˚c c c (n) i ,k i ,k i ,k 1 1 2 2 0 0 i1,k1 i2,k2 i0,k0 (1) (1) (2)ˆ (0) (2) (1)ˆ (2) (0) + λn W ˚c c (n)c + W c(i1,k1)˚c c (n) ˆ ˆ (1) (1) (2)ˆ (0) + W (2)c(1)(n)˚c(2)c(0) − W (1)c(1)(n)c(2)(n)c(0) − W c (n)c c (n)   ˆ (2) (1)ˆ (2) (0) − W (2)c(1)(n)c(2)(n)c(0) − W c c (n)c (n) (0) (1) (2) (0) (0)T (0) 1 (0) 2 − − c (n)c + c (n) γ W ˚c ˚c c (n) ⎛ 2 ⎞ i1,k1 i2,k2 i0,k0 ! ! ! !2 ! ! ! (1) !2 ! (1)ˆ ! ! !2 δ ⎝! c (n) ! ! c (n) ! ⎠ ! (1)ˆ ! − γλn + ! ! δ (0) 2 1 c (n) ! (2) ! ! (2)ˆ ! − γ c (n) + ! ˆ ! 2 c (n) c (n) 2 2 ! c(2)(n) !  ! ! * δ 2 ! (1) !2 −γ λ . δ ! c (n) ! δ 2 n +γλn ! (2) ! + γ λn . 2 2 c (n) 2

2.c. For 4.5. Quadratic programming solver. Let us restrict   the expression (3) for the admissible set and the stationary 1 2 n+1  − n − Lδ n n n case with the following constraint. v =argmin v v γ (x ,v,w , λ ) , (l) v∈V 2 d Each matrix (il|kl) represents a stationary mixed-strategy that belongs to the simplex we get ⎧ (l) (l) ⎪ NlMl ⎪ d(i |k ) ∈ R for d(i |k ) ≥ 0, vn+1 ⎨⎪ l l l l & (N M )   ! !2 l l ! (1) ! S := M 1 δ ! c ! ⎪ l (l) =argmin + γλn ! (2) ! ⎩⎪ where d =1. v∈V 2 2 c (il|kl) kl=1 Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 345 (l) Developing the formulas of (31) for N players and A necessary and sufficient condition for d(il|kl) to be a Stackelberg/Nash equilibrium point is the solution of the (l) multiplying by c(il|kl) for each component (31), we have following QPP (quadratic programming problem):     (l) (l) l 0 1 N − (j ,i ) J → BL = π(jl,il|kl) δ l l d(i0,k0),d(i1,k1),...,d(iN ,kN ) min jl=1,Nl il=1,Nl d(l) (29) ⎡ ⎤ (i |k ) (l) (l) l l π − 1 ... π ⎢ (1,1|1) (Nl,1|Ml) ⎥ subject to ⎣ ⎦ (l) (N M ) = ...... , d ∈ S l l . (l) (l) (il|kl) π ... π − 1 (Nl,1|1) (Nl,Nl|Ml) We consider the problem 1   where δ(j ,i ) is Kronecker’s delta. Then A ∈ X HX + C X → min ,  l l  eq 2 X N N +(N ) × N N ×M R( l=1 l ) ( l=1 l l) (30) is defined as follows: ⎡ ⎤ AX ≤ b, AX = b, BL(1) 0 ... 0 ⎢ (2) ⎥ l Nl+1×(NlMl) l Nl+1 ⎢ 0 BL ... 0 ⎥ where 0 ≤ X ≤ 1,A ∈ R and b ∈ R . ⎢ ⎥ T ⎢...... ⎥ ⎢ (M )⎥ The vector C = C(N|M) is defined as ⎢ 00... BL l ⎥ Aeq = ⎢ ⎥ , l ⎢ [1] 0 ... 0 ⎥ C(1|1) = W(1|1), ⎢ ⎥ l ⎢ 0[1]... 0 ⎥ C(2|1) = W(2|1), ⎣ 00... 0 ⎦ . 000[1] . l where C(N|1) = W(N|1), [1] = (1,...,1) ∈ R(Nl) . . and  N N +(N ) l ( l=1 l ) beq ∈ R C(1|Ml) = W(1|Ml), . is .  beq = 0 ... 01... 1 . l C(Nl|Ml) = W(Nl|Ml) l 5. Convergence analysis that is, C(N|M) is equal to W(i1,k1,...,iN ,kN ) ordered by columns. 5.1. Auxiliary results. Let us prove the following l The vector b is defined as auxiliary results.

l T Nl+1 b =(0,...,0, 1) ∈ R . Lemma 2. Let f(z) be a convex function defined on a Nl times convex set Z.Ifz∗is a minimizer of function We will construct the matrix Al ∈ RNl×(NlMl) using 1 2 the ergodicity constraints defined in (5), ϕ(z)= z − x +αf(z) (32) 2 Ml Nl (l) (l) (i j |k ) − 0= π l l l c(il|kl) c(jl|kl) . (31) on Z with fixed x,thenf(z) satisfies the inequality kl=1 il=1 1 ∗ 2 ∗ Then, we have z −x +αf(z ) 2 Ml Nl (l) (l) 1 2 1 ∗ 2 jl =1 π(i j |k )c − c =0, ≤ z − x +αf(z)− z − z  . (33) l l l (il|kl) (1|kl) 2 2 kl=1 il=1 Proof. By the necessary condition for a minimum at z∗, ∗ ∗ ∗ Ml Nl − ∇ − ≥ (l) (l) z x + α f(z ),z z 0, l (i j |k ) − j =2 π l l l c(il|kl) c(2|kl) =0, kl=1 il=1 and the convexity property of f(z), expressed as ∗ ∗ ∗ f(z) ≥ f(z )+ ∇f(z ),z− z , . . we derive Ml Nl ∗ ∗ ∗ (l) − (l) 0 ≤ z −x + α∇f(z ),z− z jl = Nl π(iljl|kl)c(i |k ) c(N |k ) =0. l l l l ∗− − ∗ − ∗ kl=1 il=1 = z x, z z + α [f(z) f(z )] . 346 K.K. Trejo et al.

Using this inequality in the identity Proof. Taking in (33) α = γ and ∗ 1 2 z =˜w, x =˜vn,z=ˆvn, z − x ∗ 2 f(z)=Ψδ(˜w, v˜n),f(z )=Ψδ(ˆvn, v˜n), 1 ∗ ∗ 1 ∗ 2 = z − z 2+ z −x, z − z∗ + z −x , we obtain 2 2 1 − 2 we get (33). The lemma is proven.  vˆn vn +γΨδ(ˆvn, v˜n) 2 (37) Lemma 3. If all partial derivatives of L˜δ(˜x, y˜) satisfy 1 2 1 2 ≤ w˜ − v˜n +γΨδ(˜w, v˜n)− w˜ − vˆn . the Lipschitz condition with positive constant C0, then the 2 2 following Lipschitz-type condition holds: Again setting in (33) α = γ and ∗  δ − z =˜w, x =˜vn,z=˜vn+1, [Ψ (˜w + h, v˜ + g) Ψδ(˜w, v˜ + g)] ∗ f(z)=Ψδ(˜w, vˆn),f(z )=Ψδ(˜vn+1, vˆn), − [Ψδ(˜w + h, v˜) − Ψδ(˜w, v˜)]≤C0hg (34) we get valid for any w,˜ h, v,˜ g ∈ X˜ × Y˜ . 1 2 v˜ −v˜n +γΨδ(˜v , vˆn) Proof. By Lagrange’s formula 2 n+1 n+1 1 1 1 ≤ w˜ − v˜ 2+γΨ (˜w, vˆ )− w˜ − v˜ 2. f(x + h) − f(x)= ∇f(x + th),h dt, 2 n δ n 2 n+1 0 (38) we have Selecting w˜ =˜vn+1 in (37) and w˜ =ˆvn in (38), we obtain [Ψδ(˜w + h, v˜ + g) − Ψδ(˜w, v˜ + g)] 1 2 − −  vˆ −v˜n +γΨδ(ˆv , v˜n) [Ψδ(˜w + h, v˜) Ψδ(˜w, v˜)] 2 n n 1 ≤ 1 − 2 =  ∇Ψδ(˜w + th, v˜ + g) −∇Ψδ(˜w + th, v˜),h dt v˜n+1 v˜n +γΨδ(˜vn+1, v˜n) (39) 0 2 1 1 2 − v˜n+1−vˆn , ≤ | ∇Ψδ(˜w + th, v˜ + g) −∇Ψδ(˜w + th, v˜),h | dt 2 0 1 1 2 v˜n+1−v˜n +γΨδ(˜vn+1, vˆn) ≤ C0hg dt ≤ C0hk, 2 0 1 2 1 2 ≤ vˆ −v˜n +γΨ (ˆv , vˆ )− vˆ −v˜n+1 . which proves the lemma.  2 n δ n n 2 n (40) 5.2. Main convergence theorem. The following Adding (39) with (40) and using (34) for theorem presents the convergence conditions of (17)–(19) w˜ + h =˜vn+1, w˜ =ˆvn, v˜ + g =˜v , v˜ =ˆvn, and gives an estimate of its rate of convergence. n h =˜vn+1−vˆn,g=˜vn−vˆn, Theorem 1. Assume that the problem (16) has a solution. we finally conclude that Let L˜δ(˜x, y˜) be differentiable in x˜ and y˜, whose partial 2 derivative with respect to y˜ satisfies the Lipschitz condi- v˜n+1−vˆn ≤ γ[Ψδ(˜vn+1, v˜n) − Ψδ(ˆvn, v˜n)] tion with positive constant C. Then, for any δ ∈ (0, 1), − γ[Ψ (˜v , vˆ ) − Ψ (ˆv , vˆ )] there exists a small enough δ n+1 n δ n n & * ≤ γCv˜n+1−vnv˜n−vˆn, 2 √ 1 1+ 1+2C0 γ0= γ0(δ)

1 2 v˜ −v˜n +γΨδ(˜v , vˆn) and 2 n+1 n+1 ∗ 2 2 1 ∗ 2 ∗ 1 ∗ 2 v˜ −v˜n+1 + v˜ −vˆn ≤ v˜ −v˜n +γΨδ(˜v , vˆn)− v˜ −v˜n+1 . δ n+1 2 δ δ 2 δ √ 2γδ ∗ 2 Adding these two inequalities and multiplying by two +  d(˜vn − vˆn)+ √ (˜vn−v˜δ )  d  yields 2 (2γδ) ∗ 2  ∗− 2  − 2  − 2 ≤ 1 − 2γδ + v˜δ − v˜n , v˜δ v˜n+1 + v˜n+1 vˆn + vˆn v˜n d ∗ −2γΨδ(˜vδ , vˆn)+2γ[Ψδ(˜vn+1, vˆn) ∗ 2 finally implying that +Ψδ(ˆvn, v˜n) − Ψδ(˜vn+1, v˜n)] ≤v˜δ −v˜n . ∗ 2 ∗ 2 v˜ −v˜n+1 ≤ qv˜δ − v˜n Adding and subtracting the term Ψδ(ˆvn, vˆn),wehave δ n+1 ∗ 2 ≤ q v˜ − v˜0 −→ 0 ∗ 2 2 2 δ n→∞ v˜δ −v˜n+1 +v˜n+1−vˆn +vˆn−v˜n ∗ +2γ [Ψδ(ˆvn, vˆn) − Ψδ(˜vδ , vˆn)] +2γ [Ψδ(˜vn+1, vˆn) with 2 − − (2γδ) Ψδ(ˆvn, vˆn)+Ψδ(ˆvn, v˜n) Ψδ(˜vn+1, v˜n) q =1− 2γδ + ∈ (0, 1) . ∗ 2 d ≤v˜δ − v˜n . The theorem is proven. Using (34) with w˜ + h =˜vn+1, w˜ =ˆvn, v˜ + k =˜vn  and v˜ =ˆvn,wehaveh =˜vn+1 − vˆn and k =˜vn − vˆn, and the inequality above becomes 6. Numerical example ∗ 2 2 2 v˜δ −v˜n+1 +v˜n+1−vˆn +vˆn−v˜n ∗ We consider a pricing oligopoly model (Karpowicz, 2012; +2γ [Ψδ(ˆvn, vˆn) − Ψδ(˜vδ , vˆn)] Nowak and Romaniuk, 2013) with a leader and two ∗ 2 − 2γCv˜n+1−vˆnv˜n−vˆn≤v˜δ − v˜n followers. For instance, let us consider AA, one of the most profitable airlines globally, which has always had Applying (41) to the last term on the left-hand side and in the reputation of a leader and industry challenger. Let view of the strict convexity property of Ψδ given by us also consider two local flying companies called AI ∗ ∗ 2 and AII as the followers of AA on the market. AI Ψδ(ˆvn, vˆn) − Ψδ(˜vδ , vˆn) ≥ δvˆ −v˜δ  , n and AII were forced to immediately start competing with we get international airlines for routes, getting access to airports, ∗ 2 2 ∗ 2 securing flight slots and landing rights, and attracting a v˜ −v˜n+1 +v˜ −vˆn +2γδvˆn − v˜δ  δ n+1 new customer base. 2 2 2 ∗ 2 + 1 − 2γ C v˜n−vˆn ≤v˜δ − v˜n . We suppose that each consumer buys at most one 2 ticket. It is a common practice for airlines to price Applying the identity 2 a − c, c − b = a − b − discriminate, i.e., consumers coming to the market at 2 2 ∗ a − c −c − b with a =ˆvn, b =˜vδ and c =˜vn to different times have valuations inversely related to the the left-hand side of the last inequality, we have length of time between the purchase and the flight. This is ∗ 2 2 2 2 2 the case of business people who decide to travel at the last v˜ −v˜n+1 +v˜ −vˆn + 1 − 2γ C v˜ −vˆn δ n+1 n moment, and pay much more money for getting a ticket. ∗ 2 ∗ 2 +2γδ[2 vˆn − v˜n, v˜n − v˜δ + v˜n−vˆn + v˜n − v˜δ  On the other hand, people who plan to travel a year ahead ∗ 2 2 pay significantly less money. The linear demand assumes = v˜δ −v˜n+1 +v˜n+1−vˆn a continuum of people between these extremes. − 2 2  − 2 +(1+2γδ 2γ C ) v˜n vˆn The dynamics of the game are as follows: (a) AA is ∗ +4γδ vˆn − v˜n, v˜n−v˜δ the first mover and manages all its advantage attracting ∗ 2 ∗ 2 the highest value customers who pay a uniformly high +2γδv˜n−v˜δ  ≤v˜δ − v˜n . price, (b) AI and AII price discriminates over the residual 2 Defining d =1+2γδ − 2γ C2 and completing the demand. The firms can choose to offer limited airline seats square form of the third and fourth terms yields for a specific route: high, medium, low, or none at all. The market price of the seats decreases with increasing  ∗− 2  − 2  − 2 v˜δ v˜n+1 + v˜n+1 vˆn + d v˜n vˆn total quantity offered by the companies. If the companies 2 ∗ (2γδ) ∗ 2 decide to offer a high quantity of airline seats, the price +4γδ vˆn − v˜n, v˜n − v˜ + v˜n−v˜  δ d δ collapses so that profits drop to zero. Figure 1 describes 2 the states and actions corresponding to the Markov chain (2γδ) ∗ 2 ∗ 2 ∗ 2 − v˜n−v˜  +2γδv˜ −v˜  ≤v˜ − v˜n d δ n δ δ of the pricing problem. 348 K.K. Trejo et al.

Fig. 1. Pricing Markov chain.

Following Eqn. (2), the individual utilities are as follows: ⎡ ⎤ 0.2564 0.7202 0.0005 0.0230 ⎢ ⎥ AA high medium low none AII 0.0115 0.3802 0.2628 0.3455 π = ⎢ ⎥ , High 0304369 (i,j|1) ⎣0.2662 0.0333 0.3895 0.3111⎦ Medium 22 39 47 71 0.0513 0.5530 0.3145 0.0812 Low 25 37 43 61 None 0000 ⎡ ⎤ 0.0077 0.0217 0.0346 0.9360 ⎢ ⎥ AI high medium low none AII ⎢0.0758 0.3909 0.3933 0.1401⎥ π(i,j|2) = ⎣ ⎦ , High 022240 0.2946 0.1857 0.1726 0.3470 Medium 30 39 37 0 0.0611 0.2346 0.5648 0.1395 Low 43 47 43 0 None 59 41 55 0 ⎡ ⎤ 0.0239 0.8486 0.0471 0.0804 h m l n ⎢ ⎥ AII igh edium ow one AA 0.2961 0.3267 0.2228 0.1543 π = ⎢ ⎥ , High 024273 (i,j|1) ⎣0.0975 0.1078 0.6152 0.1794⎦ Medium 28 39 39 5 0.2028 0.2719 0.3546 0.1707 Low 37 45 43 10 None 54 39 59 0 ⎡ ⎤ 0.3579 0.0507 0.2801 0.3113 The transition matrices are ⎢ ⎥ ⎡ ⎤ AA ⎢0.5572 0.3785 0.0120 0.0523⎥ π(i,j|2) = ⎣ ⎦ . 0.3288 0.2152 0.3739 0.0821 0.1751 0.2076 0.4572 0.1602 ⎢ ⎥ AI 0.1308 0.3639 0.4474 0.0578 0.0829 0.2980 0.3511 0.2681 π = ⎢ ⎥ , (i,j|1) ⎣0.1394 0.1225 0.4843 0.2538⎦ 0.5846 0.1538 0.0937 0.1679 Each company knows how the increased offer lowers airline seats price and their profits. Strategic dependence ⎡ ⎤ is a common practice in the global airline industry. In this 0.4525 0.3901 0.1061 0.0513 case, AA does this so it can predict the strategic pricing ⎢ ⎥ AI 0.0755 0.2378 0.3888 0.2979 movements of other airlines. By anticipating the predicted π = ⎢ ⎥ , (i,j|2) ⎣0.0373 0.1512 0.7654 0.0461⎦ response of the followers, AA is able to make a strategy 0.2750 0.1673 0.3104 0.2474 to become and stay successful. Given δ =14.3, γ = Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 349

0.008 and applying the extraproximal method for Markov appropriate price for routes not considered by the leader chains (see Section 4.4), we obtain c(i|k) for the leader airline. (see Fig. 3) and c(i|k) for the followers (see Figs. 4 and Lambda 5). Figure 2 shows the convergence of the parameter λ. 1 Then, applying (6), we have that the strategies needed to 0.9 converge to a Stackelberg equilibrium are as follows: 0.8

⎡ ⎤ 0.7 0.8995 0.1005 ⎢ ⎥ 0.6 AA ⎢0.0656 0.9344⎥ d = , (42) 0.5 ⎣0.2172 0.7828⎦ 0.4 0.2827 0.7173 0.3 ⎡ ⎤ 0.1077 0.8923 0.2 ⎢ ⎥ 0.1 AI ⎢0.1311 0.8689⎥ d = ⎣ ⎦ , (43) 0 0.9035 0.0965 0 10 20 30 40 50 60 0.8349 0.1651 Time (n) ⎡ ⎤ 0.7331 0.2669 ⎢ ⎥ AII 0.9214 0.0786 d = ⎢ ⎥ . (44) Fig. 2. Convergence of the parameter λ. ⎣0.1230 0.8770⎦ 0.8856 0.1144 Leader´s Strategies Air travel has become a commodity and major routes 0.45 c (1,1) are saturated with fierce . Low-cost carriers 0.4 c (2,1) have significantly influenced consumer behavior for cheap c 0.35 (3,1) c price bargains among regular travelers and increasingly (4,1) c 0.3 (1,2) among business travelers. AA has consistently been one c (2,2) 0.25 c of the most profitable airlines globally, and has always (3,2) c had the reputation of a leader and industry challenger 0.2 (4,2)

fixing its strategies (42) only on the most lucrative routes. 0.15

AI, looking at the strategies of AA, decided on a fully 0.1 branded product/service differentiation strategy (43) as 0.05 follows: (a) to compete with AA choosing a mixed l 0 strategy d(1,2) = 0,8923 for a lucrative international route 0 10 20 30 40 50 60 l Time (n) and the rest for d(1,1) = 0.1077 a domestic route seriously considered by AA, (b) to strongly compete with AA in l strategy d(2,1) for an international route, and (c) to fully l l Fig. 3. Convergence of the strategies of AA. compete with AA in strategies d(3,1) and d(4,1) for the most lucrative domestic routes. AII, instead, launch a strategy (44) for local and short-haul routes to stay at the Strategies for Follower 1

c forefront of competition deciding (a) to compete with AA (1,1) l c in strategy d(1,1) for a domestic route, (b) to compete (2,1) 0.25 c l (3,1) c partially with AA choosing a mixed strategy d(2,2) = (4,1) c 0.0786 for a lucrative international route and the rest for 0.2 (1,2) c l (2,2) d = 0.9214 a local route not seriously considered by c (2,1) (3,2) 0.15 l c AA, (c) to absolutely compete with AA in strategy d(3,2) (4,2) choosing seriously only an international route, and (d) 0.1 to partially compete with AA choosing a mixed strategy l d(4,2) = 0.1144 for a lucrative international route and the 0.05 l rest for d(4,1) = 0.8856 a local route partially considered 0 by AA. 0 10 20 30 40 50 60 Time (n) Remark 2. Price efficiency and effectiveness are critical for an airline’s ability to compete and survive. However, this does not mean that every airline should seek to offer Fig. 4. Convergence of the strategies of AI. the lowest price. Instead, it is important to offer the 350 K.K. Trejo et al.

Strategies for Follower 2 Breitmoser, Y. (2012). On the endogeneity of Cournot, Bertrand,

c 0.35 (1,1) and Stackelberg competition in , International c (2,1) Journal of 30(1): 16–29. c 0.3 (3,1) c (4,1) Clempner, J.B. and Poznyak, A.S. (2011). Convergence 0.25 c (1,2) method, properties and computational complexity for c (2,2) c Lyapunov games, International Journal of Applied Math- 0.2 (3,2) c ematics and Computer Science 21(2): 349–361, DOI: (4,2) 0.15 10.2478/v10006-011-0026-x.

0.1 Clempner, J.B. and Poznyak, A.S. (2014). Simple computing of the customer lifetime value: A fixed local-optimal policy 0.05 approach, Journal of Systems Science and Systems Engi- neering 23 0 (4): 439–459. 0 10 20 30 40 50 60 Time (n) De Fraja, G. and Delbono, F. (1990 ). Game theoretic models of mixed oligopoly, Journal of Economic Surveys 4(1): 1–17. Harris, R. and Wiens, E. (1980). Government enterprise: An Fig. 5. Convergence of the strategies of AII. instrument for the internal regulation of industry, Canadian Journal of Economics 13(1): 125–132. Karpowicz, M.P. (2012). Nash equilibrium design and 7. Conclusion price-based coordination in hierarchical systems, Interna- The main contribution of this paper was the development tional Journal of Applied Mathematics and Computer Sci- of the extraproximal method for computing the ence 22(4): 951–969, DOI: 10.2478/v10006-012-0071-0. Stackelberg/Nash equilibria in a class of ergodic Merril, W. and Schneider, N. (1966). Government firms in controlled finite Markov chains games. The nonlinear oligopoly industries: A short-run analysis, Quarterly Jour- programming problem was represented using an nal of Economics 80(5): 400–412. implementation of the Lagrange principle. Another Moya, S. and Poznyak, A.S. (2009). Extraproximal important contribution was the use of the regularizing method application for a Stackelberg–Nash equilibrium parameter, which provides strong convexity for the calculation in static hierarchical games, IEEE Transac- cost-functions and, hence, the correctness of the tions on Systems, Man, and Cybernetics, B: Cybernetics convergence analysis. 39(6): 1493–1504. For solving the extraproximal method, we presented Nett, L. (1993). Mixed oligopoly with homogeneous a two-step iterated procedure which involved an iterative goods, Annals of Public and Cooperative Economics solution of a quadratic programming problem for the 64(3): 367–393. solution of the Stackelberg game in terms of Markov Nowak, P. and Romaniuk, M. (2013). A fuzzy approach to chains. A numerical method was presented for computing option pricing in a Levy process setting, International the first step of the extraproximal method (parameter λ). Journal of Applied Mathematics and Computer Science The convergence of the suggested procedure to the 23(3): 613–622, DOI: 10.2478/amcs-2013-0046. Stackelberg/Nash equilibrium was also analyzed. It is Poznyak, A.S. (2009). Advance Mathematical Tools for Auto- important to note that we provided all the details needed matic Control Engineers, Vol. 1: Deterministic Techniques, to implement the extraproximal method in an efficient Elsevier, Amsterdam. and numerically stable way for ergodic controlled finite Poznyak, A.S., Najim, K. and Gomez-Ramirez, E. (2000). Self- Markov chains games. learning Control of Finite Markov Chains, Marcel Dekker, Inc., New York, NY. References Tanaka, K. and Yokoyama, K. (1991). On -equilibrium point in a noncooperative n-person game, Journal of Mathematical Antipin, A.S. (2005). An extraproximal method for Analysis and Applications 160(2): 413–423. solving equilibrium programming problems and games, Computational Mathematics and Mathematical Physics Trejo, K.K., Clempner, J.B. and Poznyak, A.S. (2015). A 45(11): 1893–1914. stackelberg security game with random strategies based on the extraproximal theoretic approach, Engineering Appli- Bos, D. (1986). Public Enterprise Economics, North-Holland, cations of Artificial Intelligence 37: 145–153. Amsterdam. Vickers, J. and Yarrow., G. (1998). Privatization—An Economic Bos, D. (1991). Privatization: A Theoretical Treatment, Analysis, MIT Press, Cambridge, MA. Clarendon Press, Oxford. von Stackelberg, H. (1934). Marktform und Gleichgewicht, Springer, Vienna. Computing the Stackelberg/Nash equilibria using the extraproximal method. . . 351

Kristal K. Trejo holds an M.Sc. from the De- Alexander Poznyak graduated from Moscow partment of Automatic Control at the Center for Physical Technical Institute (MPhTI) in 1970. Research and Advanced Studies (CINVESTAV- He earned his Ph.D. and Doctor’s degrees from IPN), Mexico. She received her B.Sc. in com- the Institute of Control Sciences of the Russian munications and electronic engineering from the Academy of Sciences in 1978 and 1989, respec- Autonomous University of Zacatecas, Mexico. tively. From 1973 up to 1993 he served this in- She follows a line of investigation focused on stitute as a researcher and a leading researcher, game theory and optimization research. and in 1993 he accepted the post of a full pro- fessor (3-E) at CINVESTAV of IPN in Mexico. Presently, he is the head of the Automatic Control Department. He was the supervisor of 35 Ph.D. theses (28 in Mexico). He has published more than 180 papers in various international journals Julio B. Clempner has more than ten years of and 10 books. He is a regular member of the Mexican Academy of Sci- experience in the field of management consult- ences and the System of National Investigators (SNI -3). He is a fellow ing. He holds a Ph.D. in computer science from of the IMA (Institute of Mathematics and Its Applications, Essex, UK) the Center for Computing Research at the Na- and an associate editor of Oxford-IMA Journal on Mathematical Control tional Polytechnic Institute. He specializes in ap- and Information, Kybernetika as well as the Iberamerican International plications of high technology related to project Journal on Computations and Systems. He was also an associate editor management, analysis and design of software, of the Conference on Decision and Control (CDC),theAmerican Con- development of software (ad-hoc and products), trol Conference (ACC) and a member of the editorial board of the IEEE information technology strategic planning, eval- Control Systems Society (IEEE CSS). uation of software, business process reengineer- ing and IT government, for infusing advanced computing technologies Received: 21 March 2014 into diverse lines of businesses. Dr. Clempner’s research interests are Revised: 15 November 2014 focused on justifying and introducing the Lyapunov equilibrium point in game theory. He is a member of the Mexican National System of Researchers (SNI), and of several North American and European profes- sional organizations. Dr. Julio Clempner is on the editorial board of the International Journal of Applied Mathematics and Computer Science.