Foundations of Management, Vol. 8 (2016), ISSN 2080-7279 DOI: 10.1515/fman-2016-0022 289

SEVERAL REMARKS ON THE ROLE OF CERTAIN POSITIONAL AND SOCIAL GAMES IN THE CREATION OF THE SELECTED STATISTICAL AND ECONOMIC APPLICATIONS Ewa DRABIK Warsaw University of Technology, Faculty of Management, Warsaw, Poland e-mail: [email protected]

Abstract: The was created on the basis of social as well as gambling games, such as , poker, baccarat, hex, or one-armed bandit. The aforementioned games lay solid foundations for analogous mathematical models (e.g., hex), artificial intelligence algorithms (hex), theoretical anal- ysis of computational complexity attributable to various numerical problems (baccarat), as well as illustration of several economic dilemmas  particularly in the case where the winner takes every- thing (e.g., noughts and crosses). A certain gambling games, such as a horse racing, may be successful- ly applied to verify a wide spectrum of market mechanism, for example, market effectiveness or customer behavior in light of incoming information regarding a specific product. One of a lot appli- cations of the slot machine (one-armed bandit) is asymptotically efficient allocation rule, which was as- signed by T.L. Lai and H. Robbins (1985). In the next years, the rule was developed by another and was named a multi-armed. The aim of the paper is to discuss these social games along with their potential mathematical models, which are governed by the rules predominantly applicable to the social and natural sciences. Key words: positional games, social games, gambling games, one-armed bandit, chess.

1 Introduction Melvin Dresher working at RAND in 1950), gender discrimination game, chicken game, or Colonel Blot- The humanity has been accompanied by gambling to game were created. and gambling games for ages, as they entertained This paper aims to address that social games people, provoked excitement, and gave an illusory and gambling played a key role in the development hope for the better life. According to a representative of the game theory, statistics, and their applications. number of scientists, the behavior of gamblers very often reflects the conduct of market players. In fact, gambling may be defined as a permanent pursuit 2 Some remarks about positional games of win, which in itself eventually becomes the prize. The overwhelming majority of dilemmas related The decision-making process under high-risk cir- to the (PI) games were defined cumstances pushes the arguments of rationality in the period from 1935 to 1941 and incorporated to the background. The fact in question has been into the so-called Scottish Book. The Scottish Book included in the modern finance theory, for example, referred to a notebook purchased by a wife of Stefan the prospect theory developed by Banach and used by mathematicians of the Lwów and . The attention was particularly School of Mathematics (such as Stanisław Mazur, drawn to the market players who  while making Stanisław Ulam, and Hugo Steinhaus) for jotting decisions under risk  were more prone to follow down mathematical problems meant to be solved. herd and suffer from panic attacks. Their behavior The Scottish Book used to be applied for almost six very often proves to be irrational. In the ensuing years. Many problems presented therein were created years, the problems revolving around the game theo- in previous years and not all of them were solved. ry were becoming more and more complex. In order After the World War II, Łucja Banach brought the to provide a reasonable solution as well as to de- Book to Wrocław, where it was handwritten by Hu- scribe a sequence of social and economic phenome- go Steinhaus and sent in 1956 to Los Alamos (USA) na, a variety of new games such as a prisoner’s to Stanisław Ulam. Ulam translated it into English, dilemma (originally framed by Merrill Flood and

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM 290 Ewa Drabik copied at his own expense, and dispatched to a varie- where ty of universities. The book in question proved p0 = a(∅), p1 = b(p0), p2 = a(p1), to enjoy such a great popularity that it was soon pub- p3 = b(p0, p2), p4 = a(p1, p3) lished and edited  mainly in English (Duda, 2010; Definition 2: Mauldin, 1981). Formally, we can write PI games as follows. A game A, B,φ is called determined if Let A be the set of strategies of player I and B be the inf sup φ(a, b)  v  sup inf φ(a, b) (1) bB bB set of strategies of player II. aA aA Remark: A game is determined if and only if the  : A  B   , game has a value. where A game is not determined if      , inf supφ(a,b)  v  supinf φ(a,b) (2) bB aA aA bB ( is the set of real numbers). Note. If the game is not determined, then the left- This game is played as follows: hand side of (1) is larger than the right-hand side Player I chooses a  A , and player II chooses b  B of (1). Both choices are made independently and without If the game has a value v and there exists an a0 such any knowledge about the choice of the other player. that (a0,b)  v for all b, then a0 is called an optimal Then player II pays to player I the value (a, b). for player I. If (a,b0)  v for all a, the b0 is If (a, b)<0 it means that player II gets from player I called an optimal strategy for player II. the value (a, b). We will say that P ω ,f is a win for player I or The following is the idea of an infinite game of PI (perfect information): a win for the player II if P ω , f has value 1 or 0,  let {0,1, 2,...} , respectively. If f : Pω   has the property that there exists an n such that f(p ,p ...) does not de-  there is a set P called the set of choices, 0 1 ω pend on the choice pi with i < n, then P ,f is  player I chooses p0 P , next player II chooses called a finite game. p1 P, then player I chooses p 2  P , and so on. The following theorem is true. There is a function f : Pω   such that the end Theorem 1: player II pays to player I the value f(p0, p1,…). Every finite game has a value. Definition 1: Proof (Mycielski, 1992). The triple A,B,φ is said to be the PI-game if there exists a set P such that A is set of all function: 3 The game of Hex and some infinity games   with perfect information A  a : Pn  P , where P0  { } ,    nω  One of the simplest finite PI-games is Hex. Hex is

 n  played as follows. B  b : P  P  0nω  We use a board with a ‘honeycomb’ with 122 cells, and we use 61 white and 61 black pieces (Fig. 1). and there exists a function

ω f : P   such that φ(a,b) = f(p0, p1, …),

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM Several Remarks on the Role of Certain Positional and Social Games in the Creation . . . 291

Figure 1. The board of the hex (source: Pijanowski, 1972)

The players alternatively put white or black stones A function φ : E  R is given, and a point pfirst  P is on the hexagon board. White begins and he/she wins fixed. The player I and player II pick alternately if the white stones connect the top of the board with p0  pfirst , q0 Q, p1  P, q1 Q , … such that the bottom. Black wins if the black stones connect (p ,q )  E and (q , p )  E , thereby defining the left edge with the right edge. i i i i1 a zigzag path composed of arrows. We define three Theorem 2: PI-games. (i) When the board is filled with stones, then one G : player II pays to player I the value of the players has won and the other has lost. 1 n1 (ii) White has a winning strategy. 1 lim sup (φ(pi ,qi )  φ(qi , pi1 )) (3) n 2n Proof (Mycielski, 1992). i0 Hex has a relative called Bridge_it. But for G2: player II pays to player I the value Bridge_it, a useful description of winning strategy 1 n1 lim inf (φ(p i , q i )  φ(q i , p i1 )) (4) for player I (white) has been used. However, this n  2n i0 does not seem to solve for the problem on hex. G3: the game ends as soon as a closed loop arises Some games of the some type are difficulty com- in the path defined by the players, that is, as soon pared in the sense that the problem of deciding as player I picks any p {p ,..,p } or player II if a position is a win for player I or for player II is n 0 n1 complete (see Theorem 1). picks any qn {q0 ,..,qn1}. The more general games are the infinity games with Then player II pays to player I the ‘loop average’ PI. v defined as follows. In the first case, pn = pm for some m < n and then Let G be a finite bipartite-oriented graph. In the other words, G is a system (P, Q, I) where P and Q are finite disjoint sets and E  ((P  Q) (Q  P)) and I be set of arrows. We assume, moreover, that for each (a, b)  E, there exists c such that (b, c)  E. (5)

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM 292 Ewa Drabik

Thus, in all three games, the players are competing for example, a card consisting of 2 and 3 is worth 5, to minimize or maximize the means for some num- but a hand consisting of 6 and 7 is worth 3 (i.e., the 3 bers they encounter on the arrows of the graph. being right digit in combined points total 13 or 6 + 7

As the game G3 is finite, by Theorem 1, it has value = 3 (mod 10)). Logically then, the highest possible v. The infinite games G1 and G2 are determined, hand value in baccarat is 9. The baccarat refers to because they have the same value as G3. So there is anything with a value zero, in a hand of K, 4 and 6. at least one example where infinite PI-games help us The object of the game is to have the higher total to analyze some finite PI-games. But, some infinite (closer 9) at the end of the play. One player is de- PI-games are not so simple. signed as the banker who deals four cards face down: Example (the chase): A dog tries to catch a hare two to himself and two to the player. They look at in an unbounded plane. The purpose of the dog is their cards; if either has an eight or nine, this is im- to minimize the time of the game and the purpose mediately announced and the hands are turned face of the hare is to maximize it. We assume that each up and compared. If neither hand is an eight nor dog is farther than the hare and that only the veloci- nine, the player has a choice to accept or refuse ties are bounded while acceleration are not. There a third card; if accepted it is dealt face up. are neat solution of those game: at each moment t, Traditional practice dictates that one always accept the hare should run full speed toward any point at a card if one’s hand total is between 0 and 4 and such that at is the distant from him among all points always refuse a card if one’s hand total is 6 or 7. that he can reach prior to the dog. The dog at each The player makes his/her decision either to accept instant t should run full speed toward that the same or to refuse another card. Once both the banker and point at. The hare does not have to change the point the player have made their decision, the hands are at during the game. turned face up and compared. If the player’s hand exceeds the banker’s hand, each wagering player 4 Casino baccarat chemin de fer receives back their wager and a matching amount as a bimatrix game from the bank, and the position of banker passes. If the banker’s hand exceeds the player’s hand, all The game of baccarat played a key role in the devel- wagers are forfeit and placed into the bank. opment of game theory. Bertrand’s 1889 analysis The banker position does not change. If there is a tie, of whether player should draw or stand on a two wagers remain as they are for the next hand. A win cards total of five was the starting point of Borel for players pays even money. Again, hands of equal investigation of strategic games. In 1924, Borel de- value result in a push. In the laws of baccarat, no one scribed Bertrand’s study. In 1928, J. von Neumann, code is accepted as authoritative, different clubs after providing theorem, remarked that he make their own rules. would analyze baccarat in a paper. But a solution The probability that a two-card hand has a value of 0 of the game would have to wait until the drawn is of the computer age. 2 2  4   1  25 Baccarat is a card game played in casinos. There are    9   13 13 169 three popular variants of the game: punto banco     (or “North American baccarat”), baccarat chemin de The probability that two-card hand has value fer, and baccarat banque. In baccarat chemin de fer, i  {1, 2 , , 9} is: both players can make choice: player and banker. 2 4 1 1 4  1  16 Each baccarat coup has three possible outcomes:     8   “player,” “banker,” or “tie.” Cards that have a point 13 13 13 13 13  169 value of 2–9 are worth face value (in points); 10s, Js, Player’s strategy is restricted. He/she must draw Qs, and Ks have no point (i.e., are worth zero). Aces on a two-card total of 4 or less and should on a two- are worth 1 point. Hands are valued according to the card total of 6 or 7. When his/her two-card total is 5, rightmost digit of the sum of their consistent cards: he/she is free to draw or stand as he/she chooses.

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM Several Remarks on the Role of Certain Positional and Social Games in the Creation . . . 293

The decision is usually made by the player with the to many disciplines, such as economy, biology, largest bet. Player has two pure strategies, stand and steering theory (compare the available literature on a two-card total of 5 or draw a two-card total of 5. [4] or [1]). Let X denote the value of players’ two-card hand, Unstated but implicit in the definition was require- and let Y denote the value of banker’s two-card ment that the players insert one or more coins into hand. Banker has a stand or draw decision in each slot (hence, the name “slot machine”) in order 88 strategic situations. Another word, the pair (X, Y) to activate the machine. Now, almost all slots ma- 88 is a typical bimatrix game, which has 2  2 strate- chines are electronic and controlled by microproces- gies (Ethier, 2010). Ethier described some algorithms sors. The microprocessor is programmed a random- to show some strategy in for bac- number generator that operates continually. carat bimatrix game (see Ethier, 2010). The moment the player pulls the handle or presses The algorithms based on baccarat game may be ful- the button to activate the machine, the most recently filled in several manners, for example, through tax generated random numbers determine the decrease, state pension increments, system of incen- almost instantly. Despite all this, the underlying tives such as discounts or guarantees conferred upon mathematics of the act of the modern slot machine companies for private and public investments, road much different from that of classical machine. construction, and community objects developments. The slot machines find a lot of applications, for in- In the absence of simplifying assumptions, finding stance, in statistics. One of them is asymptotically Nash equilibrium in the original game baccarat che- efficient allocation adaptive rule, which was as- min de fer requires sophisticated algorithms, featured signed by Lai and Robbins (1985). In the next years, by a high level of computational complexity. This the rule was developed by other scientists and it was algorithm demands an in-depth analysis of numerous named multi-armed bandit problem. case studies. Baccarat, similar to chess, laid founda- One of the strategies  the asymptotically efficient tions for the development of artificial intelligence. adaptive allocation rule  that is used in the realm of statistics may be described as follows. 5 Asymptotically efficient allocation rules and one-armed bandit Let wi (i = 1, 2, ...) denote statistical populations (treatments, manufacturing processes, etc.) specified One-armed bandit is a mechanical or an electrical by univariate (or another) density function f(w,i) machine, most frequently equipped with three drums with respect measure , where f(.,.) is known param- holding a variety of card faces, used for gambling eter and i is unknown parameter belonging to some purposes. This simple machine has been known since set . Assume that

1887, when Charles Fey devised an automatic mech-  anism to make the trade of offered products more  wf(w, θ) dv(w)   for all   (6) attractive. Since the machines proved to be very   successful, in the middle of the 20th century, the We sample w1, w2, … sequentially from the k popu- mechanical constructions were substituted by com- lations in order to achieve the greatest possible ex- puterized devices that were used in casinos. They pected value Sn = w1+..., wn as n, where n is used to entertain and divert the ladies whose hus- number of observations. The multi-armed bandits bands and fiancés devoted themselves to card games. problem derives from an imagined slot machine with First machines worked on a fully random basis. Cur- k  2 arms. When an arm is pulled, the player wins rently, because of the applicable software, the ran- a random reward. For each arm j, there is an un- domness is of partial nature. The participation known probability distribution of the reward. in such games and generation of shots attracted The player wants to choose depending in some way the attention of many scientists who predominantly on the reward of previous trials, so as to maximize aimed to elaborate on profit-maximizing strategies. the long-run total expected reward. The strategies concerned became attributable

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM 294 Ewa Drabik

An adaptive allocation rule  is a sequence of ran- t Let gnt :    (n=1,2,...; t=1,...,n) be the Borel dom variables 1, 2, ... taking values in the set {1, function such that for every , ..., k} and such that the  = 0 when player don’t play t (W1) (sample) or t = 1 when player play. The expected 1 P r  g (w ,..., w ) for all t  n 1  o(n ) reward is θ nt 1 t

 (W2) μ(θ)   wf(w; θ)dν(w) (7)  n   limlim sup Pθ g nt (w1,..., w t )  μ(λ)  ε /log n ε0    n t1  Another way, the expected reward is  1/I(θ/λ) k n k whenever () > (). ES  E w I   μ(θ)T (j) (8) n  i {i  j} i1  n j0 i 1 j0 (W3) where: gnt is nondecreasing in n  t for every fixed t = 1, 2, … ESn is conditional expected value of regret, t n Let ht(w1,...,wt) be the Borel function, h,t :    T (j)  I is the number of times that  sam- n  {φi  j} such that for every  i1 ples to stage n. The problem of maximizing ESn is, (W4) therefore, equivalent to that of minimizing to regret ht  gnt for all  * R n (θθ  nμ  ES n  (W5) * (9) (μ  μ(θ ))ET (j)    j n 1 j:μ(θ )μ * Pθ  max h t (w 1 ,..., w t )  μ(θ)  ε  o (n ) j  t  where: for every  > 0. * * μ  max μ θ 0 , μ θ1  μ(θ ) Condition W5 can be satisfied for the average: for some ht(w1,…, wt) = (w1 + . . . + wt)/t if θ*  θ 0 , θ 1 2 Eθwi   (i  1,..., t) Let I(,) denote the Kullback–Leibler number and  f(w;θ) / I(θ(λ)  log  f(w;θ)dv(w) (10) gw...,w w σ2a for n  t  f(w;λ)  where: such that 0 < I(,) <  whenever () > ().  is the standard deviation,

Generally, regret is ant (n=1,2,...; t=1,...,n) is positive constant such that for all t, a is nondecreasing for all n  t,   nt  * *  R n (θ)   (μ  μ(θ j )) / I(θ j ,θ )logn , and there exists   0 such that j:μ(θ )μ*  j  1/2 logn ε(logn) a   for all t  n n (11) nt t t1/2 The rule : Remark: If 2 Let w1, w2, … be a sequence of random variables  (w  θ)  f(w;θ)  (2π σ2 )1/2 exp  , whose distribution is parameterized by unknown  2   2σ  parameter . then (θ  λ)2 I(θ(λ)  2σ2

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM Several Remarks on the Role of Certain Positional and Social Games in the Creation . . . 295

Let w,..., w be the successive rewards obtained It is worth mentioning one of the most fascinating 1 Tn (j) from arm j up to stage n. The upper confidence personalities of sports and science, Robert James Fischer who was famous for his exceptionally talent- bound Un(j) and the point estimate n(j), the mean ed and rebellious mind. He played hundreds of out- reward under arm j, are given by standing games, implemented many innovative μ (j)  h (w ,..., w ) n Tn ( j ) 1 Tn ( j ) solutions, and introduced the so-called Fischer clock enabling to keep track of the total time each player U n (j)  gn,T ( j ) (w1,..., wT ( j) ) . n n takes for his or her own moves. Owing to some per- Define the “leader” at stage n as the population with sonal reasons, he was not able to play the game with the largest estimated mean among all population. the computer. Just wonder who would have won The following rule  decides which arms to play. in such a competition. In the first stage, we observe w1, w2, …, wn and In the light of the game theory, it should be empha- compute Un(j) and n(j). Next we use one of the fol- sized that because of its limited range of strategies, lowing decisions: the chess game is indeterminate, of which fact the 1) If arm j is already one of m-leaders choose to vast majority of chess players remain unaware. play the m-leaders, 2) If arm j is not among the m-leaders and Un(j) is 7 Summary and remarks less than n(k) for every m-leader k, then again the m-leaders play. It is worth noting that certain gambling games, such 3) If arm j is not among the m-leaders and n  Un as horse racing, may be successfully applied to veri- of least best of m-leaders, then play the (m − 1)  fy a wide spectrum of market mechanisms, for ex- best of m-leaders and arm j. ample, market effectiveness, or consumer behavior Note in any case, the (m − 1) of m-leaders always in light of incoming information regarding a specific get played. product. Theorem 3: The methods of reasoning based on the game theory The rule  jest is asymptotically efficient. are used for examining the equilibrium states in a diversity of economic phenomena. In this case, Proof (Ananatharam V., Varaiya P., and Warland J., the suboptimal equilibrium (i.e., the best state for all 1987). individual players) is replaced by the general equilib- rium. 6 Some remarks about chess The situation concerned occurs whenever particular The most common “finite” positional game with PI players abstain from the available, yet globally unde- is chess, which laid foundations for artificial intelli- sirable, ways of conduct (e.g., through subsidizing gence algorithms applied in various domains, includ- the investments that for particular players seem un- ing the construction of dynamic equilibrium models justifiable and unreasonable or consumption that is as well as the description of economic systems lack- not necessary for the attainment of global equilibri- ing the equilibrium. um corresponding to the notion of Nash equilibri- um). The artificial intelligence is more and more often applied in energetic  to create systems not only Certainly, since Keynes’s times the developers of monitoring the course of specific processes but also economic politics have attempted to move the eco- involved in planning and decision-making proce- nomic equilibrium towards the production increase dures. It is used also for the purposes of image pro- and, analogous, towards the growth of demand de- cessing, for example, in cameras, supporting termining the shape of manufacturing system that financial decisions, as well as many other domains would be maintainable without an unreasonable ac- of everyday life. cumulation of potential reserves.

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM 296 Ewa Drabik

8 References [4] Lai, T.L., Robbins, H., 1985. Asymptotically Efficient Adaptive Allocation Rules. Advanced [1] Ananatharam, V., Varaiya, P., Warland, J., 1987. in Applied Mathematics, Vol. 6, pp.4-22. Asymptotically Efficient Allocation Rules for [5] Mauldin, R.D., 1981. The Scottish Book. Math- the Multiarmed Bandit Problem with Multiple ematics from the Scottish Café. Boston – Basel – Plays – Part I: I.I.D. Rewards. IEEE Transaction Stuttgart: Birkhausen. of Automatic Control, Vol. Ac–32, No. 11, [6] Mycielski, J., 1992. Games with Perfect Infor- pp.968-976. mation. In: R.J. Aumann, S. Hart (eds.). Hand- [2] Duda, R., 2010. Lwow School of Mathematics. book of Game theory with Economic Wroclaw: Wroclaw University Publishing Application, Vol. 1, North – Holland, Amster- House. dam, pp.20-40. [3] Ethier, S.N., 2010. The Doctrine of Chances: [7] Pijanowski, L., 1972. Przewodnik gier (Game Probabilistic Aspects of Gambling. Berlin – Guide). Warszawa: Iskry. Heidelberg: Springer Verlag.

Brought to you by | Politechnika Warszawska - Warsaw University of Technology Authenticated Pobrano z http://repo.pw.edu.pl / Downloaded from Repository of Warsaw UniversityDownload Dateof Technology | 1/13/17 12:21 2021-09-30 PM