GameGame TheoryTheory andand itsits applicationapplication inin ArtificialArtificial IntelligenceIntelligence

Jelena Grujić AI Lab, VUB WhyWhy isis GoogleGoogle DeepDeep MindMind hiringhiring peoplepeople whowho knowknow GameGame theory?theory? WhatWhat isis thisthis goinggoing toto bebe about?about?

• What is ?

• How people make decisions?

• Why AI could beat GO before it could beat Poker? 4

Prisoner’s dilemma

GameGame TheoryTheory

• Your payoff depends on unknown decisions of others

• John von Neuman 1944

• John Nash 1955 –

• John Mayard-Smith 1973 – Evolutionary Stable State

NashNash equilibriumequilibrium

EvolutionaryEvolutionary stablestable statestate

8

The puzzle of cooperation

 Charles Darwin (1896): “If it could be proved that any part of the structure of any one species had been formed for the exclusive good of another species, it would annihilate my theory, for such could not have been produced through natural selection.”

 Robert May (2005): “The most important unanswered question in evolutionary biology, and more generally in the social sciences, is how cooperative behavior evolved and can be maintained in human or other animal groups and societies”

 Cooperation is costly – lower fitness  Defectors more likely to evolve  However, it is everywhere  Genomes  Cells  Multicellular organisms  Human and animal societies 9

Five rules for the evolution of cooperation

Nowak, Science (2006) Axelrod turnament 1980

Tit-For-Tat BEEL.VUB.AC.BEBEEL.VUB.AC.BE EXPERIMENTEXPERIMENT EXPERIMENTEXPERIMENT EXPERIMENTEXPERIMENT EXPERIMENTEXPERIMENT THE NETWORK

Cooperation Level

~All D Grim Trigger Generous TFT ~All C All C THE NETWORK

Earnings

~All D Grim Trigger Generous TFT ~All C All C BriefBrief historyhistory ofof AIAI beatingbeating humanshumans

• Tic-Tac-Toe 19?? (Boter-kaas-en-eieren)

1997

• Go March 2016

• Poker January 2017 Tic-Tac-ToeTic-Tac-Toe (~19000)(~19000)

MinimaxMinimax algorithmalgorithm

• Combinational Game Theory

• John Horton Conway (Game of Life) ChessChess

• 1980, AI bits really good players

• 1997 IBMs DeepBlue bits Kasparov

• 2002-2003 commercial computer

• 2005-2006 phones

• limited number of moves ahead (or ply), 12 for DeepBlue

• evaluation function

• pruning GoGo

• Chess 35 moves, Go 250 moves

• 2×10170 (atoms 1080)

• We can’t calculate the evaluation function

• Old tricks don’t work

• Proper AI

• Training you program

AlphaGoAlphaGo

• March 2016

• Experts games

• Playing against itself

• AlphaGo Zero, October 2017

Libratus

● No-limit Texas hold ‘em

● 10160 situations

● January 2017, $1,766,250

● What is different? Libratus

● Imperfect information

● We are not in one state, we are in a distribution of the states

● Enters Game Theory Libratus

● Imperfect information

● 10160 situations

● January 2017, $1,766,250

● What is different? ConclusionConclusion

• Prisoner’s Dilemma

• What strategies humans use?

• For imperfect information games, we need game theory ForFor thethe curious:curious:

• Sandholm, Tuomas. "The state of solving large incomplete-information games, and application to poker." AI Magazine 31.4 (2010): 13-32.

• Game Over: Kasparov and the Machine

• AlphaGo - The Movie

DataBeers Brussels

Web site: http://databeers.brussels Twitter: @DataBeersBru Facebook: brudatabeers Email: [email protected]