Thesis for the Degree of Doctor

DEEP LEARNING APPLIED TO GO-GAME: A SURVEY OF APPLICATIONS AND EXPERIMENTS

바둑에 적용된 깊은 학습:

응용 및 실험에 대한 조사

June 2017

Department of Digital Media

Graduate School of Soongsil University

Hoang Huu Duc

Thesis for the Degree of Doctor

DEEP LEARNING APPLIED TO GO-GAME: A SURVEY OF APPLICATIONS AND EXPERIMENTS

바둑에적용된 깊은 학습:

응용 및 실험에 대한 조사

June 2017

Department of Digital Media

Graduate School of Soongsil University

Hoang Huu Duc Thesis for the Degree of Doctor

DEEP LEARNING APPLIED TO GO-GAME: A SURVEY OF APPLICATIONS AND EXPERIMENTS

A thesis supervisor : Jung Keechul

Thesis submitted in partial fulfillment of the requirements for the Degree of Doctor

June 2017

Department of Digital Media

Graduate School of Soongsil University

Hoang Huu Duc

To approve the submitted thesis for the

Degree of Doctor by Hoang Huu Duc

Thesis Committee

Chair KYEOUNGSU OH (signature)

Member LIM YOUNG HWAN (signature)

Member KWANGJIN HONG (signature)

Member KIRAK KIM (signature)

Member KEECHUL JUNG (signature)

June 2017

Graduate School of Soongsil University ACKNOWLEDGEMENT

First of all, I’d like to say my deep grateful and special thanks to my advisor professor JUNG KEECHUL. My advisor Professor have commented and advised me to get the most optimal choices in my research orientation. I would like to express my gratitude for your encouraging my research and for supporting me to resolve facing difficulties through my studying and researching. Your advices always useful to me in finding the best ways in researching and they are priceless.

Secondly, I would like to thank Prof. LIM YOUNG HWAN - the dean of our project. From the first time I’ve came to Korea until the end of my Ph.D course, you often support, advice, and encourage me to get best ways in most of my hardness situations in my study progress in Korea.

I also profoundly thanks to my committee members: professor KYUNGSU OH- chairman, Prof. LIM YOUNG HWAN, Prof. KEECHUL JUNG, doctor HONG GWANG JIN, doctor RIRAK KIM for attending as my committee members. I would like to thank you for your dazzling comments and suggestions. I would especially like to HCI lab’s members. Many of you have supported me in study and research for my Ph.D. course.

Be sides, I deeply thanks to my wife Nguyen Thi Quynh Trang, my big family for all of the sacrifices and supports that you’ve made in helping me while I have to live alone for Ph.D course far away. I would like to thank to all friends of mines who encourage me to complete my program.

TABLE OF CONTENTS

ABSTRACT IN ENGLISH ······························································· ix

ABSTRACT IN KOREAN ······························································· xii

CHAPTER 1: INTRODUCTION ...... 1

1.1 BACKGROUND ...... 1 1.1.1 Machine Learning ...... 2 1.1.2 Deep learning ...... 3 1.1.3 Convolutional Neural Networks ...... 5 1.1.4 Go-game ...... 5 1.2 RESEARCH MOTIVATION ...... 7 1.3 PERSPECTIVE AND OVERVIEW ...... 8 1.3.1 Machine Learning development ...... 8 1.3.2 Deep learning history ...... 9 1.3.3 Contemporary machine learning ...... 17 1.4 CONTRIBUTIONS ...... 17 CHAPTER 2: GO-GAME AND ITS COMPLEXITY ...... 19

2.1 CHALLENGE ...... 19 2.2 APPLYING DEEP LEARNING INTO GO-GAME ...... 20 2.3 GAME’S RULE SETS, POSSIBLE MOVES, AND LEGAL MOVES ...... 23 2.3.1 Go-game’s rule sets ...... 23 2.3.1.1 Making sure that the game will come to an end-state: ...... 23 2.3.1.2 Deciding the winner of the game: ...... 23 2.3.1.3 Determining whether a group of stones is dead or alive at endgame states: ...... 24 2.3.1.4 Board sizes: ...... 24 2.3.1.5 The Stones: ...... 25 2.3.1.6 Playing a game: ...... 26 2.3.1.7 Territory: ...... 28 2.3.1.8 Ko: ...... 28 2.3.1.9 Superko:...... 28 2.3.1.10 Eye: ...... 29 2.3.1.11 Shoulder Hit: ...... 29 2.3.1.12 Chain: ...... 29 2.3.1.13 Seki (mutual life): ...... 30

- i - 2.3.1.14 Suicide: ...... 31 2.3.1.15 Komi: ...... 31 2.3.1.16 Kosumi: ...... 32 2.3.1.17 : ...... 32 2.3.1.18 Goal: ...... 32 2.3.1.19 Score: ...... 33 2.3.1.20 Endgame state: ...... 34 2.3.2 Possible moves ...... 34 2.3.3 Legal moves ...... 35 2.4 THE COMPLEXITY OF MOVES AND STRATEGIES ...... 37 2.4.1 State-space complexity: ...... 37 2.4.2 Game tree size: ...... 37 2.4.3 Ply: ...... 39 2.4.4 The complexity of Go-game moves ...... 40 2.4.5 Comparison between Go-game and ...... 41 2.4.6 The complexity of Go-game’s strategies...... 45 2.4.7 Basic strategic: ...... 45 2.4.8 Opening strategy: ...... 47 2.4.9 Middle phase and endgame: ...... 48 2.5 HOT TREND IN GO-GAME RESEARCHING AND THE CHALLENGE...... 49 2.6 THE REASON PEOPLE MUST CONCERN IN GO-GAME RESEARCH...... 51 2.7 UNDERSTANDING ABOUT THE HUGE COMPLEXITY OF GO-GAME: ...... 52 2.8 UNDERSTANDING HUGE NECESSARY RESOURCES ...... 53 2.9 PROVIDE A FULLY OBSERVATION ...... 54 2.10 SHOWING EXPERIMENTS IN “DEEP LEARNING APPLIED TO GO-GAME” ...... 54 CHAPTER 3: DEEP LEARNING APPLIED TO GO-GAME .... 55

3.1 WHY APPLYING DEEP LEARNING INTO GAME BUILDING...... 55 3.2 TYPICAL EVALUATION FUNCTION ...... 56 3.3 TYPICAL DEEP LEARNING ALGORITHM ...... 58 3.4 GO-GAME RESEARCHES AND PROJECTS...... 59 3.4.1 Orego: ...... 61 3.4.1.1 How It Works: ...... 62 3.4.1.2 Relation work and publications ...... 63 3.4.2 Fuego: ...... 64 3.4.3 GNU Go: ...... 65 3.5 GO-GAME SOFTWARE ...... 66 3.6 GOGUI PROJECT– PLATFORM TO PLAY GO-GAME WITHOUT SUGGESTIONS ...... 70 3.7 FUEGO PROJECT (SUGGESTING MOVE ENGINE)...... 71 3.8 ALPHAGO PROJECT ...... 72 3.9 GOGUI EXPERIMENT ...... 74 3.10 FUEGO ENGINEERING ATTACHING TO GOGUI PROGRAM ...... 77 3.11 KGS GO SERVER...... 80

- ii - CHAPTER 4: DEEP LEARNING BASED GO-GAME ...... 84

4.1 INTRODUCTION ...... 84 4.2 ARCHITECTURE OF CONVOLUTIONAL NEURAL NETWORK...... 85 4.3 TRAINING PROCEDURE ...... 89 4.4 TRAINING DATA ...... 90 4.5 SOLVING WORK ...... 96 CHAPTER 5: EXPERIMENTS ...... 99

5.1 EXPERIMENTS FOLLOWING HUUDUCGO’S SUGGESTIONS...... 99 5.2 EXPERIMENTS FOLLOWING OREGO’S SUGGESTIONS...... 108 5.3 EXPERIMENTS FOLLOWING FUEGO’S SUGGESTIONS ...... 113 CHAPTER 6: DISCUSSIONS ...... 118

REFERENCES ··········································································· 120

- iii - LIST OF TABLES

[Table 2-1] Number of legal moves on common board sizes (rounded) ...... 36

[Table 2-2] Complexity of some well-known games ...... 40

- iv - LIST OF FIGURES

[Figure 2-1] The model for tic-tac-toe game-tree ...... 21

[Figure 2-2] A 19x19 board size ...... 26

[Figure 2-3] Illegal moves in Ko rule...... 29

[Figure 2-4] Keeping alive using seki rule ...... 30

[Figure 2-5] End-game status: Score calculation include related factors ...... 34

[Figure 2-6] Game-tree model of Go-game ...... 39

[Figure 3-1] UML class diagram of the basic structure of Orego ...... 63

[Figure 3-2] FUEGO Main Application Documentation ...... 64

[Figure 3-3] FUEGO Go-game using GTP commands ...... 65

[Figure 3-4] Explain details of each command in FUEGO project ...... 65

[Figure 3-5] Playing Go-game at “Play OK” website ...... 70

[Figure 3-6] Both types of networks using in AlphaGo ...... 74

[Figure 3-7] GoGui interface with out attached engines ...... 75

[Figure 3-8] Selecting the board size for expected game ...... 75

[Figure 3-9] In playing progress, gamers must imply game’s rules ...... 76

[Figure 3-10] Rule sets are implemented in GoGui...... 77

[Figure 3-11] Fuego has resigned to an online player ...... 79

[Figure 3-12] Calculating after Fuego has resigned ...... 79

[Figure 3-13] CGoban3 interface ...... 80

- v - [Figure 3-14] Log in interface of CGoban3...... 81

[Figure 3-15] Selecting a room to play in CGoban3 software ...... 83

[Figure 4-1] One layer of the CNN in the pooling stage...... 86

[Figure 4-2] A kernel operates on one pixel ...... 87

[Figure 4-3] Some samples of effects by convolving kernel matrices...... 89

[Figure 4-4] Go-game CNN structure ...... 90

[Figure 4-5] Layer C1 with 4 boards of 18x18 size from original board-state ...... 90

[Figure 4-6] Coding a Go-game moves in a text file ...... 91

[Figure 4-7] Input data files in sgf format saved as text files...... 91

[Figure 4-8] A typical board-state file in SGF...... 92

[Figure 4-9] Transfer procedures between: sgf code, board-code, matrix index .... 92

[Figure 4-10] The suggesting Go-game moves with weights ...... 93

[Figure 4-11] Moves coding and showing on game board...... 94

[Figure 4-12] Transfer SGF code onto game-board position ...... 95

[Figure 4-13] SGF code B[jj] transfer to game-board ...... 95

[Figure 4-14] Board-state file with moves coding in sgf format ...... 96

[Figure 4-15] Different accuracies of suggestion on different data sets ...... 97

[Figure 4-16] The suggestion with weights at third move...... 97

[Figure 4-17] The suggestions after two moves have been made...... 98

[Figure 5-1] First move for black player of Orego and Fuego...... 100

[Figure 5-2] HuuDucGo’s suggestion with 10 best choices for first move...... 101

[Figure 5-3] 2nd move of white player suggested by Orego and Fuego...... 101

- vi - [Figure 5-4] 2nd move: HuuDucGo’s suggestion includes two others...... 102

[Figure 5-5] HuuDucGo's suggestion for 3rd move...... 103

[Figure 5-6] 3rd move: Orego converges at 1st choice, Fuego at 3rd choice...... 103

[Figure 5-7] HuuDucGo's suggestion for 4th move...... 104

[Figure 5-8] 4th move: Orego and Fuego also converge at 2nd choice...... 104

[Figure 5-9] 4th move: 1st and 2nd choice has approximate high weights...... 105

[Figure 5-10] Fifth suggestions at a given board of Orego and Fuego...... 106

[Figure 5-11] The fifth suggestion of HuuDucGo includes Fuego and Orego...... 106

[Figure 5-12] The 11th move suggestions of Orego and Fuego...... 107

[Figure 5-13] HuuDucGo's suggetion at 11th move in a given board-state ...... 107

[Figure 5-14] Fuego converges with Orego at Q4...... 108

[Figure 5-15] HuuDucGo also converges to Orego at Q4 by the 3rd choice...... 109

[Figure 5-16] 2nd move: Suggestion of Orego and Fuego...... 109

[Figure 5-17] Huu DucGo's suggestion converges at the 2nd choice...... 110

[Figure 5-18] Fuego converges with Orego at 3rd move...... 111

[Figure 5-19] 3rd move: HuuDucGo converges with Orego's at 6th choice...... 111

[Figure 5-20] 4th move: Fuego converges with Orego...... 112

[Figure 5-21] HuuDucGo's suggestion converges with Fuego at 4th move...... 112

[Figure 5-22] First move of Orego converges with Fuego...... 113

[Figure 5-23] HuuDucGo converges with Fuego at the 3rd choice...... 114

[Figure 5-24] 2nd move: Orego suggests at D16 while Fuego at Q16...... 114

[Figure 5-25] 2nd move: HuuDucGo converges with Fuego...... 115

- vii - [Figure 5-26] 3rd move: following Fuego, Orego converges at C4 position...... 115

[Figure 5-27] 3rd move: HuuDucGo converges with Fuego by 7th choice...... 116

[Figure 5-28] Orego converges with Fuego at 4th move...... 116

[Figure 5-29]. HuuDucGo converges 4th move by 3rd choice...... 117

- viii - ABSTRACT

Deep Learning Applied To Go-Game: A Survey of Applications and Experiments

HOANG, HUU DUC Department of Digital Media Graduate School of Soongsil University

More than four thousand years ago, Go-game has been invented and widely applying to teach for people about intelligent. Many kings had played it and taught their sons for improving their intelligent.

In 1980s, personal computers had been created and become popular. Many games are also programmed to help people play with the computer or support for users in playing a traditional game. Many famous traditional games played by our ancestor still remain until now like , Chess, , Go-game, tic-tac-toe, etc… and most of them had been being illustrated by computer programs and many ease games had been successfully programmed. Some complexity games have been being developed day by day to get better gradually. Specially, there are some games with very high level of complexity need intelligent think to play with their

- ix - strategies and rules like Shogi, Chess, Go-game, etc… is very hard to program to make it can play as intelligent as human.

Machine learning has emerged as the solution to let a computer thinks and learns from data similar to human brain. It uses algorithms to build analytical models, helping computers “learn” from given data for a particular purpose. It can now be applied to huge quantities of data to create exciting new applications such as driverless cars, robotic devices. Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. It is known as a new approach what improve the learning progress to get better trained data sets cause to higher accuracy results.

My thesis concentrates on applying deep learning into Go-game. I make a survey about “deep learning applied to Go-game”. I also introduce some open source code project about Go-game and how to apply deep learning technique to resolve a problem, more detail in calculating next move for playing Go-game with higher accuracy and effect. I have worked with my advisor professor Jung Keechul to build a program that give out the suggestion of next best move in Go-game. In this method, we use 5 hidden layers with 3 CNNs layers for training data.

The contributions of my thesis are show off approaches in applying deep learning into HCI with higher performance, effects. Firstly, I have provided a complete observation about deep learning and how to apply it via a survey: deep

- x - learning applied to Go-game. Secondly, I have shown the huge necessary resources of Go-game researches and projects. Thirdly, I have shown experiments about applying into Go-game and our work in this field. Finally, I have made comparison among 3 programs: My program (HuuDucGo), Orego, and Fuego. The convergence of the suggestions between them shows that: HuuDucGo’s suggestions are acceptable. The particular experiments of our work: Applying deep learning to calculating and suggesting best next moves in playing Go-game with acceptable results that converge with suggestions of 2 other big programs: Orego and Fuego.

- xi - 국문초록

바둑에적용된 깊은 학습:

응용 및 실험에 대한 조사

황후덕

디지털 미디어학과

숭실대학교 대학원

4 천년 전에 Go-game (바둑) 이 발명되어 널리 지능을 가진

사람들을 가르치기 위해 널리 적용되었습니다. 많은 왕이 그것을

연주하고 지능을 향상시키기 위해 아들을 가르쳤습니다.

1980 년대에 퍼스널 컴퓨터가 만들어지고 인기를 얻었습니다. 또한

많은 게임이 사람들이 컴퓨터로 게임을하거나 기존 게임을 플레이 할 때

사용자를 지원할 수 있도록 프로그래밍되어 있습니다. Shogi, Chess,

Xiangqi, Go-game, tic-tac-toe 등과 같이 지금까지도 우리 조상이 연주

한 많은 유명한 전통 게임이 컴퓨터 프로그램에 의해 삽화가되고 많은

게임이 성공적으로 프로그래밍되었습니다 . 점진적으로 개선되기 위해

매일 복잡성 게임이 개발되고 있습니다. 특히, 복잡성이 매우 높은 일부

- xii - 게임은 장군, 체스, 고 게임 등과 같은 전략과 규칙을 가지고 노는 것이

지능적이라고 생각합니다. 인간과 똑같이 지능적으로 플레이 할 수

있도록 프로그램하는 것은 매우 어렵습니다.

기계 학습은 컴퓨터가 인간의 두뇌와 유사한 데이터를 생각하고

학습하게하는 솔루션으로 부상했습니다. 알고리즘을 사용하여 분석

모델을 작성하여 컴퓨터가 특정 목적을 위해 주어진 데이터에서

"학습"하도록 돕습니다. 이제는 무인 자동차, 로봇 장치와 같은 흥미 진

진한 새로운 응용 프로그램을 만들기 위해 방대한 양의 데이터에 적용

할 수 있습니다. 딥 러닝 (Deep Learning)은 인공 신경 네트워크

(artificial neural networks)라고 불리는 뇌의 구조와 기능에 영감을받은

알고리즘과 관련된 기계 학습의 하위 분야입니다. 이것은 더 나은 훈련

된 데이터 세트를 얻기 위해 학습 진도를 향상시키는 새로운 접근법으로

알려져있어 더 정확한 결과를 초래합니다.

내 논문은 Go-game 에 깊은 학습을 적용하는 데 중점을 둡니다.

나는 "Go-game 에 적용된 깊은 학습"에 관한 설문 조사를합니다. Go- game 에 대한 오픈 소스 코드 프로젝트와 문제 해결을위한 심층적 인

학습 기술을 적용하는 방법을 소개합니다. 정확성과 효과가 더 높은 Go-

게임을위한 다음 동작 계산에 대해 자세히 설명합니다. 저는 고문 교수

- xiii - 인 Jung Keechul 과 함께 Go-game 의 차세대 제안에 대한

제안을했습니다. 이 방법에서는 3 개의 CNN 레이어가있는 5 개의

숨겨진 레이어를 사용하여 데이터를 학습합니다.

저의 논문의 기여는보다 높은 성과와 효과를 가진 HCI 에 깊은

학습을 적용하는 접근법을 과시하는 것입니다. 첫째, 심층 학습에 대한

완전한 관찰과 그것을 설문 조사를 통해 적용하는 방법을 제공했습니다.

Go-game 에 깊은 학습이 적용되었습니다. 둘째, Go-game 연구 및

프로젝트에 필요한 거대한 자원을 보여주었습니다. 셋째, Go-game 에

적용된 실험과이 분야에서의 작업을 보여주었습니다. 마지막으로 필자는

내 프로그램 (HuuDucGo), Orego 및 Fuego 의 3 가지 프로그램을

비교했습니다. 그들 사이에 제안의 수렴은 다음을 보여줍니다 :

HuuDucGo 의 제안을 받아 들일 수 있습니다. 우리의 작업에 대한

특별한 실험 : Orego 와 Fuego 와 같은 다른 두 가지 큰 프로그램의

제안과 수렴되는 수용 가능한 결과를 가지고 Go- 게임을 할 때 다음

행동을 계산하고 제안하는 데 깊은 지식을 적용합니다.

- xiv - CHAPTER 1: INTRODUCTION

1.1 Background

Recently, deep learning has become a very hot topic in IT research field and the core of deep learning is machine learning technique. In five recent years, deep learning technique has emerged as the most effective technique apply for training data of AI systems in recent five years. In just only five years, but Deep learning has made big changes in IT world specially in processing big data.

Nowadays, deep learning technology is helping the world grow at unprecedented speed. In the recent six years, the world has been seeing tremendous strides in the quality and "godliness" of technology products we use every day.

First of all, voice recognition technology has been upgraded a lot compared to before and users nowadays can use voice commands to interact more with smart devices.

And then, we have image recognition technology - features that are widely available on the products. You can search and organize your photos without tagging them, just based on what's in the picture, from a cat like, snowfall to lush.

Like hugs many of these products can even read and describe the elements of the photograph for visually impaired users.

What many people do not realize is that all of these technologies are essentially derived from the same source. They are developed from "deep learning", a special branch of AI. Many scientists still prefer to call it the original name: deep

- 1 - neural network.

In fact, no engineer can program the computer to perform the features mentioned above. But they created an algorithm that allowed the computer to self- study and then exposed terabytes of relevant data - such as hundreds of thousands of flower pictures, or voice recordings. This constant contact will gradually "train" the computer and make it self-identify the required images and voices. Just like how a child learns about the world around him, after a long time watching the world.

Deep neural networks are not a new concept, which has been around since the

1950s. A lot of breakthroughs in algorithms have taken place in the 1980s and

1990s. The reason that they have just come so far is because scientists have finally been able to take advantage of all the computing power combined with vast amounts of data, video, audio and video. Text files on the Internet - the deciding factors for neuron networks can work effectively.

The most attraction while research about deep learning is that we must think newly over given concept creatively. If they are applied creatively, we could be received very higher performance and better effect in processing big data.

1.1.1 Machine Learning

Machine Learning is an aspect of research and application that emerged in the early 1980s, starting as a technicality field in computer science including cognitive science and human influent factor engineering. Machine Learning has emerged

- 2 - significantly and steadily for more than recent 30 years, and it interest professionals because of many other disciplines and incorporating diverse knowledge and approaches. To a considerable extent, machine learning now accumulates a collection of semi-autonomous aspects of research and practice in state-of-art powerful information science. Farther more, the continuing synthesis of disparate perceptions and approaches to information science and practice in machine learning has created a dramatic example of different epistemologies and paradigms can be accommodated and integrated in a wave and productive intellectual project.

Machine Learning researches the design and the use of computer technology, concentrated on the interaction between human and computers. Researchers in the field of Machine Learning not only observe the methods which humans can be interacted with computers and design technologies but also let humans interact with computers fluently. The biggest challenge is getting the AI machines to work, understanding where the around environment. The development of artificial intelligence opens a new complex threat: the morality of the intelligent machine.

We can build an intelligent machine what can think and decide for itself, many unforeseen problems can happen.

1.1.2 Deep learning

Deep Learning is a subfield of machine learning concerned with algorithms inspired by the organization of the brain and as known as artificial neural networks,

- 3 - deep learning is a type of machine perception with multi-layer in training progress.

Objects are trained by algorithms to recognize things using supervised learning method, and to cluster objects using unsupervised learning method. The difference between unsupervised learning supervised and is whether you have a labeled training set to work with or not. So deep learning, working with other algorithms, can help you classify, cluster and predict. It does so by learning to read the signals, or structure, in data automatically. When deep learning algorithms train, they make guesses about the data, measure the error of their guesses against the training set, and then correct the way they make guesses in order to become more accurate. Farther more, we also make computers to model the around world well enough to perform intelligence. To achieve these goals, it is observed that a large quantity of knowledge about our world should somehow be saved, explicitly or implicitly in the computers. Because they seem daunting to formalize manually all that knowledge with a form that computers can use to respond questions and generalize to new contexts, much progress have turned to learning algorithms to catch a large fraction of those knowledge. Many researches have been made in this field to create and improve learning algorithms, but the artificial intelligence remains challenges.

The unique limitations exist in Machine Learning and Deep Learning is our imagination.

- 4 - 1.1.3 Convolutional Neural Networks

In machine learning, a Convolutional Neural Network is a type of feed- forward network. This is the connectivity pattern between its own neurons is inspired by imitating the structure of animal visual cortex. Individual cortical neural neurons respond to stimulations in a limited region of space is known as the receptive field. The receptive data of different neural neurons fragmentally overlap such as they tile the visual data. The feedback of an individual neural neuron to stimuli within its receptive data can be approximated mathematically using a convolution operation. Convolutional Neural Network were inspired by simulating biological processes and are changes of multilayer perceptrons designed in order to use the minimal numbers of preprocessing. They have immense applications in videos and images recognition, suggesting systems and natural language processes.

1.1.4 Go-game

To the “game of Go” or Go-game, it can be said that the first and most important reason why go for such extraordinary vitality is that the go play rules are extremely simple, simply amazing. Go-game is a game on the chess board and is an intellectual sport for two players.

From the very beginning, Go-game was appreciated for its emphasis on methodology, with a very long history. There are many legends about its origins,

- 5 - including a theory widely recognized by many people that the beginning of the

Yiran era (Age of a famous China’s king).

Up to the present day, despite being over 4000 years old, go is not only aging, falling apart but increasingly alive, 68 countries around the world have jointly established the Go-game Association International attracted millions of fans.

Chinese people after nearly 100 years forget their genuine products run by

Chess, now trying to recover. Japan has risen to the top of the go-game world and has kept the governing body for many years.

Some of today's good go-game instructors think that the go-game looks like the universe, which the 360 celestial bodies combine. There are 19 vertical lines and 19 horizontal lines on the chessboard, and 361 total points. A surplus point in the center, called the Tian Yu, is the Taiji (Korean: 태극), representing the center of the universe. The number 360 is the number of days in a lunar year, divided into four. The four corners are spring, summer, autumn and winter. Black and white flags represent day and night. So the chessboard is like a transformative image of

Heaven and Earth.

Our ancients had created Go-game, not only to spend time or learn how to fight over losing, but to cultivate mentally, generate wisdom, and express artistic talents of the player. Besides, Go-game has also correlated with the phenomenon of meditation, strategic military strategy, and the problem of national security.

Formerly, in the court of China at that time, who do not know how to go play is still considered "still defective."

- 6 - In conclusion, Go-game is an excellent product of our ancients to train not only people’s intelligent but also cultivate mentally. The complexity is that it has too many possible moves and various strategies to play. The complexity is not only against human but also against to the machine learning field, in more special is deep learning aspect. The more complexity it is, the more attraction it bring.

1.2 Research motivation

First of all, Deep Learning is an attractive engineering that can helps making devices become intelligent and can interact to human as wise as they have brains. If we can understand Deep Learning profoundly, we can intervene to their actions and drive them according to our purposes.

Secondly, although deep learning is not a new idea, but in recent years, it has been being hot trend cause of state-of-art computing power of current computer systems what could not be deal in the past just about ten years ago.

Thirdly, since 2010, social networks like LinkedIn, Twitter, and Facebook had been being significant developed with hundred millions using people wildly all over the world. This is one of the largest big data resources created daily and no person can process it without helping of powerful computer system that can resolve with big data automatically and efficiently:

+ Collecting big data automatically based on using goal.

+ Analyzing big data concentrate on aim purpose.

+ Modeling big data.

- 7 - + Processing big data to find out expected features.

Nowadays, we can resolve big data because of current enormous computing power helps us to calculate millions- or even billions times faster than previous computing system.

Finally, while applying deep learning into Go-game using a powerful computing system, we can create an incredible machine that can resolve big data quickly, intelligently, and correctly.

1.3 Perspective and overview

The rapid emerge of computing has done effective human-computer interaction essential. It is needed for the increasing number of machine users whose professional schedules will not accept the elaborate training and experience to take advantage of the powerful computing. Growth attention to usability is also leaded by competitive pressures for better productivity, the necessary to decrease frustration, and to decrease overhead costs such as training work. As the growth of computing affects, more aspects in our lives the necessary for usable systems becomes even so important.

1.3.1 Machine Learning development

Design in machine learning is more complicated than in many other aspects of engineering. It is inherently interdisciplinary, the drawing on and influencing

- 8 - diverse regions such as in software engineering, human factors, computer graphics, and psychology. However, the developer’s task of making a complicated system occur sensible to the using person is in itself a very hard, tough, and complicated task.

The principles of human factors applied into machine interaction became the problem of intense applied research, when equipment’s complexity began to exceed the limitations of human abilities in safe operations. Further, the complexity of computation and software development researches pose additional requirements.

The engineering paradigm that is common to many other aspects can be created to a technical approach of engineering usability in computing systems and is now in widespread use. The paradigm pursues an iterative cycle via analysis, design, implementation, and evaluation. Executable engineering structures human factors activity for working with software engineering researches. Development of executable systems draws on technologies from using person software architecture, process and data modeling, standards, and tools for modeling, interface media, constructing and testing user interfaces. Notice that each of them can be a title of research or project. These technologies will be able to be covered in the following parts on the psychology of deep learning and the computer science of machine learning.

1.3.2 Deep learning history

Which modifiable components of a learning system are responsible for its

- 9 - success or failure? What changes to them help improve performance? This has been called the fundamental credit assignment problem (Minsky, 1963). Learning or credit assignment is about calculating weights that make the neural network exhibit desired behavior – such as driving a car. Depending on the aim and how the neural neurons are connected, the behavior may require long causal chains of computing stages, where each stage transforms the aggregate activation in the network. Deep Learning is accurately grant credit across many stages.

Feedforward neural networks are acyclic, recurrent neural networks are cyclic.

“In a sense, Recurrent neural networks are the deepest of all neural networks” in principle they can generate and process memories using arbitrary sequences from input patterns.

To evaluate whether the credit assignment in a given neural network application is deep or shallow type, the concept Credit Assignment Paths, which are strings of possibly causal links among events. From input, process through hidden layers to output layers in Feed-forward neural networks, or through transformations over time in Recurrent neural networks.

If a credit assignment path (a path through the graph starting with an input) is of the form (…, k, t,… q), where k and t are the first successive elements with modifiable weights (it’s possible that t = q), then the length of the suffix list t…q is the path’s depth.

This depth restricts how far backwards credit grant can go down the causal string to find a changeable weight. The depth of deepest credit assignment path

- 10 - within an event sequence is called the solution depth… Given some fixed neural network topology, the lowest depth of any solution is named problem depth.

Sometimes we also remind to the depth of an architecture: Supervised learning

Feed-forward neural networks with fixed topology mention a problem- independent maximal problem depth rounded with numbers of non-input layers. In general, recurrent neural networks can learn to resolve problems of potentially unlimited depth.

Deep Learning, as a branch of Machine Learning, employs algorithms to process data and imitate the thinking process, or to develop abstractions. Deep

Learning uses layers of algorithms to process data, understand human speech, and visually recognize objects. Information is passed through each layer, with the output of the previous layer providing input for the next layer. The first layer in a network is called the input layer, while the last is called an output layer. All the layers between the two are referred to as hidden layers. Each layer is typically a simple, uniform algorithm containing one kind of activation function.

Feature extraction is another aspect of Deep Learning. Feature extraction uses an algorithm to automatically construct meaningful “features” of the data for purposes of training, learning, and understanding. Normally the Data Scientist, or programmer, is responsible for feature extraction.

The history of Deep Learning can be traced back to 1943, when Walter Pitts and Warren McCulloch created a computer model based on the neural networks of the human brain. They used a combination of algorithms and mathematics they

- 11 - called “threshold logic” to mimic the thought process. Since that time, Deep

Learning has evolved steadily, with only two significant breaks in its development.

Both were tied to the infamous Artificial Intelligence winters.

Henry J. Kelley is given credit for developing the basics of a continuous Back

Propagation Model in 1960. In 1962, a simpler version based only on the chain rule was developed by Stuart Dreyfus. While the concept of back propagation (the backward propagation of errors for purposes of training) did exist in the early

1960s, it was clumsy and inefficient, and would not become useful until 1985.

The earliest efforts in developing Deep Learning algorithms came from

Alexey Grigoryevich Ivakhnenko (developing the Group Method of Data

Handling) and Valentin Grigorʹevich Lapa (author of Cybernetics and Forecasting

Techniques) in 1965. They used models with polynomial (complicated equations) activation functions, that were then analyzed statistically. From each layer, the best statistically chosen features were then forwarded on to the next layer (a slow, manual process).

During the 1970’s the first AI winter kicked in, the result of promises that couldn’t be kept. The impact of this lack of funding limited both DL and AI research. Fortunately, there were individuals who carried on the research without funding.

The first “Convolutional Neural Networks” were used by Kunihiko Fukushima.

Fukushima designed neural networks with multiple pooling and convolutional layers. In 1979, he developed an artificial neural network, called Neocognitron,

- 12 - which used a hierarchical, multilayered design. This design allowed the computer the “learn” to recognize visual patterns. The networks resembled modern versions, but were trained with a reinforcement strategy of recurring activation in multiple layers, which gained strength over time. Additionally, Fukushima’s design allowed important features to be adjusted manually by increasing the “weight” of certain connections.

Many of the concepts of Neocognitron continue to be used. The use of top- down connections and new learning methods have allowed for a variety of neural networks to be realized. When more than one pattern is presented at the same time, the Selective Attention Model can separate and recognize individual patterns by shifting its attention from one to the other. (The same process many of us use when multitasking). A modern Neocognitron can not only identify patterns with missing information (for example, an incomplete number 5), but can also complete the image by adding the missing information. This could be described as “inference.”

Back propagation, the use of errors in training Deep Learning models, evolved significantly in 1970. This was when Seppo Linnainmaa wrote his master’s thesis, including a FORTRAN code for back propagation. Unfortunately, the concept was not applied to neural networks until 1985. This was when Rumelhart, Williams, and Hinton demonstrated back propagation in a neural network could provide

“interesting” distribution representations. Philosophically, this discovery brought to light the question within cognitive psychology of whether human understanding relies on symbolic logic or distributed representations. In 1989, Yann LeCun

- 13 - provided the first practical demonstration of back propagation at Bell Labs. He combined Convolutional Neural Networks with back propagation onto read

“handwritten” digits. This system was eventually used to read the numbers of handwritten checks.

This time is also when the second AI winter (1985-90s) kicked in, which also effected research for neural networks and Deep Learning. Various overly- optimistic individuals had exaggerated the “immediate” potential of Artificial

Intelligence, breaking expectations and angering investors. The anger was so intense, the phrase Artificial Intelligence reached pseudoscience status. Fortunately, some people continued to work on AI and DL, and some significant advances were made. In 1995, Dana Cortes and Vladimir Vapnik developed the support vector machine (a system for mapping and recognizing similar data). LSTM (long short- term memory) for recurrent neural networks was developed in 1997, by Sepp

Hochreiter and Juergen Schmidhuber.

The next significant evolutionary step for Deep Learning took place in 1999, when computers started becoming faster at processing data and GPU (graphics processing units) were developed. Faster processing, with GPUs processing pictures, increased computational speeds by 1000 times over a 10 year span.

During this time, neural networks began to compete with support vector machines.

While a neural network could be slow compared to a support vector machine, neural networks offered better results using the same data. Neural networks also have the advantage of continuing to improve as more training data is added.

- 14 - Around the year 2000, The Vanishing Gradient Problem appeared. It was discovered “features” (lessons) formed in lower layers were not being learned by the upper layers, because no learning signal reached these layers. This was not a fundamental problem for all neural networks, just the ones with gradient-based learning methods. So, the problem’s source turned out to be certain activation functions. A number of activation functions condensed their input, in turn reducing the output range in a somewhat chaotic fashion. This produced large areas of input mapped over an extremely small range. In these areas of input, a large change will be reduced to a small change in the output, resulting in a vanishing gradient. Two solutions used to solve this problem were layer-by-layer pre-training and the development of long short-term memory.

In 2001, a research report by META Group (now called Gartner) described he challenges and opportunities of data growth as three-dimensional. The report described the increasing volume of data and the increasing speed of data as increasing the range of data sources and types. This was a call to prepare for the onslaught of Big Data, which was just starting.

In 2009, Fei-Fei Li, an AI professor at Stanford launched ImageNet, assembled a free database of more than 14 million labeled images. The Internet is, and was, full of unlabeled images. Labeled images were needed to “train” neural nets. Professor Li said, “Our vision was that Big Data would change the way machine learning works. Data drives learning.”

By 2011, the speed of GPUs had increased significantly, making it possible to

- 15 - train Convolutional Neural Networks “without” the layer-by-layer pre-training.

With the increased computing speed, it became obvious Deep Learning had significant advantages in terms of efficiency and speed. One example is AlexNet, a

Convolutional Neural Network whose architecture won several international competitions during 2011 and 2012. Rectified linear units were used to enhance the speed and dropout.

Also in 2012, Google Brain released the results of an unusual project known as The Cat Experiment. The free-spirited project explored the difficulties of

“unsupervised learning.” Deep Learning uses “supervised learning,” meaning the convolutional neural net is trained using labeled data (think images from

ImageNet). Using unsupervised learning, a convolutional neural net is given unlabeled data, and is then asked to seek out recurring patterns.

In just a few recent years, deep learning has been advancing advances in a variety of fields such as object perception, automatic translation, speech recognition, etc. - problems that have been very difficult for human intelligence researchers. The ability to analyze big data and to use deep learning in computer systems that can adapt to what they receive without the human programming hand will quickly pave the way for more breakthroughs future. These breakthroughs could be the design of virtual assistants, self-driving systems or the use of graphic design, music creation, and the development of new materials that help the robot understand the around world.

- 16 - 1.3.3 Contemporary machine learning

Nowadays, machine learning has growth with numerous successes, but the methods to apply learning algorithms usually require spending a long time hand- engineering with the input feature representation. This is true for many problems in vision, audio, robotics, and other areas. To address this, researchers have developed deep learning algorithms that automatically learn a good representation for the input. These algorithms are today enabling many groups to achieve ground- breaking results in vision, speech, language, robotics, and other areas.

With deep learning, we can classify, cluster or predict anything we have data about: images, video, sound, text and DNA, time series (touch, stock markets, economic tables, the weather). That is, anything that humans can sense and that our technology can digitize. With deep learning, we are basically giving society the ability to behave much more intelligently, by accurately interpreting what's happening in the world around us with software.

1.4 Contributions

My research is about understanding Go-game and background via a survey on deep learning applied in Go-game and showing the experiments. Experiments of our work in journal “Suggest moving positions in Go-Game with Convolutional

Neural Networks trained data”. Our work has been published on IJHIT

(International Journal of Hybrid IT) Vol.9 No.4 in 2016 at the link:

- 17 - “http://dx.doi.org/10.14257/ijihit.2016.9.4.05”.

My contributions are about:

+ Applying CNNs into writing go-game engine to calculate and suggest the best next moves for green players acceptably.

+ Showing experiments on Orego, Fuego, and our work: Showing several real games and making comparison with my program. My program is almost converged with those two big project’s programs.

+ Suggesting a new approach to apply deep learning with acceptable suggestive moves. Instead of traditional ways using tree search algorithm for , I have applied CNNs to train data and bring move suggestion based on board-state input data.

- 18 - CHAPTER 2: GO-GAME AND ITS COMPLEXITY

2.1 Challenge

At the first sight, Go-game seem to be an easy and simple game with two basic color (white and black) pieces and only one type of piece to for each player. People can easily make comparison with Chess or Xiangqi which has seven types of pieces with totally different move rules for each piece. The large number of possible moves, the strategy and rules with many restrictions make the move become so complicated far overcome with Chess. This poses Go-game an attractive conquering goal to many researchers until now.

Go-game is an old game played by ours ancestors. In early stage of developing computer game, many researchers have tried to write an intelligent Go-game and have the acceptable time in calculating each move. Go is a board game where two players compete to control the most territory on the game board. It's normally played on a 19 x 19 grid with flat (can be 9x9 or 13x13 for smaller boards), round pieces called "stones". One player uses black stones. The other uses white. Each player takes turns alternately placing their stones on empty intersections on the grid.

Opponents spend the game trying to surround or border empty intersections on the board with their stones complying with the Go-game’s rules.

The most important part in writing Go-game programs is building the evaluation function. The good evaluation function creates the good engine that can suggest an advantaged move while playing Go-game in a given situation with high

- 19 - score. In board game types like Chess, Xiangqi, Shogi, etc. there are so many possible moves to take. The Go-game’s complexity is many times larger than the others. So, how to calculate the intelligent choice or best move in each turn to take advantage to the opponent is not only based on the strategy but also smart adjust following the opposite player’s moves. In each board state, not only one best choice exist, but normally there are certain or even tens good positions with the approximate scores to take advantage in different strategies get higher score to the opponent. The computing from an enormous state space up to 10^170 positions combined with complicated tactics that get to steep non-linearity optimal value function has leaded many researchers to a conclusion that it is impossible to represent and learn such a function. In about six recent years, the most success algorithm has sidestepped this problem Monte-Carlo tree search.

2.2 Applying deep learning into Go-game

Although many people have concerned and paid much attention to apply deep learning into an AI system/device, but until now, it is still challenge.

To understand how AI programs are capable of playing games such as chess and Go-game, we have to understand what a game tree is. A game tree represents game states (positions) as nodes in the tree, and possible actions as edges. The root of the tree represents the state at the beginning of the game. The next level represents the possible states after the first move, etc...

- 20 -

[Figure 2-1] The model for tic-tac-toe game-tree

Knowing the complete game tree is useful for a game playing AI, because it allows the program to pick the best possible move at a given game state. This can be done with the minimax algorithm: At each game turn, the AI figures out which move would minimize the worst-case scenario. To do that, it finds the node in the tree corresponding to the current state of the game. It then picks the action that minimizes the worst possible loss it might suffer. This requires traversing the whole game tree down to nodes representing end-of-game states. The minimax algorithm therefore requires the complete game tree. Game tree is great for tic-tac- toe, but not useful for chess, and even less so for Go-game.

The challenge of personal computing power became certainly at an appropriate

- 21 - time to resolve the much complexity in calculating a move in playing Go-game.

The spacious research field of cognitive science, which incorporated cognitive psychology, linguistics, AI, cognitive anthropology, and the philosophy of mind, had founded at the end of the 1970s. Part of cognitive science was enunciated scientifically and systematic informed applications to be known as "cognitive engineering". At that point when personal computing presented the experimental need for deep learning, cognitive science presented humans, concepts, skills, and vision for applying such needs. Deep learning was the first instance of “cognitive engineering”.

It is reality to say that deep learning is achieving state-of-the-art results through a chain of difficult problem domains. There are lots of excitements around artificial intelligence, and deep learning at this time. It is an amazing opportunity to apply and develop on the ground floor really powerful technique.

Because of so much complexity in Go-game, it need a strong and powerful computing system in calculating each move that will be the good choice to take advantage to the opponent in a game. Not only choose among the possible moves and legal moves, but also need to follow the basic strategy and think about the endgame strategy. That is the reason why until now, Go-game remains challenge to the researchers.

- 22 - 2.3 Game’s rule sets, possible moves, and legal moves

Go-game has sets of simple rules, but the complexity of it is enormous because of the huge number of moves it has. Besides, the combination of rules, strategies and moves make Go-game become more complicated.

2.3.1 Go-game’s rule sets

In general, there are 3 closely related issues which ought to be addressed by the variation of the rules:

2.3.1.1 Making sure that the game will come to an end-state:

Go-game players must be able to set unset situations to avoid going into circles. And neither player must be able to drag the Go-game out indefinitely either rather than losing or to irritate the opponent. Possible methods include: time control, ko rule, the super-ko rule, or placing an upper border on the number of possible moves. This is affected by the scoring procedure used while territory scoring penalizes extended execute after the boundaries of territories have been built.

2.3.1.2 Deciding the winner of the game:

Relating terms to calculate in the score are: prisoners captured during the game, komi, dead groups on the board at the endgame state, number of territory controlled by a gamer but not occupied by their stones, living stones of each gamer, the number of pass turns, and the amount of disjoint living stone groups on the board.

- 23 - 2.3.1.3 Determining whether a group of stones is dead or alive at endgame states:

If two players are unable to unify the result, some rules provide for mediation using virtual attempts in order to capture the group. Others allow the game resume until that group is captured or display clearly immortal.

There are many different official rule sets of Go-game in the developing progress on over the world. These vary in significant schemes, such as the method used to calculate the final score, and in small schemes, just like whether the two types of "bent four in the corner" situations result in removing the dead stones automatically at the endgame state or whether the position have to played out, and whether the gamers ought to start the new game with a fixed amount of stones or with an unbounded number. There are some popular official rule sets include

Korean, Chinese, American Go Association, Japanese, Ing, and New Zealand.

In most cases the differences between the rule sets are negligible. However, the rule sets rarely results in a difference in final score more than one point, the strategies and tactics of the Go-game are almost unaffected by the choosen rule sets used. Some differences come from move passes and seki scoring.

2.3.1.4 Board sizes:

The board sizes of 9x9 and 13x13 often use for begin players or to learn playing Go. Besides, 11x11, 15x15, 17x17 can be used but not often, or any size board from 5x5 up just using odd number is alright.

Normally, Go-game is played on a 19x19 grids, or board. [Figure 2-2] shows an empty board. There are nine marked points; they are usually referred to as the

- 24 - star points. They serve as reference points as well as markers on which the handicap stones are placed in handicap games.

In the board, columns are marked by character from “A”, “B”,... to “T” exclude character “I”. Rows are number by digit form 1 to 19. So, a stone’s position is indicated by its row and column intersection likes: c-3; d-9; t-19,...

There force, each intersection of the game board has one of three following states:

+ Blank (no stones);

+ Black stone;

+ White stone.

2.3.1.5 The Stones:

The pieces (tokens) used are black and white lens-shaped disks, called stones.

Black starts out with 181 stones and White with 180 stones. The total of 361 stones corresponds to the number of intersections on the standard size of 19x19. Stones are usually kept in wooden bowls next to the board in physical go-game.

- 25 -

[Figure 2-2] A 19x19 board size

2.3.1.6 Playing a game:

+ At the beginning situation of a new game, the board is empty.

+ One player takes the black stones, the other player the white ones.

+ The player with the black stones, referred to as `Black', makes the first move.

The player with the white stones, referred to as `White', makes the second move.

Thereafter, they alternate making their moves.

+ A move is made by placing a stone on an intersection.

+ A stone can be put on any blank intersection, but it must be a legal move- not only care about possible moves.

- 26 - + A stone does not move after being played, unless it is captured and taken off the board.

+ Capture: Removing from the board any stones of player’s opponent's ones that have no liberties.

+ Self-capture: Removing any stones of the very player’s own ones that have no liberties out of the board.

+ Liberty: In a given position, the liberty of a stone is a blank intersection adjacent to that stone or adjacent to that stone which is connected to that one.

+ A stone or solidly connected chain of stones of one player is captured and evacuated from the board while all the intersections directly adjacent to it are owned by the enemy.

+ Connected stones and points: Two same color stones (or two blank positions) are said “they are connected” where it is possible to draw a line from one position to the second position by passing through their adjacent positions in same state. The definition of connected empty points is only used at the endgame state to specify each player's score.

+ When no moves may be played, Go-game will recreate the former board position.

+ Each player may pass (skip) their turn at any time.

+ If there are two consecutive passes, the game ends.

+ The territory of each player consists all the points of the player has either owned or surrounded.

- 27 - 2.3.1.7 Territory:

In the endgame position, an empty position is said that it belong to one player's territory if that is player’s color stones or an empty intersection adjacent to it after all dead stones or groups are removed.

2.3.1.8 Ko:

A game could not be played in such a way as to recreate the board state following one's previous move. That mean: Player may not capture just one stone, if that stone was just placed on the previous ply, and that ply also captured just very one stone. This rule helps avoiding the game loop forever. A player can take advantage from the Ko rule by placing a stone with the mind that the opponent will not be allowed to re-capture the captured stone in the very next responding move.

If the same situation is still available in the subsequent turn, the move is not forbidden anymore and the Ko advantage may be reversed. This rule use to prevent gamers from endlessly moving by capturing and recapturing one stone, this is called “back and forth”. Back and forth is normally allowed because the game can be still developing elsewhere with next moves on the board.

2.3.1.9 Superko:

This rule is designed to make sure that the game eventually comes up to an end, by avoiding indefinite repetition of the previous positions again and again.

The purpose looks like the threefold repetition rule in playing chess. The superko

- 28 - rule abandons moves that would cause repetition, In chess, some rules allow repeat moves as one method to force to draw or calculate to limited (maximum) number of moves in a game lead to draw (some Xiangqi software uses 200 moves as the maximum allowed plies number).

[Figure 2-3] Illegal moves in Ko rule.

2.3.1.10 Eye:

Eye concept is that: A connected group of at least one empty position completely surrounded by a group or chains of pieces of a color.

2.3.1.11 Shoulder Hit:

A stone placed directly next to an opponent's stone, diagonally so it isn't quite attached. Generally, this move aim to reduce opponent’s territory potential with being hardly captured.

2.3.1.12 Chain:

A solidly connected group of Go-game’s stones is called “chain”. A chain is independently alive in case it adjacent to two eyes.

- 29 - 2.3.1.13 Seki (mutual life):

A stones group of one color is said “it is alive” while apply seki rule if it is not independently alive, this group cannot be captured. For example, in the [Figure 2-

4], on the left: each group of black white have only one eye. So, they are not independently alive. Anyhow, who play at the red circled point, the opponent would then capture that group by placing a stone in its eye. In this case both the player’s groups are alive while applying seki. Have a look at the red circled point:

This position is not surrounded by pieces of only one color, and certainly it is not counted as territory for either player. In more complex cases as shown in [Figure 2-

4] on the right, a blank intersection may be bounded by a single color group which is in seki. Following Korean and Japanese Go-game rule sets, such a point is nonetheless treated by neutral territory for endgame scoring purposes. Normally, the Korean and Japanese Go-game rules only count a blank point as territory for one player if it is bounded by a chain or chains of that player’s color that are independently alive. Seki eye may occur in many different ways. The simplest are:

+ Each color has a group with no eyes and two sharing liberties;

+ Each color has one eyes and one sharing liberty.

[Figure 2-4] Keeping alive using seki rule

- 30 -

2.3.1.14 Suicide:

Suicide would occur when a player places a stone where it has no liberties. It is actually illegal to perform this move. Even if you could, it is difficult to see where it would be to the advantage of the player performing the move. Currently, most official rule sets ban playing such that a play outcome in that Go-game player's own pieces being removed from the board- self capture. But some rule sets except suicide move if there are at least two stones. Suicide of many stones rarely occurs in a real game, but in several situations, a suicide move can make big change to the entire game board; it threat the enemy’s eye shape cause to capture a large group.

2.3.1.15 Komi:

Komi in the Go-game is adding for compensation to the final score of the second player who using white stones for playing. Because using black stones get first move and take advantage of the game, generally rule sets consider compensation score to equal the match. Komi score is about between 5 and 7 points.

Komi is often takes 6.5 points in the Korean and Japanese rule sets; Komi is up to

7.5 points in Chinese rule sets. But komi usually applies only to games where both players are at the similar rank. In case of different ranks players, the weaker player often play first (black) and stronger player typically play white stones people often agree 0.5 for komi point; it means: In draw game, black stones player is the winner.

- 31 - 2.3.1.16 Kosumi:

Kosumi is a ply that placed at a position diagonally next to another same color stones while the adjoining intersections are blank. Placing this ply seems simple tactically. The aim of kosumi move can be to attack, connect, maintain, cut, move out, and so on.

2.3.1.17 Handicap:

The official rule sets differ to how pieces are placed on the board for handicap:

+ Free placement: Pieces can be placed anywhere on the board as if the player's turn repeated; (Chinese rule sets)

+ Fixed placement: Tradition dictates the piece placement following to the handicap. (Japanese rule sets)

Territory and Area scoring rules are also different in the compensation for each handicap stone. Komi also varies, ranging from certain common values: 5.5,

6.5, or 7.5.

2.3.1.18 Goal:

- The Goal of Go-game is to control more territory than your opponent. At the end of the game, calculates each the player’s control territory and score (include handicap, komi) to specify who wins the game. The player who has higher score than the other at the endgame state wins the game. The game is never draw because the komi’s point number is not an integer digit but player’s Territory and Area

- 32 - scores are.

2.3.1.19 Score:

At the endgame state, player's score is the amount of intersections in their area.

The most prominent differences among rule sets are the scoring methods. There are two official scoring systems: territory scoring and area scoring. A third system is rarely used nowadays but it was used in the past and it has historical and theoretical interest. The game’s score maybe different although the result almost the same while we choose different method. So, we should take care to distinguish between counting methods and scoring systems. There are two scoring systems are popular in use, but there are only two ways of calculating using "area" scoring.

+ Territory scoring: Using in Korean and Japanese rule sets: Player's score is decided by the amount of empty intersections that player has bounded minus the amount of pieces their opponent has captured. Korean and Japanese rules have special provisions in situations of seki, therefore, this is not the necessary part in territory scoring system.

+ Area scoring: Player’s score is calculated from two components: the amount of empty positions only your stones bound and the amount of your stones placed on the board.

Generally, calculating is done by having each player put the prisoners they have captured into the opponent's territory and recounting the remaining territory.

- 33 - 2.3.1.20 Endgame state:

The endgame is the final stage of the game when the life/death status of all big groups has been determined and the remaining moves aim at expansion of own territory and reduction of the territory of the opponent. By the endgame, the board has been more or less divided up into separate territories, and most of the fighting tends to affect only two of them, occurring at a mutual boundary. The opening and middle game are much like a single large battle between two armies; the endgame is like a number of smaller battles going on in different places simultaneously.

[Figure 2-5] End-game status: Score calculation include related factors

2.3.2 Possible moves

The amount of possible Go-game is really large. It is usual compared to the amount of atoms in the universe (about 10^80), but in fact, it is very much huger.

- 34 - In some rule sets that except cycles, such as those which only restrict basic ko rule, games can go on forever, and there is no limit that how many games that can be played. So, it is necessary to consider only those rules that contain some form of superko rules, or rule sets that contain Ko rules preventing the game from indefinitely going on. The most popular approach is using the Logical rules of Go- game. From a mathematical perspective, those are the easiest way to work with.

If we ignore the capture rule (one or group of stones can be removed from board and make occupied intersections- can be loop forever), possible moves:

361! = 1.4379232588848906548323625114999e+768 ~ 1.4*10^768.

2.3.3 Legal moves

In calculating the amount of possible moves of a game, not all of them are legal moves because of some rules like Ko rule. A top bound of the amount of positions on a 19x19 Go-game board is not so hard to calculate. Each intersection must be empty, black, white, so the amount of possible positions is exactly 3^361

(no capture), which is ~1.741 × 10^172. For this bound, symmetry is not accounted.

The larger board size, the larger possible moves number with power increase.

However, many of these positions contain series of stones without a liberty and therefore are not legal. The exact amount of legal positions is calculated for square boards include size 19×19 are shown in the [Table 2- 1].

- 35 - [Table 2-1] Number of legal moves on common board sizes (rounded) Board size Number of legal moves

9x9 ~1.039 × 10^38

13×13 ~3.724 × 10^79

17×17 ~1.908 × 10^137

19×19 ~2.082 × 10^170

An approximation to this number, called L19, has been known since 2006.

L19 is approximately 2.081681994 * 10^170

In 2016, working by John Tromp, finally he has given the exact number:

2081681993819799846

9947863334486277028

6522453884530548425

6394568209274196127

3801537852564845169

8519643907259916015

6281285460898883144

2712971531931755773

6620397247064840935

How was it done?

It turns out that the problem reduces to the task of raising a sparse matrix with

363 billion rows and columns to the 361 power. That was a 363 billion row square

- 36 - matrix. This formulation was known in the early 2000s but the computing power has only just been available to calculate this number. The computation started on March 6, 2015 and ran till December 26, 2015 and after some post-processing the number was announced on January 20, 2016. An estimated 30 peta-bytes of disk I/O was generated. If you like the idea of checking the computation by running the program again you will need something like a server with 15TB of fast scratch disk space, 8 to 16 cores, and 192GB of RAM and expect to wait a few months.

2.4 The complexity of moves and strategies

2.4.1 State-space complexity:

This is a quantity of a game represent the number of legal game positions can be played from the start position of the game. In case this is too hard to account, a top bound can usually be computed by using illegal intersections or positions that never arise through the course of a game.

2.4.2 Game tree size:

Game tree size is the entire amount of possible games which can be played: the leaf nodes’ amount of the game tree rooted at the initial position of the game.

Typically, the game tree is vastly huger than the state space. The reason is that a same positions can be displayed in many games by making in different order of

- 37 - moves depend on variable player’s strategies. An top bound for the size of a game tree can sometimes be calculated by simplifying the game in a method that just only increases the game tree’s size (for instance: by allowing illegal moves) until it is tractable. The game tree building based on “ply” instead of “turn” to calculate the tree leaves.

- 38 - .

. . . .

. . . .

. . .

. . .

[Figure 2-6] Game-tree model of Go-game

2.4.3 Ply:

In two-player sequential games, ply refers to only one turn taken by one of the players. The concept is used to purify what is meant in particular when one might otherwise say "turn". The concept "turn" can be a problem when it means different things in different traditions. For instance, in standard chess terminology,

- 39 - each move consists of a turn taking by each player; therefore a ply in chess is only a half-move. Therefore, after the 20th move in a chess game, there are 40 plies have been completed: 20 plies is played by black and 20 by white. In the Go-game, by contrast, the ply is the move unit to count in playing. For instance, saying that a game is 150 moves long mean it last 150 plies. In computing, the “ply” concept is very important because one ply corresponds to a level of the game tree.

However, in some games where the amount of moves is not limited like Go- game, the game tree is infinite. See some complexity of games in [Table 2-2].

[Table 2-2] Complexity of some well-known games Game Board side State space complexity Game tree complexity

(positions) (as log to base 10) (plies)

Chess 64 47 123

Tic-tac-to 9 3 5

Line of Action 64 23 64

Shogi 81 71 226

Xiangqi 90 40 150

Go (19x19) 361 170 360

Stratego 92 115 535

2.4.4 The complexity of Go-game moves

On a 19×19 board, there are about 3^361×0.012 = 2.1×10^170 possible positions, most of which are the end result of about (120!)^2 = 4.5×10^397

- 40 - different games without stones’ capture, for a total of about 9.3×10^567 games. In case of allowing captures, there are about 10^(7.49×10^48) possible games, most of which last for over 1.6×10^49 moves.

An un-pruned Go game tree could have more than 361! branches, since moves

(but not board states) can be repeated: for example, in ko fights, or, more generally, after a capture of a group. Also, one fact which would automatically prune a Go tree is the existence of illegal moves, such as immediate ko recapture or group suicide. Finally, there are situations, like multiple ko, eternal ko, and eternal life, where optimal play goes on forever unless you use a superko rule.

2.4.5 Comparison between Go-game and Chess.

In 1997, Garry Kasparov was defeated by Deep Blue, a computer program written by IBM, running on a supercomputer. This was the first time that a reigning world chess champion was defeated by a computer program in tournament conditions. Superficially, in 2016, AlphaGo's win against Lee Sedol can be compared to Deep Blue's win against Gary Kasparov. With the exception that

AlphaGo's win came almost 20 years later. We have to understand the differences between chess and Go-game.

Go-game, due to its higher complexity, could not be tackled with Deep Blue's approach. Good progress was made with Monte Carlo Tree Search. This is also a somewhat disappointing solution: pure Monte Carlo Tree Search does not use any domain knowledge. This means that a Go-game playing program using pure Monte

- 41 - Carlo Tree Search does not know anything about how to play Go-game at the beginning of each new game. There is no learning through experience.

In chess, each player begins with 16 pieces of six different types. Each piece type moves differently. The target of the game is to capture the opponent's king.

Go-game starts with an empty board. At each turn, a player places a stone (the equivalent of a piece in chess) on the board. Stones all obey the same rules. The goal of the game is to capture as much territory as possible. It can therefore be argued that Go-game has simpler rules than chess.

In spite of the fact that the rule sets of Go-game might appear simpler than the rule sets of chess, but the complexity of Go-game is much higher. At each game state, a player is faced with a choice of a greater number of possible moves compared to chess (about 35 in chess, 250 in go-game). Farther more, games usually last longer: A popular game of Go-game averagely last for 150 moves while only 80 in chess.

Because of this, the total number of possible games of Go has been estimated at 10^761, compared to 10^120 for chess. Both are very large numbers: the entire universe is estimated to contain "only" about 10^80 atoms. But Go-game is the most complex of the two games, which is also why it has been such a challenge for computers to play it, until now.

The complexity of Go-game bases mostly from the virtually infinite methods of having player’s stones placing more efficiently than their opponent, in order to take more points at the end of game. This gives player so many orders of extra

- 42 - freedom, flexibility and possibilities to achieve the goal to win the game, there is an incredible amount of subjectivity involved.

This is the main reason why many Go-game professionals will give different

'best' moves to the same Go-game position: each professional player will look for the move that he/she regards as the best and most efficient move cooperating with and utilizing his/her stones already placed.

Chess strategies are more or less acknowledged: if you ask ten grand masters where to play in a given chess position, they probably all would give more or less the same answer. Ask ten top Go-game professionals about the next move in a given complex board-position and probably they all will give a different answer.

During a Go-game game the board builds up: more and more stones are placed on the board and complex patterns develop that attempt to let your stones working together more efficiently than those of your opponent. The game becomes more and more interesting as multiple fights and unresolved positions occur simultaneously as the game proceeds.

Since stones at one side of the Go-game board affect significantly the outcome of the situation at another side, there clearly are non-local influences that contribute to the whole-board development of the game. Therefore, making a severe mistake in a game of Go-game may result in a-local-loss but not necessarily in losing the game as there still will be many opportunities to win by forcing advantages elsewhere on the board.

Chess is orders of magnitude less flexible as with each move, the number of

- 43 - pieces remains the same or is reduced and usually the number of possible positions is greatly reduced (a pawn is not allowed to move backwards, casting is allowed only once, pieces are taken, etc). As less and less pieces are left during the game,

Chess becomes much simpler and less interesting. Also, making one (small) mistake in a Chess game usually will determine the final outcome.

What further complicated Go-game compared to Chess are the so-called ko situations in which complex fights all over the board are ‘weighted’ against each other. Winning or losing a ko fight may mean that you give up one side of the board in exchange of huge advantages on another side.

Go-game is at least several orders of magnitude more complex than a game of

Chess primarily because of the huge number of possible ways to let the game flow

(after each move) towards another line of development. With Go-game, the number of ways in which a single stone can (and will) affect and impact the whole-board situation on the long term is also many orders of magnitude bigger than that compare to a single piece movement with Chess.

If you want an ultra-high degree of freedom to use your creativity, imagination, and perseverance in a game that is open and with all and more aspects than you're probably to encounter in real life, which has virtually infinite nuances and complex details that you cannot find in any other games currently existing, you should definitely go for Go.

Go is the most fascinating, challenging, profound, ultra-simple and ultra- complex both at the same time, ultra-balanced and detailed, flexible, versatile and

- 44 - most ancient of games. Choosing between Go and Chess is like choosing between living on Earth or on the Moon.

2.4.6 The complexity of Go-game’s strategies.

Strategies deal with global influence, interaction among distant stones, keeping the entire board in mind during local fighting, and other issues that include the overall game. Therefore, it is possible to allow a tactical loss meanwhile it confers a strategic advantage. Beginning players often start a game by randomly placing stones on the board without strategic mean, as if it were a chance in playing the game. The understanding of how stones connect to get greater power develops.

Hence, a few basic common opening sequences can be understood. Learning the ways of alive and death is helpful in a fundamental way to develop player's strategic understanding about weak groups. A player who can both handle adversity and play aggressively is said to display fighting spirit or kiai in the game.

2.4.7 Basic strategic:

+ Connection: Keeping player's own pieces connected means that fewer groups need to maintain living shape, the more connections help player to have fewer groups mean easer to defend.

+ Cut: Keeping opposing pieces disconnected means that their opponent needs to defend and set up living shape for more groups.

- 45 - + Stay alive: To keep pieces stay alive, the simplest way is to set up a foothold in one of four corners or along one of four edges of the board. At least, a group ought to have two eyes to be "alive". If their opponent cannot destroy either eye, when the opponent try any such move is prohibited by the rule sets and called suicide.

+ Mutual life (seki rule) remaining: Allowing seki to keep living group is better than dying: With a situation in which neither player can place their piece on a concrete point without then accepting the opponent to place at another point to capture. The most popular example is that of adjacent chains that share their a few liberties- if either player places their stone in the shared liberties, they can destroy their own group to a single eye (self-capture), this will allow the other players to capture that chains in the next move.

+ Death group: A group that has less than two eyes is called “death group” and finally removed from the board as captured.

+ Invasion: Set up a living group in an area that the opponent has greater influence, means player reduces the opponent's points in proportion into the area one occupies.

+ Reduction: Putting a stone at a position far enough to the opponent's area of influence to decrease the amount of the territory that they eventually get, but it must not so far that it can be easily cut off from other stones outside.

+ Sente: Sente is a move that forces one's opponent to respond (a.k.a gote) such as put an opponent’s group in danger situation and can be captured. A player

- 46 - who can usually place stone with sente has the initiative and can control the stream of the game. A move that overwhelmingly compels a gamer into a concrete follow- up move is called to have "sente", or have "initiative"; certainly the opponent has

"gote". In most games, who can maintain sente in most of the time will win the game. Gote implies "succeeding move", while the opposite of sente implies

"preceding move". The sente evidences which player has got the initiative in the game’s duration, and which plies result in holding and taking the initiative. More precisely, while a player attacks in sente, due to the other defends in gote statue, it can be named that player respectively does and does not have the initiative. The situation of having sente is favorable, permitting control of the stream of the game.

Players can break out of gote takes back advantage that gain sente by choosing to accept some loss on the local level, attack to take the initiative in playing elsewhere.

+ Sacrifice: Permitting a group or chain to die in order to break out of a sente and carry out a plan, or play in a more important area.

The strategies involved can be very complex and abstract. High-level players often spend years to improve their understanding of strategies, and beginners may play hundreds of games against their opponents before they are able to win regularly.

2.4.8 Opening strategy:

In the opening of a game, usually players play at the corners of the game- board first, as the existence of the two edges allows it easier for them to bound

- 47 - territory and establish their stones. Next good positions are four sides, where there are stills one edges to support their stones. Generally, players are opening moves at the third or fourth line far from edges, with occasional moves on the second and fifth lines. In popular, stones on the third line are good defensive moves and offer stability, whereas stones placing on the fourth line are good attacking moves and influence more of the board. The opening stage is the most difficult duration of the game for professional level players and it takes a disproportionate amount of total playing time.

In the opening stage, players usually play established sequences named , which are locally balanced exchanges. After all, the joseki chosen should also generate a satisfactory result on a global scale. It is generally commented to keep a balance between influence and territory. Which of these gets precedence is usual a problem of individual taste.

2.4.9 Middle phase and endgame:

In the middle progress of the game is the most militant, and usually it lasts for more than 100 moves. During the middle of the game, players often invade each other's territories, and keep attacking formations that lack the need two eyes for viability. Such groups could be saved or sacrificed for more significant things on the board. It is possible that a player may succeed to capture a large weak group of their opponent's, which usually proves decisive and finishes the game by a resignation. However, problems may be more complicated yet, with apparently

- 48 - dead groups reviving, major trade-offs, and skillful play by attacking in such a way as to establish territories rather than kill.

The end of the middle-game and transition to the endgame stage is often marked by a few features. At this moment, the game will breaks up into areas that have very little or even no affects to each player, where before the board’s central area related to all parts of it. There are no large weak groups are remain in serious danger. Reasonably, moves can be attributed several definite value, such as equals or less than 20 points, rather than simply being necessary to accomplish. Both players then set limited objectives in their plans, in capturing or saving stones, making or destroying territory. Usually, for strong players, these changing aspects of the game occur at much the same time. In summary, the middle-game transits into the endgame stage when the concepts of influence and strategy need to reassess in terms of particular eventual results on the board.

2.5 Hot trend in Go-game researching and the challenge.

Although there are many researches in Go-game with many orientations have emerged in recent several years, but Go-game is still a hot trend and attracts new researchers in researching and improving Go-game due to more than 40 million

Go-game players on the world and about 8 million in Korea. The board size using

19×19 grid is now become standard. The game had reached Korea in the 5th century CE and later Japan in the 7th century CE.

According to Korea Baduk (Go-game) Association, in Korea, Go-game

- 49 - (Baduk) and is very popular; Koreans have a reputation for playing very fast.

Sooner or later, they are producing some of the world strongest players. Both

China and Korea have a growing population of very strong young players, a phenomenon which bodes well for the future development of the game.

According to American Go Association, the number of Go-game players is still increased day by day because of top ten reasons (for more details, visit the following link: “http://www.usgo.org/top-ten-reasons-play-go”):

1. Go is the simplest of all games.

2. Go is the most complex of all games.

3. Go is the most popular game in the world today.

4. Go is about building, not destroying.

5. You always know where you fit in.

6. All players are equal.

7. It is easy to learn from mistakes.

8. Ancient rituals impart important values.

9. Every game has a winner.

10. Go is the oldest game still played in its original form.

As those shown reasons, until now, many new researchers are still interested in

Go-game. But after the success of AlphaGo, this trend has a wave of slacking off.

Actually, AlphaGo is the greatest product of machine learning running on a supper- computer. However, to normal people, they do not need to play with a supper machine or the world Go-game’s champion, just a good product can help them

- 50 - improve their Go-game skill. So, many projects are still running and a lot of Go- game software has been still trading.

To the new Go-game researchers or researching people, an additional liability was put on. If they decide to continue their Go-game research, they must observe deeply impacted factors:

+ The huge complexity in processing Go-game with enormous moves and sets of rules require much time and large amount of resources to invest.

+ The brought in effect of their work can overcome many current projects and researches with shown huge scope and scale or not.

+ In professional ways, can they pass the brilliant of AlphaGo of the Google

Deep Mind project or not.

If they want to avoid failing, they must calculate their ability comparison to required resources. This is really a significant challenge.

2.6 The reason people must concern in Go-game research.

To make a complete Go-game, it needs to invest so much time and resources according to our survey with some large and long Go-game projects:

+ A profound knowledge of Go-game and much reality experiment in playing

Go-game.

+ A profound knowledge and experiment in deep learning and many used algorithms of people who have researched and developed Go-game.

+ Spend much time in building and constructing algorithms, engines, and

- 51 - much time to test the results and the accuracy of programs and algorithms.

+ Have got a powerful system to run the test with big data- special in training progress.

After our calculation, and comparison, people must recognize if lack resources status they have to stop the research or project of Go-game. It is useful for new researchers in this field to know:

+ The complexity of the Go-game.

+ The need of resource in Go-game research.

+ The necessary platform and background in Go-game research.

+ The huge of scope and scale need for a research or project in Go-game.

To the new researchers, who want to think about Go-game development and want to start a research or a project in this field have a base to consider their condition to determine to continue or stop. Although at the first sight, Go-game seems not complicated, but as you can see in my analysis, its complexity is so much far overcome with other popular board games like Chess, Xiangqi,...

Make some comparison with two previous projects:

+ Fuego has 31 publications in 6 years (from 2008 to 2014).

+ Orego has 47 members, 31 publications in 14 years. (2003-2017).

2.7 Understanding about the huge complexity of Go-game:

Using 19x19 grid board, the average number for each game is about 150- 250 plies and it last at least 40 minutes. A typical online player's game on normal

- 52 - (19x19) size lasts 45 minutes to 2 hours, but professionals can play 5 to 6 or sometimes 10 hour in balance matches. The appropriate choice balance with player’s level is important to play and learn Go-game.

With sets of many rules, complexity strategies, huge number of possible moves, people have to spend many years in to learn and practice to play it well.

2.8 Understanding huge necessary resources

At the first sight, Go-game is very simple to play. But because of having too much possible moves combine with many rules and strategies. In professional level, it takes much time in study and play Go-game.

Although there are hundreds publications and tens software and researches have been processed and processing about Go-game. Only AlphaGo in the Deep

Mind project of Google using supercomputer and huge data has got a brilliant success. The large times complexity comparing to chess, it need to invest much time and really huge resources to gain the success in researching Go-game and to handle it:

+ Learning to understand Go-game deeply its rules and strategy.

+ Learning about legal moves and the possible of its game-tree.

+ Understanding Go-game approaching methods.

+ Understanding the complexity in its huge amount of possible moves.

- 53 - 2.9 Provide a fully observation

My thesis has provided an overview from start to end of a Go-game research.

You can evaluate how much time and resources it will take about and much important knowledge we must prepare before start your work.

Many projects using open source code and allows people can download and take advantage of it freely. In the other ways, some project’s software is for purchase and you have to pay for playing them.

2.10 Showing experiments in “deep learning applied to Go- game”

I have made a survey about typical resolutions in “deep learning applied to

Go-game”. Main parts of work each step to build a Go-game program using deep learning. I have shown experiments of a typical open source Go-game software and some experiments of our work in improving Go-game using deep learning with 5 hidden layers, 3-layers Convolutional Neural Networks.

Besides, if new researchers want to start a new engine to improve Go-game, they can use the platform of Go-Gui or similar open source code to reduce resources from rebuilding a platform of Go-game.

- 54 - CHAPTER 3: DEEP LEARNING APPLIED TO GO- GAME

3.1 Why applying deep learning into game building.

Various deep learning architectures such as convolutional deep neural networks, deep neural networks, recurrent neural networks and deep belief networks have been applied to many AI fields like automatic speech recognition, audio recognition, natural language processing, bioinformatics, and computer vision, where they have been shown to bring in state-of-the-art results on various

AI tasks.

Research in this field attempts to make better representations and set up models to learn these representations from huge-scale unlabeled data. Several of the representations are inspired using advances in neuroscience and they are loosely based on the information processing interpretation and communication patterns with a nervous system, just like neural coding which attempts to define the relationship between various causes and associated neuronal responses of the brain.

A Go-game program needs to make a choice where to play its next stone. This decision is difficultly made cause of the aboard range of impacts that a single stone can make across the total board, and the complexity of interactions various stones' chains can have with each other. To address this problem, various architectures have arisen and the most popular are as following:

+ Tree search;

- 55 - + Monte Carlo methods;

+ Pattern matching;

+ Knowledge-based systems;

+ Machine learning.

Although knowledge-based systems have been applying very successful at Go- game, but the skill level of a Go-game program is almost linked to the knowledge of the associated domain experts and programmers who created it. The idea to handle this exist limitation is to apply machine learning techniques in writing Go- game to allow the program to automatically generate rules, strategies, patterns, and rule of conflict resolution. Generally, this is done by using a genetic algorithm or neural network to review a big database of professional games or let the program play many games with itself or with other opponents (include human or other AI programs). After done a game, these algorithms are able to utilize data of this game as a means of improving their performance.

3.2 Typical evaluation function

An evaluation function, also called a static evaluation function or heuristic evaluation function. This is the most important function used by game-playing programs to evaluate and estimate the value or goodness of a moving position in the related algorithms. A typical evaluation function is built to prioritize speed over accuracy. This function cares only at the current position to evaluate and it does not explore possible moves.

- 56 - Evaluation functions in Go-game take into account both territory controlled, the influence of stones, amount of prisoners and alive and death of groups on the board. There are two different kinds of evaluation functions:

+ Concrete evaluation function: Analyze entire the board. In this kind, it is hard to implement and usually requires a very large search tree. Neural networks applied to Go-game often implement by this kind.

+ Conceptual evaluation function: Analyze different concepts of the Go-game

(group status, string status, connectivity, etc.) and the method to relate these different concepts on the whole board.

Following is an example of a concrete evaluation function on the board 5x5:

Section>> Board reference A5'

Connectivity listing to Node A1'

string#00 (A4', A3', A2', A1'); length=4; liberties=5 string#01 (A4', A3',

A2', ', ', A1'); length=6; liberties=5 string#02 (A4', A3', A2', ', C2',

C1', ', A1'); length=8; liberties=6 ...

Connectivity listing to Node A2'

string#00 (A4', A3', A2'); length=3 ...

Connectivity listing to Node A3' ... Connectivity listing to Node A4' ...

Connectivity listing to Node ' ... Connectivity...

Thus, the database would be very big, although CPU cycles could be saved by finding cross-referencing nodes contained in the strings by using pointers.

- 57 - 3.3 Typical deep learning algorithm

Deep learning algorithm is the heart of an AI program. It is also known as hierarchical learning, deep machine learning, deep structure learning. Deep learning algorithm is a class in machine learning algorithms.

Normally, neural networks usually have one to two hidden layers and are used for supervised prediction or classification. Deep learning neural network architectures differ from neural networks because they have more hidden layers.

Deep learning networks differ from neural networks cause of they can learn in an supervised or unsupervised manner for both unsupervised and supervised learning tasks.

Deep learning is branch of machine learning field based on learning representations of data. An observation can be represented by many methods such as a vector indicate the intensity values per pixel, or a set of edges, regions of concrete shape, etc. There are some representations are better than other ones at simplifying the learning task. One of the insurances of deep learning is to replace handcrafted features with high efficient algorithms for learning form unsupervised or semi-supervised feature, and hierarchical feature extraction.

- 58 - Typical structure of a deep learning algorithm is that:

+ Using a cascade of many layers of nonlinear processing units for feature

extraction and transformation. Each successive layer uses the output from the

previous layer as input. The algorithms may be supervised or unsupervised and

applications include pattern analysis (unsupervised) and classification

(supervised).

+ Based on the learning of multiple levels of features or representations of

the data. Higher level features are derived from lower level features to form a

hierarchical representation.

+ Being part of the broader machine learning field of learning

representations of data.

+ Learn multiple levels of representations that correspond to different

levels of abstraction; the levels form a hierarchy of concepts.

3.4 Go-game researches and projects.

The complexity implicates the attraction! That’s the huge motivation attracts many researchers to handle Go-game until now.

The first Go-game program was written in 1968 by Albert Lindsey Zobrist as a part of his thesis about pattern recognition. He has introduced an influent function to evaluate territory and to detect ko rule.

- 59 - In 1981 April, Jonathan K Millen published a paper in Byte discussing Wally, a Go-game program with a 15x15 board size that fit within the KIM-

1 microcomputer's 1KB RAM. In November 1984, Bruce F. Webster published a paper in the magazine discussing a Go-game program he had built for the Apple

Macintosh, and including the MacFORTH source.

In 1998, many strong players were able to win computer programs although giving handicaps up to 25–30 stones, an enormous handicap that very few human players could ever take. There was an impressed case in the World

Championship 1994 where the winning computer program, Go Intellect machine lost all three games when plays against the youth players in spite of receiving 15 stones for handicap. Generally, players who understood and discovered a computer program's weaknesses could win it with many stones for handicap than typical players. But researching on Go-game has been continuing to build up better software that can play as professional as human.

Developments in machine learning and Monte Carlo Tree Search brought the most successful Go-game programs to high dan level on the board size 9x9. In

2009, the first Go-game program appeared so good that it could reach and kept low dan-level ranks at the KGS Go Server using the 19x19 board size as well.

In Finland, at the 2010 European Go Congress, Go-game program MogoTW played 19x19 board against Catalin Taranu (5p). MogoTW program received a handicap with only seven stones and won.

In 2011, Zen software (Japan) reached 5 dan at the server KGS, It plays games

- 60 - with only 15 seconds per move. The account which reached that 5 dan rank uses a cluster version of Zen software running on a 26-core machine.

In 2012, Zen program beat Takemiya Masaki (9p) at 11 points with five stones handicap, followed by a win at 20 points with four stones handicap.

In 2013, Crazy Stone program beat Yoshio Ishida (9p) using a 19×19 board game at four stones handicap.

The Codecentric Go Challenge 2014, a best-of-five match using 19x19 board size, was played between Franz-Jozef Dickhut (6d) and Crazy Stone program. No stronger player had ever agreed to play in a serious competition against a Go-game program on even terms. Finally, Franz-Jozef Dickhut won, although Crazy Stone won by 1.5 points at the first match.

3.4.1 Orego:

Orego is a multi-year project to develop programs to play the classical Asian game of Go-game. The code is available from GitHub or in less frequent releases, by downloading a .jar at link: https://webdisk.lclark.edu/drake/orego/.

The Orego project has been aided by grants from:

1- The Atkinson Faculty Development Program.

2- The W. M. Keck Foundation

3- The James F. and Marion L. Miller Foundation

4- The John S. Rogers Science Research Program (Lewis & Clark

College)

- 61 - 5- The Willamette Valley REU-RET Consortium for Mathematics

Research (funded by the National Science Foundation)

The main object is an instance of edu.lclark.orego.ui.Orego. This contains an instance of edu.lclark.orego.mcts.Player, which in turn contains: An edu.lclark.orego.core.Board, which keeps track of the board state, an edu.lclark.orego.book.OpeningBook, which generates opening book moves, one or more edu.lclark.orego.mcts.McRunnables, which performs Monte Carlo simulations, an edu.lclark.orego.mcts.TreeDescender and an edu.lclark.orego.mcts.TreeUpdater, which work with the Monte Carlo search

"tree" stored in an edu.lclark.orego.mcts.TranspositionTable, and an edu.lclark.orego.time.TimeManager, which lets the program decide how to use its allotted time.

3.4.1.1 How It Works:

The instance of Orego accepts GTP commands from a user or other program. Each command is passed to the handleCommand method, which tokenizes the command and calls the proper methods based on user input.

A typical GTP command is genmove black, requesting a move. On receiving this command, handleCommand calls the bestMove method from the Player. This starts a series of threads wrapped around the McRunnables, each of which repeatedly calls its own performMcRun method. This method:

1. Copying the state from the Player's Board to its own Board;

2. Has the Player's TreeDescender generating moves to the frontier of the tree;

- 62 - 3. Completes the run by calling the McRunnable's own playout method;

4. Has the Player's TreeUpdater updating the state of the tree.

Once the allotted time for the move has run out, the Player stops the threads and returns the move from the root of the tree with the most wins.

[Figure 3-1] UML class diagram of the basic structure of Orego

3.4.1.2 Relation work and publications

There are many publications and presentations of this project, and following is some proud posters in 2011, 2012, and 2013. Current team members:

1) Dr. Peter Drake (Associate Professor of Computer Science, Lewis &

Clark College)

2) Andrea Dean (Lewis & Clark '17)

3) Gregory Aaronson (Lewis & Clark '16)

Totally, Orego has 47 members, 31 publications in 2003 until now- 14 years.

For more details, visit the link:

- 63 - https://sites.google.com/a/lclark.edu/drake/research/orego.

3.4.2 Fuego:

Fuego participates in some of the Computer Go Tournaments on KGS. It won the 45th tournament (9x9) in October 2008 and the 53rd tournament (19x19) in

November 2009. It came second in a very strong field in the 54th tournament (9x9) in December 2009 and won the 55th (9x9) in January 2010. The Third placed in the 74th (9x9) in August 2011.

Fuego occasionally plays on KGS under the user names Fuego19, Fuego13 and Fuego9. In March 2013, Fuego19 achieved a KGS rank of 1d, running on a 16- core machine.

Experimental versions of Fuego play on the Computer Go Server.

Several rated Fuego bots operated by Aloril play on OGS under names such as

Fuego9x9_1M.

[Figure 3-2] FUEGO Main Application Documentation

- 64 -

[Figure 3-3] FUEGO Go-game using GTP commands

[Figure 3-4] Explain details of each command in FUEGO project

3.4.3 GNU Go:

- 65 - GNU Go project is a free program that plays the Go-game. It has played thousands of games stored on the NNGS Go server. It is now also playing regularly on the on the WING server in Japan, Legend Go Server in Taiwan, and many volunteers also run GNU Go clients on KGS. It has setup itself as the leading non- commercial Go-game program in the recent competitions that it has taken part in.

GNU Go 3.8, released February 19, 2009, is the latest stable version. GNU Go is portable and it is known to work well on various systems like Windows,

GNU/Linux, and Mac OS/X.

All source files are distributed under the GNU General Public License, except

‘gmp.c’, ‘gmp.h’, ‘gtp.c’, and ‘gtp.h’.

The two files ‘gtp.c’, ‘gtp.h’ are copyright: Free Software Foundation. In the promotion’s interests of the Go Text Protocol, these files are under a less restrictive license than the GPL, they are also free for unrestricted use.

The two files ‘gmp.c’, ‘gmp.h’ were stored in the public domain by William

Shubert, the author of those files, and they are free for unrestricted use.

3.5 Go-game software

I have collected some famous and popular Go-game software and check for their working:

+ AlphaGo: Built by Google Deep Mind Project. It is the first computer program to win against a professional (9 dan) human Go-game player. Engine:

Combines Monte Carlo Tree Search with two neural networks: a policy network

- 66 - and a value network.

+ AYA: by Hiroshi Yamashita. The author, Hiroshi Yamashita, rewrote AYA as a Monte Carlo version of his program.

+ Crazy Stone: Research, 2007 Computer Olympiad runner-up, 2013 UEC

Cup. This program is written by Remi Coulom (sold in Japan as Saikyo no Igo).

+ Fuego: An open source Monte Carlo Open source Monte Carlo engine,

2009 Computer Olympiad 9x9 champion. At 2010 UEC Cup, first program play on

9x9 board-size to beat 9-dan pro.

+ GNU Go: An open source classical Go-game program. GNU Go is a free Go-game playing program (engine). It compiles on many platforms, including GNU/Linux, Unix, Windows and Mac OS.

+ Go++: Commercial, 2002 Computer Olympiad champion. It is written by

Michael Reiss (sold in Japan as Strongest Go or Tuyoi Igo).

+ Leela: the first Monte Carlo program for sale (previously) to the public. Now free, by a professional chess program author.

+ The Many Faces of Go: Commercial, 2008 Computer Olympiad champion.

It is written by David Fotland (sold in Japan as AI Igo).

+ MyGoFriend by Frank Karger. MyGoFriend's GUI can be configured for many languages (English, Korean, Japanese, Chinese, French, German, Dutch,

Russian, Taiwanese and Spanish) but a drawback is that the scoring is always in

Chinese rule sets.

+ MoGo: by Sylvain Gelly. Parallel version built by many people. It is a

- 67 - University research and introduced UCT, 2007 Computer Olympiad champion

+ Pachi: A strong open source Monte Carlo program written by Petr Baudis,

Jonathan Chetwynd provided online version: Peepo, It is integrated with maps and comments while playing.

+ Smart Go: SmartGo is a commercial Go program for Windows & Apple.

(via an windows emulator) SmartGo for Windows (version 2.8.3) comes with over

45000 professional games and over 2000 programs. It can be localized in English,

French, German, Japanese, Polish and Russian. A free 15-day trial is available at the SmartGo web site: “http://www.smartgo.com/”. SmartGo for Windows is the complete tool for Go-game players, with a database of more than 89,000 professional games. It is written by Anders Kierulf, the inventor of the Smart Game

Format.

+ Steenvreter: by Erik van der Werf. Steenvreter is a MCTS based go program.

It won the Computer Olympiad in 2007 (9x9) and placed second in the

2011 Computer Olympiad (19x19 & 13x13).

+ Zen: Parallel version is built by Hideki Kato. Zen is a strong Go-game engine by an individual Japanese programmer Yoji Ojima. It is written by Yoji

Ojima aka Yamato (sold in Japan as Tencho no Igo).

+ Goban: Go-game Software is a standalone (against GNU Go) and Internet

Go-game client for Mac OS X. Link: “http://www.sente.ch/software/goban”.

+ BiGo software: Assistant is Go (Baduk, Weiqi) games (professional and amateur) database software. It allows searching by , joseki, positions and

- 68 - game information fields. Link: “http://bigo.ufgo.org”.

+ PilotGO: A Go-game recorder and SGF viewer/editor for PalmOS. Link at: http://minas.ithil.org/pilotgone.

+ GoSuite: A Go-game recorder and SGF viewer/editor for PocketPC, also including the Vieka GNUgo port for pocketPC allowing you to play against your

PocketPC PDA. Link at: http://senseis.xmp.net/?GoSuite.

+ American Go Association (http://www.usgo.org/go-software): This page purchase many types and kinds of Go-game software adapt to many devices like

PC, Palm Pilot, Mac, or even play online. You can also choose software to teach

Go-game, too.

+ Igowin: Smart Games (http://www.smart-games.com/igowin.html): It uses all of the go playing levels in The Many Faces of Go. Free version of Igowin is a

9x9 board size. We can play it on PC, phone. That is very good for green players learn to play Go-game. To play with full size 19x19, we should buy their program.

+ Play OK (https://www.playok.com): This is an online playing website provided 32 online games (Using Korean: 06 games, Vietnamese: 07 games).

People can join and play with other players on the Internet with Human-to-Human interface. That mean they do not support a calculating machine to think and play with human-to-computer interface. To play Go-game, you first choose you language and then click on the “Go” link as shown in [Figure 3-5].

- 69 -

[Figure 3-5] Playing Go-game at “Play OK” website

3.6 GoGui project– platform to play Go-game without suggestions

GoGui is a program’s graphical interface that plays the Go-game and use GTP

(Go Text Protocol). The Go Text Protocol, GTP protocol is a text based for communication with computer Go-game programs. It is alternative for the modern of the Go Modem Protocol, and potentially, it may replace Go Modem Protocol for use in Go-game competitions in the near future. It is also intended, via the use of auxiliary programs, making it easier for Go-game programmers to connect with

Go-game servers on the Internet and process the automatic regression testing.

- 70 - GoGui is a Java client compliant with GTP versions 1 and 2. It also contains a collection of GTP tools for automatic game play, networking, regression testing and more. GoGui is licensed under the GNU GPL, written by Markus Enzenberger.

GoGui is an open source code project developed by many people and researchers with many successes and provided source code at link:

“https://sourceforge.net/projects/gogui/”. This project is now finished and its source code with main interface is also use and integrate into many other developing engine project likes Fuego project.

3.7 Fuego project (suggesting move engine)

Fuego is a collection of C++ libraries provided for developing Go-game software. It includes a Go-game player using Monte Carlo tree search. The initial version of the code was released by the Computer Go Group at the University of

Alberta and is based in parts on the previous projects Smart Game

Board and Explorer. Fuego is available provided under the license of: GNU Lesser

General Public License. For more information, visit: “http://fuego.sourceforge.net”.

Fuego project’s download site at: https://sourceforge.net/projects/fuego

Totally, Fuego has 31 publications in 6 years (from 2008 to 2014), for more details, visit site: "http://fuego.sourceforge.net/publications.html”.

Fuego's biggest tournament successes so far were winning the 9x9 competition at the 2009 Computer Olympiad and winning the 4th UEC cup in 2010 (19x19). In

Online Competitions, Fuego has participated in some of the Computer Go

- 71 - Tournaments on KGS. It won the 45th tournament (9x9) in October 2008 and the 53rd tournament (19x19) in November 2009. It came second in a very strong field in the 54th tournament (9x9) in December 2009 and won the 55th (9x9) in

January 2010, and has got third place in the 74th (9x9) in August 2011. Fuego occasionally plays on KGS under the user names Fuego19, Fuego13 and Fuego9.

In March 2013, Fuego19 achieved a KGS rank of 1d, running on a 16-core machine. Experimental versions of Fuego play on the Computer Go Server. Several rated Fuego bots (probably an older version) operated by Aloril play on OGS under names such as Fuego9x9_1M. To read more details about the results of Fuego’s tournaments, visit the link: http://fuego.sourceforge.net/competitions.html.

3.8 AlphaGo project

When first building AlphaGo, Hassabis and his team trained and ran the system on a single PC. But in October 2015, before the match against three-time

European champion , the team has been upgraded to the system with higher processing power.

Typical neural networks run on a large number of connected computers, each equipped with GPU-capable graphics processors, for machine learning algorithms.

In October 2015, Hassabis said AlphaGo runs on 48 CPUs and 8 GPUs and the distributed version of AlphaGo runs on 1202 CPUs and 176 GPUs.

AlphaGo relies on two different components: A tree search procedure and

- 72 - convolutional networks that guide the tree search procedure. The convolutional networks are conceptually somewhat similar to the evaluation function in Deep

Blue, except that they are learned and not designed. The tree search procedure can be regarded as a brute-force approach, whereas the convolutional networks provide a level on intuition to the game-play.

In total, three convolutional networks are trained, of two different kinds: two policy networks and one value network. Both types of networks take as input the current game state, represented as an image.

The value network provides an estimate of the value of the current state of the game: what is the probability of the black player to ultimately win the game, given the current state. The input to the value network is the whole game board, and the output is a single number that representing the probability of a win.

The policy networks provide guidance regarding which action to choose, given the current state of the game. The output is a probability value for each possible legal move (i.e. the output of the network is as large as the board). Actions (moves) with higher probability values correspond to actions that have a higher chance of leading to a win.

- 73 -

[Figure 3-6] Both types of networks using in AlphaGo

3.9 Gogui experiment

Gogui project has been developed in Java language aim to play Go-game completely following the complexity rule sets. This program allows people to play a full game until the end and it will calculate the synthesis score of the game to specify the winner at all. However, it has not got an engine to calculate or suggest a moving position. On the contrary, it allows using open source with attach engines to think. Without attached engines, we can use it to play as on physical board and synthesis assistance. [Figure 3-7] show the interface and main menu of Gogui program.

- 74 -

[Figure 3-7] GoGui interface without attached engines

If there are no engines attached to this program, it works as a game board with rules checking to help people play a game by them self or two people without any suggestions.

[Figure 3-8] Selecting the board size for expected game

- 75 -

In this interface, before start a new game, players can change the board size from default 19x19 board size to smaller size. Using 19x19 grid board, the average number for each game is 250 plies and it last at least 40 minutes. A typical online player's game on normal (19x19) size lasts 45 minutes to 1 hour, but professionals can play 5 to 6 or sometimes 10 hour for balance matches. The appropriate choice balance with player’s level is important to play and learn Go-game.

If players place a stone into a legal position, this program will alert by a massage box and do not allow this ply based on the game rules. As shown in

[Figure 3-9], position at J2 is a suicide move and this program has prevented this move with a clear massage.

[Figure 3-9] In playing progress, gamers must imply game’s rules

In GoGui java project, they separate and process each rule independently and check all rule sets if each move is choice and expect to play.

- 76 -

[Figure 3-10] Rule sets are implemented in GoGui.

3.10 Fuego engineering attaching to GoGui program

FUEGO is a software framework. In contrast to a small library with restricted functionality and a stable application programming interface (API), FUEGO supports with a large number of functions and classes. The API is not stable among major releases. Transiting applications that depend upon the FUEGO libraries to a new major version can be a significant effort. For each major version a stable branch is generated, which is used to apply critical bug fixes that don’t break the

API, and it can be used by applications which do not need or want to follow new

- 77 - developments of the main branch. The code is divided to five libraries, and uses a hugely consistent coding style. The code is created in C++ with portability in mind.

Besides the standard C++ library, it uses selected parts from the Boost libraries, which are available for a large amount of platforms. The Platform-dependent functionality, such as process creation or time measurement, is encapsulated in classes in the smart game library. Thus, the default implementation of them uses

POSIX function calls. There is no attempt is made in providing the graphical user interface- specific functionality. Graphical user interfaces or any other controllers can interface with the game engines by attaching.

Actually, Fuego engine is not so intelligent enough to win advance players, I have let it plays against some online player on webpage: “playokOK.com”, and human often win. See the message and result score board of a game Fuego (white color) resigned in [Figure 3-11] and [Figure 3-12]:

- 78 -

[Figure 3-11] Fuego has resigned to an online player

[Figure 3-12] Calculating after Fuego has resigned

- 79 - 3.11 KGS GO Server.

The KGS Go Server, known until 2006 as the Kiseido Go Server, is a game server first developed in 1999 and established in 2000 for people to play Go. The system was developed by William M. Shubert and its code is now written entirely in Java.

[Figure 3-13] CGoban3 interface

A list of the top 100 players, sorted by KGS calculated rank, is regularly updated and maintained. International tournament games and national championship games are relayed on this server. Monthly Computer

Go tournaments are held in the Computer Go room on KGS.

The KGS Go Server is distinguished by a kibitz culture. Kibitzes are common and popular in high-level games, and may include off-topic discussions though this

- 80 - is discouraged by the administrators. The two players cannot see kibitzers' comments until after the game.

There are several client programs to connect to KGS. CGoban 3 is for normal use, on any system that supports Java. It supports 30 different languages, and can also be used as a Smart Game Format (SGF) file editor and viewer. KGS GTP is another java program, for use by Go-playing programs. KGS Client for Android is for mobile phones that use the Android operating system; it supports several languages, but not as many as CGoban 3. KGS used to offer a Java applet version of CGoban, but applet support was removed in early 2016 or late 2015.

KGS allows games on any square size board from 2x2 up to 38x38, including the 19x19, 13x13 and 9x9 boards. There are several games types offered on KGS:

[Figure 3-14] Log in interface of CGoban3.

- 81 - Ranked, which are used for KGS ratings calculations. Only games played on

19x19 boards can be ranked; and only if both players use the rank option. The rest of the game types in this list are non-ranked.

+ Free: Players are not required to register to login but they are not used in

KGS ratings calculations. If you want to know your go-game level and rating for your rank, register your account and play an amount of games enough.

+ Teaching games: Allow the player with white stones to initiate exploration of alternative lines of play.

+ Rengo: which are for two pairs of players.

+ Simul: Allows one player plays 2 or more games at the same time.

+ Demo: in which one person plays both black and white stones, and may have alternative lines of play. Demo games are used for reviews, lectures and lessons, as well as relaying non-KGS games of interest. Relay of non-KGS games requires permission of the source, and advance notice.

In addition, non-ranked games may be marked private.

The players on KGS may be rated, using levels from 30 kyu to 9 dan according to their results in ranked games. In addition, certified professional players may use their professional ranks.

As shown in [Figure 3-14], to log in with an exist account, fill in the “Name” field and “Password” field, then press OK button. In case, just play as a guest without ranking, press the Guest button.

- 82 -

[Figure 3-15] Selecting a room to play in CGoban3 software

After log in to the CGoban3, select your expected room to join and start to play Go-game. After playing enough required number of game, the system will rating for your level of your own account.

- 83 - CHAPTER 4: DEEP LEARNING BASED GO-GAME SYSTEM

4.1 Introduction

Now a days, Go-game still remains very popular in our contemporary lives with tens millions players over the world. Although today’s high technologies, huge computing power has arisen, there are many new algorithms and inventions have been applied for machine learning to help machines become more intelligent as they have human brains to think and make give out decisions by themselves but the results are still remain hardness to resolve. Intelligent solutions using new algorithms have been becoming more and more intelligent such as an AI program can beat human player and even the most expert gamer in playing Go-game.

In improving the Go-game move suggestion engine, we resolve this problem by using 3 layers Convolutional Neural Networks. There are some researchers have also applied Convolutional Neural Network to improve the intelligence of Go- game and have got success as: Schraudolph et al. 1994; Enzenberger, 1996;

Sutskever & Nair 2008, but still remain limitations: The computing power at that moment limits number of hidden layers- most of them normally use only one hidden layer to reduce processing time. In our engine, we use Convolutional Neural

Network with 5 hidden layers with 3 Convolution layers and billions connections in representing and learning Go-game knowledge. We increase in size and depth of

- 84 - layers leads to a qualitative change performance, a strong Go-game move evaluation function can be represented and learnt by that architectures.

We show out the applying of Convolutional Neural Networks into writing a

Go-game program that can help beginning players to learn playing Go-game by suggesting them possible moves, our program calculates and suggests 10 best choices and marks the order from 1st to 10th – the higher number is the worse choice.

4.2 Architecture of Convolutional Neural Network

Convolutional Neural Networks are a class of models that is frequently used in machine learning, both in the supervised and the unsupervised setting, because of their ability to handle big amounts of training data. Convolutional Neural Networks consist of a number of layers, each of which contain a number of parameters whose values are unknown a priori and need to be trained.

A Convolutional Neural Networks contains three components:

+ Convolutional layers: which apply a specified number of convolution filters to the image. For each sub-region, the layer performs a set of mathematical operations to produce a single value in the output feature map. Convolutional layers then typically apply a ReLU activation function to the output to introduce nonlinearities into the model.

+ Pooling layers: which downsample the image data extracted by the convolutional layers to reduce the dimensionality of the feature map in order to

- 85 - decrease processing time. A commonly used pooling algorithm is max pooling, which extracts sub-regions of the feature map (e.g., 2x2-pixel tiles), keeps their maximum value, and discards all other values.

[Figure 4-1] One layer of the CNN in the pooling stage. Units of the same color have tied weights and units of different color represent different filter maps

+ Dense (fully connected) layers: which perform classification on the features extracted by the convolutional layers and downsampled by the pooling layers. In a dense layer, every node in the layer is connected to every node in the preceding layer.

Each layer in a Convolutional Neural Network contains convolutional neurons. Each neuron receives as input the outputs of neurons in a previous layer.

The inputs are then summed together and passed through a non-linear "activation" function.

- 86 -

[Figure 4-2] A kernel operates on one pixel

We describe the training procedure in details and the precise network architecture. Thus, the input to a layer is an m x m x r image, with m stand for the width and height of that image and r stand for is the number of channels (with an

RGB image: r = 3). A convolutional layer has k kernels of size n x n x q with n smaller than m; q can be equal to the number of channels r or smaller.

In multi-layer neural networks, the typical set of equations is illustrated in formula 4.1, layer j calculates the output vector hj using data from the output hj-1 of the layer in front of it, input data x starting for the first training layer: h0 = x.

hj = tanh(bj + Wj hj-1) (4.1)

With parameters bj and Wj stand correlative for vector of offsets and matrix of weights. Element-wise tanh can be replaced by sigm(k) = 1/(1 + e-k) =

½(tanh(k)+1). The top layer output hi is used for calculating a prediction and is combined a supervised target y into a loss function L(hi,y), typically convex in bi +

Wihi-1. The output layer with output hout could have a non-linearity different with the one used in other layers, the softmax formula is expressed:

- 87 - i i i1 bt Wt h i e ht  i i i1 bn Wn h (4.2)  n e

i th i i i where Wt is the t row of W , ht is positive and  t ht 1. The softmax output can be used as estimator of P(Y = t|x), with the interpretation that Y is the class associated with input pattern x. In this case one often uses the negative

i i conditional log-likelihood L(h ,y) = −logP(Y = y|x) = −log hy as a loss, whose expected value over (x,y) pairs is to be minimized.

Neural Network algorithm is stochastic:

+ The same algorithm is trained on the same data can give different results.

+ Reason: Stochastic algorithms use randomness as part of the learning process. In

Neural Networks, they are initialized with random weights.

- 88 -

[Figure 4-3] Some samples of effects by convolving kernel matrices.

4.3 Training procedure

We describe the details of the training procedure and the precise network architecture. The input to a convolutional layer is an image size m x m x r where m stand for width and height of that image and r is the number of channels (RGB images have r = 3). Convolutional layers have k kernels of size n x n x q with n is smaller than m and q can be smaller or equal to the number of channels r.

- 89 -

[Figure 4-4] Go-game CNN structure

Firstly, at C1 layer is the Convolutional Neural Network layer, we divide

19x19 board to 4 boards with 18x18 board size to calculate.

[Figure 4-5] Layer C1 with 4 boards of 18x18 size from original board-state

At S2 layer is the subsample layer, we divide each 18x18 board size into 4 boards of 9x9 board-size and then convolute using out kernel.

And so on, after training progress, base on the weight of each position, we sort top down and show the best move position first.

4.4 Training Data

For calculating and making comparison, I use data from The KGS Go Server at http://www.gokgs.com. In this data source, there are positions sequences of board st for entire games played between humans of various ranks. A Board-state includes positions of all stones on the board 19x19 and the position sequences

- 90 - allow one to recognize the move sequence. Each saving move have 3 parts: [B/W]-

Black/White; [cr]- while c is the column name and r is the row name in letter from

“a” to “s” (include letter “i”). For example: move of Black at position a-19 in the game board is saved in sgf file as: B[aa].

[Figure 4-6] Coding a Go-game moves in a text file

A move at at position is encoded to a value in the computer matrix of 19x19 for each stone position on the 19x19 board. To transfer between sgf code into board code and calculating matrices code, I have built many procedures enough to process for transferring row and column separately and correctly.

[Figure 4-7] Input data files in sgf format saved as text files.

- 91 -

[Figure 4-8] A typical board-state file in SGF.

[Figure 4-9] Transfer procedures between: sgf code, board-code, matrix index

Each turn, the program suggests 10 best choices and the first (number 1) is the best when player input the command: “reg_genmove”. The second option is the

- 92 - “reg_genperc” for advance viewing- showing only 6 best choices but enclose with the weight of each choice.

[Figure 4-10] The suggesting Go-game moves with weights

As shown in [Figure 4-10], the best choice of Black player is at position Q16.

The second best is at R16. Next move’s suggestion shows the more advance option to the White player, my program shows the weights of each choice enclose in the suggestive moves. (6 choice only)- these suggestions help people easier to make comparison before decide their choices.

- 93 -

[Figure 4-11] Moves coding and showing on game board.

If the player input and illegal command, the help information will appear. I have designed commands set following GTP standard program. But for advance test, I output more information than a standard one done.

[Figure 4-11] shows the code and the position of the move “B[jj]” in the store game-state board file using the SGF (Smart Game format). So, each step from reading from a stored input file into computer matrix and transfer to show on the board of a game, we must translate with 2 steps via integer matrix. Transfer sgf code contains lowercase letter “i” to game-board code using letters for column

(exclude “I”) and using integer number for row. For example: sfd code B[jj] =

B[k10] in the game-board code as shown in the [Figure 4-12]:

- 94 -

[Figure 4-12] Transfer SGF code onto game-board position

But actually, the move name of B[jj] is K10 on the game-board as shown in the [Figure 4-13].

[Figure 4-13] SGF code B[jj] transfer to game-board

- 95 - 4.5 Solving Work

In our work, have just suggesting move for new players. We have been converting to use synchronous functions with GoGui/Fuego project and attach to

GoGui platform to use graphic interface.

Besides that, we can test our suggesting moves by letting our engine play against Fuego engine or Orego engine.

We use data from KGS Go-game Server (http://www.gokgs.com). After training stages, we test our Go-game program. We also test all possible options:

Changing between black color and white color player; change the selection with different data sets.

[Figure 4-14] Board-state file with moves coding in sgf format There is large number of different output results for a move in Go-game.

Convolutional Neural Network's suggesting process and many of them are not the garbage with unusable value. Our program tried to calculate the 10 best score positions to show in newbie-helping task.

- 96 -

a) 40531 input files b) 21115 input files

[Figure 4-15] Different accuracies of suggestion on different data sets

[Figure 4-16] and [Figure 4-17] show the different between calculating weights in suggestion moves at the third move for black player at the same board state. We can see that, although the first 2 best moves in these two suggestions are . the same but the weights of them are different. From the third, based on the data set, some moves are different and the weighs of them are variety.

[Figure 4-16] The suggestion with weights at third move.

- 97 -

[Figure 4-17] The suggestions after two moves have been made.

- 98 - CHAPTER 5: EXPERIMENTS

As it is shown in the chapter 4, I have installed and trained my Go-game

(called HuuDucGo) to suggest next best moves for players. I have run and made comparison with Fuego and Orego go-game program. As they are shown above,

Fuego and Orego are big projects in researching about go-game.

To make comparison, I use some given sequence board-states. Each comparison is made on the same board-state for all programs. In order to have a fair comparison, I have made been making with each program suggestion.

To generate a move, use “genmove ” command; to making a move at position [cr], use “play []” command; to ask for a suggestion, use “reg_genmove ” command.

5.1 Experiments following HuuDucGo’s suggestions.

Firstly, I follow the suggestions of HuuDucGo and ask for suggestion from each program to make comparison about the convergent ability. The convergence of the suggestion shows the acceptable results of HuuDucGo with the others.

- 99 -

[Figure 5-1] First move for black player of Orego and Fuego.

In [Figure 5-1], at the first move for black player, two other program have suggested at the same position at Q4. It’s also occurred in HuuDucGo at the third position (see [Figure 5-2]). In this case, Orego and Fuego programs suggest the same position at Q4 while HuuDucGo suggest the best choice is Q16. Actually, because the game-board is symmetric, the first move at 4 positions: Q4, Q16, D4,

D16 are similar. In HuuDucGo, Q4 is the 3rd best move in the suggestion list.

- 100 -

[Figure 5-2] HuuDucGo’s suggestion with 10 best choices for first move.

Next, I played at position Q4 for all 3 programs and ask for the next move

(“reg_genmove ” command) for white player to make comparison:

[Figure 5-3] 2nd move of white player suggested by Orego and Fuego.

- 101 -

[Figure 5-4] 2nd move: HuuDucGo’s suggestion includes two others.

Because the board is symmetric that 4 positions: Q4, Q16, D4, D16 have the mirror position, but Q4 has been occupied the rest 3 positions are similar. The

[Figure 5-4] shows that HuuDucGo, Orego, Fuego have given different suggestions but they are not too different in calculating the best position for next moves and

HuuDucGo has converged with Orego at the 2nd choice and Fuego at the 5th choice.

- 102 -

[Figure 5-5] HuuDucGo's suggestion for 3rd move.

[Figure 5-6] 3rd move: Orego converges at 1st choice, Fuego at 3rd choice.

- 103 -

[Figure 5-7] HuuDucGo's suggestion for 4th move.

[Figure 5-8] 4th move: Orego and Fuego also converge at 2nd choice.

- 104 -

[Figure 5-9] 4th move: 1st and 2nd choice has approximate high weights.

In this case, although Fuego and Orego has already converges with HuuDucGo at second choice but weights of 1st and 2nd choices are not large different. Because

HuuDucGo’s suggestion based and the given board-state data set, the different data set give different suggestions.

To test more deeper moves, I try to get 5th suggestion after 4 moves: ;B[Q16];

W[D4]; B[Q3]; W[D17] as shown in [Figure 5-10] and [Figure 5-11]. HuuDucGo and Orego have converged at the same suggestion move F3 for black player, while

Fuego suggested at the move D15 for black player. In HuuDucGo’s suggestion,

D15 takes the second best move. This mean, in this case, HuuDucGo gives the suggestions very converge with two other big programs.

- 105 -

[Figure 5-10] Fifth suggestions at a given board of Orego and Fuego.

[Figure 5-11] The fifth suggestion of HuuDucGo includes Fuego and Orego.

Take more move suggestion after ten moves, the 11th suggestion move have a

- 106 - difference as shown in [Figure 5-12] and [Figure 5-13].

[Figure 5-12] The 11th move suggestions of Orego and Fuego.

[Figure 5-13] HuuDucGo's suggetion at 11th move in a given board-state

In this case, at 11th move’s suggestion, HuuDucGo do not converge with

Orego in any of ten suggestions. But the Fuego’s suggestion at position Q17 is nearly converge with HuuDucGo by 3rd choice. The convergence moving

- 107 - suggestions with these two big programs can figure out: Although HuuDucGo using different method in programming Go-Game but it has good results. It is also shows that, because of the large complexity of the go-game and its move, although big programs have the different suggestions.

5.2 Experiments following Orego’s suggestions.

Second, I use the suggestions of Orego program and ask for the next move suggestions to make comparison between 3 programs: Orego, Fuego, and

HuuDucGo.

[Figure 5-14] Fuego converges with Orego at Q4.

- 108 -

[Figure 5-15] HuuDucGo also converges to Orego at Q4 by the 3rd choice.

Following Orego with first playing at Q4, I continue check with the second suggestions in [Figure 5-16] and [Figure 5-17].

[Figure 5-16] 2nd move: Suggestion of Orego and Fuego.

- 109 -

[Figure 5-17] Huu DucGo's suggestion converges at the 2nd choice.

We can see that, at the second choices, HuuDucGo contains Orego’s suggestion at the second choice. Because of the symmetric of the board, D4, D16

(first and second choices of HuuDucGo) and Q16 (Fuego’s suggestion) have the similar weights for the next move at that board-state. At the next move comparison, the board-progress following 2 Orego’s suggestions: ;B[Q4]; W[D16];. As shown in [Figure 5-18] and [Figure 5-19], the 3rd move of HuuDucGo’s suggestions include Orego’s one.

- 110 -

[Figure 5-18] Fuego converges with Orego at 3rd move.

[Figure 5-19] 3rd move: HuuDucGo converges with Orego's at 6th choice.

- 111 -

[Figure 5-20] 4th move: Fuego converges with Orego.

[Figure 5-21] HuuDucGo's suggestion converges with Fuego at 4th move.

- 112 - 5.3 Experiments following Fuego’s suggestions

Third, I use the suggestions of Fuego program and ask for the next move suggestions to make comparison between 3 programs: Fuego, Orego, and

HuuDucGo.

At the first move of the game, HuuDucGo converges with Fuego at the third choice of suggestions.

[Figure 5-22] First move of Orego converges with Fuego.

- 113 -

[Figure 5-23] HuuDucGo converges with Fuego at the 3rd choice.

[Figure 5-24] 2nd move: Orego suggests at D16 while Fuego at Q16.

- 114 -

[Figure 5-25] 2nd move: HuuDucGo converges with Fuego.

At the second move, HuuDucGo converges with Fuego at the fifth choice.

[Figure 5-26] 3rd move: following Fuego, Orego converges at C4 position.

- 115 -

[Figure 5-27] 3rd move: HuuDucGo converges with Fuego by 7th choice.

[Figure 5-28] Orego converges with Fuego at 4th move.

- 116 -

[Figure 5-29]. HuuDucGo converges 4th move by 3rd choice.

At the I have made experiments following each of 3 programs and made comparisons to find out the convergence in the move suggestions. Because of the complexity of go-game and the variety of strategies, even the big program, at some move sequences, they do not converge with each other. But in the early moves, these two big programs and my program HuuDucGo almost converge with two others.

- 117 - CHAPTER 6: DISCUSSIONS

I have improved that, deep learning Convolutional Neural Networks can calculate and suggest Go-game moves based on the board-state input data. With the results that using Convolutional Neural Network, we can train data for suggesting best moves in Go-game, and it is helpful to green players for learning the Go-game.

In deep learning, people often use CNNs in image recognition. We have distributed a new approach of applying deep learning using Convolutional Neural Network into Go-game with the go-game CNN structure. Furthermore, the convergence in the move suggestion with big project program like Orego, Fuego shows that my program is really useful in helping green players. The Convolutional Neural

Network can outperform traditional tree search based programs. Our work also proves the ability of applying Convolutional Neural Networks into supervised deep learning technique as well as many traditional ways to get higher effect and performance.

In experiments, my program- HuuDucGo has almost converged with two big project programs: Orego and Fuego in many sequences, moves. My moving suggestions are almost converged with these two big programs can shows that

HuuDucGo using CNN- a different method with tradition in programming Go-

Game also brings good results. In making comparison, two big programs also give the different suggestions because the complexity of go-game and the variety of strategies in playing go-game. Despite of the variety, HuuDucGo has converged with two big go-game programs in the next move suggestions.

- 118 - Future work: With my current work- a lecturer at Vietnam- Korea friendship

IT college, this is a big chance to study more and apply deep learning to my work about teaching in computer science and information technology. With a platform of

GoGui software and go-game, it is very convenient to teach, learn and apply machine learning algorithm in building an engine- the first step to build a one’s own intelligent artificial software. Besides, with the basic background of deep learning and machine learning, I can not only teach but also study and create intelligent software, systems, machines, and devices. Specially, in today digital world: Social networks, social network people always create big data likes images, videos, comments each day, the only way to process those big data is using AI machines with high computation power. The key to manage our digital world is machine learning with an advance band of it is deep learning. I hope, my thesis can have a part in helping students and researchers get easer approach to build an AI system at their beginning step into contemporary computer science world.

- 119 - REFERENCES

[1] Chaslot, G. M. J., Winands, M. H., HERIK, H. J. V. D., Uiterwijk, J. W., & Bouzy, B. (2008). Progressive strategies for Monte-Carlo tree search. New Mathematics and Natural Computation, 4(03), 343-357.

[2] Clark, C., & Storkey, A. (2014). Teaching deep convolutional neural networks to play go. arXiv preprint arXiv:1412.3409.

[3] Maddison, C. J., Huang, A., Sutskever, I., & Silver, D. (2014). Move evaluation in go using deep convolutional neural networks. arXiv preprint arXiv:1412.6564.

[4] Coulom, R. (2006, May). Efficient selectivity and backup operators in Monte- Carlo tree search. In International Conference on Computers and Games (pp. 72-83). Springer Berlin Heidelberg.

[5] Chorus, P. (2009). Implementing a computer player for abalone using alpha- beta and monte-carlo search (Doctoral dissertation, Maastricht University).

[6] Park, D. (2015). Space-state complexity of Korean chess and Chinese chess. arXiv preprint arXiv:1507.06401.

[7] Krizhevsky, A., Sutskever, I., & Hinton, G. E. Advances in neural information processing systems. 2012. Imagenet classification with deep convolutional neural networks.

[8] Enzenberger, M., Muller, M., Arneson, B., & Segal, R. (2010). Fuego—an open-source framework for board games and Go engine based on Monte Carlo tree search. IEEE Transactions on Computational Intelligence and AI in Games, 2(4), 259-270.

[9] Gelly, S., & Silver, D. (2011). Monte-Carlo tree search and rapid action value estimation in computer Go. Artificial Intelligence, 175(11), 1856-1875.

[10] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., ... & Petersen, S. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529-533.

[11] Tromp, J. (2016, June). The Number of Legal Go Positions. In International Conference on Computers and Games (pp. 183-190). Springer International Publishing.

[12] Kocsis, L., & Szepesvári, C. (2006, September). Bandit based monte-carlo planning. In European conference on machine learning (pp. 282-293). Springer Berlin Heidelberg.

[13] Kochel, K., & Stinia, M. (2015). EDUCATIONAL VALUES OF TRADITIONAL BOARD GAMES. Yearbook of the International Society of

- 120 - History Didactics/Jahrbuch der Internationalen Gesellschaft für Geschichtsdidaktik, 36.

[14] Nowakowski, R. J. (1998). Games of no chance (Vol. 29). Cambridge University Press.

[15] Duc, H. H., Jihoon, L., & Keechul, J. (2016). Suggesting Moving Positions in Go-Game with Convolutional Neural Networks Trained Data. International Journal of Hybrid Information Technology, 9(4), 51-58.

[16] Maddison, C. J., Huang, A., Sutskever, I., & Silver, D. (2014). Move evaluation in go using deep convolutional neural networks. arXiv preprint arXiv:1412.6564.

[17] Van Den Herik, H. J., Uiterwijk, J. W., & Van Rijswijck, J. (2002). Games solved: Now and in the future. Artificial Intelligence, 134(1-2), 277-311.

[18] Enzenberger, M., Muller, M., Arneson, B., & Segal, R. (2010). Fuego—an open-source framework for board games and Go engine based on Monte Carlo tree search. IEEE Transactions on Computational Intelligence and AI in Games, 2(4), 259-270.

[19] Shalev-Shwartz, S., & Ben-David, S. (2014). Understanding machine learning: From theory to algorithms. Cambridge university press.

[20] Schraudolph, N. N., Dayan, P., & Sejnowski, T. J. (1994). Temporal difference learning of position evaluation in the game of Go. In Advances in neural information processing systems (pp. 817-824).

[21] Chen, S. J. Y. J. C., Yang, T. N., & Hsu, S. C. (2004). Computer chinese chess. ICGA Journal, 27(1), 3-18.

[22] Sutskever, I., & Nair, V. (2008). Mimicking go experts with convolutional neural networks. Artificial Neural Networks-ICANN 2008, 101-110.

[23] Allis, L. V. (1994). Searching for solutions in games and artificial intelligence. Ponsen & Looijen.

[24] Slany, W. (2000, October). The complexity of graph Ramsey games. In International Conference on Computers and Games (pp. 186-203). Springer Berlin Heidelberg.

[25] Saks, M., & Wigderson, A. (1986, October). Probabilistic Boolean decision trees and the complexity of evaluating game trees. In Foundations of Computer Science, 1986., 27th Annual Symposium on (pp. 29-38). IEEE.

[26] Müller, M. (2002). Computer go. Artificial Intelligence, 134(1-2), 145-179.

[27] Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.

- 121 - [28] Wu, D. J. (2011). Move Ranking and Evaluation in the game of Arimaa. BA diss., Harvard College, Cambridge, MA.

[29] Anzai, Y. (2012). Pattern recognition and machine learning. Elsevier.

[30] Lohre, B., Dodson, S., Sylvester, N., & Drake, P. (2011). Decision Trees for Local Search in Monte Carlo Go.

[31] Müller, M. (2010). Fuego-GB Prototype at the Human machine competition in Barcelona 2010: a Tournament Report and Analysis.

[32] Enzenberger, M., Muller, M., Arneson, B., & Segal, R. (2010). Fuego—an open-source framework for board games and Go engine based on Monte Carlo tree search. IEEE Transactions on Computational Intelligence and AI in Games, 2(4), 259-270.

[33] Enzenberger, M., & Müller, M. (2009, May). A lock-free multithreaded Monte-Carlo tree search algorithm. In Advances in Computer Games (pp. 14- 20). Springer Berlin Heidelberg.

[34] Müller, M. (2009). Fuego at the Computer Olympiad in Pamplona 2009: A tournament report.

[35] Drake, P. (2014). Go: the deepest game. Journal of Computing Sciences in Colleges, 30(1), 113-113.

[36] Basaldua, J., Stewart, S., Moreno-Vega, J. M., & Drake, P. D. (2014). Two online learning playout policies in Monte Carlo Go: An application of win/loss states. IEEE Transactions on Computational Intelligence and AI in Games, 6(1), 46-54.

[37] Robinson, W. B., Mendez, E. M., Hale, B. K., Johnson, L. A., & Weber, F. D. (1998). U.S. Patent No. 5,734,838. Washington, DC: U.S. Patent and Trademark Office.

[38] Richards, N., Moriarty, D. E., & Miikkulainen, R. (1998). Evolving neural networks to play Go. Applied Intelligence, 8(1), 85-96.

- 122 -