CALIFORNIA STATE UNIVERSITY SAN MARCOS

PROJECT SIGNATURE PAGE

PROJECT SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE

MASTER OF SCIENCE

IN

COMPUTER SCIENCE

PROJECT TITLE: Using Computer Vision Techniques to Play an Existing Video Game

AUTHOR: Christopher E. Erdelyi

DATE OF SUCCESSFUL DEFENSE: May 6, 2019

THE PROJECT HAS BEEN ACCEPTED BY THE PROJECT COMMITTEE IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN COMPUTER SCIENCE.

Xin Ye PROJECT COMMITTEE CHAIR SIGNATURE DATE

Shaun-bill Wu PROJECT COMMITTEE MEMBER DATE . Using Computer Vision Techniques to Play an Existing Video Game

Presented to the faculty of the College of Science and Mathematics at California State University, San Marcos

Submitted in partial fulfillment of the requirements for the degree of Masters of Science

Christopher Erdelyi [email protected] March 2019 Abstract

Game playing algorithms are commonly implemented in video games to control non-player characters (hereafter, “NPC’s,”) in order to provide a richer or more competitive game environment. However, directly programming opponent algorithms into the game can cause the game-controlled NPC’s to become predictable to human players over time. This can negatively impact player enjoyment and interest in the game, especially when the algorithm is supposed to compete against human opponents. To extend the revenue-generating lifespan of a game, the developers may wish to continually refine the algorithms – but these updates would need to be downloaded to every players’ installed copy of the game. Thus, it would be beneficial for the game’s algorithm to run independently from the game itself, located on a server which can be easily accessed and updated by the game developers. Furthermore, the same basic algorithm setup could be used across many games that the developer creates, by using computer vision to determine game states, rather than title-specific Application Program Interfaces (hereafter, “API’s.”)

In this paper, we propose a method for playing a racing game using computer vision, and controlling the game only through the inputs provided to human players. Using the Open Source Computer Vision Library (hereafter known by its common name, “OpenCV”) to take screenshots of the game and apply various image processing techniques, we can represent the game world in a manner suitable for external driving algorithm to process. The driving algorithm then makes decisions based on the state of the processed image, and sends inputs back to the game via keyboard emulation.

The driving algorithm created for this project was tuned using more than 50 separate adjustments, and run multiple times on each adjustment to measure how far the player’s vehicle could travel before crashing or stalling. These results were then compared to a set of baseline tests, where random input was used to steer the vehicle. The results show that our computer vision-based approach does indeed show promise, and could be used to successfully compete against human players if enhanced.

2 Acknowledgements

Thank you Dr. Ye, for your suggestions on computer vision and driving algorithm design, and for guiding me throughout the research project. I also thank my family, friends, and coworkers for their patience and support while I completed the Master’s program.

3 Table of Contents

List of Abbreviations and Definitions ...... 6 1. Introduction and Background ...... 7 2. Related Work ...... 9 2.1 DeepMind: Capture the Flag ...... 9 2.2 OpenCV: Grand Theft Auto V ...... 10 3. Program Flow Explanation and Diagrams ...... 13 3.1. Overall Process Flow for Experiment ...... 13 3.2. OpenCV Image Manipulation: Functions and Order of Operations ...... 14 3.3 Visual Analysis Processing Steps ...... 17 4. Hardware and Software Utilized ...... 18 5. Approach and Implementation ...... 19 5.1 Approach ...... 19 5.2 Capture Display ...... 22 5.3 Overlay Mask on Screen Capture ...... 22 5.4 Examine Processed Image ...... 28 5.5 Driving Algorithm Chooses Next Input Action ...... 30 5.6 Emulate System Keypresses ...... 31 6. Experimental Results ...... 34 6.1 Setup and Baseline Results ...... 34 6.2 Driving Algorithm Tuning: Iterative Results ...... 35 6.3 Experiment Results on Static Drive Algorithm Configuration ...... 37 6.4 Driving Behavior ...... 38 7. Conclusion and Future Work ...... 40 References ...... 43 External Figure References ...... 49

4 Table of Figures

Figure 1. Screengrab of a video demo for DeepMind playing

Capture the Flag. [External Figure 1]...... 10

Figure 2. Canny Edge Detection on GTA V. [External Figure 2]. ... 11

Figure 3. Hough Lines on GTA V image. [External Figure 3]...... 12

Figure 4. Lane marker overlay in GTA V. [External Figure 4]...... 12

Figure 5. Closed-loop process cycle...... 14

Figure 6. OpenCV processing steps for emulator screenshots...... 16

Figure 7. Visual analysis steps...... 18

Figure 8. Grayscale image conversion...... 23

Figure 9. Threshold function generated black and white image...... 24 Figure 10. Processed game image after Canny edge detection is applied...... 25 Figure 11. Processed game image after Gaussian blurring has been applied to Canny edges...... 26

Figure 12. Processed game image with Hough lines...... 27 Figure 13. Processed game image after second application of Hough lines...... 28

Figure 14. Turns navigated vs algorithm tuning iteration...... 37

Figure 15. Trial results over 30 attempts...... 38

5 List of Abbreviations and Definitions

API: Application Program Interface. In this paper, we are referring to communications definitions or tools that allow for one program to interact with another directly.

CPU: Central Processing Unit. The general purpose computing cores used in personal computers.

GPU: Graphics Processing Unit. The computing cores which are architected to specialize in computer graphics generation.

NPC: Non-Player Character. Refers to an in-game avatar which may act and look similar to a human player’s avatar, but is controlled by the game itself.

OpenCV: Open Computer Vision. An open-source library of functions that allow for real-time computer vision.

OS: . Software which manages computer hardware, software, and services.

PAL: Phase Alternating Line. An analogue television encoding standard with a resolution of 576 interlaced lines.

RAM: Random Access Memory. Refers to computer memory for temporary program storage.

ROM: Read Only Memory. In this paper, it refers to the test game’s program file. The name originated from the fact that cartridge-based video games were stored on solid state memory chips, and did not have the ability to be written to.

6

1. Introduction and Background

In today’s video games, one common requirement of the main game program is to control a wide variety of non-player characters, which interact with the human player. These non-player characters, or “NPC’s,” can be cooperative characters, enemies, or environmental figures that add decoration and flair to the game’s world. Traditionally, computer-controlled enemy players, or “bots,” are controlled by a hard-coded logic within a game [25]. Games have traditionally implemented various forms of pathfinding algorithms to control their NPC’s. These pathfinding methods require a full understanding of a map’s topology, along with decision-making functions, such as the A* algorithm [2]. When used for pathfinding, the A* algorithm is a search algorithm which “repeatedly examines the most promising unexplored location it has seen. When a location is explored, the algorithm is finished if that location is the goal; otherwise, it makes note of all that location’s neighbors for further exploration” [4]. Modified A* algorithms for avoiding randomized obstacles were also proposed, and found to be successful 28]. However, even these modified implementations require the environment to be completely known to the game algorithms, and therefore, the algorithm must be integrated with the game itself. Externally run algorithms do exist for some games. Two companies creating external algorithm programs, which can be used for gaming, are OpenAI [3] and Alphabet’s Deepmind [5]. For example, DeepMind has partnered with Blizzard Entertainment to create a machine-learning capable version of their game, Starcraft II. However, for this experimental version of the game, Blizzard created API’s to allow DeepMind’s software to interact with it [27]. This approach would require an active developer to continue providing support their published game; any game which has had its support discontinued would likely never have similar integration with external software. A majority of games on the market use integrated algorithms. Their “bots” maintain a limited amount of behaviors throughout the game’s existence. Pre- programmed algorithms can be quickly surpassed by human players in any genre of video game. Once strategies are discovered to exploit gaps in the algorithm’s in- game abilities, the bots no longer present a challenge to the player. Alternatively, bots that play poorly as a part of a human player’s team can be distracting and frustrating. Some players may voice their frustration on Reddit subforums for games, such as Blizzard’s Heroes of the Storm, in which algorithm-controlled teammates play with confusing and sub-optimal strategies [13]. To many, the pre-

7 programmed game logic feels stale, unintelligent, and unentertaining, and players can lose interest in the game if demands for algorithm updates are not met. Additionally, players often find tricks, glitches, or other unintended (or unforeseen) routes or techniques to get around a racetrack with the shortest recorded time. For example, players of Mario Kart DS (released for the portable DS game system) discovered a technique dubbed “snaking,” which involves using a speed boost originally intended for taking corners, and applying it to straight sections of the track [1]. Such discoveries and exploits make the algorithm uncompetitive. To alleviate this problem, developers sometimes invest significant resources to upgrade the game logic, and provide it as a downloadable update. Developers will sometimes also share insight into the bots’ algorithm changes via a blog post [24]. However, with increasing broadband data speed [16], and with computer vision software becoming freely available [17], an alternative to such a downloaded update is possible. We propose that an algorithm, combined with computer vision, can be used to teach a program how to play a game as it exists in the present, using only the button inputs available to any human player. Such programs could be run on remote servers, and interact with human players via any multiplayer interfaces already working with a game. In order to give a community of gamers an experience that can evolve with them, our proposed game-playing algorithm would allow a game’s developer to make changes to the behavior of their bots, without rewriting any portion of the game’s . This paper is divided into seven main sections. In this section, we describe some background on the problem area we focused on for this research project. In section two, we discuss related work in the field of computer vision for gaming. In section three, we describe the overall program structure, and the process steps performed for image manipulation and measurement. In section four, we have recorded the hardware and software used, for reference. In section five, we discuss the approach and implementation details undertaken to perform the research. In section six, we discuss the experimental results of the research. In section seven, we offer our conclusion on the research performed, and suggest future work to build and improve upon the work we have done.

8 2. Related Work

There have been many similar, documented attempts to play a moderately complex game using external algorithms. Two such experiments are presented here. Both experiments utilize a game’s raw pixel output, combined with human- gameplay controller inputs, to perform decision-making and subsequent manipulation of the game environment. In both experiments, the program does not have access to a map, or any other external representation of the game environment. The movement and actions are informed purely by the visuals generated by the game.

2.1 DeepMind: Capture the Flag

DeepMind conducted such an experiment to play a first-person shooter game, which was built upon a visually-modified version of Quake 3 [10]. This modified game was created for in-game agents to play Capture the Flag, leveraging machine learning to increase the capabilities of the game-playing logic. In the game, the avatars on each team were tasked with “tagging” enemy combatants, locating the enemy flag, and returning it to their own base. In addition, the environment layouts are procedurally generated, ensuring that the agents cannot simply memorize the layout between training runs. Figure 1, below, represents both the raw pixels seen and evaluated by the DeepMind-based software (left half of the image) and a representation of one of the procedurally-generated maps (right half of the image.)

9

Figure 1. Screengrab of a video demo for DeepMind playing Capture the Flag. [External Figure 1].

In this Capture the Flag experiment, the DeepMind machine-learning algorithm was able to learn and play the game successfully. The agents were trained with 450,000 matches’ worth of data, and the learned skillset enabled it to beat human players 75% of the time. Watching video of the algorithm in action clearly shows the advantages of DeepMind’s reaction time versus a human player, while demonstrating sufficient tactical skills for defense, teaming up with another player, and capturing the flag to score points [9].

2.2 OpenCV: Grand Theft Auto V

A second experiment was run by Harrison Kinsley of PythonProgramming.net, on Rockstar Publishing’s Grand Theft Auto V, hereafter referred to as “GTA V” [11]. This game contains a relatively photorealistic representation of city and highway roads, complete with painted lane markers. In this experiment, Mr. Kinsley used OpenCV to perform edge detection of lanes on the in-game roads, in order to create a self-driving car program. Two important functions were used to perform the image analysis: Canny edge detection, and Hough line generators. Canny edge detection works by examining individual image pixels, and compares each pixel to its neighbors to look for sharp intensity

10 changes [6]. Hough lines take an image (such as a Canny-processed image) containing individual pixels, and through the properties of the transform, allow each pixel to “vote” for the line they belong to [14]. The Hough line with the highest “vote” count becomes the drawn line. Here, OpenCV’s integrated Canny edge detection and Hough lines functions were utilized to locate the lane markers, and create two lane guides for the program to follow. In figure 2 below, the original game display, left, is shown alongside the Canny edge-detected output, right.

Figure 2. Canny Edge Detection on GTA V. [External Figure 2]. In figure 3 below, Hough lines have been drawn on top of the the Canny edge-detected output from figure 2. Minimum length thresholds for line generation were set, to exclude small features like landscaping, and the vehicle’s mirror. The resulting lines roughly approximate the lane markings on the game’s road.

11

Figure 3. Hough Lines on GTA V image. [External Figure 3]. In figure 4 below, lane markers are defined by choosing the longest two Hough lines generated. The lane markers are superimposed in green over the original game screenshot, to represent the boundaries that the program is measuring for steering input.

Figure 4. Lane marker overlay in GTA V. [External Figure 4].

The PythonProgramming.net approach allowed the vehicle to successfully navigate the road, by measuring the slope of the two lane lines [12].

12 3. Program Flow Explanation and Diagrams

In this section, we will explain broadly the control loop of our program’s steps, and the methods and order used to perform computer vision processing. Figure 5 contains the repeating control loop of the program. Figure 6 contains the order of operations of the computer vision processing steps. Figure 7 explains the process for analyzing the image and choosing a driving direction.

3.1. Overall Process Flow for Experiment

The program’s loop begins with displaying the emulated game image in windowed mode on the desktop. Screenshots are taken of the game window, by setting the coordinates of the screenshot function to span only the displayed game within the emulator window. These screenshots are then passed through several OpenCV functions, which renders a representation of the game state useful to the driving algorithm. The manipulated image is then passed to the driving algorithm function, where it is examined to determine the next controller output state. These output commands are passed to a Python keypress emulator function, which sends the desired keypresses to the Linux OS. With the game emulator selected and operating as the targeted window, Linux directs the keypress commands to the emulator. The emulator accepts these keypress commands, thereby controlling the game, and altering the next output state. This altered output state becomes the input for the next cycle of the loop. These steps are shown in Figure 5, below.

13

Figure 5. Closed-loop process cycle.

3.2. OpenCV Image Manipulation: Functions and Order of Operations

The Python OpenCV functions are executed in a specific order to achieve the image processing output desired for our driving algorithm. First, a full-color screenshot is taken of the emulator window’s game state, and the pixel values are saved into a NumPy array. NumPy arrays are N-dimensional array objects, which

14 allow high-performance operation on their elements [15]. This screenshot is converted into grayscale, and then passed through a threshold function, which converts the image to strictly black and white. These steps reduce the amount of computation that the Canny function is required to perform. The Canny edge detection then creates an image in black and white, where all edges are marked out as thin white lines. A Gaussian blur filter is applied to the image to account for small gaps in detected edges. Gaussian blur filters are low- pass noise filters, which eliminate Gaussian-style noise [26]. The result is that small errors in edge detection are smoothed out, and lines become more continuous. This blur-filtered image is then passed to the Hough lines function, which draws thick white lines across all edges passing a preset length threshold value. This process leaves us with an incomplete masking of the boundaries of the track, so the Gaussian blur is applied a second time, which further fills in the gaps between original Canny-produced edge lines, and the Hough lines. Finally, the Hough lines function is applied a second time, which masks the track boundaries in the image sufficiently well for our driving algorithm to function. These steps are shown in Figure 6, below. The functions performed are in the left column. The image output from that function is in the right column, connected by a dotted line for clarity.

15 Screenshot Captured and ………………….. Original Color Image stored in Numpy Array

OpenCV: Convert Array ………………….. Black and White Image Image to Black and White

OpenCV: Perform Canny ………………….. Black and White, Edges Edge Detection on Image Only

OpenCV: Perform Gaussian ………………….. Black and White, Edges Blur on Image Only Blurred to Reduce Aliasing

OpenCV: Perform Apply ………………….. Black and White, Lines Hough Lines to Image Painted Over Edge Connection Points

OpenCV: Perform Gaussian ………………….. Black and White, Lines Blur on Image Painted and Blurred

OpenCV: Apply Hough Lines ………………….. Black and White, Lines to Image Cover Greater Proportion of Boundaries

Return to Array Image to ………………….. Pseudo-Masked Image: Main Function Black and White, Boundaries Masked in White

Figure 6. OpenCV processing steps for emulator screenshots.

16

3.3 Visual Analysis Processing Steps

The driving algorithm performs visual analysis on the processed images to determine the next keypress output. The algorithm operates on the fact that out-of- bounds areas are masked in white pixels, while in-bounds track areas are black. Therefore, the algorithm attempts to steer away from white pixels. In order to do so, first the image is examined to look for the presence of white pixels. Three horizontal rows are set to inspect for the pixels in the top third, middle third, and bottom third regions of the image. These correspond to far, intermediate, and near distances from the player vehicle in the game window. Each row has 19 equidistant spaced points, which are sampled for the pixel’s value. Once all rows and points have been sampled, then the algorithm compares the quantity of white boundary pixels on the left and right halves of the image. This is performed by splitting each row’s sampled pixel set. Pixel columns 0 through 8 correspond to the left side of the image; pixel columns 10 through 18 correspond to the right side of the image. Pixels in column 9 are in the direct center of the image, and are saved for special condition checks. The difference in the number of pixels between the left and right halves is stored. Next, additional checks are performed to look for special conditions. These include upcoming 90-degree turns, acceleration cues from having black pixels straight ahead in all 3 evaluated rows, and an acceleration bump to attempt to free a vehicle stuck on the track. These conditions assert their own driving directions, overriding or modifying the value calculated in the previous step. Finally, the algorithm will choose a button press combination based on the pixel boundaries measured on the left and right halves, plus any input modifiers from the special conditions checks. These button presses are then passed to the keypress emulator to carry out. These steps are shown in Figure 7, below.

17 Sample Pixel Values Across Image 3 Rows 19 Columns

Compare Count of White Boundary Pixels on Left and Right Halves of Image

Check For Special Conditions Based on Pixel Layouts (90- degree turn, etc.)

Choose Button Input Based on Special Conditions and Boundary Pixel Left-Right Balance Figure 7. Visual analysis steps.

4. Hardware and Software Utilized

This project required the purchase and utilization of the following hardware and software. The hardware is documented to provide a reference for the observed performance of the program. Hardware with faster or slower computation abilities may affect the results of this type of experiment, since we are attempting to perform analysis and responses in real time. Software and software versions are documented to reference available features, plus programming environment compatibility.

4.1 Hardware

CPU: Intel Core i7-8700K, 3.7GHz, 6 physical cores + 6 logical cores

RAM: 16GB DDR4-3200 RAM

Storage: Sandisk 512GB Solid State Drive

18

Display: 1920x1080 resolution, 60hz refresh rate

4.2 Software

Ubuntu Linux 18.04.01 LTS

Python version 3.6.6

MSS (Multiple Screen Shots) – Python package v. 3.3.1

NumPy – Python package v. 1.15.4

PyUserInput – Python package v. 0.1.11

OpenCV 3.4.3.18 – computer vision processing ability higan v106 (Byuu.org) – Super Nintendo Entertainment System emulation

F-Zero (PAL version) – Game ROM

5. Approach and Implementation

This section describes the planning and decisions made to create a working program. The game selected to experiment on is “F-Zero,” published in 1991 for the Super Nintendo Entertainment System [22]. This is a futuristic-style racing game, which does not attempt to display photorealistic graphics. It was chosen for both the relatively simple visuals and high maximum speed of gameplay, in order to evaluate how the OpenCV visual manipulation would work in a time-sensitive task. The higan emulator is used to play the game within the Linux desktop environment [8]. The Python script interacts with the emulator in order to play the game.

5.1 Approach

19 Our approach to the creation of a game-playing program for “F-Zero” centered on two basic requirements: First, that all programming packages utilized would be open source, and that the environment would be able to run within Python and Linux; Second, that the program would interact with the game only via normal controller inputs, and not utilize any APIs to directly interface with the game. This required research into available software packages, including OpenCV, Python keypress , and open-source game console emulators. Once suitable software was found, testing was done to ensure the various software packages could all interface with each other. Next, basic research was done to understand controller button functionality for the game itself. We also needed a rough estimate of in-game acceleration and deceleration rates when the accelerator was pressed and released, and whether there was any non-linear behavior in the game control scheme. Finally, lane detection had to be designed for the system. Our original intent was to detect the barriers on the left and right edges of the game’s race tracks, and measure the relative angle of the lines to determine turning direction. However, we found that as distant objects came nearer to the vehicle, the resolution of the objects did not increase. Instead, the existing objects are made larger, resulting in a stair-step appearance that gets more significant as the objects approach the bottom- center of the game window. These stair-step changes in pixel color and intensity are recognized by the Canny algorithm as edges. Additionally, the stair-step appearance also has the effect of breaking any continuous lines found by the Canny edge detector in many shorter lines. Therefore, we determined that there were too many potential edges to allow driving by defining lane lines. Instead, applying a mask to the image, based on the presence of edges, was chosen to represent passible and impassible game areas. To have the program play the game, the following events would happen in a continuous loop:

1. The emulator will run in windowed mode on the game desktop, in one corner of the display. 2. Screenshot software within Python will then rapidly capture the image in the game emulator’s window, using the MSS library. MSS provides high-performance screenshot functions useful for computer vision [23]. 3. The screenshot will then be manipulated and analyzed by OpenCV, following the below steps in order: a. The image will be converted to grayscale.

20 b. The grayscale image will be passed through a thresholding function, to obtain a binary black and white image. c. Canny edge detection will be performed on the black and white image. d. Gaussian blurring will be applied to the image containing Canny edges, to eliminate aliasing. e. Hough lines processing will be used to paint white lines over the blurred Canny edge image. This results in a partially- masked image, which needs to be filled in more completely for the driving algorithm. Therefore, we repeat the blur and Hough- line drawing steps. f. Gaussian blur will be re-applied to the image, which is now partially masked with white Hough lines. This sets up the image for another pass through the Hough lines function. g. Hough lines processing will be performed a second time, more completely masking the image. The result is an image which is sufficiently masked to allow for the driving algorithm to be applied. A third pass is unnecessary, because additional masking was found to only affect the outer edges of the image, which are not measured for this driving algorithm. 4. The processed image will then be analyzed by the driving algorithm. Measurement points will be taken horizontally across the screen, in order to determine the presence of barriers (white pixels) or open track (black pixels). Decision-tree logic will be implemented to determine: a. If the car needs to turn; b. How sharply to turn; c. If the vehicle should be accelerating. 5. The driving algorithm will then determine what keys should be pressed, in order to cause the selected action chosen by the decision tree. Button press durations will have a base time duration, with sharper turns or accelerations causing longer button press commands. 6. The keypress software will then send the emulated keyboard commands back out to the Linux OS, taking advantage of the “target window” functionality to allow Linux to direct the commands back to the game console emulator. 7. The emulator will respond to the keypresses, altering the display within the game window, and provide new input for the next loop of the cycle.

21

5.2 Capture Emulator Display

This process was enabled by using the Python package MSS [21]. The monitors() function is used to set the display area. The grab() function is used to perform the screen capture of the higan emulator window. The output of the grab() function is saved into an array, using NumPy to perform the conversion.

5.3 Overlay Mask on Screen Capture

This process makes use of OpenCV’s various functions. The OpenCV functions used are CvtColor, COLOR_BGR2GRAY, Threshold, Canny, GaussianBlur, and HoughLinesP. CvtColor is a function which converts the color space of an image. COLOR_BGR2GRAY is a function which applies a formula for grayscale color space conversion, while Threshold will convert all pixels beyond a defined intensity to either black or white [18]. Canny is the function called to perform Canny edge detection on the image [19]. GaussianBlur is a function called to apply a Gaussian blur filter to the image [20]. HoughLinesP is a function used to draw probabilistic Hough lines on the image [21]. First, the image is cropped to remove game UI elements on the top and bottom of the screen, such as the lap timer, and the avatar of the vehicle itself. This is done using simple Python array slicing, which will letterbox the game window into a region of interest for further processing. Second, the image color information will be removed to aid in Canny edge detection processing. the image is converted from a color capture to grayscale, using function CvtColor, and defining the color space as COLOR_BGR2GRAY. This lessens the processing load of the Canny function, since the grayscale image is only 8 bits per pixel, and does not negatively affect the accuracy of the Canny function’s edge finding. Additionally, brightness thresholds are applied to the image, to flatten out some artistic shading that occasionally bands across the game’s racetrack. This is achieved using the Threshold function. The application of this function preemptively eliminates any false edges from being detected, since these are

22 drivable areas. The Threshold function is applied to drive gray shades below the defined threshold to black, and to drive gray shades above the threshold to white. Figure 8 portrays a side-by-side example of the original game display image, and the grayscale converted image. Figure 9 shows the results of applying the Threshold function to the grayscale image.

Figure 8. Grayscale image conversion.

23

Figure 9. Threshold function generated black and white image. Third, the grayscale image is processed through Canny edge detection to define the image’s edges. OpenCV’s documentation for the Canny function recommends a Gaussian blur be applied prior to the Canny edge detection processing, in order to remove random noise (such as film grain or digital camera sensor read noise) from the image; however, our digital game screenshots contain no random noise, so this function is skipped to speed up processing [23]. The output of the Canny function is a masked image, where the pixels representing the detected edges are converted to white, and the remainder of the image is flattened to black. This is portrayed in Figure 10.

24

Figure 10. Processed game image after Canny edge detection is applied. Fourth, the Canny-produced image is smoothed out using Gaussian blur to remove any minor aliasing. This filtering has the effect of connecting short line segments that may be otherwise missed by the Hough lines function, particularly near the bottom and bottom-center of the image. Figure 11 shows the effect of applying the Gaussian blur filter.

25

Figure 11. Processed game image after Gaussian blurring has been applied to Canny edges. Fifth, the Probabilistic Hough Lines function attempts to find the lines represented by the white pixels in the image. The probabilistic variant of the Hough Lines function allows us to define a minimum line length, and a maximum gap allowable between pixels in a potential line. A general description of Hough Transforms is available on the University of Edinburgh’s website [7]. Figure 12 portrays the image state after the first application of the Hough lines function.

26

Figure 12. Processed game image with Hough lines.. Sixth, the Gaussian blur and probabilistic Hough Lines functions are run on the image a second time, to more completely mask the off-limits areas of the racetrack. Figure 13 demonstrates the improvement in masking by running through both functions a second time. This level of masking was observed to be sufficient for the driving algorithm to operate as expected. A third application of the Gaussian blur and probabilistic Hough Lines functions incurs performance penalties without increasing the performance of the driving algorithm.

27

Figure 13. Processed game image after second application of Hough lines.

5.4 Examine Processed Image

The processed image array is passed into the function responsible for making driving decisions for the vehicle. Since the game’s camera is stationary, inputting turning directions will affect how much of the OpenCV detected barriers (white pixels) exist on each side of the screen. For example, if the vehicle is positioned closer to the left side of the track, the off-limits area (masked in white from the previous processing) will take up more of the left half of the game image. In order to accurately steer the vehicle, the program must measure where the black and white pixels exist on the fully processed image. Then, it can choose which

28 driving directions are necessary to avoid the white pixels, and drive towards the black pixels. To examine the image, a mask is applied, sampling the array along 3 horizontal rows, at 19 equidistant points across the row. The sampled points are saved into their own array. Three rows were chosen to examine various distances ahead of the player vehicle. The row in the top third of the image represents distances far ahead of the vehicle; the row in the middle third represents intermediate distances in front of the vehicle; and the row in the bottom third represents distances very close to the vehicle. Three rows were determined, via experimentation, to be the minimum number required for the driving algorithm to successfully steer the vehicle around the course. More rows can provide more distance information, at a cost of processing time and algorithm complexity. The number of columns came about experimentally as well. The game contains some sections of track that are almost as wide as the entire game display, and others that are only about twice as wide as the vehicle itself. There are also points where the track width will grow or shrink very quickly. As a result, it was found that the screen had to be measured nearly all the way across, but with sufficient measurement density to locate the vehicle relative to the boundaries. With 19 measurement points per row, nine are used for each of the left and right sides, while one is located at the center direct ahead of the vehicle. Experimentation found that 9 points per side are enough to center the vehicle in its lane. Finally, loops are used to examine the left and right halves of the image, by counting the number of white pixels seen on each row. The counts are tallied and saved into Left-Top, Left-Middle, Left-Bottom, Right-Top, Right-Middle, and Right-Bottom variables. With this information, a number of analytical combinations are possible. The simplest is to total the left and right side variables, and do a comparison. More white pixels on the left side indicates that the barrier is nearest to the left, and therefore the car should turn to the right, and vice-versa. Additional analysis is also performed. Checks are done to see if the entire top row contains white barrier pixels, signifying an upcoming 90-degree turn; if the top two rows contain entirely white barrier pixels, signifying that a barrier is very near and straight ahead; or if the combined white pixel count of all 3 rows is very high, signifying that the vehicle is stuck next to, or facing, a barrier. Also, the difference in the number of white pixels counted on the left and the right informs how sharply the vehicle should be turning. For example, if the number of white pixels on the left half of the image are far greater than on the right half, the car should turn more sharply to avoid colliding with the barriers. If only a

29 small imbalance is noticed between left and right side pixel counts, then a nudging steering action would be more appropriate.

5.5 Driving Algorithm Chooses Next Input Action

This project contains four grades of steering input for each side: a nudge, an easy turn, a normal turn, and a hard banking turn. These are selected based on the comparison counts between left and right pixels; a greater imbalance towards one side or the other causes a more aggressive turning action to be selected. If the program does not detect an imbalance, then the chosen direction should be to continue straight ahead, until we do come to an imbalanced state. If the points directly in the center of the sampled image area are black, we assume that the way ahead is clear, and the selected action is to more aggressively accelerate. If, however, there is a white barrier pixel detected in the center of the image, the program should approach more slowly, to give the vehicle time to navigate the upcoming required turn. In this situation, we use a nudge forward action. Finally, the program can detect when a 90-degree turn is ahead by measuring just the top line’s point. If nearly all of the top line’s points are white on the left and right hand side, the program assumes that a wall is directly ahead, and doubles the left and right turn values prior to doing a left-right comparison. This will allow the program to select a more aggressive turn, which is required to avoid crashing into a perpendicular barrier. These resulting directions are strings, which will be input into the keypress function later. An example pixel balance configuration for left-turn steering is presented below. The right and left variables contain the sum of the right and left pixels from each of the 3 pixel sampling rows, respectively. The maximum count for either variable is therefore 27 pixels. Therefore, in the if-statement, a hard left is chosen when the difference between right and left is 21 or more pixels. Logically, this means the system has measured far more white barrier pixels on the right half of the screen than on the left, and the player vehicle should steer aggressively towards the barrier-free section of the track. The heading variable is an optional terminal debugging message. The variables hardLeft, turnLeft, easyLeft, nudgeLeft, and nudgeForward each contain a string value which is passed to the keypress function. This string value is then interpreted by the keypress function, and generates an output.

30 if right > left: heading = "Left" if right - left > 20: return hardLeft, left, right, heading elif right - left >= 14: return turnLeft, left, right, heading elif right - left >= 4: return easyLeft, left, right, heading elif right - left >= 2: return nudgeLeft, left, right, heading else: return nudgeForward, left, right, heading

5.6 Emulate System Keypresses

The driving algorithm, having selected the next driving direction, now needs to communicate these directions to the game. Since we are using an emulator on a computer, the inputs are configurable to be controlled by a keyboard, rather than a gamepad. To take advantage of existing system keypress emulation packages for Python, keyboard entry was chosen as the input method for the game emulator. The original game input was set up to use five primary buttons to control the vehicle’s direction and forward impulse: directional pad left and right for rotational turns; shoulder button L and R for lateral left and right movement; and the B- button for acceleration. Braking exists in the original game, but is not used by this program. Additionally, the Super Nintendo controller was only capable of registering binary on/off button presses. Therefore, button press duration was used to measure player intent, and games (such as F-Zero) reacted quickly, with no programmed-in acceleration (or “lag”) before the corresponding action reached full effect. Therefore, in this project’s program, timers are used to set keypress duration. Short duration timers represent rapid keypresses, such as small adjustments to correct the course of the racing vehicle; long duration timers represent sustained maneuvers, such as turning a corner, or accelerating straight ahead to gain speed. Regarding turning, there are three types of turns: in addition to normal rotational turning using the directional pad, a left or right slide can be executed using the shoulder L and R buttons alone. This movement does not change heading, but alters track positioning at a modest pace. Finally, a pivoting turn can be executed by pressing the directional pad, simultaneously with the corresponding direction shoulder button. This turn is the most aggressive, but has not been implemented in this iteration of the program.

31 Forward acceleration is controlled by long keypresses for maximum acceleration; blipping the throttle (short duration keypresses) during either nudge turns, or when obstacles ahead are detected; or by releasing the accelerator to coast. Specific to the PyUserInput package, the function press_key initiates the emulated keypress, and continuously generates the emulated keypress to the Linux OS. The function release_key ends the emulated keypress generation. In our implementation, each keypress has a configurable baseline time duration, saved as a variable called timeout. This is derived from the current system time, obtained from the function time.time(), plus an offset. All keypresses last at least the duration of the baseline, but most will modify and extend this duration. Additionally, multiple buttons can be pressed at once, and each key may have its own keypress duration. The keypress logic contains both time duration settings, and button press and release duties. First, a series of if-statements looks for a match to the current input command. Once the command is matched, the associated keys are pressed and released according to the baseline timeout duration, plus an additional delay, set up in its corresponding if-statement. When the current system time surpasses the time defined by the timeout plus the delay, the keypress is released, and the if- statement breaks. Example Python keypress timing configurations for left turns are presented below. ‘Left’ corresponds to the left directional arrow on the keyboard, which is interpreted by the higan emulator as a left directional-pad button press on the original controller input. This provides rotational turns to the left. ‘B’ corresponds to the B key on the keyboard, which is interpreted by the higan emulator as pressing the B-button on the original controller input, and provides acceleration to the vehicle. ‘L’ corresponds to the L key on the keyboard, which is interpreted by the higan emulator as pressing the left shoulder button on the original controller input. This causes a leftward sliding movement of the vehicle, equivalent to a translation on the X-axis.

timeout = time.time() + 0.05 # 50 milliseconds

if directions == "Left": k.press_key('Left') while True: if time.time() > timeout + 0.05: k.release_key('Left') break

elif directions == "nudgeLeft":

32 k.press_key('L') k.press_key('B') while True: if time.time() > timeout + 0.03: k.release_key('L') k.release_key('B') break elif directions == "hardLeft": k.press_key('Left') while True: if time.time() > timeout +0.10: k.release_key('Left') break elif directions == "easyLeft": k.press_key('Left') k.press_key('B') while True: if time.time() > timeout: k.release_key('B') if time.time() > timeout + 0.03: k.release_key('Left') break

33 6. Experimental Results

In this section, we will discuss the setup of the experiment parameters, the baseline results obtained from random input, the results achieved by measuring the masked game image, and observed driving behavior of the driving algorithm.

6.1 Setup and Baseline Results

The setup and tuning for this experiment was performed on the track titled “Mute City I.” This track contains 7 turns, varying between right-angle turns, 45- degree turns, and one U-turn. The width of the track also varies between sections. The game allows for a choice between four vehicles, each with different characteristics. The vehicle chosen for this experiment was the default selection vehicle, the Blue Falcon, because it offers a balance between acceleration, top speed, and manageable turning characteristics. The experiments were run on practice mode, with another competing vehicle also set up as the Blue Falcon. The steering was adjusted, and further granularity introduced into the driving algorithm, over the course of 60 separate adjustments. Due to variability in the game framerate, the driving algorithm’s output could differ between trial runs. Therefore, each driving algorithm adjustment was observed between three and five times before moving forward with the next adjustment. The best result seen for each iteration was recorded. Baseline results were taken by using a random number generator to generate a number between 0 and 1, and assigning a value of 0.0 to 0.33 as a left turn; 0.33 to 0.67 as a right turn; and 0.67 to 1.0 as straight ahead. This way, input directions were equally represented by the values returned by the random number generator. 32 trials were attempted using standard “left” and “right” turn durations, as used by the final driving algorithm. A further 16 trials were run using the “hard left” and “hard right” turn durations. Baseline trials always resulted in the vehicle crashing out at or before the first turn of the track. When using the standard turn duration, the vehicle would travel further down towards the first turn than when using the hard turn durations. Baseline trials never resulted in the vehicle navigating the first turn successfully.

34 6.2 Driving Algorithm Tuning: Iterative Results

The driving algorithm required tuning to successfully navigate any of the turns on the course. Much of the tuning was focused on defining the left-to-right pixel balance for counting white barrier pixels, as well as changing the duration of button presses for each type of turn. Initial values were set up by estimating the pixel counts needed for each turn type, and by playing the game via the emulator to estimate the keypress time duration needed to cause particular turning and acceleration rates. During the tuning is when conditional checks for right-angle turns, rate-of- change measurements between runs, and “vehicle stuck” conditions were implemented and improved. These augmented the vehicle’s basic ability to successfully navigate the course at a slow speed. The results of the iterative changes were recorded during testing, and are shown in the table below. The first 25 results can be seen as gradually improving, until changes caused a regression in the driving algorithm’s effectiveness. As the changes were refined and results improved, new changes were introduced that would cause regressions again. These swings in ability are visualized in Figure 14. Comparing the first 30 results to the second 30 results shows an increase in the average completed turn count, from 2.23 to 3.3 turns per run. This is an increase of 48% in the average turn completion per run. When compared with the baseline trials, these results suggest that there is useful information encoded in these game images manipulated by the OpenCV processes, which the algorithm is able to use to drive the vehicle. Futhermore, the increase in average turn completion between the grouped results suggests that improved driving algorithm can extract more performance out of the images as they are currently processed.

35 Experiment Turns Experiment Turns Iteration Successfully Iteration Successfully Number Navigated Number Navigated Baseline 0 32 3 1 2 33 3 2 2 34 4 3 2 35 2 4 2 36 4 5 2 37 1 6 2 38 6 7 2 39 1 8 2 40 2 9 1 41 2 10 3 42 2 11 2 43 6 12 2 44 5 13 2 45 2 14 2 46 6 15 2 47 1 16 2 48 4 17 3 49 2 18 3 50 4 19 2 51 5 20 3 52 5 21 2 53 2 22 3 54 6 23 3 55 2 24 3 56 2 25 5 57 7 26 2 58 2 27 2 59 1 28 2 60 6 29 1 Group Avg Turns 30 1 1-30 2.23 31 1 31-60 3.3

36

Figure 14. Turns navigated vs algorithm tuning iteration. Current algorithm configurations allow the vehicle to navigate between 1 and 7 turns on the course. No two attempts by the program are the same. Turning points, oscillations, and acceleration moments vary between trial laps. We believe this is caused, in part, by the OpenCV-manipulated image frame rate being both variable (each frame takes a different amount of time to complete) and relatively slow (under 10 frames per second). This results in, effectively, a low-bandwidth discrete control loop between the game and the Python-based program. A typical position to get stuck (or crash out) at is at turn 3, a 90-degree sharp right turn after 2 consecutive 45-degree right turns; turn 4, a 90-degree right turn after a straightaway; or to stall out after navigating turn 6.

6.3 Experiment Results on Static Drive Algorithm Configuration

An additional experiment was run on a static driving algorithm setup, over 30 trial runs, to measure the variability of the outcomes from the algorithm’s choices. Each trial was limited to three minutes of Python script runtime, after which the program would automatically close. The number of successfully

37 completed turns varied between 1 and 6, but the mode of the set was 2 successful turns. The mean was 3.13 turns, and the standard deviation of all 30 trials was 1.477 turns. We consider these results to be poor, because they demonstrate that the driving algorithm is not able to consistently navigate around the track, despite having the ability to do so. This is evidenced by the fact that the program was able to navigate 3 turns or more a total of 15 times, or half of all trial runs. Since the driving algorithm is fixed and unchanging, and processing frame rate is known to us to be slow and variable, these results highlight the significant effect that the slow and inconsistent updates have on the program’s operation. Figure 15 below visualizes the significant amount of variability that arises from providing the program with a limited amount of game state information.

Figure 15. Trial results over 30 attempts.

6.4 Driving Behavior

While the initial computer vision processing of the screen was able to quickly yield good results, determining how to steer the vehicle based on the masked image required many revisions before working adequately. Ultimately, the vehicle is able to avoid driving straight into barriers, but is susceptible to a “ping-

38 pong” action, oscillating between the lane barriers. This oscillation ultimately reduces the vehicle speed, and is difficult for the program to recover from in the current configuration. Random incidents, like being bumped from a computer-controlled vehicle, do show that the driving algorithm can react to what is happening on the screen. Often, the driving algorithm is able to recover from the unexpected heading and velocity change, steering itself back onto the course without crashing out of the race or getting stuck. Despite the sub-optimal turning behavior, the vehicle has made it to the end of the track on some attempts, completing 7 out of 7 turns. This is significant, because if the driving algorithm can complete one full lap, then it should be able to repeat the same algorithm and complete any number of laps. However, the vehicle cannot currently complete a full lap, even when it does not crash out against a barrier, due a bug in the PyUserInput package which causes the emulated keypresses to stop working. After roughly two minutes, the emulated keypress commands slowly become ineffective. The number of emulated keypress inputs transmitted by the PyUserInput functions dwindle, essentially halting the control inputs of the game emulator. If the program is quit and restarted, it will begin analyzing the emulator window from the game state it was left in, and the emulated keypress inputs will be fully functional again.

39 7. Conclusion and Future Work

In this paper, we have demonstrated a technique for using OpenCV functions to mask a game image, converting the game’s visual output into a black and white image with boundaries marked in white. The processed, masked image was evaluated by a simple driving algorithm, to test whether the boundaries marked by the OpenCV process were sufficient to navigate the in-game vehicle safely through the track. Through experimentation with a basic driving algorithm, our program was able to avoid the boundaries generated by the OpenCV processing. This was demonstrated to us over the course of 60 adjustments to the driving algorithm, where each run had the number of successful turns recorded. Baseline testing, consisting of random inputs, resulted in crashes before the first turn every time. In contrast, during 30 trials of running the OpenCV processing and driving algorithm, the vehicle never crashed before the first turn. As a result, we feel confident that the image masking performed by the OpenCV functions results in image data that can be used by driving algorithm to successfully navigate around the game’s course. This type of real-time analysis is possible because computer vision techniques, such as Canny edge detection and Hough Lines drawing, are able to be executed at multiple frames per second on general purpose processors. However, computing power is a bottleneck for this type of computer-vision processing. Since the Python code, OS, and game emulator are all running on the CPU, there is a relatively low frame rate for the screen capture and image processing routines. Currently, these are processed at or below 10 frames per second. This effectively limits the amount of information that the driving algorithm gets to work with, since it is unable to immediately measure the results of its inputs. Low frame rates may be a contributing cause to the lane oscillation behavior. Experimental runs are also inconsistent. One run’s driving inputs may vary from another’s, despite using the same settings for screen analysis and turn choice. This also seems to be a consequence of having a low and variable frame rate, meaning the driving algorithm is likely measuring a different game state between two runs, despite being in roughly the same position on the racetrack. The frame rate may be increased by using a more powerful CPU, or by utilizing computer vision software that can be run on a GPU. Additionally, bug- free keypress emulation packages could allow the current hardware to lap the course successfully. Informal experiments with overclocking the CPU, and manually restarting the main Python program after 2 minutes, have led to the vehicle completing several laps without human control intervention.

40 Further experiments should focus on replacing the hard-coded driving algorithm with state-based artificial intelligence, or attempting to use machine learning to build a driving model. These changes can apply to the keypress timing and duration logic, or to the visual analysis of the screen, or both. Machine learning-based input control schemes could be implemented via Recurrent Neural Networks, or other similar algorithms, which include a state vector to represent the previous button inputs chosen. For example, instead of a small number of discrete turn inputs like “hardLeft” and “easyLeft,” a machine learning model could generate many more turn descriptions. This would allow the program to have more granular control over the vehicle. This type of neural network could be created by recording the inputs of a human playing the game many times, and using this data to train the network. An advantage of this approach is that the neural network would emulate its human-based example, and may therefore drive in a way that appears natural to other human players. Genetic Programming could also be used in an unsupervised manner for a button control scheme, by allowing the model to measure for itself the effects of different button press combinations and durations. A scoring system could be implemented, which rewards the model for turning to avoid barriers, and getting further around the track. The button presses could be random initially, but as button press combinations are found that drive the internal score higher, the score- accumulating presses would be favored over random inputs. Additionally, artificial intelligence could be added to the visual analysis of the OpenCV-processed game images. Currently, the driving algorithm does not take into account the previous state, or states, of the game image. Adding state- based memory would allow the algorithm to perform differently, depending on what happened in the past. For example, the vehicle may bump into the barriers of the track and take damage. This damage can be repaired by driving over certain sections of the track, which require a slight diversion off of the fastest racing path. If the artificial intelligence were able to visually detect how many times the vehicle made contact with the barrier, it would be able to decide whether or not to take itself off the ideal racing line, and drive over the repair section of the track. This would help ensure that the program is able to finish the race, since it would reduce the chances of crashing out by accumulating too much damage to the vehicle. Another possible application of neural networks is to replace the current 3- row pixel sampling setup to detect turns. Instead, supervised training on a classifier neural network, such as a Multi-Layer Perceptron, could examine the masked image to determine controller inputs. One possible method would be to section apart the masked images into a grid, analyze each section independently, and record the percentage of that section’s white pixels as a floating point number within an array. This approach would have an advantage over the current 3-row

41 arrangement, because it would examine the entire game’s screen output. This would allow the program to take into account the very closest and furthest away sections of the track. With those sections of the track being analyzed, the vehicle would be more likely to find a path out of a corner, since it may find open race track space in one of the extremes of the screen. Additionally, analyzing further into the game distance would help the vehicle to avoid hitting a perpendicular wall at high speeds.

42 References

1. Anonymous. MarioKart DS Snaking (Why the Hate?). Reddit Web site. https://www.reddit.com/r/3DS/comments/k5y1g/mariokart_ ds_snaking_why_the_hate/. Published September 5, 2011. Accessed September 2, 2018.

2. Botea A., Bouzy B., Buro M., Bauckhage, C., Nau, D. Pathfinding in Games. Artificial and Computational Intelligence in Games. 2013; 6: 21-31. DOI: 10.4230/DFU.Vol6.12191.21

3. Brockman, G. and Sutskever, I. Introducing OpenAI. OpenAI Web site. https://blog.openai.com/introducing- openai/. Published December 11, 2015. Accessed February 4, 2019.

4. Cui X. and Shi H. A*-based Pathfinding in Modern Computer Games. International Journal of Computer Science and Network Security. 2011; 11(1): 125-130. http://paper.ijcsns.org/07_book/201101/20110119.pdf. Accessed April 13, 2019.

5. DeepMind staff. Solve Intelligence. Use It to Make the World a Better Place. DeepMind Web site. https://www.deepmind.com/about. Publication date

43 unlisted. Accessed February 27, 2019.

6. Ding, L. and Goshtasby, A. On the Canny edge detector. DOI: 10.1.1.103.7086. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1 03.7086&rep=rep1&type=pdf. Accessed April 12, 2019.

7. Fisher R., Perkins S., Walker A., Wolfart E. Hough Transform. University of Edinburgh Web site. http://homepages.inf.ed.ac.uk/rbf/HIPR2/hough.htm. Publication 2003. Accessed February 25, 2019.

8. Higan staff. Higan. Byuu.org website. https://byuu.org/emulation/higan/. Publishing date unknown. Accessed April 10, 2019.

9. Jaderberg, M. Human-Level in First-Person Multiplayer Games with Population-Based Deep RL. YouTube Web site. https://www.youtube.com/watch?v=dltN4MxV1RI&featur e=youtu.be. Published July 6, 2018. Accessed January 30, 2019.

10. Jaderberg M., Czarnecki W., Dunning I., et. al. Human-Level Performance in First-Person Multiplayer Games with Population-Based Deep Reinforcement

44 Learning. Hosted by ArXiv Web site. https://arxiv.org/pdf/1807.01281.pdf. Published July 3, 2018. Accessed January 30, 2019.

11. Kinsley H. Reading Game Frames in Python with OpenCV – Python Plays GTA V. Python Programming Web site. https://pythonprogramming.net/game-frames- open-cv-python-plays-gta-v/. Published April 10, 2017. Accessed September 2, 2018.

12. Kinsley H. Self Driving Car – Python Plays GTA V. Python Programming Web site. https://pythonprogramming.net/self-driving-car-python- plays-gta-v/. Published April 10, 2017. Accessed September 2, 2018.

13. Lou H. AI in Video Games: Toward a More Intelligent Game. Harvard University – Science in the News Web site. http://sitn.hms.harvard.edu/flash/2017/ai-video-games- toward-intelligent-game/. Published August 28, 2017. Accessed March 10, 2019.

14. Mordvintsev, A. et al. Smoothing Images. Readthedocs.io website. https://opencv-python- tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/p y_filtering/py_filtering.html. Published circa 2013. Accessed April 14, 2019.

45

15. NumPy developers. NumPy. Numpy.org website. http://www.numpy.org. Updated 2019. Accessed April 14, 2019.

16. Ookla. Fixed Broadband Speedtest Data Q2-Q3 2018 United States. Speedtest Web site. https://www.speedtest.net/reports/united-states/2018/fixed/ Published December 12, 2018. Updated February 12, 2019. Accessed March 5, 2019.

17. OpenCV. About. OpenCV Web site. https://opencv.org/about.html. Publication date unlisted. Accessed August 3, 2018.

18. OpenCV Dev Team. Miscellaneous Image Transformations. OpenCV Web site. https://docs.opencv.org/2.4/modules/imgproc/doc/miscellan eous_transformations.html. Publication date unlisted. Updated March 11, 2019. Accessed March 11, 2019.

19. OpenCV Dev Team. Canny Edge Detector. OpenCV Web site. https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/ canny_detector/canny_detector.html. Publication date unlisted. Updated March 11, 2019. Accessed March 11,

46 2019.

20. OpenCV Dev Team. Smoothing Images. OpenCV Web site. https://docs.opencv.org/2.4/doc/tutorials/imgproc/gausian_ median_blur_bilateral_filter/gausian_median_blur_bilateral _filter.html. Publication date unlisted. Updated March 11, 2019. Accessed March 11, 2019.

21. OpenCV Dev Team. Hough Line Transform. Opencv.org website. https://docs.opencv.org/3.4/d9/db0/tutorial_hough_lines.ht ml. Publication date unlisted. Accessed April 14, 2019.

22. Oxford, N. Super NES Retro Review: F-Zero. USGamer website. https://www.usgamer.net/articles/super- nes-classic-game-by-game-2-f-zero. Published July 11, 2017. Accessed April 14, 2019.

23. PyPi staff. Project Description. PyPi Web site. https://pypi.org/project/mss/. Published date unknown. Updated February 23, 2019. Accessed March 1, 2019.

24. Roamingnumeral. Dev Blog: Making a More Human Bot. North American League of Legends website. https://na.leagueoflegends.com/en/news/game-

47 updates/gameplay/dev-blog-making-more-human-bot. Published April 15, 2014. Accessed March 10, 2019.

25. SapphireLore. Bot AI’s For Some Heroes are in Major Need of an Update. Reddit Web site forum post. https://www.reddit.com/r/heroesofthestorm/comments/6d1u of/bot_ais_for_some_heroes_are_in_major_need_of_an/. Published May 24, 2017. Accessed March 10, 2019.

26. Sinha, U. The Hough Transform. AIShack.com website. http://aishack.in/tutorials/hough-transform-basics/. Published circa 2011. Accessed April 13, 2019.

27. Vinyals, O., Gaffney, S., Ewalds, T. Deepmind and Blizzard open Starcraft II as an AI Research Environment. Deepmind Web site. https://deepmind.com/blog/deepmind- and-blizzard-open-starcraft-ii-ai-research-environment/. Published August 9, 2017. Accessed February 26, 2019.

28. Wang, J., Lin, Y. Game AI: Simulating Car Racing Game by Applying Pathfinding Algorithms. International Journal of Machine Learning and Computing. 2012; 2(1): 13-18. http://www.ijmlc.org/papers/82-A1090.pdf

48 External Figure References

1. Embedded YouTube video screengrab. Taken from https://deepmind.com/blog/capture-the-flag/. Accessed April 13, 2019.

2. PythonProgramming.net “Python Plays GTA V” tutorial series. https://pythonprogramming.net/lane-region-of- interest-python-plays-gta-v/?completed=/direct-input- game-python-plays-gta-v/. Accessed April 13, 2019.

3. PythonProgramming.net “Python Plays GTA V” tutorial series. https://pythonprogramming.net/hough-lines-python- plays-gta-v/?completed=/lane-region-of-interest-python- plays-gta-v/. Accessed April 13, 2019.

4. PythonProgramming.net “Python Plays GTA V” tutorial series. https://pythonprogramming.net/finding-lanes-self- driving-car-python-plays-gta-v/?completed=/hough-lines- python-plays-gta-v/. Accessed April 13, 2019.

49