Intelligent Bomberman with Reinforcement Learning Ngo Hung Minh Triet University of Twente P.O

Total Page:16

File Type:pdf, Size:1020Kb

Intelligent Bomberman with Reinforcement Learning Ngo Hung Minh Triet University of Twente P.O Intelligent Bomberman with Reinforcement Learning Ngo Hung Minh Triet University of Twente P.O. Box 217, 7500AE Enschede The Netherlands [email protected] ABSTRACT corresponding games. Bomberman is a strategical, maze-based game where the In this study, we show the correlation between reinforce- players defeat their enemies by placing a multi-direction ment learning algorithms with different state representa- count-down bomb that would explode and destroy obsta- tions and the ability to learn and play the game Bomber- cles and other players. In this paper, a simplified version man. We used five reinforcement learning algorithms: Q- of Bomberman is implemented in Java, where different Learning, Sarsa, Double Q-Learning, and Deep Q Neural controlled agents are placed. Each agent represents one of Network and two state representations: 5-tiles information five reinforcement learning methods: Q-Learning, Sarsa, and complete information. Each configuration is tested Double Q-Learning, and Deep Q Neural Network with with three tests: the learnability test, the advance test, two state representations: 5-tiles information and com- and the comparative test. In each of the tests, the agents plete information. Then, we investigate whether a specific are evaluated per generation. Each generation consists reinforcement learning method can successfully learn to of 10000 training episodes and 100 evaluation episodes. play Bomberman efficiently by evaluating them with ad- Then, win rates and average rewards are extracted to make hoc agents and finally against each other. The configu- comparisons. The results show that although simple Q- ration of 5-tiles information with Sarsa archives the best Learning works, but performance deteriorates in scenar- overall quantitative results. ios with larger state space. While a preliminary version of the Deep Q network shows superior results and looks Keywords promising compare to Q-Learning with the use of the neu- ral network to tackle the ample state space. In the end, Reinforcement Learning, Bomberman, Computer Game, 5-tiles information with Sarsa configuration archives the Q-Learning, Neural network, Deep Reinforcement Learn- best overall quantitative results. ing In Section 2, we introduce the game and describe the im- plementation of simple Bomberman in Java with differ- 1. INTRODUCTION ent state representations. Then, in section 3, we give an In the past decades, Reinforcement Learning is gaining overview of five reinforcement algorithms we used for the more attention. Being inspired by animal learning the- agents. After that, we present the experimental setups and ories, reinforcement learning (RL) is developed with the results in section 4. In section 5, we discuss the research idea that an agent can deduce from its experience to de- process and draw reflections from thorough our research. cide which actions to take to maximize predefined values. Finally, we end the paper by summarising our paper and This agent learns by interacting with the environment and discuss some future improvements. receiving feedback indicating the values or rewards of the action. The objective is that after the learning process 2. BOMBERMAN is that the agent will perceive a specific state of an envi- In 1983, Hudson Soft published the original variant of ronment and reason the actions that would give the most Bomberman, and new generations of Bomberman are still rewards. [11, 16]. being published to this day. Nowadays, there are almost Nowadays, there are many studies on creating reinforce- 100 different variants of Bomberman games released on ment learning agents for various genre of games rang- various platforms [7]. In the original variant of Bomber- ing from grid games including traditional games such as man, which is named Bankudan Otoko in Japan but is shougi, Go, and chess [13, 15, 9, 12, 14] to modern com- called Bomber Man in most releases, the player is trapped puter games such as Mario, Atari, Bomberman, Ms. Pack- in a maze with monsters. In each stage, the objective is to man [8, 4, 2] in order to improve the efficiency and win- find the exit and eliminate all monsters by moving around rate of the reinforcement learning agents against other re- the maze, placing bombs, obtains power-ups that empower inforcement learning agents or human champions of the player's bombs. When the goal is met, the player can proceed to the next stage. While single-player campaigns Permission to make digital or hard copies of all or part of this work for mode occupied a large portion of Bomberman, we focus personal or classroom use is granted without fee provided that copies on the multiplayer variant in this research, where players are not made or distributed for profit or commercial advantage and that are trapped in the same maze, and their only objective copies bear this notice and the full citation on the first page. To copy oth- is to eliminate each other by placing bombs strategically erwise, or republish, to post on servers or to redistribute to lists, requires until the very last man is the winner. Furthermore, since prior specific permission and/or a fee. the game can prolong forever, over the iterations, there 35th Twente Student Conference on IT July. 2nd, 2021, Enschede, The Netherlands. will be more challenges that help end the game, such as Copyright 2021, University of Twente, Faculty of Electrical Engineer- more power-ups or bombs appear randomly to speed up ing, Mathematics and Computer Science. the termination of the game. 1 2.1 Environment for all the agents in the game. This probability will in- We implemented an environment that represents a simpli- crease linearly with turn count to 1.0, which prevents the fied version of Bomberman. The core of the environment is game from running forever. Therefore, to win the game, Java and JavaFX as the framework that helps visualize the the agents must learn how to dodge bombs and finish the game. The grid size can be customized, but in our experi- opponent as soon as possible. ments, a board with the size of 7x7 (including the unbreak- able border)is used, with its environment configured as in 2.2 State representation Figure 1. The players in each test type are different but According to Figure 1, as there are 36 changeable tiles ranging from two to four. There are two types of agents: (free/breakable/unbreakable), 36 tiles to place a bomb, 36 ad-hoc agents and learning agents. Ad-hoc agents con- tiles for the agents (ranging from two to four) to move into, the number of state can be approximated as 336 · sist of stationary agents and random agents. This type of 36 36 49 agent is used to assess the ability to learn and compatibil- 2 · 4 ≈ 4:9 · 10 different game states in total. In ity (the quality of the agents' policies) of a learning agent this part, we introduce two ways to represent the game. with Bomberman. Each learning agent is implemented One of them will show the complete information of the with a combination of a reinforcement learning algorithm game, which results in the enormous state space mentioned and a state representation. Each turn, the agents receive above. The other is an attempt to minimize the state the state representation from the environment (the game) spaces by limiting the agent knowledge (how much the that the agents use to determine their next move. After agent can perceive from the game/environment) and only receiving all the moves from the active agents, the environ- showing the agent's surroundings to them. ment executes the moves and sends back the results and 2.2.1 Complete information rewards of the agents to them, and then the cycle repeats In this configuration, the game is converted to a flatten- until there is less than one agent on the field. able array that contains all the information of the current game state. The array is in 2D and has the same size as the board. Each of the cells in the array contains the informa- tion of the respective tile in the game. That information is computed by the following values: • Tile type: Free, Breakable wall, Obstacles (Unbreak- able wall or Bomb). • Whether the tile contains the agent. • Whether the tile contains any of the other agents. • The foreseeable explosion countdown of the tile. In other words, the agent knows whether there will be an explosion in this tile with the precise countdown. As mentioned in the game rules section, the bombs in this game can be chained to create a bigger explosion. There- fore, The environment uses BFS (Breadth-First Search) to calculate the foreseeable explosion countdown in order to group explosion tiles together, and the group count- down value is equal to the lowest countdown bomb in the Figure 1. The game configuration uses for all tests in this group. This way, the agent knows whether the tile is safe study. Blue tiles are breakables walls, deep brown tiles are according to the nearest bomb countdown but the precise unbreakable walls, and learning agents (in this case, 4) are countdown of the exploding bomb even it is chained by represent by human-like figures and are distinct by colors. another bomb. In this configuration, the number of states This configuration separates the agents with walls, and each is equal to the calculated game states, which is ≈ 4:9·1049. agent is contained in a smaller grid which turns out to be the balance between complexity and fairness. To win the game, 2.2.2 5-tiles information the agents need to learn how to break walls and approach Since the main goal of this study is to evaluate how well then eliminate the opponents. the simple reinforcement learning algorithm performs in Bomberman, we think that it is not feasible to store or The rules of the game are as follows, each turn, the agent explore the state space with the size of a 50 digits number.
Recommended publications
  • Bomberman Story Ds Walkthrough Guide
    Bomberman story ds walkthrough guide Continue Comments Share Hudson Soft (JP), Ubisoft (International) One player, multiplayer Japan: May 19, 2005 North America: June 21, 2005 Europe: July 1, 2005 Bomberman is a game in the Bomberman series released in 2005 for the Nintendo DS. One player This game is very similar to the original Bomberman. The player must navigate the mazes and defeat the enemies before finding a way out. In this game, there is more than one item in each level and they are stored in inventory on the second screen. By clicking on the subject with a stylus, the player can activate the item at any time. They may choose to use the items immediately or be more conservative with them. There are 10 worlds each with 10 stages. The last stage is always the boss and there is a bonus stage after the fifth stage. In bonus level, the player is invincible and must defeat all enemies on one of the screens. They can then move to another screen to collect as many items as they can with the remaining time. Worlds/Bosses This section is a stub. You can help by expanding it. Multiplayer this game can wirelessly communicate with other Nintedo DS systems to play multiplayer. You can play a multi-card game or play the same card. There is no difference in gameplay depending on one card or multiple cards to play. With a wireless connection, you can play with up to 8 players at a time. When you die in a multiplayer game, you send to the bottom screen to shoot missiles at other players still alive.
    [Show full text]
  • Video Game Archive: Nintendo 64
    Video Game Archive: Nintendo 64 An Interactive Qualifying Project submitted to the Faculty of WORCESTER POLYTECHNIC INSTITUTE in partial fulfilment of the requirements for the degree of Bachelor of Science by James R. McAleese Janelle Knight Edward Matava Matthew Hurlbut-Coke Date: 22nd March 2021 Report Submitted to: Professor Dean O’Donnell Worcester Polytechnic Institute This report represents work of one or more WPI undergraduate students submitted to the faculty as evidence of a degree requirement. WPI routinely publishes these reports on its web site without editorial or peer review. Abstract This project was an attempt to expand and document the Gordon Library’s Video Game Archive more specifically, the Nintendo 64 (N64) collection. We made the N64 and related accessories and games more accessible to the WPI community and created an exhibition on The History of 3D Games and Twitch Plays Paper Mario, featuring the N64. 2 Table of Contents Abstract…………………………………………………………………………………………………… 2 ​ Table of Contents…………………………………………………………………………………………. 3 ​ Table of Figures……………………………………………………………………………………………5 ​ Acknowledgements……………………………………………………………………………………….. 7 ​ Executive Summary………………………………………………………………………………………. 8 ​ 1-Introduction…………………………………………………………………………………………….. 9 ​ 2-Background………………………………………………………………………………………… . 11 ​ ​ ​ 2.1 - A Brief of History of Nintendo Co., Ltd. Prior to the Release of the N64 in 1996:……………. 11 ​ 2.2 - The Console and its Competitors:………………………………………………………………. 16 ​ ​ Development of the Console……………………………………………………………………...16
    [Show full text]
  • Diplomová Práce
    České vysoké učení technické Fakulta elektrotechnická DIPLOMOVÁ PRÁCE Hraní obecných her s neúplnou informací General Game Playing in Imperfect Information Games Prohlášení Prohlašuji, že jsem svou diplomovou práci vypracoval samostatně a použil jsem pouze podklady (literaturu, projekty, SW atd.) uvedené v přiloženém seznamu. V Praze, dne 7.5.2011 ……………………………………. podpis Acknowledgements Here I would like to thank my advisor Mgr. Viliam Lisý, MSc. for his time and valuable advice which was a great help in completing this work. I would also like to thank my family for their unflinching support during my studies. Tomáš Motal Abstrakt Název: Hraní obecných her s neúplnou informací Autor: Tomáš Motal email: [email protected] Oddělení: Katedra kybernetiky Fakulta elektrotechnická, České vysoké učení technické v Praze Technická 2 166 27 Praha 6 Česká Republika Vedoucí práce: Mgr. Viliam Lisý, MSc. email: [email protected] Oponent: RNDr. Jan Hric email: [email protected] Abstrakt Cílem hraní obecných her je vytvořit inteligentní agenty, kteří budou schopni hrát konceptuálně odlišné hry pouze na základě zadaných pravidel. V této práci jsme se zaměřili na hraní obecných her s neúplnou informací. Neúplná informace s sebou přináší nové výzvy, které je nutno řešit. Například hráč musí být schopen se vypořádat s prvkem náhody ve hře či s tím, že neví přesně, ve kterém stavu světa se právě nachází. Hlavním přínosem této práce je TIIGR, jeden z prvních obecných hráčů her s neúplnou informací který plně podporuje jazyk pro psaní her s neúplnou informací GDL-II. Pro usuzování o hře tento hráč využívá metodu založenou na simulacích. Přesněji, využívá metodu Monte Carlo se statistickým vzorkováním.
    [Show full text]
  • Descargar Bomberman Blast Wii Iso
    Descargar bomberman blast wii iso Continue Progress continues We have already had 10,965 updates with Dolphin 5.0. Follow Dolphin's ongoing progress through the Dolphin blog: August/September 2019 Progress Report. The Bomberman Blast Wii IsoBomberman Blast Wii ControlsBomberman Blast is an action game developed and published by Hudson Soft for the Wii and WiiWare. It is part of the Bomberman series. This is the second in the bomberman downloadable games trilogy, after Bomberman Live for Xbox Live Arcade, followed by Bomberman Ultra for PlayStation Network. The game was released in two versions: full featured retail release and downloadable, lower-priced WiiWare. Bomberman Blast is originally a retail game. Featuring a whole new story mode, just like the old games as well as the 8-player online combat mode, it seemed like Hudson was trying. Vicky's dolphin emulator needs your help! Dolphin can play thousands of games and changes happen all the time. Help us keep up! Join us and help us make this the best resource for the dolphin. Bomberman BlastDeveloper (s)Hudson SoftPublisher (s)Hudson EntertainmentSeriesBombermanPlatform (s)WiiRelease Date (s) JP September 25, 2008Genre (s)ActionMode (s)Single game, Multiplayer (8)Entry TechniquesWi Remote, GameCube ControllerCompatibility5PerfectIGameSRB668 The WiiWare VersionDolphin Forum threadOpen IssuesSearch GoogleSearch WikipediaBomberman Blast (Bomberman in Japan) is a brand new addition to the Bomberman series available on WiiWare. Up to eight players can fight online at the same time via Nintendo Wi-Fi Connection. Simple controls make this a great game for family and friends to enjoy at any time. You can summon new items by shaking the Wii Remote Controller, creating new levels of Bomberman excitement.
    [Show full text]
  • 10-Yard Fight 1942 1943
    10-Yard Fight 1942 1943 - The Battle of Midway 2048 (tsone) 3-D WorldRunner 720 Degrees 8 Eyes Abadox - The Deadly Inner War Action 52 (Rev A) (Unl) Addams Family, The - Pugsley's Scavenger Hunt Addams Family, The Advanced Dungeons & Dragons - DragonStrike Advanced Dungeons & Dragons - Heroes of the Lance Advanced Dungeons & Dragons - Hillsfar Advanced Dungeons & Dragons - Pool of Radiance Adventure Island Adventure Island II Adventure Island III Adventures in the Magic Kingdom Adventures of Bayou Billy, The Adventures of Dino Riki Adventures of Gilligan's Island, The Adventures of Lolo Adventures of Lolo 2 Adventures of Lolo 3 Adventures of Rad Gravity, The Adventures of Rocky and Bullwinkle and Friends, The Adventures of Tom Sawyer After Burner (Unl) Air Fortress Airwolf Al Unser Jr. Turbo Racing Aladdin (Europe) Alfred Chicken Alien 3 Alien Syndrome (Unl) All-Pro Basketball Alpha Mission Amagon American Gladiators Anticipation Arch Rivals - A Basketbrawl! Archon Arkanoid Arkista's Ring Asterix (Europe) (En,Fr,De,Es,It) Astyanax Athena Athletic World Attack of the Killer Tomatoes Baby Boomer (Unl) Back to the Future Back to the Future Part II & III Bad Dudes Bad News Baseball Bad Street Brawler Balloon Fight Bandai Golf - Challenge Pebble Beach Bandit Kings of Ancient China Barbie (Rev A) Bard's Tale, The Barker Bill's Trick Shooting Baseball Baseball Simulator 1.000 Baseball Stars Baseball Stars II Bases Loaded (Rev B) Bases Loaded 3 Bases Loaded 4 Bases Loaded II - Second Season Batman - Return of the Joker Batman - The Video Game
    [Show full text]
  • Bomberman-Quickstart 4.Pdf
    Bomberman English QS 6/8/00 10:00 AM Page 2 TM Bomberman English QS 6/8/00 10:00 AM Page 4 NATION NATION Meet the Fuse that Just Won’t Lose. 95 double-click “My Computer,” then Controls double-click the CD-ROM icon, then <Space Bar> • Drop Bomb or those of you not familiar with double-click the setup icon. or • Drop Spooge the Bomberman experience, I will Online Manual / Help Joystick • Grab spare you the details. The Contains the manual you are holding in Button 1 • Throw Bomb Fdynamics of the game are as easy as your hands. <Enter> • Action Button 1-2-3. About Bomberman or • Stop kicked Bomb 1. Drop a bomb. This contains information not ready at Joystick • Punch Bomb 2. Run like hell. the time of this writing. You should Button 2 • Activate Trigger 3. Watch your back (and your front, read this in its entirety as it will contain your left, etc. Just watch out.) information on the various tools <Cursor Keys> Move Bomberman Getting Started System Requirements available, as well as any changes in MINIMUM: IBM or 100% compatible Pentium® Start Game functionality omitted within Battle Mode Setup 90, 16 MB of RAM, 40 MB free this manual. This is a 1-10 player no-holds barred hard disk space, CD-ROM drive, LocalBus or PCI When you start the game it action game! Play either against SVGA video card, Sound Blaster® Exit Game automatically detects the number of human or computer opponents in this or 100% compatible sound card, Windows® 95 This will end your current session of gamepads on your machine.
    [Show full text]
  • Bomberman As an Artificial Intelligence Platform
    MSc 2.º CICLO FCUP 2016 Bomberman Bomberman as an as an Artificial Intelligence Intelligence an Artificial as Artificial Intelligence Platform Manuel António da Cruz Lopes Dissertação de Mestrado apresentada à Faculdade de Ciências da Universidade do Porto em Área Científica 2016 Platform Manuel António António Manuelda Cruz Lopes !" #$ %& $''''''(''''''(''''''''' Manuel Ant´onioda Cruz Lopes Bomberman as an Artificial Intelligence Platform Departamento de Ci^enciade Computadores Faculdade de Ci^enciasda Universidade do Porto 2016 Manuel Ant´onioda Cruz Lopes Bomberman as an Artificial Intelligence Platform Tese submetida Faculdade de Ci^enciasda Universidade do Porto para obteno do grau de Mestre em Ci^enciade Computadores Advisors: Prof. Pedro Ribeiro Departamento de Ci^enciade Computadores Faculdade de Ci^enciasda Universidade do Porto Junho de 2016 2 Para os meus pais e para o meu irm~ao 3 4 Acknowledgments I would like to thank my supervisor, Prof. Pedro Ribeiro, for the opportunity to work on this research topic, giving me enough freedom to experiment and learn on my own, and yet he has always been there to guide me. Also to my friends and family that gave me the moral support needed to endure this journey. Thank you. 5 6 Abstract Computer games have a long tradition in the Artificial Intelligence (AI) field. They serve the purpose of establishing challenging goals with long term positive repercus- sions, with researchers trying to develop AIs capable of beating the world human experts. They also provide a good amount of exposition in mainstream media, whilst also giving an engaging environment for teaching computer science concepts in general and AI methods in particular.
    [Show full text]
  • CSSE220 Bomberman Programming Assignment – Team Project
    CSSE220 BomberMan programming assignment – Team Project You will write a game that is patterned off the 1980’s BomberMan game. You can find a description of the game, and much more information here: http://strategywiki.org/wiki/Bomberman Also, the original NES instructions manual: http://www.gamesdatabase.org/Media/SYSTEM/Nintendo_NES/manual/Formated/Bomberman_- _1987_-_Hudson_Soft.pdf You can also find an online playable version of the game here: http://game-oldies.com/play-online/bomberman-nintendo-nes# Table of Contents Essential features of your program ........................................................................................ 3 Nice features to add .......................................................................................................................3 Additional features that you might add include ................................................................................... 4 A major goal of this project.................................................................................................... 4 Parallel work ..................................................................................................................................4 Development cycles .......................................................................................................................4 Milestones (all due at the beginning of class, except as noted) ............................................... 5 Cycle 0: UML Class Diagram ............................................................................................................5
    [Show full text]
  • Fall Guys: Ultimate Knockout Und Super Bomberman R Online Treffen Sich Im Crossover
    ** PRESSEMELDUNG ** 27. Mai 2021 FALL GUYS: ULTIMATE KNOCKOUT UND SUPER BOMBERMAN R ONLINE TREFFEN SICH IM CROSSOVER SUPER BOMBERMAN R ONLINE erscheint heute für PlayStation, Xbox, Switch und Steam als Free-to-Play-Titel und lädt den “Bean Bomber” aus Fall Guys ein Konami Digital Entertainment B.V. und Mediatonic verkünden heute ein besonderes Crossover zwischen dem neu veröffentlichten SUPER BOMBERMAN R ONLINE und dem „bohnigen Stolper Royale“ Fall Guys: Ultimate Knockout. Als Teil der Zusammenarbeit können alle Spieler von SUPER BOMBERMAN R ONLINE einen exklusiven „Bean Bomber“-Spielcharakter* erhalten. Somit ist dies das erste Mal, dass es ein Fall Guy in ein anderes Spiel geschafft hat. *Bis auf Weiteres im Spiel auf allen Plattformen verfügbar. Kein zusätzlicher Kauf notwendig. Der “Bean Bomber” kann über das Schlachtfeld hechten, um nicht von unzähligen Bomben getroffen zu werden, während Spieler im neuen „Kampf 64“-Onlinemodus gegeneinander antreten. Zusätzlich wird auch die Bomberman-Reihe in Fall Guys: Ultimate Knockout Einzug halten, nämlich mit dem neuen „Bomberman“-Kostüm! Der ikonische Charakter schlägt am 4. Juni 2021 im Blunderdome ein und wird im Ingame-Shop von Fall Guys für insgesamt zehn Kronen erhältlich sein. “Bean Bomber”-Charakter in SBRO “Bomberman”-Kostüm in Fall Guys Gameplay des “Bean Bombers” in SBRO “Bean Bomber”-Profil in SBRO SUPER BOMBERMAN R ONLINE ist ab heute für PlayStation®4, Xbox One, Nintendo Switch™ und PC via Steam als downloadexklusiver Free-to-Play-Titel verfügbar. Das Spiel ist ebenfalls auf PlayStation®5 und Xbox Series X|S spielbar. PlayStation Store – LINK Microsoft Store (Xbox) – LINK Nintendo eShop – LINK Steam – LINK Google Stadia – LINK KONAMIs Richtlinien für erstellte Nutzerinhalte (bspw.
    [Show full text]
  • SUPER BOMBERMAN R a Fast Paced 1 to 8 Player Party Action Game That Fully Utilizes the New and Unique Nintendo Switch™!
    RELEASE DATE March 3rd 2017 GENRE Action PRICE $49.99 HAVE A BLAST!! SUPER BOMBERMAN R A fast paced 1 to 8 player party action game that fully utilizes the new and unique Nintendo Switch™! Bomberman, the iconic party-battle game, is returning as “SUPER BOMBERMAN R” on the Nintendo Switch and will be released alongside the launch of the new system for home and on-the-go fun! Dive into the simple yet addictive gameplay In SUPER BOMBERMAN R by controlling the main character, placing bombs, and battling enemies and rivals with a modern visual style including 3D stages and photo-realistic graphics. Take out CPU-controlled adversaries in ‘Story’ mode to clear a series of 50 unique stages in order to save the galaxy. SUPER BOMBERMAN R also makes full use of the Nintendo Switch’s first of its kind system capabilities which allows gamers to play where ever, whenever and with whomever they choose. Players can go head-to-head in a heated “Battle” mode where up to eight players are dropped within a maze until the “last man standing” is declared the winner. All iconic Bomberman elements return to the game including a dynamite list of items found within the destructible areas of each maze. Removing a wall will often reveal a power-up to extend the rage of the Nintendo Switch™ explosion or endow Bomberman with a useful skill including faster movement, the ability to kick or throw bombs and more! As players gain ever more devastating capabilities, they must use all their skill and ingenuity to ensure their opponents get caught in a blast while avoiding it’s fiery touch! Features: • +50 levels of single player or Co-Op maps! • 3D stages with dynamic environments • Bomberman’s siblings and well-known enemies are back with rich personality • Battle mode for maximum of 8 players, local connection battles, and online battles Marketing Calendar JAN FEB MAR APR Advertising PR Community Blood Nintendo Crossover Language Violence Retail / POP ©2017 Konami Digital Entertainment ©2017 Nintendo.
    [Show full text]
  • Bomberman Tournament
    BomberMan Tournament https://blogs.forbes.com/olliebarder/files/2018/06/bomberman_mizuno.jpg Mustafa Fuad Rifet Ibrahim Overview ● Game + Tournament ● 3 Agents Game + Tournament ● Environment (crates, walls, coins,...) ● Agents (move, drop bombs) ● score Game + Tournament ● Multiple eps ● 400 steps ● 0.5s thinking time ● Slow thinking penalty ● No multiprocessing ● One core of i7-8700K, up to 8GB RAM ● Simple agents http://randomhoohaas.flyingomelette.com/bomb/gba-st/img/art-multiplay.JPG Lord_Voldermort https://cdn.wallpapersafari.com/6/42/hSU3zx.jpg Lord_Voldermort Lord_Voldermort Lord_Voldermort Self: ● Empty ● Danger ● Bomb Lord_Voldermort Self: ● Empty ● Danger ● Bomb Self_Bomb: ● True ● False Lord_Voldermort Left,Up,Right,Down: Self: ● Wall ● Empty ● Enemy ● Danger ● Crate ● Bomb ● Coins ● Bombs Self_Bomb: ● ● Danger True ● ● Empty False ● Priority Lord_Voldermort Lord_Voldermort https://hci.iwr.uni-heidelberg.de/vislearn/HTML/teaching/courses/FML/bomberman_rl/agent.php?id=Lord_Voldemort NOBEL https://upload.wikimedia.org/wikipedia/commons/0/07/Alfred_Nobel3.jpg NOBEL https://neuro.cs.ut.ee/wp-content/uploads/2015/12/dqn.png NOBEL Fixed Q-Targets Fix Update frequently Update after n steps https://neuro.cs.ut.ee/wp-content/uploads/2015/12/dqn.png NOBEL Dueling DQN Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, 2016 NOBEL Wang et al, Dueling Network Architectures for Deep Reinforcement Learning, 2016 NOBEL NOBEL (Prioritized) Experience Replay: ● Data efficiency ● Learning the whole game ● Valuable experiences
    [Show full text]
  • BOMBERMAN CSEE 4840 Embedded System Design
    BOMBERMAN CSEE 4840 Embedded System Design CH2788 Cheng-Han Hsieh JW 2862 Jian Wang MW2724 Ming Wang SC3198 Shang-Ming Cherng 1. Project Instruction In this project, we intend to implement a popular game called BOMBERMAN. It will be a game of multiple players. Each player plays the game in separate machine and communicates with others through network. The general goal for game player is to kill each other by strategically placing bombs which will explode after predefined period. Bombs in the game have predefined effective hurt range, within which any player will be killed. After several games, we will decide the winner who kills the most people. 2. Design Block The following is the design block of our project. It is composed of CPU, SDRAM CONTROL, SDRAM CONTROL, ENTERNET CONTROL, AUDIO CONTROL, VGA CONTROL and PS2 CONROL. We will describe how they will be designed in this report. ENTERNET NIOS SYSTEM ENTERNET CONTROL CPU AUDIO AUDIO CHIP CONTROL SDRAM VGA VGA SDRAM CONTROL CONTROL KEYBOARD PS2_CONTROL 3. Software/ Hardware Implementation In the part we will describe how will we implement our project in software and hardware. Software and hardware have identical control table that is used to control character and rendesr the scene on screen. a. Software Software is mainly responsible for characters controlling, event processing and network communication. Players will use arrow keys on keyboard to make Bomber men move. For each movement software will check whether the Bomber men can do that or not through checking control table. After checking we can decide how to change the position of characters.
    [Show full text]