Accelerating Gradient Boosting Machine

Total Page:16

File Type:pdf, Size:1020Kb

Accelerating Gradient Boosting Machine Accelerating Gradient Boosting Machine Haihao Lu * 1 2 Sai Praneeth Karimireddy * 3 Natalia Ponomareva 2 Vahab Mirrokni 2 Abstract available implementations: scikit-learn (Pedregosa et al., Gradient Boosting Machine (GBM) (Friedman, 2011), R gbm (Ridgeway et al., 2013), LightGBM (Ke et al., 2001) is an extremely powerful supervised learn- 2017), XGBoost (Chen & Guestrin, 2016), TF Boosted ing algorithm that is widely used in practice. Trees (Ponomareva et al., 2017), etc. GBM routinely features as a leading algorithm In spite of the practical success of GBM, there is a consid- in machine learning competitions such as Kaggle erable gap in its theoretical understanding. The traditional and the KDDCup. In this work, we propose Ac- interpretation of GBM is to view it as a form of steepest celerated Gradient Boosting Machine (AGBM) by descent in functional space (Mason et al., 2000; Friedman, incorporating Nesterov’s acceleration techniques 2001). While this interpretation serves as a good starting into the design of GBM. The difficulty in accel- point, such framework lacks rigorous non-asymptotic con- erating GBM lies in the fact that weak (inexact) vergence guarantees, especially when compared to the grow- learners are commonly used, and therefore the ing body of literature on first order convex optimization. errors can accumulate in the momentum term. To overcome it, we design a “corrected pseudo resid- In convex optimization literature, Nesterov’s acceleration is ual” and fit best weak learner to this corrected a successful technique to speed up the convergence of first- pseudo residual, in order to perform the z-update. order methods. In this work, we show how to incorporate Thus, we are able to derive novel computational Nesterov momentum into the gradient boosting framework guarantees for AGBM. This is the first GBM type to obtain an accelerated gradient boosting machine. of algorithm with theoretically-justified acceler- ated convergence rate. Finally we demonstrate 1.1. Our contributions with a number of numerical experiments the ef- We propose the first accelerated gradient boosting algorithm fectiveness of AGBM over conventional GBM that comes with strong theoretical guarantees and can be in obtaining a model with good training and/or used with any type of weak learner. In particular: testing data fidelity. 1. We propose a novel accelerated gradient boosting al- gorithm (AGBM) (Section 3) and prove (Section 4) 1. Introduction that it reduces the empirical loss at a rate of O(1=m2) after m iterations, improving upon the O(1=m) rate Gradient Boosting Machine (GBM) (Friedman, 2001) is obtained by traditional gradient boosting methods a powerful supervised learning algorithm that combines 2. We propose a computationally inexpensive practical multiple weak-learners into an ensemble with excellent pre- variant of AGBM, taking advantage of strong convexity diction performance. GBM works very well for a number of of loss function, which achieves linear convergence tasks like spam filtering, online advertising, fraud detection, (Section 5). We also list the conditions (on the loss anomaly detection, computational physics (e.g., the Higgs function) under which AGBMs would be beneficial. Boson discovery), etc; and has routinely featured as a top al- 3. With a number of numerical experiments with weak gorithm in Kaggle competitions and the KDDCup (Chen & tree learners (one of the most popular type of GBMs) Guestrin, 2016). GBM can naturally handle heterogeneous we confirm the effectiveness of AGBM. datasets (highly correlated data, missing data, categorical 4. Our accelerated boosting framework will be open- data, etc). It is also quite easy to use with several publicly sourced at this url1. * 1 2 Equal contribution MIT, Cambridge, MA, USA Google, Apart from theoretical contributions, we paved the way to New York, NY, USA 3EPFL, Lausanne, Switzerland. Correspon- dence to: Haihao Lu <[email protected]>. speed up some practical applications of GBMs, which cur- rently require a large number of boosting iterations. For th Proceedings of the 35 International Conference on Machine 1 Learning, Stockholm, Sweden, PMLR 80, 2018. Copyright 2018 Link removed for anonymity; will be added upon acceptance. by the author(s). Accelerating Gradient Boosting Machine example, GBMs with boosted trees for multi-class problems cal explanation of acceleration techniques by studying the are commonly implemented as a number of one-vs-rest continuous-time interpretation of accelerated gradient de- learners, resulting in more complicated boundaries (Fried- scent via dynamical systems (Su et al., 2016; Wilson et al., man et al., 1998) and a potentially a larger number of boost- 2016; Hu & Lessard, 2017). ing iterations required. Additionally, it is a common practice Accelerated Greedy Coordinate and Matching Pursuit to build many very weak learners for problems where it is Methods: Recently, (Locatello et al., 2018) and (Lu et al., easy to overfit. Such large ensembles result not only in 2018) discussed how to accelerate matching pursuit and slow training time, but slower inference. AGBMs can be greedy coordinate descent algorithms respectively. Their potentially beneficial for all these applications. methods however require a random step and are hence only ‘semi-greedy’, which does not fit in the boosting framework. 1.2. Related Literature Accelerated GBM: Recently, (Biau et al., 2018) and Convergence Guarantees for GBM: After being first in- (Fouillen et al., 2018) proposed an accelerated version of troduced by Friedman et al. (Friedman, 2001), several GBM by directly incorporating Nesterov’s momentum in works established its guaranteed convergence, without ex- GBM, however, no theoretical justification was provided. plicitly stating the convergence rate (Collins et al., 2002; Furthermore, as we argue in Section 5.2, their proposed Mason et al., 2000). Subsequently, when the loss function algorithm may not converge to the optimum. was both smooth and strongly convex, (Bickel et al., 2006) proved an exponential convergence rate—more precisely that O(exp(1="2)) iterations are sufficient to ensure that 2. Gradient Boosting Machine the training loss is within " of its optimal value. (Telgar- We consider a supervised learning problem with n training sky, 2012) then studied the primal-dual structure of GBM p examples (xi; yi); i = 1; : : : ; n such that xi 2 R is the and demonstrated that in fact only O(log(1=")) iterations feature vector of the i-th example and yi is a label (in a clas- are needed. However the constants in their rate were non- sification problem) or a continuous response (in a regression standard and less intuitive. This result was recently im- problem). In the classical version of GBM (Friedman, 2001), proved upon by (Freund et al., 2017) and (Lu & Mazumder, we assume we are given a base class of learners B and that 2018), who showed a similar convergence rate but with more our target function class is the linear combination of such transparent constants such as the smoothness and strong con- base learners (denoted by lin(B)). Let B = fbτ (x) 2 Rg be vexity constant of the loss function, and the density of weak a family of learners parameterized by τ 2 T . The prediction learners, etc. Additionally, if the loss function is assumed to corresponding to a feature vector x is given by an additive be smooth and convex (but not necessarily strongly convex), model of the form: (Lu & Mazumder, 2018) also showed that O(1=") iterations M ! are sufficient. We refer the reader to (Telgarsky, 2012), (Fre- X f(x) := β b (x) 2 lin(B) ; (1) und et al., 2017) and (Lu & Mazumder, 2018) for a more m τm detailed literature review on the theoretical results of GBM m=1 convergence. where bτm (x) 2 B is a weak-learner and βm is its corre- Accelerated Gradient Methods: For optimizing a smooth sponding additive coefficient. Here, βm and τm are chosen convex function, (Nesterov, 1983) showed that the standard in an adaptive fashion in order to improve the data-fidelity gradient descent algorithm can be made much faster, re- as discussed below. Examples of learners commonly used sulting in the accelerated gradient descent method. While in practice include wavelet functions, support vector ma- chines, and classification and regression trees (Friedman gradient descent requires O(1=") iterations,p accelerated gra- dient methods only requires O(1= "). In general, this rate et al., 2001). We assume the set of weak learners B is of convergence is optimal and cannot be improved upon scalable, namely that the following assumption holds. (Nesterov, 2004). Since its introduction in 1983, the main- Assumption 2.1. If b(·) 2 B, then λb(·) 2 B for any λ > 0. stream research community’s interest in Nesterov’s accel- erated method started around 15 years ago; yet even today Assumption 2.1 holds for most of the set of weak learners we most researchers struggle to find basic intuition as to what are interested in. Indeed scaling a weak learner is equivalent is really going on in accelerated methods. Such lack of intu- to modifying the coefficient of the weak learner, so it does ition about the estimation sequence proof technique used by not change the structure of B. (Nesterov, 2004) has motivated many recent works trying The goal of GBM is to obtain a good estimate of the function to explain this acceleration phenomenon (Su et al., 2016; f that approximately minimizes the empirical loss: Wilson et al., 2016; Hu & Lessard, 2017; Lin et al., 2015; n Frostig et al., 2015; Allen-Zhu & Orecchia, 2014; Bubeck ? n X o L = min L(f) := `(yi; f(xi)) (2) et al., 2015). Some have recently attempted to give a physi- f2lin(B) i=1 Accelerating Gradient Boosting Machine where `(yi; f(xi)) is a measure of the data-fidelity for the 3. Accelerated Gradient Boosting Machine i-th sample for the loss function `.
Recommended publications
  • Arxiv:2006.04059V1 [Cs.LG] 7 Jun 2020
    Soft Gradient Boosting Machine Ji Feng1;2, Yi-Xuan Xu1;3, Yuan Jiang3, Zhi-Hua Zhou3 [email protected], fxuyx, jiangy, [email protected] 1Sinovation Ventures AI Institute 2Baiont Technology 3National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China Abstract Gradient Boosting Machine has proven to be one successful function approximator and has been widely used in a variety of areas. However, since the training procedure of each base learner has to take the sequential order, it is infeasible to parallelize the training process among base learners for speed-up. In addition, under online or incremental learning settings, GBMs achieved sub-optimal performance due to the fact that the previously trained base learners can not adapt with the environment once trained. In this work, we propose the soft Gradient Boosting Machine (sGBM) by wiring multiple differentiable base learners together, by injecting both local and global objectives inspired from gradient boosting, all base learners can then be jointly optimized with linear speed-up. When using differentiable soft decision trees as base learner, such device can be regarded as an alternative version of the (hard) gradient boosting decision trees with extra benefits. Experimental results showed that, sGBM enjoys much higher time efficiency with better accuracy, given the same base learner in both on-line and off-line settings. arXiv:2006.04059v1 [cs.LG] 7 Jun 2020 1. Introduction Gradient Boosting Machine (GBM) [Fri01] has proven to be one successful function approximator and has been widely used in a variety of areas [BL07, CC11]. The basic idea is to train a series of base learners that minimize some predefined differentiable loss function in a sequential fashion.
    [Show full text]
  • Xgboost: a Scalable Tree Boosting System
    XGBoost: A Scalable Tree Boosting System Tianqi Chen Carlos Guestrin University of Washington University of Washington [email protected] [email protected] ABSTRACT problems. Besides being used as a stand-alone predictor, it Tree boosting is a highly effective and widely used machine is also incorporated into real-world production pipelines for learning method. In this paper, we describe a scalable end- ad click through rate prediction [15]. Finally, it is the de- to-end tree boosting system called XGBoost, which is used facto choice of ensemble method and is used in challenges widely by data scientists to achieve state-of-the-art results such as the Netflix prize [3]. on many machine learning challenges. We propose a novel In this paper, we describe XGBoost, a scalable machine learning system for tree boosting. The system is available sparsity-aware algorithm for sparse data and weighted quan- 2 tile sketch for approximate tree learning. More importantly, as an open source package . The impact of the system has we provide insights on cache access patterns, data compres- been widely recognized in a number of machine learning and sion and sharding to build a scalable tree boosting system. data mining challenges. Take the challenges hosted by the machine learning competition site Kaggle for example. A- By combining these insights, XGBoost scales beyond billions 3 of examples using far fewer resources than existing systems. mong the 29 challenge winning solutions published at Kag- gle's blog during 2015, 17 solutions used XGBoost. Among these solutions, eight solely used XGBoost to train the mod- Keywords el, while most others combined XGBoost with neural net- Large-scale Machine Learning s in ensembles.
    [Show full text]
  • Downloaded from the Cancer Imaging Archive (TCIA)1 [13]
    UC Irvine UC Irvine Electronic Theses and Dissertations Title Deep Learning for Automated Medical Image Analysis Permalink https://escholarship.org/uc/item/78w60726 Author Zhu, Wentao Publication Date 2019 License https://creativecommons.org/licenses/by/4.0/ 4.0 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, IRVINE Deep Learning for Automated Medical Image Analysis DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Computer Science by Wentao Zhu Dissertation Committee: Professor Xiaohui Xie, Chair Professor Charless C. Fowlkes Professor Sameer Singh 2019 c 2019 Wentao Zhu DEDICATION To my wife, little son and my parents ii TABLE OF CONTENTS Page LIST OF FIGURES vi LIST OF TABLES x LIST OF ALGORITHMS xi ACKNOWLEDGMENTS xii CURRICULUM VITAE xiii ABSTRACT OF THE DISSERTATION xvi 1 Introduction 1 1.1 Dissertation Outline and Contributions . 3 2 Adversarial Deep Structured Nets for Mass Segmentation from Mammograms 7 2.1 Introduction . 7 2.2 FCN-CRF Network . 9 2.3 Adversarial FCN-CRF Nets . 10 2.4 Experiments . 12 2.5 Conclusion . 16 3 Deep Multi-instance Networks with Sparse Label Assignment for Whole Mammogram Classification 17 3.1 Introduction . 17 3.2 Deep MIL for Whole Mammogram Mass Classification . 19 3.2.1 Max Pooling-Based Multi-Instance Learning . 20 3.2.2 Label Assignment-Based Multi-Instance Learning . 22 3.2.3 Sparse Multi-Instance Learning . 23 3.3 Experiments . 24 3.4 Conclusion . 27 iii 4 DeepLung: Deep 3D Dual Path Nets for Automated Pulmonary Nodule Detection and Classification 29 4.1 Introduction .
    [Show full text]
  • Gradient Boosting Meets Graph Neural Networks
    Published as a conference paper at ICLR 2021 BOOST THEN CONVOLVE:GRADIENT BOOSTING MEETS GRAPH NEURAL NETWORKS Sergei Ivanov Liudmila Prokhorenkova Criteo AI Lab; Skoltech Yandex; HSE University; MIPT Paris, France Moscow, Russia [email protected] [email protected] ABSTRACT Graph neural networks (GNNs) are powerful models that have been successful in various graph representation learning tasks. Whereas gradient boosted decision trees (GBDT) often outperform other machine learning methods when faced with heterogeneous tabular data. But what approach should be used for graphs with tabular node features? Previous GNN models have mostly focused on networks with homogeneous sparse features and, as we show, are suboptimal in the heterogeneous setting. In this work, we propose a novel architecture that trains GBDT and GNN jointly to get the best of both worlds: the GBDT model deals with heterogeneous features, while GNN accounts for the graph structure. Our model benefits from end- to-end optimization by allowing new trees to fit the gradient updates of GNN. With an extensive experimental comparison to the leading GBDT and GNN models, we demonstrate a significant increase in performance on a variety of graphs with tabular features. The code is available: https://github.com/nd7141/bgnn. 1 INTRODUCTION Graph neural networks (GNNs) have shown great success in learning on graph-structured data with various applications in molecular design (Stokes et al., 2020), computer vision (Casas et al., 2019), combinatorial optimization (Mazyavkina et al., 2020), and recommender systems (Sun et al., 2020). The main driving force for progress is the existence of canonical GNN architecture that efficiently encodes the original input data into expressive representations, thereby achieving high-quality results on new datasets and tasks.
    [Show full text]
  • Comparative Performance of Machine Learning Algorithms in Cyberbullying Detection: Using Turkish Language Preprocessing Techniques
    Comparative Performance of Machine Learning Algorithms in Cyberbullying Detection: Using Turkish Language Preprocessing Techniques Emre Cihan ATESa*, Erkan BOSTANCIb, Mehmet Serdar GÜZELc a Department of Security Science, Gendarmerie and Coast Guard Academy (JSGA), Ankara, Turkey. b Department of Computer Engineering, Ankara University, Ankara, Turkey. c Department of Computer Engineering, Ankara University, Ankara, Turkey. Abstract With the increasing use of the internet and social media, it is obvious that cyberbullying has become a major problem. The most basic way for protection against the dangerous consequences of cyberbullying is to actively detect and control the contents containing cyberbullying. When we look at today's internet and social media statistics, it is impossible to detect cyberbullying contents only by human power. Effective cyberbullying detection methods are necessary in order to make social media a safe communication space. Current research efforts focus on using machine learning for detecting and eliminating cyberbullying. Although most of the studies have been conducted on English texts for the detection of cyberbullying, there are few studies in Turkish. Limited methods and algorithms were also used in studies conducted on the Turkish language. In addition, the scope and performance of the algorithms used to classify the texts containing cyberbullying is different, and this reveals the importance of using an appropriate algorithm. The aim of this study is to compare the performance of different machine learning algorithms in detecting Turkish messages containing cyberbullying. In this study, nineteen different classification algorithms were used to identify texts containing cyberbullying using Turkish natural language processing techniques. Precision, recall, accuracy and F1 score values were used to evaluate the performance of classifiers.
    [Show full text]
  • Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting
    www.nature.com/scientificreports OPEN Enhanced Prediction of Hot Spots at Protein-Protein Interfaces Using Extreme Gradient Boosting Received: 5 June 2018 Hao Wang, Chuyao Liu & Lei Deng Accepted: 10 September 2018 Identifcation of hot spots, a small portion of protein-protein interface residues that contribute the Published: xx xx xxxx majority of the binding free energy, can provide crucial information for understanding the function of proteins and studying their interactions. Based on our previous method (PredHS), we propose a new computational approach, PredHS2, that can further improve the accuracy of predicting hot spots at protein-protein interfaces. Firstly we build a new training dataset of 313 alanine-mutated interface residues extracted from 34 protein complexes. Then we generate a wide variety of 600 sequence, structure, exposure and energy features, together with Euclidean and Voronoi neighborhood properties. To remove redundant and irrelevant information, we select a set of 26 optimal features utilizing a two-step feature selection method, which consist of a minimum Redundancy Maximum Relevance (mRMR) procedure and a sequential forward selection process. Based on the selected 26 features, we use Extreme Gradient Boosting (XGBoost) to build our prediction model. Performance of our PredHS2 approach outperforms other machine learning algorithms and other state-of-the-art hot spot prediction methods on the training dataset and the independent test set (BID) respectively. Several novel features, such as solvent exposure characteristics, second structure features and disorder scores, are found to be more efective in discriminating hot spots. Moreover, the update of the training dataset and the new feature selection and classifcation algorithms play a vital role in improving the prediction quality.
    [Show full text]
  • Development and Evaluation of the Combined Machine Learning Models for the Prediction of Dam Inflow
    water Article Development and Evaluation of the Combined Machine Learning Models for the Prediction of Dam Inflow Jiyeong Hong 1 , Seoro Lee 1, Joo Hyun Bae 2 , Jimin Lee 1, Woon Ji Park 1, Dongjun Lee 1, Jonggun Kim 1 and Kyoung Jae Lim 1,* 1 Department of Regional Infrastructure Engineering, Kangwon National University, Chuncheon-si 24341, Korea; [email protected] (J.H.); [email protected] (S.L.); [email protected] (J.L.); [email protected] (W.J.P.); [email protected] (D.L.); [email protected] (J.K.) 2 Korea Water Environment Research Institute, Chuncheon-si 24408, Korea; [email protected] * Correspondence: [email protected] Received: 3 September 2020; Accepted: 15 October 2020; Published: 20 October 2020 Abstract: Predicting dam inflow is necessary for effective water management. This study created machine learning algorithms to predict the amount of inflow into the Soyang River Dam in South Korea, using weather and dam inflow data for 40 years. A total of six algorithms were used, as follows: decision tree (DT), multilayer perceptron (MLP), random forest (RF), gradient boosting (GB), recurrent neural network–long short-term memory (RNN–LSTM), and convolutional neural network–LSTM (CNN–LSTM). Among these models, the multilayer perceptron model showed the best results in predicting dam inflow, with the Nash–Sutcliffe efficiency (NSE) value of 0.812, root mean squared errors (RMSE) of 77.218 m3/s, mean absolute error (MAE) of 29.034 m3/s, correlation coefficient (R) of 0.924, and determination coefficient (R2) of 0.817. However, when the amount of dam inflow is below 100 m3/s, the ensemble models (random forest and gradient boosting models) performed better than MLP for the prediction of dam inflow.
    [Show full text]
  • On Interpretability of Artificial Neural Networks
    On Interpretability of Artificial Neural Networks: A Survey Feng-Lei Fan, Student Member, IEEE, Jinjun Xiong, Senior Member, IEEE, Mengzhou Li, Student Member, IEEE, and Ge Wang, Fellow, IEEE Abstract— Deep learning as represented by the artificial deep learning, interpretability is needed to hold a deep learning neural networks (DNNs) has achieved great success in many system accountable [57]. If a model builder can explain why a important areas that deal with text, images, videos, graphs, and model makes a particular decision under certain conditions, so on. However, the black-box nature of DNNs has become one of users would know whether such a model contributes to an the primary obstacles for their wide acceptance in mission- adverse event or not. It is then possible to establish standards critical applications such as medical diagnosis and therapy. Due to the huge potential of deep learning, interpreting neural and protocols to use the deep learning system optimally. networks has recently attracted much research attention. In this The lack of interpretability has become a main barrier of deep paper, based on our comprehensive taxonomy, we systematically learning in its wide acceptance in mission-critical applications. review recent studies in understanding the mechanism of neural For example, regulations were proposed by European Union in networks, describe applications of interpretability especially in medicine, and discuss future directions of interpretability 2016 that individuals affected by algorithms have the right to research, such as in relation to fuzzy logic and brain science. obtain an explanation [61]. Despite great research efforts made Index Terms—Deep learning, neural networks, on interpretability of deep learning and availability of several interpretability, survey.
    [Show full text]
  • An Intuitive Exploration of Artificial Intelligence
    An Intuitive Exploration of Artificial Intelligence Simant Dube An Intuitive Exploration of Artificial Intelligence Theory and Applications of Deep Learning Simant Dube Technology and Innovation Office Varian Medical Systems A Siemens Healthineers Company Palo Alto, CA, USA ISBN 978-3-030-68623-9 ISBN 978-3-030-68624-6 (eBook) https://doi.org/10.1007/978-3-030-68624-6 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
    [Show full text]
  • Comparison of Different U-Net Models for Building Extraction from High- Resolution Aerial Imagery Firat ERDEM, Ugur AVDAN
    ISSN: 2148-9173 Vol: 7 Issue:3 Dec 2020 International Journal of Environment and Geoinformatics (IJEGEO) is an international, multidisciplinary, peer reviewed, open access journal. Comparison of Different U-Net Models for Building Extraction from High- Resolution Aerial Imagery Firat ERDEM, Ugur AVDAN Chief in Editor Prof. Dr. Cem Gazioğlu Co-Editors Prof. Dr. Dursun Zafer Şeker, Prof. Dr. Şinasi Kaya, Prof. Dr. Ayşegül Tanık and Assist. Prof. Dr. Volkan Demir Guest Editor Dr. Nedim Onur AYKUT Editorial Committee (December 2020) Assos. Prof. Dr. Abdullah Aksu (TR), Assit. Prof. Dr. Uğur Algancı (TR), Prof. Dr. Bedri Alpar (TR), Prof. Dr. Levent Bat (TR), Prof. Dr. Paul Bates (UK), İrşad Bayırhan (TR), Prof. Dr. Bülent Bayram (TR), Prof. Dr. Luis M. Botana (ES), Prof. Dr. Nuray Çağlar (TR), Prof. Dr. Sukanta Dash (IN), Dr. Soofia T. Elias (UK), Prof. Dr. A. Evren Erginal (TR), Assoc. Prof. Dr. Cüneyt Erenoğlu (TR), Dr. Dieter Fritsch (DE), Prof. Dr. Çiğdem Göksel (TR), Prof.Dr. Lena Halounova (CZ), Prof. Dr. Manik Kalubarme (IN), Dr. Hakan Kaya (TR), Assist. Prof. Dr. Serkan Kükrer (TR), Assoc. Prof. Dr. Maged Marghany (MY), Prof. Dr. Michael Meadows (ZA), Prof. Dr. Nebiye Musaoğlu (TR), Prof. Dr. Masafumi Nakagawa (JP), Prof. Dr. Hasan Özdemir (TR), Prof. Dr. Chryssy Potsiou (GR), Prof. Dr. Erol Sarı (TR), Prof. Dr. Maria Paradiso (IT), Prof. Dr. Petros Patias (GR), Prof. Dr. Elif Sertel (TR), Prof. Dr. Nüket Sivri (TR), Prof. Dr. Füsun Balık Şanlı (TR), Prof. Dr. Uğur Şanlı (TR), Duygu Ülker (TR), Prof. Dr. Seyfettin Taş (TR), Assoc. Prof. Dr. Ömer Suat Taşkın (US), Assist.
    [Show full text]
  • Imitation Learning in Relational Domains: a Functional-Gradient Boosting Approach
    Imitation Learning in Relational Domains: A Functional-Gradient Boosting Approach Sriraam Natarajan and Saket Joshi+ and Prasad Tadepalli+ and Kristian Kersting# and Jude Shavlik∗ Wake Forest University School of Medicine, USA + Oregon State University, USA # Fraunhofer IAIS, Germany, ∗ University of Wisconsin-Madison, USA Abstract follow a meaningful trajectory. Nevertheless, techniques used from supervised learning have been successful for imitation Imitation learning refers to the problem of learn- learning [Ratliff et al., 2006]. We follow this tradition and ing how to behave by observing a teacher in ac- investigate the use of supervised learning methods to learn tion. We consider imitation learning in relational behavioral policies. domains, in which there is a varying number of ob- jects and relations among them. In prior work, sim- Inverse reinforcement learning is a popular approach to ple relational policies are learned by viewing imi- imitation learning studied under the name of apprentice- tation learning as supervised learning of a function ship learning [Abbeel and Ng, 2004; Ratliff et al., 2006; from states to actions. For propositional worlds, Neu and Szepesvari, 2007; Syed and Schapire, 2007]. Here functional gradient methods have been proved to one would assume that the observations are generated by a be beneficial. They are simpler to implement than near-optimal policy of a Markov Decision Process (MDP) most existing methods, more efficient, more natu- with an unknown reward function. The approaches usually rally satisfy common constraints on the cost func- take the form of learning the reward function of the agent and tion, and better represent our prior beliefs about solving the corresponding MDP.
    [Show full text]
  • Episodic Exploration for Deep Deterministic Policies for Starcraft Micromanagement
    Published as a conference paper at ICLR 2017 EPISODIC EXPLORATION FOR DEEP DETERMINISTIC POLICIES FOR STARCRAFT MICROMANAGEMENT Nicolas Usunier∗, Gabriel Synnaeve∗, Zeming Lin, Soumith Chintala Facebook AI Research {usunier,gab,zlin,soumith}@fb.com ABSTRACT We consider scenarios from the real-time strategy game StarCraft as benchmarks for reinforcement learning algorithms. We focus on micromanagement, that is, the short-term, low-level control of team members during a battle. We propose several scenarios that are challenging for reinforcement learning algorithms because the state- action space is very large, and there is no obvious feature representation for the value functions. We describe our approach to tackle the micromanagement scenarios with deep neural network controllers from raw state features given by the game engine. We also present a heuristic reinforcement learning algorithm which combines direct exploration in the policy space and backpropagation. This algorithm collects traces for learning using deterministic policies, which appears much more efficient than, e.g., -greedy exploration. Experiments show that this algorithm allows to successfully learn non-trivial strategies for scenarios with armies of up to 15 agents, where both Q-learning and REINFORCE struggle. 1 INTRODUCTION StarCraft1 is a real-time strategy (RTS) game in which each player must build an army and control individual units to destroy the opponent’s army. As of today, StarCraft is considered one of the most difficult games for computers, and the best bots only reach the level of high amateur human players (Churchill, 2015). The main difficulty comes from the need to control a large number of units in partially observable environment, with very large state and action spaces: for example, in a typical game, there are at least 101685 possible states whereas the game of Go has about 10170 states.
    [Show full text]