Reinforcement Learning in Supervised Problem Domains

Total Page:16

File Type:pdf, Size:1020Kb

Reinforcement Learning in Supervised Problem Domains Technische Universität München Fakultät für Informatik Lehrstuhl VI – Echtzeitsysteme und Robotik reinforcement learning in supervised problem domains Thomas F. Rückstieß Vollständiger Abdruck der von der Fakultät für Informatik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. Daniel Cremers Prüfer der Dissertation 1. Univ.-Prof. Dr. Patrick van der Smagt 2. Univ.-Prof. Dr. Hans Jürgen Schmidhuber Die Dissertation wurde am 30. 06. 2015 bei der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 18. 09. 2015 angenommen. Thomas Rückstieß: Reinforcement Learning in Supervised Problem Domains © 2015 email: [email protected] ABSTRACT Despite continuous advances in computing technology, today’s brute for- ce data processing approaches may not provide the necessary advantage to win the race against the ever-growing amount of data that can be wit- nessed over the last decades. In this thesis, we discuss novel methods and algorithms that are capable of directing attention to relevant details and analysing it in sequence to overcome the processing bottleneck and to keep up with this data explosion. In the first of three parts, a novel exploration technique for Policy Gradi- ent Reinforcement Learning is presented which replaces traditional ad- ditive random exploration with state-dependent exploration, exploring on a higher, more strategic level. We will show how this new exploration method converges faster and finds better global solutions than random exploration can. The second part of this thesis will introduce the concept of “data con- sumption” and discuss means to minimise it in supervised learning tasks by deriving classification as a sequential decision process and ma- king it accessible to Reinforcement Learning methods. Depending on previously selected features and the internal belief state of a classifier a next feature is chosen by a sequential online feature selection that learns which features are most informative at each given time step. In experi- ments this attentive hybrid learning system shows significant reduction in required data for correct classification. Finally, the third major contribution of this thesis is a novel sequence learning approach that learns an explicit contextual state while traver- sing a sequence. This context helps distinguish the current input and mitigates the need for a predictor capable of dealing with sequential data. We show the close relationship to concepts from theoretical com- puter science, in particular that of deterministic finite automata and re- gular languages and demonstrate experimentally the capabilities of this hybrid algorithm. All three parts share in common a tight integration of Reinforcement Learning and Supervised Learning which not only delivers an orthogo- nal view onto this research but also establishes for the first time a general framework of such hybrid algorithms. iii ZUSAMMENFASSUNG Trotz des ständigen technischen Fortschritts in der Computertechnolo- gie ist es durchaus möglich, dass die heutigen Holzhammer-Methoden der Datenanalyse uns nicht den nötigen Vorteil bringen, um das Rennen gegen das stetige Datenwachstum zu gewinnen, das in den letzten Jahr- zehnten zu beobachten ist. In dieser Dissertation werden neue Methoden und Algorithmen diskutiert, die in der Lage sind, ihre Aufmerksamkeit auf die relevanten Details zu richten und diese sequentiell zu verarbei- ten, um den Flaschenhals der Informationsverarbeitung zu überwinden und mit der Datenexplosion Schritt zu halten. Im ersten von drei Teilen wird eine neuartige Explorationsmethode für Reinforcement Learning (bestärkendes Lernen) mit Policy Gradients vor- gestellt, welches die traditionelle Art des additiven Explorierens durch zustandsabhängige Exploration ersetzt und auf einem höheren und mehr strategischen Level arbeitet. Wir zeigen, dass diese neue Form der Explo- ration schneller konvergiert und bessere und globalere Lösungen finden kann als zufällige Exploration. Der zweite Teil dieser Arbeit stellt das Konzept des “Datenkonsums” vor und diskutiert Mittel, um diesen für überwachte Lernvorgänge zu minimieren. Dies wird ermöglicht indem Klassifizierung als sequenzi- eller Entscheidungsprozess hergeleitet und dadurch bestärkenden Lern- methoden zugänglich gemacht wird. Abhängig von vorherig ausgewähl- ten Features und des internen Zustands eines Klassifizierers wählt eine sequenzielle Komponente ein neues Feature in Echtzeit aus, indem es lernt, welches Feature den höchsten Informationsgehalt zum jeweiligen Zeitpunkt trägt. In Experimenten zeigt dieses hybride Lernsystem ei- ne signifikante Verringerung der nötigen Datenmenge für die korrekte Klassifizerung. Der dritte bedeutende Beitrag dieser Dissertation beschreibt eine neue sequenzielle Lernmethode, die während des Traversierens einer Sequenz einen expliziten kontextuellen Zustand aufbaut. Dieser Kontext unter- stützt die Charakterisierung der aktuellen Eingabe und ermöglicht das Auseinanderhalten zeitlicher Eingabesignale ohne die Hilfe einer sequen- tiellen Klassifikationsmethode. Wir zeigen die enge Verwandtschaft zu Konzepten aus der Theoretischen Informatik auf, insbesondere zur Au- tomatentheorie und Regulären Sprachen, und demonstrieren die Mög- lichkeiten dieses hybriden Algorithmus experimentell. iv Alle drei Teile haben eine enge Integration von bestärkenden und über- wachten Lernmethoden gemeinsam. Dies stellt nicht nur eine alternati- ve Ansicht auf die hier vorgestellten Forschungsergebnisse dar, sondern etabliert zum ersten Mal ein allgemeines Rahmensystem solcher kombi- nierter Algorithmen. v ACKNOWLEDGEMENTS I would like to thank my supervisors Prof. Jürgen Schmidhuber and Prof. Patrick van der Smagt for their help and support. Without either of them, this thesis would not have been possible. I thank Jürgen for believing in me and letting me join his group, for his inspiring vision and valuable conversations, and for his support and funding during the first years of my Ph.D. I want to thank Patrick for jumping in and taking over the supervision without knowing me much, for funding my work in the later years of my Ph.D., and for the interesting lunch discussions during my visits. His constructive and criticial suggestions and diligent proofreading have helped improve this thesis tremendously. Special thanks go to my colleagues at TUM (a.k.a. “die Buam”), for the many great and interesting years together. Christian Osendorfer, my “roommate” for many years, was always there to bounce off ideas, dis- cuss crazy theories and help out with complex mathematical problems. Thanks go to Frank Sehnke, Alex Graves and Martin Felder, the other Buam in the team. We worked on some great projects together, always had each other’s backs and made the years of Ph.D. slavery a fun experi- ence. The other contributors of PyBrain, in particular Justin Bayer, Tom Schaul and Daan Wierstra, also deserve a mention for their great work on this joint project. Everyone at the chair of Robotics and Embedded Systems, especially Prof. Alois Knoll, Dr. Gerhard Schrott, our secre- taries Monika Knürr, Gisela Hibsch and Amy Bücherl, have helped and supported me in various ways over the years and made my time at TUM more about the research and less about bureaucracy. A big Dankeschön goes to all my friends. I’m lucky to have met so many great people, both during my time in Tübingen, and after my move to München. They supported me in every decision, were there for me during good and bad times, and kept me sane and happy over all these years. I miss you all incredibly. I also want to thank my parents Wolfgang and Berta for a happy and unburdened upbringing, for giving me the opportunity to make my own choices in life and for always giving me good advise for the important decisions. Finally, I would like to thank my wonderful partner Steve Pennells for supporting me unconditionally during all this time, for his sacrifices and his endless patience over the last years and for being such a great companion on our journey around the globe and through life. vi CONTENTS abstract iii zusammenfassung iv acknowledgements vi 1 introduction 3 I Overview1 2 outline 5 3 contributions 7 3.1 Scientific Contributions . 7 3.2 Technical Contributions . 8 4 function approximation 13 II Literature Review 11 4.1 General . 13 4.2 Function Approximation in Reinforcement Learning . 14 4.3 Linear Regression . 15 4.4 Logistic and Multinomial Regression . 17 4.5 Neural Networks . 18 4.5.1 Feed-Forward Neural Networks . 19 4.5.2 Recurrent Neural Networks and LSTMs . 22 4.6 Locally Weighted Projection Regression . 24 4.7 Sequence Learning . 24 5 reinforcement learning 27 5.1 General Remarks and Notation . 27 5.1.1 Formal Definition of Reinforcement Learning . 28 5.2 Model-based Reinforcement Learning . 31 5.3 Value-Based Reinforcement Learning . 31 5.3.1 Discrete Value-Based RL . 32 5.3.2 Continuous Value-Based RL . 35 5.4 Direct Reinforcement Learning . 41 5.4.1 Overall performance measure . 41 vii viii Contents 5.4.2 Finite Difference Methods . 41 5.4.3 Likelihood Ratio Methods . 42 5.5 Exploration . 45 5.5.1 Exploration in Discrete Action Spaces . 45 5.5.2 Exploration in Continuous Action Spaces . 46 6 state-dependent exploration 51 III Methods and Experiments 49 6.1 Introduction . 51 6.2 Exploration in Parameter Space . 53 6.3 State-Dependent Exploration . 55 6.3.1 REINFORCE for General Multi-Dimensional FA . 56 6.3.2 Derivation from
Recommended publications
  • Predrnn: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal Lstms
    PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs Yunbo Wang Mingsheng Long∗ School of Software School of Software Tsinghua University Tsinghua University [email protected] [email protected] Jianmin Wang Zhifeng Gao Philip S. Yu School of Software School of Software School of Software Tsinghua University Tsinghua University Tsinghua University [email protected] [email protected] [email protected] Abstract The predictive learning of spatiotemporal sequences aims to generate future images by learning from the historical frames, where spatial appearances and temporal vari- ations are two crucial structures. This paper models these structures by presenting a predictive recurrent neural network (PredRNN). This architecture is enlightened by the idea that spatiotemporal predictive learning should memorize both spatial ap- pearances and temporal variations in a unified memory pool. Concretely, memory states are no longer constrained inside each LSTM unit. Instead, they are allowed to zigzag in two directions: across stacked RNN layers vertically and through all RNN states horizontally. The core of this network is a new Spatiotemporal LSTM (ST-LSTM) unit that extracts and memorizes spatial and temporal representations simultaneously. PredRNN achieves the state-of-the-art prediction performance on three video prediction datasets and is a more general framework, that can be easily extended to other predictive learning tasks by integrating with other architectures. 1 Introduction
    [Show full text]
  • Machine Learning Or Econometrics for Credit Scoring: Let’S Get the Best of Both Worlds Elena Dumitrescu, Sullivan Hué, Christophe Hurlin, Sessi Tokpavi
    Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds Elena Dumitrescu, Sullivan Hué, Christophe Hurlin, Sessi Tokpavi To cite this version: Elena Dumitrescu, Sullivan Hué, Christophe Hurlin, Sessi Tokpavi. Machine Learning or Econometrics for Credit Scoring: Let’s Get the Best of Both Worlds. 2020. hal-02507499v2 HAL Id: hal-02507499 https://hal.archives-ouvertes.fr/hal-02507499v2 Preprint submitted on 2 Nov 2020 (v2), last revised 15 Jan 2021 (v3) HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Machine Learning or Econometrics for Credit Scoring: Let's Get the Best of Both Worlds∗ Elena, Dumitrescu,y Sullivan, Hu´e,z Christophe, Hurlin,x Sessi, Tokpavi{ October 18, 2020 Abstract In the context of credit scoring, ensemble methods based on decision trees, such as the random forest method, provide better classification performance than standard logistic regression models. However, logistic regression remains the benchmark in the credit risk industry mainly because the lack of interpretability of ensemble methods is incompatible with the requirements of financial regulators. In this paper, we pro- pose to obtain the best of both worlds by introducing a high-performance and inter- pretable credit scoring method called penalised logistic tree regression (PLTR), which uses information from decision trees to improve the performance of logistic regression.
    [Show full text]
  • Almost Unsupervised Text to Speech and Automatic Speech Recognition
    Almost Unsupervised Text to Speech and Automatic Speech Recognition Yi Ren * 1 Xu Tan * 2 Tao Qin 2 Sheng Zhao 3 Zhou Zhao 1 Tie-Yan Liu 2 Abstract 1. Introduction Text to speech (TTS) and automatic speech recognition (ASR) are two popular tasks in speech processing and have Text to speech (TTS) and automatic speech recog- attracted a lot of attention in recent years due to advances in nition (ASR) are two dual tasks in speech pro- deep learning. Nowadays, the state-of-the-art TTS and ASR cessing and both achieve impressive performance systems are mostly based on deep neural models and are all thanks to the recent advance in deep learning data-hungry, which brings challenges on many languages and large amount of aligned speech and text data. that are scarce of paired speech and text data. Therefore, However, the lack of aligned data poses a ma- a variety of techniques for low-resource and zero-resource jor practical problem for TTS and ASR on low- ASR and TTS have been proposed recently, including un- resource languages. In this paper, by leveraging supervised ASR (Yeh et al., 2019; Chen et al., 2018a; Liu the dual nature of the two tasks, we propose an et al., 2018; Chen et al., 2018b), low-resource ASR (Chuang- almost unsupervised learning method that only suwanich, 2016; Dalmia et al., 2018; Zhou et al., 2018), TTS leverages few hundreds of paired data and extra with minimal speaker data (Chen et al., 2019; Jia et al., 2018; unpaired data for TTS and ASR.
    [Show full text]
  • Self-Supervised Learning
    Self-Supervised Learning Andrew Zisserman Slides from: Carl Doersch, Ishan Misra, Andrew Owens, Carl Vondrick, Richard Zhang The ImageNet Challenge Story … 1000 categories • Training: 1000 images for each category • Testing: 100k images The ImageNet Challenge Story … strong supervision The ImageNet Challenge Story … outcomes Strong supervision: • Features from networks trained on ImageNet can be used for other visual tasks, e.g. detection, segmentation, action recognition, fine grained visual classification • To some extent, any visual task can be solved now by: 1. Construct a large-scale dataset labelled for that task 2. Specify a training loss and neural network architecture 3. Train the network and deploy • Are there alternatives to strong supervision for training? Self-Supervised learning …. Why Self-Supervision? 1. Expense of producing a new dataset for each new task 2. Some areas are supervision-starved, e.g. medical data, where it is hard to obtain annotation 3. Untapped/availability of vast numbers of unlabelled images/videos – Facebook: one billion images uploaded per day – 300 hours of video are uploaded to YouTube every minute 4. How infants may learn … Self-Supervised Learning The Scientist in the Crib: What Early Learning Tells Us About the Mind by Alison Gopnik, Andrew N. Meltzoff and Patricia K. Kuhl The Development of Embodied Cognition: Six Lessons from Babies by Linda Smith and Michael Gasser What is Self-Supervision? • A form of unsupervised learning where the data provides the supervision • In general, withhold some part of the data, and task the network with predicting it • The task defines a proxy loss, and the network is forced to learn what we really care about, e.g.
    [Show full text]
  • A Comprehensive Case Study: an Examination of Machine Learning and Connectionist Algorithms
    A Comprehensive Case Study: An Examination of Machine Learning and Connectionist Algorithms A Thesis Presented to the Department of Computer Science Brigham Young University In Partial Fulfillment of the Requirements for the Degree Master of Science Frederick Zarndt [email protected] June 1995 Contents 1. Introduction . 1 2. Databases . 6 2.1 Database Names . 7 2.2 Database Statistics . 11 3. Learning Models . 20 3.1 Decision Trees . 20 3.2 Nearest Neighbor . 21 3.3 Statistical . 21 3.4 Rule Based . 22 3.6 Neural Networks . 22 4. Method . 24 4.1 Cross-Validation . 24 4.2 Consistency of Training/Testing Sets and Presentation Order . 26 4.3 Attribute Discretization . 26 4.4 Unknown Attribute Values . 28 4.5 Learning Model Parameters . 30 5. Results . 31 6. Limitations . 52 7. Conclusion . 53 Appendix A: Standard Database Format (SDF) . 55 Appendix B: Tools . 58 B.1 Database Tools . 58 B.1.1 verify . 58 B.1.2 verify.ksh . 58 B.1.3 dsstats . 58 B.1.4 dsstats.ksh . 59 B.1.5 xlate . 59 B.2 Test Tools . 61 B.2.1 xval.<algorithm> . 61 B.2.2 test-ml . 66 B.2.3 test-conn . 67 B.3 Result Extraction Tools . 68 B.3.1 average . 68 B.3.2 summarize . 68 Appendix C: Database Generators . 69 Appendix D. Learning Model Parameters . 70 D.1 Decision Trees . 70 D.2 Nearest Neighbor . 70 D.3 Statistical . 71 D.4 Rule Based . 71 D.5 Neural Networks . 72 Appendix E. Database Descriptions . 80 Bibliography . 85 A Comprehensive Case Study: An Examination of Machine Learning and Connectionist Algorithms Frederick Zarndt Department of Computer Science M.S.
    [Show full text]
  • Infinite Variational Autoencoder for Semi-Supervised Learning
    Infinite Variational Autoencoder for Semi-Supervised Learning M. Ehsan Abbasnejad Anthony Dick Anton van den Hengel The University of Adelaide {ehsan.abbasnejad, anthony.dick, anton.vandenhengel}@adelaide.edu.au Abstract with whatever labelled data is available to train a discrimi- native model for classification. This paper presents an infinite variational autoencoder We demonstrate that our infinite VAE outperforms both (VAE) whose capacity adapts to suit the input data. This the classical VAE and standard classification methods, par- is achieved using a mixture model where the mixing coef- ticularly when the number of available labelled samples is ficients are modeled by a Dirichlet process, allowing us to small. This is because the infinite VAE is able to more ac- integrate over the coefficients when performing inference. curately capture the distribution of the unlabelled data. It Critically, this then allows us to automatically vary the therefore provides a generative model that allows the dis- number of autoencoders in the mixture based on the data. criminative model, which is trained based on its output, to Experiments show the flexibility of our method, particularly be more effectively learnt using a small number of samples. for semi-supervised learning, where only a small number of The main contribution of this paper is twofold: (1) we training samples are available. provide a Bayesian non-parametric model for combining autoencoders, in particular variational autoencoders. This bridges the gap between non-parametric Bayesian meth- 1. Introduction ods and the deep neural networks; (2) we provide a semi- supervised learning approach that utilizes the infinite mix- The Variational Autoencoder (VAE) [18] is a newly in- ture of autoencoders learned by our model for prediction troduced tool for unsupervised learning of a distribution x x with from a small number of labeled examples.
    [Show full text]
  • Unsupervised Speech Representation Learning Using Wavenet Autoencoders Jan Chorowski, Ron J
    1 Unsupervised speech representation learning using WaveNet autoencoders Jan Chorowski, Ron J. Weiss, Samy Bengio, Aaron¨ van den Oord Abstract—We consider the task of unsupervised extraction speaker gender and identity, from phonetic content, properties of meaningful latent representations of speech by applying which are consistent with internal representations learned autoencoding neural networks to speech waveforms. The goal by speech recognizers [13], [14]. Such representations are is to learn a representation able to capture high level semantic content from the signal, e.g. phoneme identities, while being desired in several tasks, such as low resource automatic speech invariant to confounding low level details in the signal such as recognition (ASR), where only a small amount of labeled the underlying pitch contour or background noise. Since the training data is available. In such scenario, limited amounts learned representation is tuned to contain only phonetic content, of data may be sufficient to learn an acoustic model on the we resort to using a high capacity WaveNet decoder to infer representation discovered without supervision, but insufficient information discarded by the encoder from previous samples. Moreover, the behavior of autoencoder models depends on the to learn the acoustic model and a data representation in a fully kind of constraint that is applied to the latent representation. supervised manner [15], [16]. We compare three variants: a simple dimensionality reduction We focus on representations learned with autoencoders bottleneck, a Gaussian Variational Autoencoder (VAE), and a applied to raw waveforms and spectrogram features and discrete Vector Quantized VAE (VQ-VAE). We analyze the quality investigate the quality of learned representations on LibriSpeech of learned representations in terms of speaker independence, the ability to predict phonetic content, and the ability to accurately re- [17].
    [Show full text]
  • A Primer on Machine Learning
    A Primer on Machine Learning By instructor Amit Manghani Question: What is Machine Learning? Simply put, Machine Learning is a form of data analysis. Using algorithms that “ continuously learn from data, Machine Learning allows computers to recognize The core of hidden patterns without actually being programmed to do so. The key aspect of Machine Learning Machine Learning is that as models are exposed to new data sets, they adapt to produce reliable and consistent output. revolves around a computer system Question: consuming data What is driving the resurgence of Machine Learning? and learning from There are four interrelated phenomena that are behind the growing prominence the data. of Machine Learning: 1) the ever-increasing volume, variety and velocity of data, 2) the decrease in bandwidth and storage costs and 3) the exponential improve- ments in computational processing. In a nutshell, the ability to perform complex ” mathematical computations on big data is driving the resurgence in Machine Learning. 1 Question: What are some of the commonly used methods of Machine Learning? Reinforce- ment Machine Learning Supervised Machine Learning Semi- supervised Machine Unsupervised Learning Machine Learning Supervised Machine Learning In Supervised Learning, algorithms are trained using labeled examples i.e. the desired output for an input is known. For example, a piece of mail could be labeled either as relevant or junk. The algorithm receives a set of inputs along with the corresponding correct outputs to foster learning. Once the algorithm is trained on a set of labeled data; the algorithm is run against the same labeled data and its actual output is compared against the correct output to detect errors.
    [Show full text]
  • Analysis of Earthquake Forecasting in India Using Supervised Machine Learning Classifiers
    sustainability Article Analysis of Earthquake Forecasting in India Using Supervised Machine Learning Classifiers Papiya Debnath 1, Pankaj Chittora 2 , Tulika Chakrabarti 3, Prasun Chakrabarti 2,4, Zbigniew Leonowicz 5 , Michal Jasinski 5,* , Radomir Gono 6 and Elzbieta˙ Jasi ´nska 7 1 Department of Basic Science and Humanities, Techno International New Town Rajarhat, Kolkata 700156, India; [email protected] 2 Department of Computer Science and Engineering, Techno India NJR Institute of Technology, Udaipur 313003, Rajasthan, India; [email protected] (P.C.); [email protected] (P.C.) 3 Department of Basic Science, Sir Padampat Singhania University, Udaipur 313601, Rajasthan, India; [email protected] 4 Data Analytics and Artificial Intelligence Laboratory, Engineering-Technology School, Thu Dau Mot University, Thu Dau Mot City 820000, Vietnam 5 Department of Electrical Engineering Fundamentals, Faculty of Electrical Engineering, Wroclaw University of Science and Technology, 50-370 Wroclaw, Poland; [email protected] 6 Department of Electrical Power Engineering, Faculty of Electrical Engineering and Computer Science, VSB—Technical University of Ostrava, 708 00 Ostrava, Czech Republic; [email protected] 7 Faculty of Law, Administration and Economics, University of Wroclaw, 50-145 Wroclaw, Poland; [email protected] * Correspondence: [email protected]; Tel.: +48-713-202-022 Abstract: Earthquakes are one of the most overwhelming types of natural hazards. As a result, successfully handling the situation they create is crucial. Due to earthquakes, many lives can be lost, alongside devastating impacts to the economy. The ability to forecast earthquakes is one of the biggest issues in geoscience. Machine learning technology can play a vital role in the field of Citation: Debnath, P.; Chittora, P.; geoscience for forecasting earthquakes.
    [Show full text]
  • Automatic Classification of Oranges Using Image Processing and Data Mining Techniques
    Automatic classification of oranges using image processing and data mining techniques Juan Pablo Mercol, Juliana Gambini, Juan Miguel Santos Departamento de Computaci´on, Facultad de Ciencias Exactas y Naturales, UBA, Buenos Aires, Argentina [email protected],{jgambini,jmsantos}@dc.uba.ar, Abstract Data mining is the discovery of patterns and regularities from large amounts of data using machine learning algorithms. This can be applied to object recognition using image processing techniques. In fruits and vegetables production lines, the quality assurance is done by trained people who inspect the fruits while they move in a conveyor belt, and classify them in several categories based on visual features. In this paper we present an automatic orange’s classification system, which uses visual inspection to extract features from images captured with a digital camera. With these features train several data mining algorithms which should classify the fruits in one of the three pre-established categories. The data mining algorithms used are five different decision trees (J48, Classification and Regression Tree (CART), Best First Tree, Logistic Model Tree (LMT) and Random For- est), three artificial neural networks (Multilayer Perceptron with Backpropagation, Radial Basis Function Network (RBF Network), Sequential Minimal Optimization for Support Vector Machine (SMO)) and a classification rule (1Rule). The obtained results are encouraging because of the good accuracy achieved by the clas- sifiers and the low computational costs. Keywords: Image processing, Data Mining, Neural Networks, Decision trees, Fruit Qual- ity Assurance, Visual inspection, Artificial Vision. 1 Introduction During the last years, there has been an increase in the need to measure the quality of several products, in order to satisfy customers needs in the industry and services levels.
    [Show full text]
  • A Gradient-Based Split Criterion for Highly Accurate and Transparent Model Trees
    A Gradient-Based Split Criterion for Highly Accurate and Transparent Model Trees Klaus Broelemann and Gjergji Kasneci SCHUFA Holding AG, Wiesbaden, Germany fKlaus.Broelemann,[email protected] Abstract As a consequence, the choice of a predictive model for a specific task includes a trade-off between simplicity and Machine learning algorithms aim at minimizing the transparency on one side and complexity and higher predic- number of false decisions and increasing the accu- tive power on the other side. If the task is mainly driven by racy of predictions. However, the high predictive the quality of the prediction, complex state-of-the-art models power of advanced algorithms comes at the costs are best suited. Yet, many applications require a certain level of transparency. State-of-the-art methods, such as of transparency. This can be driven by regulatory constraints, neural networks and ensemble methods, result in but also by the user’s wish for transparency. A recent study highly complex models with little transparency. shows that model developers often feel the pressure to use We propose shallow model trees as a way to com- transparent models [Veale et al., 2018], even if more accu- bine simple and highly transparent predictive mod- rate models are available. This leads to simple models, such els for higher predictive power without losing the as linear or logistic regression, where more complex models transparency of the original models. We present would have much higher predictive power. In other cases, a novel split criterion for model trees that allows transparency is not required, but optional.
    [Show full text]
  • Accurate and Intuitive Contextual Explanations Using Linear Model
    Accurate and Intuitive Contextual Explanations using Linear Model Trees Aditya Lahiri Narayanan Unny Edakunni American Express, AI Labs American Express, AI Labs Bangalore, Karnataka, India Bangalore, Karnataka, India [email protected] [email protected] Abstract 1 Introduction With the ever-increasing use of complex machine learning The Finance domain has a number of scenarios where sophis- models in critical applications within the finance domain, ticated machine learning models are used to predict various explaining the modelâĂŹs decisions has become a neces- quantities of interest. This ranges from prediction of credit sity. With applications spanning from credit scoring[18] to and fraud risk of transactions to digital marketing. In many credit marketing[1], the impact of these models is undeni- of these instances, it is not enough to provide accurate pre- able. Among the multiple ways in which one can explain dictions, but is also important to provide an intuitive reason the decisions of these complicated models[16], local post in coming up with the prediction that can be understood by hoc model agnostic explanations have gained massive adop- a human. In instances like credit approval, it is also man- tion. These methods allow one to explain each prediction dated by law to provide an explanation to consumer for the independent of the modelling technique that was used while decision made for approval or denial of the credit[8, 13]. In training. As explanations, they either give individual feature applications like credit line extension[14], predictions from attributions[20] or provide sufficient rules that represent these sophisticated machine learnt models could be used by conditions for a prediction to be made[21].
    [Show full text]