A Deep Learning Approach to Big Data an Application to Traffic

Total Page:16

File Type:pdf, Size:1020Kb

A Deep Learning Approach to Big Data an Application to Traffic Fakult¨atBauingenieurwesen Institut f¨urBauinformatik THESIS A Deep Learning Approach to Big Data An Application to Traffic Prediction Submitted by: Falk H¨ugle(3254131) Advisors: Prof. Dr.-Ing. Raimar Scherer Dipl.-Ing. Ngoc Trung Luu Selbstst¨andigkeitserkl¨arung Hiermit versichere ich, dass ich die vorliegende Arbeit ohne unzul¨assigeHilfe Dritter und ohne Benutzung anderer als der angegebenen Hilfsmittel angefertigt habe; die aus fremden Quellen direkt oder indirekt ¨ubernommenen Gedanken sind als solche kenntlich gemacht. Die Arbeit wurde bisher weder im Inland noch im Ausland in gleicher oder ¨ahnlicher Form einer anderen Pr¨ufungsbeh¨ordevorgelegt und ist auch noch nicht ver¨offentlicht worden. New York, 19.05.2017 ............................. Falk H¨ugle Contents List of Figures II List of Tables III List of AlgorithmsIV 1 Introduction 1 1.1 Subject and Goals...................................1 1.2 Structure........................................1 2 Academic and Historical Context2 2.1 Academic Context...................................2 2.2 Historical Context...................................3 3 Theory 7 3.1 Machine Learning....................................7 3.1.1 Basics......................................7 3.1.2 Learning Paradigms..............................8 3.1.3 Data in Machine Learning........................... 10 3.1.4 Statistical Learning Theory.......................... 11 3.1.5 Loss and Cost Functions............................ 15 3.1.6 Types of Problems............................... 20 3.1.7 Types of Models................................ 28 3.1.8 Hyperparameter Optimization........................ 29 3.1.9 Assessing Performance............................. 33 3.2 Artificial Neural Networks............................... 35 3.2.1 Basics...................................... 35 3.2.2 Artificial Neurons............................... 37 3.2.3 Types of Architectures............................. 42 3.2.4 Learning Algorithms.............................. 54 3.2.5 Improving Generalization........................... 67 3.3 Deep Learning..................................... 74 3.3.1 Theoretical Justification............................ 74 3.3.2 Challenges in Training Deep Neural Networks................ 76 3.3.3 Solutions to Challenges............................ 79 3.3.4 New Developments............................... 86 3.4 Big Data......................................... 89 4 Application to Traffic Prediction 94 4.1 Problem Description.................................. 94 4.2 Related Research.................................... 95 4.3 Traffic Research Basics................................. 96 4.4 Data and Data Pre-Processing............................ 97 4.5 Model.......................................... 101 4.6 Implementation..................................... 109 4.7 Training......................................... 110 4.8 Model Evaluation.................................... 116 4.9 Possible Improvements and Future Research..................... 122 4.10 Other Applications in Civil Engineering....................... 123 4.11 Conclusion....................................... 124 References 125 Acronyms 138 I List of Figures 3.1 Loss Functions - Regression.............................. 16 3.2 Loss Functions - Classification............................. 18 3.3 Linear Regression example............................... 22 3.4 Logistic Regression example.............................. 25 3.5 Density Estimation example.............................. 27 3.6 Biological Neuron.................................... 38 3.7 Artificial Neuron.................................... 38 3.8 Activation Functions.................................. 41 3.9 Perceptron and Multilayer Perceptron........................ 44 3.10 Autoencoder...................................... 45 3.11 Fully Connected and Stacked Elman Recurrent Neural Network.......... 48 3.12 Boltzmann Machine and Restricted Boltzmann Machine.............. 53 3.13 Deep Boltzmann Machine............................... 54 3.14 Momentum and Nesterov Momentum update rule.................. 58 3.15 Recurrent Neural Network unrolled in time..................... 62 3.16 Underfitting and Overfitting.............................. 68 3.17 Comparison Deep Learning and conventional models................ 74 3.18 Rectified Linear Activation Function......................... 79 3.19 Maxout Unit...................................... 80 3.20 Long Short-Term Memory Unit............................ 81 3.21 Unsupervised Pre-Training effectiveness....................... 84 4.1 Fundamental Diagram of Traffic Flow........................ 96 4.2 Traffic Plot: Occupancy vs. Flow........................... 97 4.3 Traffic Plot: Speed vs. Flow.............................. 97 4.4 Traffic Plot: Occupancy vs. Speed........................... 97 4.5 Traffic and Weather Station locations........................ 98 4.6 Incident Data format.................................. 100 4.7 Model Architecture................................... 103 4.8 Correlation Significance Matrix............................ 105 4.9 Correlation Matrix................................... 105 4.10 Training and Validation Error............................. 116 4.11 Trained Weights - Convolutional Layers....................... 117 4.12 Trained Weights - LSTM Layer............................ 118 4.13 Mixture Components Probabilities.......................... 119 II List of Tables 3.1 Performance Metrics Classification.......................... 34 3.2 Model Comparison 1.................................. 34 3.3 Model Comparison 2.................................. 34 4.1 Hyperparameters - Model............................... 110 4.2 Traffic Regimes..................................... 111 4.3 Hyperparameters - Learning Algorithm....................... 111 4.4 Hyperparameters - Regularization.......................... 112 4.5 Result MAE and MSE................................. 120 4.6 Result Accuracy, Precision and Recall........................ 120 4.7 Model Comparison LSTM vs. Trivial - Paired T-Test................ 121 4.8 Model Comparison LSTM vs. Trivial - McNemar Test............... 121 4.9 Model Comparison LSTM vs. RNN - Paired T-Test................. 121 4.10 Model Comparison LSTM vs. RNN - McNemar Test................ 122 III List of Algorithms 1 Perceptron Learning Algorithm............................ 55 2 Batch Gradient Descent................................ 56 3 Mini-Batch Stochasic Gradient Descent....................... 57 4 Backpropagation.................................... 61 5 Backpropagation Through Time........................... 63 6 Contrastive Divergence................................. 66 IV Chapter 1 Introduction 1.1 Subject and Goals The primary goals of this thesis are to give an overview over the field of Deep Learning (DL), i.e. Machine Learning (ML) with deep Artificial Neural Networks (ANNs), and to demonstrate an application ofDL to a particular Big Data (BD) problem. Specifically, a broad background introduction to the theory ofML and ANNs is provided. A comprehensive exploration of the fieldDL then elucidates, from a theoretical perspective, why DL works, what its advantages are over shallow ANNs, which problems occur in training Deep Neural Networks (DNNs), and which recent theoretical advances have helped overcome these challenges. Furthermore, the subject ofBD is explored in some detail. In this context, it is exhibited how, in light of ever larger and more ubiquitous Data Sets,DL presents itself as an excellent approach to addressBD problems. In order to further substantiate the validity of this data-driven method, a practical application of DL to Traffic Modeling (TM) is discussed in detail. This use case exemplifies a scenario in which learning complex patterns directly from data is advantageous compared to explicitly modeling the relationships between a system's constituent parts. This research project encompasses acquisition and Pre-Processing of a massive Data Set, model design and its implementation using state of the artDL libraries, model training leveraging cloud accessible supercomputing resources, as well as generation of predictions on test data and their statistical analysis. 1.2 Structure Chapter 2 lays out the academic context ofDL, including its definition and relationship to other fields. Furthermore, the history ofDL is explored in the broader context of the history of Artificial Intelligence (AI) and, in particular, ANN research. Chapter 3 focuses on theory. The subject requires a thorough understanding ofML and ANNs, which are each discussed in dedicated subchapters. These two subchapters are to be understood as a reference for the rest of the thesis and can be skipped if the reader has sufficient background knowledge. The third subchapter provides a detailed view onDL, in particular, its theoretical justification, what challenges exist in training DNNs, and what solutions are available to overcome these challenges. Furthermore, promising new developments in the field are outlined. The fourth subchapter provides an overview of the subject ofBD and discusses its relationship toDL. Chapter 4 describes an application ofDL to theBD problem ofTM. In particular, the underlying data, model design, model implementation, and training are explained in detail. Moreover, possible use cases of the model, as well as results of example calculations are given, based on which model quality is compared to alternative approaches. Lastly, other possible applications ofDL in Civil Engineering are discussed. No particular
Recommended publications
  • A Public Transport Bus Assignment Problem: Parallel Metaheuristics Assessment
    David Fernandes Semedo Licenciado em Engenharia Informática A Public Transport Bus Assignment Problem: Parallel Metaheuristics Assessment Dissertação para obtenção do Grau de Mestre em Engenharia Informática Orientador: Prof. Doutor Pedro Barahona, Prof. Catedrático, Universidade Nova de Lisboa Co-orientador: Prof. Doutor Pedro Medeiros, Prof. Associado, Universidade Nova de Lisboa Júri: Presidente: Prof. Doutor José Augusto Legatheaux Martins Arguente: Prof. Doutor Salvador Luís Bettencourt Pinto de Abreu Vogal: Prof. Doutor Pedro Manuel Corrêa Calvente de Barahona Setembro, 2015 A Public Transport Bus Assignment Problem: Parallel Metaheuristics Assess- ment Copyright c David Fernandes Semedo, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa A Faculdade de Ciências e Tecnologia e a Universidade Nova de Lisboa têm o direito, perpétuo e sem limites geográficos, de arquivar e publicar esta dissertação através de exemplares impressos reproduzidos em papel ou de forma digital, ou por qualquer outro meio conhecido ou que venha a ser inventado, e de a divulgar através de repositórios científicos e de admitir a sua cópia e distribuição com objectivos educacionais ou de investigação, não comerciais, desde que seja dado crédito ao autor e editor. Aos meus pais. ACKNOWLEDGEMENTS This research was partly supported by project “RtP - Restrict to Plan”, funded by FEDER (Fundo Europeu de Desenvolvimento Regional), through programme COMPETE - POFC (Operacional Factores de Competitividade) with reference 34091. First and foremost, I would like to express my genuine gratitude to my advisor profes- sor Pedro Barahona for the continuous support of my thesis, for his guidance, patience and expertise. His general optimism, enthusiasm and availability to discuss and support new ideas helped and encouraged me to push my work always a little further.
    [Show full text]
  • Admb Package
    Using AD Model Builder and R together: getting started with the R2admb package Ben Bolker March 9, 2020 1 Introduction AD Model Builder (ADMB: http://admb-project.org) is a standalone program, developed by Dave Fournier continuously since the 1980s and re- leased as an open source project in 2007, that takes as input an objective function (typically a negative log-likelihood function) and outputs the co- efficients that minimize the objective function, along with various auxiliary information. AD Model Builder uses automatic differentiation (that's what \AD" stands for), a powerful algorithm for computing the derivatives of a specified objective function efficiently and without the typical errors due to finite differencing. Because of this algorithm, and because the objective function is compiled into machine code before optimization, ADMB can solve large, difficult likelihood problems efficiently. ADMB also has the capability to fit random-effects models (typically via Laplace approximation). To the average R user, however, ADMB represents a challenge. The first (unavoidable) challenge is that the objective function needs to be written in a superset of C++; the second is learning the particular sequence of steps that need to be followed in order to output data in a suitable format for ADMB; compile and run the ADMB model; and read the data into R for analysis. The R2admb package aims to eliminate the second challenge by automating the R{ADMB interface as much as possible. 2 Installation The R2admb package can be installed in R in the standard way (with install.packages() or via a Packages menu, depending on your platform.
    [Show full text]
  • Automated Likelihood Based Inference for Stochastic Volatility Models H
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Institutional Knowledge at Singapore Management University Singapore Management University Institutional Knowledge at Singapore Management University Research Collection School Of Economics School of Economics 11-2009 Automated Likelihood Based Inference for Stochastic Volatility Models H. Skaug Jun YU Singapore Management University, [email protected] Follow this and additional works at: https://ink.library.smu.edu.sg/soe_research Part of the Econometrics Commons Citation Skaug, H. and YU, Jun. Automated Likelihood Based Inference for Stochastic Volatility Models. (2009). 1-28. Research Collection School Of Economics. Available at: https://ink.library.smu.edu.sg/soe_research/1151 This Working Paper is brought to you for free and open access by the School of Economics at Institutional Knowledge at Singapore Management University. It has been accepted for inclusion in Research Collection School Of Economics by an authorized administrator of Institutional Knowledge at Singapore Management University. For more information, please email [email protected]. Automated Likelihood Based Inference for Stochastic Volatility Models Hans J. SKAUG , Jun YU November 2009 Paper No. 15-2009 ANY OPINIONS EXPRESSED ARE THOSE OF THE AUTHOR(S) AND NOT NECESSARILY THOSE OF THE SCHOOL OF ECONOMICS, SMU Automated Likelihood Based Inference for Stochastic Volatility Models¤ Hans J. Skaug,y Jun Yuz October 7, 2008 Abstract: In this paper the Laplace approximation is used to perform classical and Bayesian analyses of univariate and multivariate stochastic volatility (SV) models. We show that imple- mentation of the Laplace approximation is greatly simpli¯ed by the use of a numerical technique known as automatic di®erentiation (AD).
    [Show full text]
  • Multivariate Statistical Methods to Analyse Multidimensional Data in Applied Life Science
    Multivariate statistical methods to analyse multidimensional data in applied life science Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften (Dr. rer. nat.) der Naturwissenschaftlichen Fakult¨atIII Agrar- und Ern¨ahrungswissenschaften, Geowissenschaften und Informatik der Martin-Luther-Universit¨atHalle-Wittenberg vorgelegt von Frau Trutschel (geb. Boronczyk), Diana Geb. am 18.02.1979 in Hohenm¨olsen Gutachter: 1. Prof. Dr. Ivo Grosse, MLU Halle/Saale 2. Dr. Steffen Neumann, IPB alle/Saale 3. Prof. Dr. Andr´eScherag, FSU Jena Datum der Verteidigung: 18.04.2019 I Eidesstattliche Erklärung / Declaration under Oath Ich erkläre an Eides statt, dass ich die Arbeit selbstständig und ohne fremde Hilfe verfasst, keine anderen als die von mir angegebenen Quellen und Hilfsmittel benutzt und die den benutzten Werken wörtlich oder inhaltlich entnommenen Stellen als solche kenntlich gemacht habe. I declare under penalty of perjury that this thesis is my own work entirely and has been written without any help from other people. I used only the sources mentioned and included all the citations correctly both in word or content. __________________________ ____________________________________________ Datum / Date Unterschrift des Antragstellers / Signature of the applicant III This thesis is a cumulative thesis, including five research articles that have previously been published in peer-reviewed international journals. In the following these publications are listed, whereby the first authors are underlined and my name (Trutschel) is marked in bold. 1. Trutschel, Diana and Schmidt, Stephan and Grosse, Ivo and Neumann, Steffen, "Ex- periment design beyond gut feeling: statistical tests and power to detect differential metabolites in mass spectrometry data", Metabolomics, 2015, available at: https:// link.springer.com/content/pdf/10.1007/s11306-014-0742-y.pdf 2.
    [Show full text]
  • Solving the Maximum Weight Independent Set Problem Application to Indirect Hex-Mesh Generation
    Solving the Maximum Weight Independent Set Problem Application to Indirect Hex-Mesh Generation Dissertation presented by Kilian VERHETSEL for obtaining the Master’s degree in Computer Science and Engineering Supervisor(s) Jean-François REMACLE, Amaury JOHNEN Reader(s) Jeanne PELLERIN, Yves DEVILLE , Julien HENDRICKX Academic year 2016-2017 Acknowledgments I would like to thank my supervisor, Jean-François Remacle, for believing in me, my co-supervisor Amaury Johnen, for his encouragements during this project, as well as Jeanne Pellerin, for her patience and helpful advice. 3 4 Contents List of Notations 7 List of Figures 8 List of Tables 10 List of Algorithms 12 1 Introduction 17 1.1 Background on Indirect Mesh Generation . 17 1.2 Outline . 18 2 State of the Art 19 2.1 Exact Resolution . 19 2.1.1 Approaches Based on Graph Coloring . 19 2.1.2 Approaches Based on MaxSAT . 20 2.1.3 Approaches Based on Integer Programming . 21 2.1.4 Parallel Implementations . 21 2.2 Heuristic Approach . 22 3 Hybrid Approach to Combining Tetrahedra into Hexahedra 25 3.1 Computation of the Incompatibility Graph . 25 3.2 Exact Resolution for Small Graphs . 28 3.2.1 Complete Search . 28 3.2.2 Branch and Bound . 29 3.2.3 Upper Bounds Based on Clique Partitions . 31 3.2.4 Upper Bounds based on Linear Programming . 32 3.3 Construction of an Initial Solution . 35 3.3.1 Local Search Algorithms . 35 3.3.2 Local Search Formulation of the Maximum Weight Independent Set . 36 3.3.3 Strategies to Escape Local Optima .
    [Show full text]
  • Cognicrypt: Supporting Developers in Using Cryptography
    CogniCrypt: Supporting Developers in using Cryptography Stefan Krüger∗, Sarah Nadiy, Michael Reifz, Karim Aliy, Mira Meziniz, Eric Bodden∗, Florian Göpfertz, Felix Güntherz, Christian Weinertz, Daniel Demmlerz, Ram Kamathz ∗Paderborn University, {fistname.lastname}@uni-paderborn.de yUniversity of Alberta, {nadi, karim.ali}@ualberta.ca zTechnische Universität Darmstadt, {reif, mezini, guenther, weinert, demmler}@cs.tu-darmstadt.de, [email protected], [email protected] Abstract—Previous research suggests that developers often reside on the low level of cryptographic algorithms instead of struggle using low-level cryptographic APIs and, as a result, being more functionality-oriented APIs that provide convenient produce insecure code. When asked, developers desire, among methods such as encryptFile(). When it comes to tool other things, more tool support to help them use such APIs. In this paper, we present CogniCrypt, a tool that supports support, participants suggested tools like a CryptoDebugger, developers with the use of cryptographic APIs. CogniCrypt analysis tools that find misuses and provide code templates assists the developer in two ways. First, for a number of common or generate code for common functionality. These suggestions cryptographic tasks, CogniCrypt generates code that implements indicate that participants not only lack the domain knowledge, the respective task in a secure manner. Currently, CogniCrypt but also struggle with APIs themselves and how to use them. supports tasks such as data encryption, communication over secure channels, and long-term archiving. Second, CogniCrypt In this paper, we present CogniCrypt, an Eclipse plugin that continuously runs static analyses in the background to ensure enables easier use of cryptographic APIs. In previous work, a secure integration of the generated code into the developer’s we outlined an early vision for CogniCrypt [2].
    [Show full text]
  • Management and Analysis of Wildlife Biology Data Bret A
    2 Management and Analysis of Wildlife Biology Data bret a. collier and thomas w. schwertner INTRODUCTION HE GENERAL FOCUS in this chapter is on outlining the range of op- tions available for the management and analysis of data collected during wildlife biology studies. Topics presented are inherently tied to data analy- Tsis, including study design, data collection, storage, and management, as well as graphical and tabular displays best suited for data summary, interpretation, and analysis. The target audience is upper-level undergraduate or masters-level gradu- ate students studying wildlife ecology and management. Statistical theory will only be presented in situations where it would be beneficial for the reader to understand the motivation behind a particular technique (e.g., maximum likelihood estima- tion) versus attempting to provide in-depth detail regarding all potential assump- tions, methods of interpretation, or ways of violating assumptions. In addition, po- tential approaches that are currently available and are rapidly advancing (e.g., applications of mixed models or Bayesian modeling using Markov Chain Monte Carlo methods) are presented. Also provided are appropriate references and pro- grams for the interested reader to explore further these ideas. The previous authors (Bart and Notz 2005) of this chapter suggested that before collecting data, visit with a professional statistician. Use the consultation with a professional to serve as a time to discuss the objectives of the research study, out- line a general study design, discuss the proposed methods to be implemented, and identify potential problems with initial thoughts on sampling design, as well as to determine what statistical applications to use and how any issues related to sam- pling will translate to issues with statistical analysis.
    [Show full text]
  • On Desktop Application for Some Selected Methods in Ordinary Differential Equations and Optimization Techniques
    International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321- 4767 P-ISSN: 2321-4759 www.ijmsi.org Volume 8 Issue 9 || September, 2020 || PP 20-34 On Desktop Application for Some Selected Methods in Ordinary Differential Equations and Optimization Techniques 1Buah, A. M.,2Alhassan, S. B. and3Boahene,B. M. 1 Department of Mathematical Sciences, University of Mines and Technology, Tarkwa, Ghana 2 Department of Mathematical Sciences, University of Mines and Technology, Ghana 3 Department of Mathematical Sciences, University of Mines and Technology, Ghana ABSTRACT-Numerical methods are algorithms used to give approximate solutions to mathematical problems. For many years, problems solved numerically were done manually, until the software was developed to solve such problems. The problem with some of this softwareis that, though some are free, the environment they offer is not user friendly. With others, it would cost you a fortune to get access to them and some suffer execution time and overall speed. In this research, a desktop application with a friendly user interface, for solving some selected methods in numerical methods for Ordinary Differential Equations (ODEs) and sequential search methods in optimization techniques has been developed using Visual C++. The application can determine the error associated with some of the methods in ODEs if the user supplies the software with the exact solution. The application can also solve optimization problems (minimization problems) and return the minimum function value and the point at which it occurs. KEYWORDS:ODEs, optimization, desktop application ----------------------------------------------------------------------------------------------------------------------------- ---------- Date of Submission: 11-11-2020 Date of Acceptance: 27-11-2020 ------------------------------------------------------------------------------------------------------------------------ --------------- I. INTRODUCTION A Differential Equation (DE) is an equation involving a function and its derivatives [1].
    [Show full text]
  • Analysis and Processing of Cryptographic Protocols
    Analysis and Processing of Cryptographic Protocols Submitted in partial fulfilment of the requirements of the degree of Bachelor of Science (Honours) of Rhodes University Bradley Cowie Grahamstown, South Africa November 2009 Abstract The field of Information Security and the sub-field of Cryptographic Protocols are both vast and continually evolving fields. The use of cryptographic protocols as a means to provide security to web servers and services at the transport layer, by providing both en- cryption and authentication to data transfer, has become increasingly popular. However, it is noted that it is rather difficult to perform legitimate analysis, intrusion detection and debugging on data that has passed through a cryptographic protocol as it is encrypted. The aim of this thesis is to design a framework, named Project Bellerophon, that is capa- ble of decrypting traffic that has been encrypted by an arbitrary cryptographic protocol. Once the plain-text has been retrieved further analysis may take place. To aid in this an in depth investigation of the TLS protocol was undertaken. This pro- duced a detailed document considering the message structures and the related fields con- tained within these messages which are involved in the TLS handshake process. Detailed examples explaining the processes that are involved in obtaining and generating the var- ious cryptographic components were explored. A systems design was proposed, considering the role of each of the components required in order to produce an accurate decryption of traffic encrypted by a cryptographic protocol. Investigations into the accuracy and the efficiency of Project Bellerophon to decrypt specific test data were conducted.
    [Show full text]
  • MULTIVALUED SUBSETS UNDER INFORMATION THEORY Indraneel Dabhade Clemson University, [email protected]
    Clemson University TigerPrints All Theses Theses 8-2011 MULTIVALUED SUBSETS UNDER INFORMATION THEORY Indraneel Dabhade Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_theses Part of the Applied Mathematics Commons Recommended Citation Dabhade, Indraneel, "MULTIVALUED SUBSETS UNDER INFORMATION THEORY" (2011). All Theses. 1155. https://tigerprints.clemson.edu/all_theses/1155 This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected]. MULTIVALUED SUBSETS UNDER INFORMATION THEORY _______________________________________________________ A Thesis Presented to the Graduate School of Clemson University _______________________________________________________ In Partial Fulfillment of the Requirements for the Degree Master of Science Industrial Engineering _______________________________________________________ by Indraneel Chandrasen Dabhade August 2011 _______________________________________________________ Accepted by: Dr. Mary Beth Kurz, Committee Chair Dr. Anand Gramopadhye Dr. Scott Shappell i ABSTRACT In the fields of finance, engineering and varied sciences, Data Mining/ Machine Learning has held an eminent position in predictive analysis. Complex algorithms and adaptive decision models have contributed towards streamlining directed research as well as improve on the accuracies in forecasting. Researchers in the fields of mathematics and computer science have made significant contributions towards the development of this field. Classification based modeling, which holds a significant position amongst the different rule-based algorithms, is one of the most widely used decision making tools. The decision tree has a place of profound significance in classification-based modeling. A number of heuristics have been developed over the years to prune the decision making process.
    [Show full text]
  • The Maritime Pickup and Delivery Problem with Cost and Time Window Constraints: System Modeling and A* Based Solution
    The Maritime Pickup and Delivery Problem with Cost and Time Window Constraints: System Modeling and A* Based Solution CHRISTOPHER DAMBAKK SUPERVISOR Associate Professor Lei Jiao University of Agder, 2019 Faculty of Engineering and Science Department of ICT Abstract In the ship chartering business, more and more shipment orders are based on pickup and delivery in an on-demand manner rather than conventional scheduled routines. In this situation, it is nec- essary to estimate and compare the cost of shipments in order to determine the cheapest one for a certain order. For now, these cal- culations are based on static, empirical estimates and simplifications, and do not reflect the complexity of the real world. In this thesis, we study the Maritime Pickup and Delivery Problem with Cost and Time Window Constraints. We first formulate the problem mathe- matically, which is conjectured NP-hard. Thereafter, we propose an A* based prototype which finds the optimal solution with complexity O(bd). We compare the prototype with a dynamic programming ap- proach and simulation results show that both algorithms find global optimal and that A* finds the solution more efficiently, traversing fewer nodes and edges. iii Preface This thesis concludes the master's education in Communication and Information Technology (ICT), at the University of Agder, Nor- way. Several people have supported and contributed to the completion of this project. I want to thank in particular my supervisor, Associate Professor Lei Jiao. He has provided excellent guidance and refreshing perspectives when the tasks ahead were challenging. I would also like to thank Jayson Mackie, co-worker and friend, for proofreading my report.
    [Show full text]
  • Author Guidelines for 8
    Proceedings of the 53rd Hawaii International Conference on System Sciences | 2020 Easy and Efficient Hyperparameter Optimization to Address Some Artificial Intelligence “ilities” Trevor J. Bihl Joseph Schoenbeck Daniel Steeneck & Jeremy Jordan Air Force Research Laboratory, USA The Perduco Group, USA Air Force Institute of Technology, USA [email protected] [email protected] {Daniel.Steeneck; Jeremy.Jordan}@afit.edu hyperparameters. This is a complex trade space due to Abstract ML methods being brittle and not robust to conditions outside of those on which they were trained. While Artificial Intelligence (AI), has many benefits, attention is now given to hyperparameter selection [4] including the ability to find complex patterns, [5], in general, as mentioned in Mendenhall [6], there automation, and meaning making. Through these are “no hard-and-fast rules” in their selection. In fact, benefits, AI has revolutionized image processing their selection is part of the “art of [algorithm] design” among numerous other disciplines. AI further has the [6], as appropriate hyperparameters can depend potential to revolutionize other domains; however, this heavily on the data under consideration itself. Thus, will not happen until we can address the “ilities”: ML methods themselves are often hand-crafted and repeatability, explain-ability, reliability, use-ability, require significant expertise and talent to appropriately train and deploy. trust-ability, etc. Notably, many problems with the “ilities” are due to the artistic nature of AI algorithm development, especially hyperparameter determination. AI algorithms are often crafted products with the hyperparameters learned experientially. As such, when applying the same Accuracy w w w w algorithm to new problems, the algorithm may not w* w* w* w* w* perform due to inappropriate settings.
    [Show full text]