Delivering a Machine Learning Course on HPC Resources

Total Page:16

File Type:pdf, Size:1020Kb

Delivering a Machine Learning Course on HPC Resources Delivering a machine learning course on HPC resources Stefano Bagnasco, Federica Legger, Sara Vallero This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement LHCBIGDATA No 799062 The course ● Title: Big Data Science and Machine Learning ● Graduate Program in Physics at University of Torino ● Academic year 2018-2019: ○ Starts in 2 weeks ○ 2 CFU, 10 hours (theory+hands-on) ○ 7 registered students ● Academic year 2019-2020: ○ March 2020 ○ 4 CFU, 16 hours (theory+hands-on) ○ Already 2 registered students 2 The Program ● Introduction to big data science ○ The big data pipeline: state-of-the-art tools and technologies ● ML and DL methods: ○ supervised and unsupervised models, ○ neural networks ● Introduction to computer architecture and parallel computing patterns ○ Initiation to OpenMP and MPI (2019-2020) ● Parallelisation of ML algorithms on distributed resources ○ ML applications on distributed architectures ○ Beyond CPUs: GPUs, FPGAs (2019-2020) 3 The aim ● Applied ML course: ○ Many courses on advanced statistical methods available elsewhere ○ Focus on hands-on sessions ● Students will ○ Familiarise with: ■ ML methods and libraries ■ Analysis tools ■ Collaborative models ■ Container and cloud technologies ○ Learn how to ■ Optimise ML models ■ Tune distributed training ■ Work with available resources 4 Hands-on ● Python with Jupyter notebooks ● Prerequisites: some familiarity with numpy and pandas ● ML libraries ○ Day 2: MLlib ■ Gradient Boosting Trees GBT ■ Multilayer Perceptron Classifier MCP ○ Day 3: Keras ■ Sequential model ○ Day 4: bigDL ■ Sequential model ● Coming: ○ CUDA ○ MPI ○ OpenMP 5 ML Input Dataset for hands on ● Open HEP dataset @UCI, 7GB (.csv) ● Signal (heavy Higgs) + background ● 10M MC events (balanced, 50%:50%) ○ 21 low level features ■ pt’s, angles, MET, b-tag, … Signal ○ 7 high level features ■ Invariant masses (m(jj), m(jjj), …) Background: ttbar Baldi, Sadowski, and Whiteson. “Searching for Exotic Particles in https://archive.ics.uci.edu/ml/datasets/HIGGS High-energy Physics with Deep Learning.” Nature Communications 5 6 Infrastructure: requirements ● Commodity hardware (CPUs) ● Non-dedicated and heterogeneous resources: ○ Bare metal ■ 1 x 24 cores, 190 GB RAM ■ 4 x 28 cores, 260 GB RAM ○ IaaS Cloud (on premises) ■ 10 VM, 8 cores, 70 GB RAM ● Uniform application/service orchestration layer -> Kubernetes ● High-throughput vs. high-performance -> Spark ● Distributed datasets -> HDFS ● Elasticity: allow to scale up if there are unused resources 7 What about HPCs? ● HPC = high performance processors + low latency interconnect ● HPC clusters are typically managed with a batch system ● The OCCAM HPC @University of Torino employs a Cloud like management strategy coupled to lightweight virtualization -> OCCAM facility ○ https://c3s.unito.it/index.php/super-computer 8 The OCCAM supercomputer APPLICATION VIRTUALIZATION Defined by: The pivotal technologies ● Runtime environment for the middleware ● Resource requirements architecture are Linux ● Execution model containers, currently managed with Docker. Cloud-like ● package, ship and run COMPUTING MODEL distributed application components with HPC cluster guaranteed platform parity across different HPC: environments batch-like, multi-node workloads ● democratizing using MPI and inter-node VIRTUAL WORKSTATION: virtualization by communication code execution (e.g. R or ROOT) providing it to in a single multicore node, developers in a usable, possibly with GPU acceleration PIPELINES: application-focused form multi-step data analysis JUPYTER-HUB: requiring high-memory large ON DEMAND With Spark backend for ML single-image nodes Autoscaling and Big Data workloads 9 Infrastructure: architecture OAuth login CPUs: 216 Memory: 1.9 TB HDFS: 2.3 TB Spark Spark Spark Spark Executor Executor Executor Driver Network: 1 Gbps Spark Spark Spark Spark Spark Spark Executor Executor Executor Executor Executor Driver Spark Spark Spark Executor Executor Driver HDFS (for Datasets) Kubernetes Control Plane Kubernetes Workers High-class hardware Lower-class hardware Virtual Machines 10 Infrastructure: elasticity ● Spark driver continuously Spark Spark Spark scales up to reach the Driver Driver Driver requested number of executors ● No static quotas enforced, but Scale Up a Min number of executors to be granted to each tenant ● Custom Kubernetes Operator Executors (alpha version): ○ lets tenants occupy all available resources in a Executors FIFO manner ○ undeploys exceeding executors only to grant the Min number of Scale Down Executors resources to all registered tenants Farm Operator 11 Scaling tests ● #cores per executor Perfect scaling ● #cores per machine ● #cores in homogeneous cluster BigDL,NN ● Strong scaling MLLib, efficiency = GBT MLLib, time(1)/(N*time(N)) MPC ○ N = #cores 12 ML models && lessons learned Model AUC time # events cores note MLLib GBT 82 15m 10M 25 Doesn’t scale MLLib MPC - 4 74 9m 10M 25 Scales well, layers, 30 hidden can’t build units complex models Keras Sequential 81 18m 1M 25 No distributed - 1 layer, 100 training, cannot hidden units process 10M events BigDL Sequential 86 3h15m 10M 88 1 core/executor - 2 layers, 300 required hidden units 13 Summary ● Applied ML course for Ph.D students focusing on distributed training for ML models ● Infrastructure runs on ‘opportunistic’ resources ● Architecture can be ‘reused’ on OCCAM 14 Spares 15 Farm Kube Operator https://github.com/svallero/farmcontroller ● Spark Driver deploys executor Pods with given namespace/label/name (let’s call this triad a selector) ● But a Pod is not a scalable Kubernetes Resource (i.e. a Deployment is) ● The farm Operator implements two Custom Resource Definitions (CRDs) with their own Controller: ○ Farm Resource ○ FarmManager Resource ● The Farm Operator can be applied to any other app (farm type) with similar features ● CAVEAT: ○ The Farm app should be resilient to the live removal of executors (i.e. Spark, HTCondor) 16 Farm Kube Operator (continued) Farm Resource ● Collects Pods with given selector ● Implements scaledown Farm ● Defines a Min number of executors (quota) ● Reconciles on selected Pod events FarmManager Resource ● Reconciles on Farm events ● Scales down Farms over quota only if some other Farm requests resources and it’s below its quota ● Simple algorithm: number of Farm killed pods per Farm is Manager proportional to the number of Pods over the quota (should be improved) 17 OCCAM HPC facility at University of Turin ● managed using container-based cloud-like technologies ● computing applications are run on Virtual Clusters deployed on top of the physical infrastructure 18 OCCAM SPECS 2 Management nodes 4 Fat Nodes ● CPU - 2x Intel® Xeon® Processor E5-2640 v3 8 ● CPU - 4x Intel® Xeon® Processor E7-4830 v3 12 core/2.1Ghz core 2.6 GHz ● RAM - 768GB/1666MHz (48 x 16Gb) DDR4 ● RAM - 64GB/2133MHz ● DISK - 1 SSD 800GB + 1 HDD 2TB 7200rpm ● DISK - 2x HDD 1Tb Raid0 ● NET - IB 56Gb + 2x10Gb ● NET - IB 56Gb + 2x10Gb + 4 x 1GB ● FORMFACTOR - 1U 4 GPU Nodes 32 Light nodes ● CPU - 2x Intel® Xeon® Processor E5-2680 v3, 12 core 2.1Ghz ● RAM - 128GB/2133 (8 x 16Gb) DDR4 ● CPU - 2x Intel® Xeon® Processor E5-2680 v3, 12 ● DISK - 1 x SSD 800GB sas 6 Gbps 2.5’’ core 2.5Ghz ● NET - IB 56Gb + 2x10Gb ● RAM - 128GB/2133 (8 x 16 Gb) ● GPU - 2 x NVIDIA K40 su PCI-E Gen3 x16 ● DISK - SSD 400GB SATA 1.8 inch. ● NET - IB 56Gb + 2x10Gb ● FORMFACTOR - high density (4 nodes x RU) 19 Scaling tests #1 ● Optimize #cores per #core = 5 optimal executor ● Model: MLLib MCP and GBT, 1M events ● One machine t2-mlwn-01.to.infn.it ● In the ‘literature’ #cores = 5 is magic number to achieve maximum HDFS GBT does not scale well throughput Expected since GBT training is hard to parallelise 20 Scaling tests #2 ● Optimize #executors ● #cores/executor = 5 ● Model: MLlib MCP, 1M, ● One machine 10M events 21 Scaling tests #3 ● Scaling on homogeneous resources ○ bare metal, 4 machines with 56 cores and 260 GB 22 ML models && lessons learned 2 GBT fast MPC GBT slow Keras Sequential 23.
Recommended publications
  • Full Academic Cv: Grigori Fursin, Phd
    FULL ACADEMIC CV: GRIGORI FURSIN, PHD Current position VP of MLOps at OctoML.ai (USA) Languages English (British citizen); French (spoken, intermediate); Russian (native) Address Paris region, France Education PhD in computer science with the ORS award from the University of Edinburgh (2004) Website cKnowledge.io/@gfursin LinkedIn linkedin.com/in/grigorifursin Publications scholar.google.com/citations?user=IwcnpkwAAAAJ (H‐index: 25) Personal e‐mail [email protected] I am a computer scientist, engineer, educator and business executive with an interdisciplinary background in computer engineering, machine learning, physics and electronics. I am passionate about designing efficient systems in terms of speed, energy, accuracy and various costs, bringing deep tech to the real world, teaching, enabling reproducible research and sup‐ porting open science. MY ACADEMIC RESEARCH (TENURED RESEARCH SCIENTIST AT INRIA WITH PHD IN CS FROM THE UNIVERSITY OF EDINBURGH) • I was among the first researchers to combine machine learning, autotuning and knowledge sharing to automate and accelerate the development of efficient software and hardware by several orders of magnitudeGoogle ( scholar); • developed open‐source tools and started educational initiatives (ACM, Raspberry Pi foundation) to bring this research to the real world (see use cases); • prepared and tought M.S. course at Paris‐Saclay University on using ML to co‐design efficient software and hardare (self‐optimizing computing systems); • gave 100+ invited research talks; • honored to receive the
    [Show full text]
  • B.Sc Computer Science with Specialization in Artificial Intelligence & Machine Learning
    B.Sc Computer Science with Specialization in Artificial Intelligence & Machine Learning Curriculum & Syllabus (Based on Choice Based Credit System) Effective from the Academic year 2020-2021 PROGRAMME EDUCATIONAL OBJECTIVES (PEO) PEO 1 : Graduates will have solid basics in Mathematics, Programming, Machine Learning, Artificial Intelligence fundamentals and advancements to solve technical problems. PEO 2 : Graduates will have the capability to apply their knowledge and skills acquired to solve the issues in real world Artificial Intelligence and Machine learning areas and to develop feasible and reliable systems. PEO 3 : Graduates will have the potential to participate in life-long learning through the successful completion of advanced degrees, continuing education, certifications and/or other professional developments. PEO 4 : Graduates will have the ability to apply the gained knowledge to improve the society ensuring ethical and moral values. PEO 5 : Graduates will have exposure to emerging cutting edge technologies and excellent training in the field of Artificial Intelligence & Machine learning PROGRAMME OUTCOMES (PO) PO 1 : Develop knowledge in the field of AI & ML courses necessary to qualify for the degree. PO 2 : Acquire a rich basket of value added courses and soft skill courses instilling self-confidence and moral values. PO 3 : Develop problem solving, decision making and communication skills. PO 4 : Demonstrate social responsibility through Ethics and values and Environmental Studies related activities in the campus and in the society. PO 5 : Strengthen the critical thinking skills and develop professionalism with the state of art ICT facilities. PO 6 : Quality for higher education, government services, industry needs and start up units through continuous practice of preparatory examinations.
    [Show full text]
  • Delivering a Machine Learning Course on HPC Resources
    EPJ Web of Conferences 245, 08016 (2020) https://doi.org/10.1051/epjconf/202024508016 CHEP 2019 Delivering a machine learning course on HPC resources Stefano Bagnasco1, Gabriele Gaetano Fronzé1, Federica Legger1;∗ Stefano Lusso1, and Sara Vallero1 1Istituto Nazionale di Fisica Nucleare, via Pietro Giuria 1, 10125 Torino, Italy Abstract. In recent years, proficiency in data science and machine learning (ML) became one of the most requested skills for jobs in both industry and academy. Machine learning algorithms typically require large sets of data to train the models and extensive usage of computing resources, both for training and inference. Especially for deep learning algorithms, training performances can be dramatically improved by exploiting Graphical Processing Units (GPUs). The needed skill set for a data scientist is therefore extremely broad, and ranges from knowledge of ML models to distributed programming on heterogeneous resources. While most of the available training resources focus on ML algorithms and tools such as TensorFlow, we designed a course for doctoral students where model training is tightly coupled with underlying technologies that can be used to dynamically provision resources. Throughout the course, students have access to a dedicated cluster of computing nodes on local premises. A set of libraries and helper functions is provided to execute a parallelized ML task by automatically deploying a Spark driver and several Spark execution nodes as Docker containers. Task scheduling is managed by an orchestration layer (Kubernetes). This solution automates the delivery of the software stack required by a typical ML workflow and enables scalability by allowing the execution of ML tasks, including training, over commodity (i.e.
    [Show full text]
  • Iowa State University – 2016-2017 1
    Iowa State University – 2016-2017 1 CPR E 281 Digital Logic 4 COMPUTER SCIENCE COM S 309 Software Development Practices 3 COM S 311 Design and Analysis of Algorithms 3 http://www.cs.iastate.edu COM S 321 Introduction to Computer Architecture and 3 The undergraduate curriculum in computer science leading to the degree Machine-Level Programming Bachelor of Science is accredited by the Computing Accreditation COM S 327 Advanced Programming Techniques 3 Commission of ABET, http://www.abet.org , and equips students with COM S 331 Theory of Computing 3 a sound knowledge of the foundations of computer science, as well COM S 342 Principles of Programming Languages 3 as the problem solving and system design skills necessary to create robust, efficient, reliable, scalable, and flexible software systems. The COM S 352 Introduction to Operating Systems 3 B.S. degree in Computer Science prepares students for graduate study COM S 362 Object-Oriented Analysis and Design 3 in computer science, and for various business, industry, and government positions including computer scientists, information technologists, In addition to the above courses, at least 6 credits of 400-level courses and software developers. The main educational objectives of the are required, with a grade of C- or better. At least 3 credits must be from computer science program at Iowa State University are that its graduates courses in Group 1 (oral and written reports) and the remaining credits demonstrate expertise, engagement, and learning within three to five from courses in Group 1 or 2. Com S 402 may be applied towards the years after graduation.
    [Show full text]
  • Merlyn.AI Prudent Investing Just Got Simpler and Safer
    Merlyn.AI Prudent Investing Just Got Simpler and Safer by Scott Juds October 2018 1 Merlyn.AI Prudent Investing Just Got Simpler and Safer We Will Review: • Logistics – Where to Find Things • Brief Summary of our Base Technology • How Artificial Intelligence Will Help • Brief Summary of How Merlyn.AI Works • The Merlyn.AI Strategies and Portfolios • Importing Strategies and Building Portfolios • Why AI is “Missing in Action” on Wall Street? 2 Merlyn.AI … Why It Matters 3 Simply Type In Logistics: Merlyn.AI 4 Videos and Articles Archive 6 Merlyn.AI Builds On (and does not have to re-discover) Other Existing Knowledge It Doesn’t Have to Reinvent All of this From Scratch Merlyn.AI Builds On Top of SectorSurfer First Proved Momentum in Market Data Merlyn.AI Accepts This Re-Discovery Not Necessary Narasiman Jegadeesh Sheridan Titman Emory University U. of Texas, Austin Academic Paper: “Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency” (1993) Formally Confirmed Momentum in Market Data Merlyn.AI Accepts This Re-Discovery Not Necessary Eugene Fama Kenneth French Nobel Prize, 2013 Dartmouth College “the premier market anomaly” that’s “above suspicion.” Academic Paper - 2008: “Dissecting Anomalies” Proved Signal-to-Noise Ratio Controls the Probability of Making the Right Decision Claude Shannon National Medal of Science, 1966 Merlyn.AI Accepts This Re-Discovery Not Necessary Matched Filter Theory Design for Optimum J. H. Van Vleck Signal-to-Noise Ratio Noble Prize, 1977 Think Outside of the Box Merlyn.AI Accepts This Re-Discovery Not Necessary Someplace to Start Designed for Performance Differential Signal Processing Removes Common Mode Noise (Relative Strength) Samuel H.
    [Show full text]
  • By Scott Juds – Sumgrowth Strategies
    AAII Phoenix Chapter by Scott Juds –PresentedSumGrowth by ScottStrategies Juds – Sept. 2019 December 12, 2019 President & CEO, SumGrowth Strategies 1 Disclaimers • DO NOT BASE ANY INVESTMENT DECISION SOLELY UPON MATERIALS IN THIS PRESENTATION • Neither SumGrowth Strategies nor I are a registered investment advisor or broker-dealer. • This presentation is for educational purposes only and is not an offer to buy or sell securities. • This information is only educational in nature and should not be construed as investment advice as it is not provided in view of the individual circumstances of any particular individual. • Investing in securities is speculative. You may lose some or all of the money that is invested. • Past results of any particular trading system are not guarantee indicative of future performance. • Always consult with a registered investment advisor or licensed stock broker before investing. 2 Merlyn.AI Prudent Investing Just Got Simpler and Safer The Plan: • Brief Summary of our Base Technology • How Artificial Intelligence Will Help • A Summary of How Merlyn.AI Works • The Merlyn.AI Strategies and Portfolios • Using Merlyn.AI within Sector Surfer • Let’s go Live Online and See How Things Work 3 Company History 2010 2017 2019 Merlyn.AI Corp. News Founded Jan 2019, Raised $2.5M, Exclusive License from SGS to Create & Market Merlyn ETFs Solactive US Bank RBC Calculator Custodian Publisher Market Maker Alpha Architect SGS Merlyn.AI ETF Advisor SectorSurfer SGS License Exemptive Relief NYSE AlphaDroid Investors Web Services MAI Indexes ETF Sponsor Compliance Quasar Marketing Distributor Cable CNBC Mktg. Approval Advisor Shares SEC FINRA G.Adwords Articles First Proved Momentum in Market Data Narasiman Jegadeesh Sheridan Titman Emory University U.
    [Show full text]
  • Exploring the Use of Artificial Intelligent Systems in STEM Classrooms Emmanuel Anthony Kornyo Submitted in Partial Fulfillment
    Exploring the use of Artificial Intelligent Systems in STEM Classrooms Emmanuel Anthony Kornyo Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2021 © 2021 Emmanuel Anthony Kornyo All Rights Reserved Abstract Exploring the use of Artificial Intelligent Systems in STEM Classrooms Emmanuel Anthony Kornyo Human beings by nature have a predisposition towards learning and the exploration of the natural world. We are intrinsically intellectual and social beings knitted with adaptive cognitive architectures. As Foot (2014) succinctly sums it up: “humans act collectively, learn by doing, and communicate in and via their actions” and they “… make, employ, and adapt tools of all kinds to learn and communicate” and “community is central to the process of making and interpreting meaning—and thus to all forms of learning, communicating, and acting” (p.3). Education remains pivotal in the transmission of social values including language, knowledge, science, technology, and an avalanche of others. Indeed, Science, Technology, Engineering, and Mathematics (STEM) have been significant to the advancement of social cultures transcending every epoch to contemporary times. As Jasanoff (2004) poignantly observed, “the ways in which we know and represent the world (both nature and society) are inseparable from the ways in which we choose to live in it. […] Scientific knowledge [..] both embeds and is embedded in social practices, identities, norms, conventions, discourses, instruments, and institutions” (p.2-3). In essence, science remains both a tacit and an explicit cultural activity through which human beings explore their own world, discover nature, create knowledge and technology towards their progress and existence.
    [Show full text]
  • Outline of Machine Learning
    Outline of machine learning The following outline is provided as an overview of and topical guide to machine learning: Machine learning – subfield of computer science[1] (more particularly soft computing) that evolved from the study of pattern recognition and computational learning theory in artificial intelligence.[1] In 1959, Arthur Samuel defined machine learning as a "Field of study that gives computers the ability to learn without being explicitly programmed".[2] Machine learning explores the study and construction of algorithms that can learn from and make predictions on data.[3] Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions. Contents What type of thing is machine learning? Branches of machine learning Subfields of machine learning Cross-disciplinary fields involving machine learning Applications of machine learning Machine learning hardware Machine learning tools Machine learning frameworks Machine learning libraries Machine learning algorithms Machine learning methods Dimensionality reduction Ensemble learning Meta learning Reinforcement learning Supervised learning Unsupervised learning Semi-supervised learning Deep learning Other machine learning methods and problems Machine learning research History of machine learning Machine learning projects Machine learning organizations Machine learning conferences and workshops Machine learning publications
    [Show full text]
  • Methodologies for Continuous Life-Long Machine Learning for AI Systems
    Methodologies for Continuous Life-long Machine Learning for AI Systems James A. Crowder* John N. Carbone *Colorado Engineering, Inc. Electrical and Computer Engineering Dept. Colorado Springs, CO 80920 Southern Methodist University USA Dallas, TX 75205, USA Email: [email protected], [email protected] Abstract— Current machine learning architectures, strategies, 1. INTRODUCTION: LIFE-LONG MACHINE and methods are typically static and non-interactive, making LEARNING them incapable of adapting to changing and/or heterogeneous data environments, either in real-time, or in near-real-time. A fully autonomous, artificially intelligent system Typically, in real-time applications, large amounts of disparate has been the holy grail of AI for decades. However, data must be processed, learned from, and actionable current machine learning methodologies are too static intelligence provided in terms of recognition of evolving and minimally adaptive enough to provide the activities. Applications like Rapid Situational Awareness (RSA) necessary qualitative continuously self-adaptive used for support of critical systems (e.g., Battlefield Management and Control) require critical analytical assessment learning required for possible decades of system and decision support by automatically processing massive and performance. Therefore, we employ biologically increasingly amounts of data to provide recognition of evolving inspired research and artificial human learning events, alerts, and providing actionable intelligence to operators mechanisms for enabling AI neural pathways and and analysts [2 and 4]. memories to evolve and grow over time [5 & 8]. These mechanisms enable a paradigm shift providing Herein we prescribe potential methods and strategies for continuously adapting, life-long machine learning within a self- continuous, or life-long, machine learning algorithm learning and self-evaluation environment to enhance real- and method evolution.
    [Show full text]
  • Eds. Tero Tuovinen, Jacques Periaux & Pekka Neittaanmäki
    Reports on Scientific Computing and Optimization 1/2019 Eds. Tero Tuovinen, Jacques Periaux & Pekka Neittaanmäki Book of abstract CSAI Computational Sciences and AI in Industry: new digital technologies for solving future societal and economical challenges 2019 12-14 June, 2019, Jyväskylä, Finland Reports on Scientific Computing and Optimization 1/2019 Editors Tero Tuovinen, Jacques Periaux & Pekka Neittaanmäki University of Jyväskylä ISBN 978-951-39-7788-7 (Print) ISBN 978-951-39-7789-4 (e-publication) ISSN 2489-351X AgC132.1 AgC133.1 AgC134.1 AgC133.2 PÄÄOVI VIRASTOMESTARIT AUDITORIO 2 LUENTOSALI ALFA AULA AUDITORIO 1 AgD121.1 LUENTOSALI BEETA RAVINTOLA PIATO AUDITORIO 3 AGORA 1.KERROS AgB113.1 EUROPE AgB112.2 LATIN AgB112.1 AFRICA AgB111.1 ASIA 0 5 10m Agora (Ag), 1.Kerros, 20.02.2006 AgC231.1 AgC232.1 AgC233.1 AgC234.1 AgC201 LUENTOSALI GAMMA AgC221.1 AgC222.1 LUENTOSALI DELTA AGORA 2.KERROS AgB213.1 LAKES AgB212.2 MOUNTAINS AgB212.1 FINLAND AgB211.1 SOVJET 0 5 10m Agora (Ag), 2.Kerros, 20.02.2006 Preface First of all and on behalf of the University of Jyv¨askyl¨aorganizers, we would like to extend our warm congratulations on the successful participation of the delegates of this conference, the first in Europe in 2019 of a series and to express our heartfelt thanks to people from all computational Sciences and Artificial In- telligence styles who are meeting this week in Jyv¨askyl¨ain a friendly perspective of international cooperation in science and technology. The rapid digitalization of industry and society is bringing opportunities and challenges to large and small industries, with target in significantly improving well being of citizens, in reducing their environmental impact (greener transport called essential and big data seen as a help in cutting emissions), in increasing safe autonomous urban, air and surface transport, mobility, in accessing efficient (big) data tools for medical applications and public health-care systems.
    [Show full text]
  • M.Tech. Information Technology Regulations – 2015 Choice Based Credit System
    ANNA UNIVERSITY, CHENNAI UNIVERSITY DEPARTMENTS M.TECH. INFORMATION TECHNOLOGY REGULATIONS – 2015 CHOICE BASED CREDIT SYSTEM PROGRAMME EDUCATIONAL OBJECTIVES (PEOs) : I. To prepare students to excel in research and to succeed in Information Technology profession through global, rigorous post graduate education. II. To provide students with a solid foundation in computing, communication and information technologies that is required to become a successful IT professional or a researcher in the field of computer science and information technology. III. To train students with good computing and communication knowledge so as to comprehend, analyze, design, and create novel software products and communication protocols for the real life problems. IV. To inculcate students in professional and ethical attitude, effective communication skills, teamwork skills, multidisciplinary approach, and an ability to relate information technology issues to broader social context. V. To provide student with an academic environment aware of excellence, leadership, written ethical codes and guidelines, and the life-long learning needed for a successful professional career PROGRAMME OUTCOMES (POs): On successful completion of the programme, 1. Graduates will demonstrate knowledge of information technology, computer science and communication engineering. 2. Graduates will demonstrate an ability to identify, formulate and solve computing and communication problems. 3. Graduate will demonstrate an ability to design effective and useful software and carry out research in the fields of computing and communication. 4. Graduates will demonstrate an ability to implement a system, component or process as per needs and specifications. 5. Graduates will demonstrate an ability to implement the projects that require knowledge from related fields like electronics and communication. 6. Graduate will demonstrate skills to use modern computing paradigms and computing platforms to develop products and projects that are useful to the society.
    [Show full text]
  • Agnostic Feature Selection Guillaume Florent Doquet
    Agnostic Feature Selection Guillaume Florent Doquet To cite this version: Guillaume Florent Doquet. Agnostic Feature Selection. Artificial Intelligence [cs.AI]. Université Paris- Saclay/Université Paris-Sud, 2019. English. tel-02436845 HAL Id: tel-02436845 https://hal.archives-ouvertes.fr/tel-02436845 Submitted on 13 Jan 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Agnostic Feature Selection These` de doctorat de l’Universite´ Paris-Saclay prepar´ ee´ a` Universite´ Paris-Sud Ecole doctorale n◦580 Sciences et technologies de l’information et de la communication (STIC) Specialit´ e´ de doctorat : Informatique These` present´ ee´ et soutenue a` Gif-sur-Yvette, le 29/11/19, par NNT : 20XXSACLXXXX GUILLAUME DOQUET Composition du Jury : Jer´ emie´ MARY Senior Researcher, Criteo Rapporteur Mohamed NADIF Professeur, Universite´ Paris-Descartes Rapporteur Anne VILNAT Professeur, Universite´ Paris-Sud President´ du jury Gilles GASSO Professeur, INSA Rouen Examinateur Amaury HABRARD Professeur, Universite´ de St-Etienne Examinateur Michele` SEBAG DR1, Universite´ Paris-Sud Directeur de these` ` ese de doctorat Th 2 Abstract With the advent of Big Data, databases whose size far exceed the human scale are becoming increasingly common. The resulting overabundance of monitored variables (friends on a social network, movies watched, nucleotides coding the DNA, monetary transactions...) has motivated the development of Dimensionality Re- duction (DR) techniques.
    [Show full text]