DEEP-Hybriddatacloud ASSESSMENT of AVAILABLE TECHNOLOGIES for SUPPORTING ACCELERATORS and HPC, INITIAL DESIGN and IMPLEMENTATION PLAN

DEEP-HybridDataCloud ASSESSMENT OF AVAILABLE TECHNOLOGIES FOR SUPPORTING ACCELERATORS AND HPC, INITIAL DESIGN AND IMPLEMENTATION PLAN DELIVERABLE: D4.1 Document identifier: DEEP-JRA1-D4.1 Date: 29/04/2018 Activity: WP4 Lead partner: IISAS Status: FINAL Dissemination level: PUBLIC Permalink: http://hdl.handle.net/10261/164313 Abstract This document describes the state of the art of technologies for supporting bare-metal, accelerators and HPC in cloud and proposes an initial implementation plan. Available technologies will be analyzed from different points of views: stand-alone use, integration with cloud middleware, support for accelerators and HPC platforms. Based on results of these analyses, an initial implementation plan will be proposed containing information on what features should be developed and what components should be improved in the next period of the project. DEEP-HybridDataCloud – 777435 1 Copyright Notice Copyright © Members of the DEEP-HybridDataCloud Collaboration, 2017-2020. Delivery Slip Name Partner/Activity Date From Viet Tran IISAS / JRA1 25/04/2018 Marcin Plociennik PSNC 20/04/2018 Cristina Duma Aiftimiei Reviewed by INFN 25/04/2018 Zdeněk Šustr CESNET 25/04/2018 Approved by Steering Committee 30/04/2018 Document Log Issue Date Comment Author/Partner TOC 17/01/2018 Table of Contents Viet Tran / IISAS 0.01 06/02/2018 Writing assignment Viet Tran / IISAS 0.99 10/04/2018 Partner contributions WP members 1.0 19/04/2018 Version for first review Viet Tran / IISAS Updated version according to 1.1 22/04/2018 Viet Tran / IISAS recommendations from first review 2.0 24/04/2018 Version for second review Viet Tran / IISAS Updated version according to 2.1 27/04/2018 Viet Tran / IISAS recommendations from second review 3.0 29/04/2018 Final version Viet Tran / IISAS DEEP-HybridDataCloud – 777435 2 Table of Contents Executive Summary.............................................................................................................................5 1. Introduction.....................................................................................................................................6 2. Available technologies for obtaining bare-metal like performance.................................................6 2.1. Paravirtualization technologies................................................................................................7 2.1.1. SR-IOV and PCI Passthough..........................................................................................7 2.1.2. virtio................................................................................................................................8 2.2. Container technologies.............................................................................................................9 2.2.1. Docker.............................................................................................................................9 2.2.2. udocker............................................................................................................................9 2.2.3. Linux containers (LXC/LXD).......................................................................................10 2.2.4. Singularity.....................................................................................................................11 2.2.5. Other available technologies.........................................................................................11 2.3. Comparison of technologies..................................................................................................12 3. Supports for paravirtualization and containers in cloud middleware............................................13 3.1. OpenStack..............................................................................................................................13 3.1.1. OpenStack Heat.............................................................................................................14 3.1.2. OpenStack Magnum......................................................................................................15 3.1.3. OpenStack nova-lxd......................................................................................................16 3.2. OpenNebula...........................................................................................................................17 3.2.1. OpenNebula LXDoNe...................................................................................................18 3.2.2. ONEDock......................................................................................................................18 3.3. Kubernetes.............................................................................................................................19 3.4. Apache Mesos + Marathon + Chronos..................................................................................21 4. Support for Accelerated computing...............................................................................................26 4.1. Accelerators and Deep Learning............................................................................................26 4.1.1. Types of accelerators......................................................................................................26 4.1.2. Using accelerators in Deep Learning............................................................................27 4.2. Support for accelerators at hypervisor/container level..........................................................31 4.2.1. PCI passthrough and SR-IOV.......................................................................................32 4.2.2. GPU-specific virtualization methods............................................................................32 4.2.3. Device mapping (passthrough) for LXD/Docker...........................................................34 4.2.4. nvidia-docker runtime....................................................................................................35 4.2.5. Comparison of approaches for supporting accelerators.................................................36 4.3. Support for accelerators at cloud middleware level...............................................................36 4.3.1. OpenStack.....................................................................................................................36 4.3.2. OpenNebula...................................................................................................................37 4.3.3. Kubernetes.....................................................................................................................37 4.3.4. Apache Mesos................................................................................................................38 5. Interaction with HPC resources using PaaS approach...................................................................39 5.1. Specifics of HPC systems......................................................................................................39 5.2. Using containers in HPC........................................................................................................43 5.3. Local schedulers.....................................................................................................................44 5.4. PaaS Interaction and interfaces..............................................................................................46 6. Initial implementation plan............................................................................................................47 6.1. Task structure and coordination activities..............................................................................48 6.2. Coordination with other WPs.................................................................................................49 6.2.1. Coordination with WP 5................................................................................................49 6.2.2. Coordination with WP3.................................................................................................49 DEEP-HybridDataCloud – 777435 3 6.3. Initial implementation plan....................................................................................................50 6.3.1. Improving support for containers in cloud middleware................................................51 6.3.2. Improving support for accelerators in cloud middleware.............................................52 6.3.3. Interaction with HPC resources....................................................................................52 6.4. Risk assessments....................................................................................................................53 7. Conclusion.....................................................................................................................................54 8. List of Figures................................................................................................................................55 9. List of tables..................................................................................................................................55 10. Acronyms.....................................................................................................................................55 11. References and links....................................................................................................................58 11.1. References............................................................................................................................58

DEEP-Hybriddatacloud ASSESSMENT of AVAILABLE TECHNOLOGIES for SUPPORTING ACCELERATORS and HPC, INITIAL DESIGN and IMPLEMENTATION PLAN

Wire-Aware Architecture and Dataflow for CNN Accelerators

Efficient Management of Scratch-Pad Memories in Deep Learning

Efficient Processing of Deep Neural Networks

Accelerate Scientific Deep Learning Models on Heteroge- Neous Computing Platform with FPGA

Exploring the Programmability for Deep Learning Processors: from Architecture to Tensorization Chixiao Chen, Huwan Peng, Xindi Liu, Hongwei Ding and C.-J

Deep Learning Inference in Facebook Data Centers: Characterization, Performance Optimizations and Hardware Implications

Energy-Efficient Neural Network Architectures

Deep Learning in Mobile and Wireless Networking: a Survey Chaoyun Zhang, Paul Patras, and Hamed Haddadi

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization

The Use of Appropriate Hardware Fabrics to Achieve a Low Latency

Mobile/Embedded DNN and AI Socs

DLIR: an Intermediate Representation for Deep Learning Processors