Turbocharging the AI Pipeline with Python and Anaconda

Stan Seibert Director of Community Innovation @ Anaconda

!1 What is AI? Augmented AI = Artificial Intelligence?

• Create insight by combining: • Data • Software • Domain Expertise

© 2018 Anaconda - Confidential & Proprietary !2 Automated Operational AI Is Hard

A mature system might end up being (at “ most) 5% machine learning code and (at least) 95% glue code – Google AI Researchers ”

Source: “Machine Learning: the High-Interest Credit Card of Technical Debt”, Google Inc, 2015

© 2018 Anaconda - Confidential & Proprietary !3 Critical Components

• Open Source • Build on cutting-edge innovation in the community • Reproducible Environments • Manage the data science software environment • Team Collaboration • Share and collaborate with fellow data scientists and other stakeholders

© 2018 Anaconda - Confidential & Proprietary 4 Deep Learning: A "Killer App" for GPUs

!5 Examples: Image Captioning

van Gogh

https://research.googleblog.com/2016/09/show-and-tell-image-captioning-open.html

© 2018 Anaconda 6 Examples: Time Series Analysis

• Want to model and predict behavior in foreign exchange markets • GPU-trained RNN using previous 20 values to predict next value

http://on-demand.gputechconf.com/gtc/2017/presentation/s7625-daniel-egloff-going-deeper-in-finance.pdf

© 2018 Anaconda !7 Examples: Style Transfer

Picasso van Gogh Monet

http://genekogan.com/works/style-transfer/

© 2018 Anaconda 8 The Deep Learning Big Bang

(Much) Bigger Training Sets Faster & Specialized Hardware

Monet Open Source Tools Algorithm Research

© 2018 Anaconda - Confidential & Proprietary 9 Python: The Language of Deep Learning

TensorFlow Keras PyTorch

© 2018 Anaconda - Confidential & Proprietary 10 How do GPUs help?

GPU CPU

© 2018 Anaconda - Confidential & Proprietary !11 The GPU: A Parallel Execution Engine

× 80

© 2018 Anaconda - Confidential & Proprietary !12 Critical Components

• Open Source • Build on cutting-edge innovation in the community • Reproducible Environments • Manage the data science software environment • Team Collaboration • Share and collaborate with fellow data scientists and other stakeholders

© 2018 Anaconda - Confidential & Proprietary 13 Expanding Beyond Deep Learning

!14 What Else Do Data Scientists Want?

• Data Preparation / ETL • Feature Engineering • Simulation • Scalable Analytics • Fast Visualization

• ...ability to integrate tools together in a complete pipeline

© 2018 Anaconda - Confidential & Proprietary 15 Numba: A JIT Compiler for Python

• An open-source, function-at-a-time compiler library for Python • Compiler toolbox for different targets and execution models: • single-threaded CPU, multi-threaded CPU, GPU • regular functions, “universal functions” (array functions), etc • Speedup: 2x (compared to basic code) to 200x (compared to pure Python) • Combine ease of writing Python with speeds approaching FORTRAN • BSD licensed (including GPU compiler) • Goal is to empower scientists who make tools for themselves and other scientists

© 2018 Anaconda, Inc. CUDA Python Denotes CUDA kernel function (launched from CPU, runs on GPU)

Work with NumPy array elements and attributes on the GPU

Special CUDA function for atomic addition

Launch CUDA kernel with 32 blocks of 32 threads each

© 2018 Anaconda, Inc. Numba in a Machine Learning World

Simulation

Causes Effects

Machine Learning

© 2018 Anaconda - Confidential & Proprietary !18 Numba in a Machine Learning World

Simulation

Cars on a Road Pixels in a Camera

Machine Learning

© 2018 Anaconda - Confidential & Proprietary !19 Numba in a Machine Learning World

Numba

Cars on a Road Pixels in a Camera

TensorFlow

© 2018 Anaconda - Confidential & Proprietary !20 GPU Array Interop

Modifying these projects * to share GPU tensors through a common Python interface

* PyTorch support in community PR

© 2018 Anaconda - Confidential & Proprietary 21 Breaking down GPU data silos

ETL/Data Machine Data Data Prep Learning

Database Data Data Visualization

GPU

© 2018 Anaconda - Confidential & Proprietary !22 22 CPU transfer

ETL/Data Machine Data Data Prep Learning

CPU transfer CPU transfer

Database Data Data Visualization

GPU

© 2018 Anaconda - Confidential & Proprietary !23 23 Better: Keep the Data on the GPU

ETL/Data Machine Data Data Prep Learning

Database Data Data Visualization

GPU

© 2018 Anaconda - Confidential & Proprietary !24 24 Apache Arrow

• Rapidly becoming the standard for many kinds of structured data • Can Arrow work on the GPU? • Answer: Yes!

© 2018 Anaconda - Confidential & Proprietary 25 Dataframes on the GPU

• GPU Dataframes are Arrow-format data structures on the GPU • Designed to be passed between different applications, languages and runtimes: • Example: MapD database -> Python notebook -> XGBoost • RAPIDS dataframe support includes 3 tier library structure (names to be changed soon): • libgdf: library of GPU dataframe functions • pygdf: Python wrapper around GDF + JIT functionality • dask_gdf: Distributed GPU dataframes

© 2018 Anaconda - Confidential & Proprietary 26 Dask: Distributed Computing Made Easy

• Scalable execution task graphs of task graphs from single computers to 1000+ node clusters • Orchestrate CPU and GPU tasks on data structures distributed across many nodes

© 2018 Anaconda - Confidential & Proprietary !27! 27 Reproducible Environments

!28 Moving From Tools to Projects

• Every data science project is a software integration problem • Need to bring together many tools, plus their dependencies • Important to be able to record state of environment for future reproducibility • Often latest versions of libraries • Other times want older versions of libraries • Want to be able to easily compare different versions

© 2018 Anaconda - Confidential & Proprietary 29 © 2018 Anaconda - Confidential & Proprietary 30 Conda Package Manager

B • Language independent A Numba • Platform independent NumPy NumPy C • No special privileges required v1.10 v1.11 Pandas cudatoolkit R • No VMs or containers v0.16 9.2 Essentials

• Enables: Python v2.7 Python v3.6 R • Reproducibility • Collaboration conda • Scaling

© 2018 Anaconda - Confidential & Proprietary !31 31 The Deep Learning Technology Stack

NEURAL NETWORKS KERAS

TENSOR MATH CUPY MXNET TENSORFLOW ...and many others PRIMITIVES MKL-DNN CUBLAS CUDNN NCCL

OS/DRIVERS

MANY-CORE CPU HARDWARE MULTI-CORE CPU GPU (XEON PHI)

© 2018 Anaconda - Confidential & Proprietary 32 The Deep Learning Technology Stack Installable with Conda

NEURAL NETWORKS CHAINER KERAS

TENSOR MATH CUPY MXNET TENSORFLOW

PRIMITIVES MKL-DNN CUBLAS CUDNN NCCL

OS/DRIVERS

MANY-CORE CPU HARDWARE MULTI-CORE CPU GPU (XEON PHI)

© 2018 Anaconda - Confidential & Proprietary 33 GPU-Accelerated Packages in Anaconda

© 2018 Anaconda - Confidential & Proprietary !34 Starting a Deep Learning Project

conda create -n deeplearn python=3.6 notebook keras -gpu conda activate deeplearn jupyter notebook

Force a particular CUDA toolkit: conda create -n deeplearn python=3.6 notebook keras tensorflow-gpu \ cudatoolkit=9.0

© 2018 Anaconda - Confidential & Proprietary !35 Jupyter Notebooks: DL Narratives

© 2018 Anaconda - Confidential & Proprietary 36 Packing Models with Conda

TensorFlow Keras My Model

Pandas My App

Tornado

© 2018 Anaconda 37 Data Science For Teams

!38 GPU Needs for Data Science Teams

• Centralized access • High end GPUs are most easily managed in the data center • Data scientists connect remotely to train and run their GPU-accelerated models • Resource management • Best practice is to allow only one application to use a GPU at a time. • Need to reserve GPU exclusively for a user when they are running their GPU code • Support for varying compute needs • Many projects don't need a GPU, some need one, a very small number need two or more • Most cost-efficient approach is a heterogenous cluster with mixture of GPU and non-GPU nodes

© 2018 Anaconda - Confidential & Proprietary !39 Anaconda Enterprise: An AI Enablement Platform For Teams At Scale

From One Data Scientist to Thousands From One Machine to Thousands

Anaconda Enterprise 5

© 2018 Anaconda !40 Anaconda Enterprise: Cloud Native AI

• Anaconda is the industry’s trusted distributor for all core Core AI AI technologies Technologies Governance • Simplifies and automates AI governance and reproducibility Cloud Native • Cloud Native: Modern, Approach dynamic, API-oriented, and container-based

© 2018 Anaconda - Confidential & Proprietary !41 © 2018 Anaconda - Confidential & Proprietary !42 Heterogenous Clusters

AE Worker (CPU) AE Worker (GPU) AE Master (Node 1) AE Services Editor Sessions Object Storage Database Git UI Session 2 (TensorFlow)

Persistent Disk kube-apiserver Deploy Storage etcd Sessions docker Editor Sessions Session 4 (PyTorch) kubelet Session 1 (Py)

Session 3 (Py) Deployments

Deployments AE Kubernetes Masters (Nodes 2 and 3) Deployment 2 (TF) Deployment 1 (Py) kube-apiserver Deployment 3 (Py) etcd docker docker docker kubelet kubelet kubelet

© 2018 Anaconda - Confidential & Proprietary 19!43 Projects that need GPU resources can request them

© 2018 Anaconda - Confidential & Proprietary !44 GPU device available to TensorFlow running in notebook

© 2018 Anaconda !45 © 2018 Anaconda !46 Varying Hardware Needs

Inference in batches, or one at a time?

Do you need GPU- GPU only 30% faster w/ batch size = 1 GPU 10x faster when levels of throughput batch size >= 32 in production?

© 2018 Anaconda 47 Conclusion

• GPU-accelerated AI with Anaconda brings together many parts: • Open source technology: Deep learning, GPU dataframes, Numba for custom algorithms and simulation • Reproducible environments: Conda package manager, Anaconda Distribution • Team collaboration: Resource management and deployment with Anaconda Enterprise • Learn more at: https://www.anaconda.com/

© 2018 Anaconda - Confidential & Proprietary !48 Questions?

© 2018 Anaconda - Confidential & Proprietary !49