Using Ipus from Docker Release 1.4.0

Using IPUs from Docker Release 1.4.0 Graphcore Ltd Dec 08, 2020 CONTENTS 1 Introduction 1 2 Initial setup 2 3 Using gc-docker 3 3.1 Loading docker images.............................................3 3.2 Verifying IPU access from inside container..................................4 3.3 Mounting directories from the host......................................5 3.4 Setting environment variables.........................................5 4 Running a TensorFlow application on an IPU6 5 Extending the images 7 6 Further reading 8 7 Trademarks & copyright 9 i CHAPTER ONE INTRODUCTION This guide explains how you can run applications in Docker on a Linux machine with one or more physical IPU devices. Prerequisites: • A machine with IPU devices • Ubuntu 18.04 / CentOS 7.6 1 CHAPTER TWO INITIAL SETUP First check if your machine has the IPU device driver installed. You can check this is loaded and running with the following command: $ modinfo ipu_driver If the driver is are installed and running, you should see something similar to: $ modinfo ipu_driver filename: /lib/modules/4.15.0-55-generic/updates/dkms/ipu_driver.ko version: 1.0.39 description: IPU PCI Driver author: Graphcore Limited license: GPL srcversion: 49FFB7D8556EB58899AE41A alias: pci:v00001D95d00000003sv*sd*bc*sc*i* alias: pci:v00001D95d00000002sv*sd*bc*sc*i* alias: pci:v00001D95d00000001sv*sd*bc*sc*i* depends: retpoline: Y name: ipu_driver vermagic: 4.15.0-55-generic SMP mod_unload parm: memmap_start:array of ulong parm: memmap_size:array of ulong If so, proceed to the next section. If it returns an error along the lines of: $ modinfo ipu_driver modinfo: ERROR: Module ipu_driver not found. You will need to install the driver. See the Getting Started Guide for your IPU system for more information. 2 CHAPTER THREE USING GC-DOCKER The Graphcore Poplar SDK includes some command line tools for managing the IPU system. The gc-docker command is a small wrapper for the command docker run which adds the correct flags to use a set of IPU devices inside a running container. If this is not on your path, you will need to go to the Poplar installation directory and enable the command-line tools: $ cd[poplar-installation-path] $ source enable.sh This must be done in each shell. Alternatively, you can run the following command to automatically source it in all new Bash login shells: $ echo 'source [full-path-to-extracted-poplar]/enable.sh' >> ~/.bash_profile 3.1 Loading docker images First, download the Poplar image bundle from the Graphcore Downloads portal. Then load the bundle into your local Docker daemon: $ docker load --input=poplar-docker-images-1.3.0.tar.gz Check the images have loaded and had tags applied. For example (output trimmed): $ docker images REPOSITORY TAG IMAGE ID graphcore/tools 1.3.0 c2f5ebc91d4b graphcore/tensorflow 1 6175c27cb631 graphcore/tensorflow 1-amd 6175c27cb631 graphcore/tensorflow 2 ae99a3fd3181 graphcore/tensorflow 2-amd ae99a3fd3181 graphcore/tensorflow 1-intel db5fef31303d graphcore/tensorflow 2-intel cb3b3a41321e graphcore/pytorch 1.3.0 d84478558ab0 graphcore/poplar 1.3.0 c744278a89b2 ubuntu bionic-20200903 c14bccfdea1c • graphcore/tools: contains tools to interact with IPU devices. • graphcore/poplar: contains Poplar and PopART. • graphcore/tensorflow: is based on graphcore/poplar, with TensorFlow installed on top. These images are tagged with 1 and 2 to choose between using TensorFlow 1 or 2. AMD optimised builds are the default, but can be explicitly used with the 1-amd and 2-amd tags. Builds using Intel specific instructions can be used with 1-intel and 2-intel tags. • graphcore/pytorch: is based on graphcore/poplar, with PyTorch and PopTorch installed. 3 Note: This tarball method of container image delivery will be replaced with a Docker registry in future, which will enable docker pull to be used instead. 3.2 Verifying IPU access from inside container First check you have access to the IPU devices on the host. To do this, run gc-info -l and check the output contains a list of devices. Next, do the same but inside the context of a container: $ gc-docker -- --rm -ti graphcore/tools gc-info -l The output should be the same. Check you can run a TensorFlow container with gc-docker, and make sure the IPUs are visible to TensorFlow: $ gc-docker -- --rm -ti graphcore/tensorflow:2 python3 Python 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow >>> tensorflow.config.list_physical_devices("IPU") [PhysicalDevice(name='/physical_device:IPU:0', device_type='IPU')] >>> The syntax for running an image with gc-docker is similar to using docker run, which is: $ docker run[OPTIONS] IMAGE[COMMAND][ARG...] The main difference is that docker run is replaced with gc-docker --. So, in the TensorFlow example above, we used the graphcore/tensorflow:2 image and ran python3 as the command. No arguments were passed to python3. The -- part of this command tells gc-docker that the rest of the arguments should be passed directly to docker run. gc-docker also has a few options which can be used before this. For example, you can pass a subset of IPU devices using --device-id n: $ gc-docker --device-id4 -- --rm -ti graphcore/tools gc-info -l Please note, the device IDs in the container always start from zero. So if you select a subset of devices, they will be numbered from 0. For example, if you use devices 4 to 7, they will have IDs 0 to 3 in the container. The --echo option is also useful. This makes gc-docker print the Docker command it would have run. For example: $ gc-docker --echo --device-id4 -- --rm -ti graphcore/tools gc-info -l docker run --device=/dev/ipu4:/dev/ipu4 --device=/dev/ipu4_ex:/dev/ipu4_ex -ti graphcore/ ,!tools gc-info -l Use the --help option or refer to the IPU Command Line Tools document, for more information. 4 3.3 Mounting directories from the host You can mount volumes to share data between the host machine and the Docker container environment. This is useful for cases where you need to read data to be processed or to output results. Volumes are mounted using the -v option. The basic syntax is -v <path_on_host>:<path_in_container>. For example, to mount /home/me/cat_pics from your host machine as /cats in the container, you could run the following command: $ gc-docker -- -ti -v /home/me/cat_pics:/cats graphcore/tensorflow ls -a /cats . .. mog.jpg 3.4 Setting environment variables If you need some environment variables set inside the Docker environment, add -e VAR_NAME="var value" to your Docker options. For example: $ gc-docker -- -ti -e POPLAR_LOG_LEVEL=TRACE graphcore/tensorflow:2 python3 5 CHAPTER FOUR RUNNING A TENSORFLOW APPLICATION ON AN IPU To demonstrate the workflow for running a TensorFlow application on IPUs in a Docker development environment, we will use one of the TensorFlow applications from the Graphcore public examples repository. First, get the code: $ git clone https://github.com/graphcore/examples.git $ cd examples A common pattern when working with a Docker-based development environment is to mount the current directory into the container (as described in Mounting directories from the host), then set the working directory inside the container with -w <dir name>. For example, -v "$(pwd):/app" -w /app. Applying this, you can run the LSTM example with the following command: $ gc-docker -- -ti -v" $(pwd):/app" -w /app graphcore/tensorflow:1 python3 code_examples/ ,!tensorflow/kernel_benchmarks/lstm.py To avoid running out of shared memory, many machine learning applications being very demanding on that part, it is recommended to add the following docker option: --ipc=host. 6 CHAPTER FIVE EXTENDING THE IMAGES These base images can be used to create new images for more specialised purposes, or to package an application for deployment to platforms such as Kubernetes or Kubeflow. As an example, here’s a simple Dockerfile example that creates a Jupyter notebook environment with TensorFlow and access to IPUs: FROM graphcore/tensorflow:2 RUN pip3 install notebook CMD ["jupyter", "notebook", "--allow-root", "--ip=0.0.0.0", "--port=8080"] You can build and run this with the following commands: $ docker build -t notebook . $ gc-docker -- -p 8080:8080 notebook 7 CHAPTER SIX FURTHER READING You can find documentation for the Graphcore software products on the Developer page of the Graphcore website. 8 CHAPTER SEVEN TRADEMARKS & COPYRIGHT Graphcore® and Poplar® are registered trademarks of Graphcore Ltd. AI-Float™, Colossus™, Exchange Memory™, In-Processor-Memory™, IPU-Core™, IPU-Exchange™, IPU-Fabric™, IPU-Link™, IPU-M2000™, IPU-Machine™, IPU-POD™, IPU-Tile™, PopART™, PopLibs™, PopVision™, Pop- Torch™, Streaming Memory™ and Virtual-IPU™ are trademarks of Graphcore Ltd. All other trademarks are the property of their respective owners. Copyright © 2016-2020 Graphcore Ltd. All rights reserved. 9.

Using Ipus from Docker Release 1.4.0

Running ML/DL Workloads Using Red Hat Openshift Container Platform V3.11 Accelerate Your ML/DL Projects Platform Using Kubeflow and NVIDIA Gpus

Application Development with Azure

Running Large-Scale Machine Learning Experiments in the Cloud

Chapter 1 - Overview

Tensorflow 2.0 and Kubeflow for Scalable and Reproducable Enterprise Ai

Reference Architecture for Kubeflow on Openshift Accelerate ML/DL Workloads Using Kubeflow and Poweredge Servers

CNCF Webinar Taming Your AI/ML Workloads with Kubeflow

Kubeflow: End to End ML Platform Animesh Singh

Machine Learning at Scale with Kubernetes Aug 23Rd, 2018

Application Development with Azure

Build an Event Driven Machine Learning Pipeline on Kubernetes

Arxiv:2103.00490V1 [Cs.DC] 24 Feb 2021 Keywords: Dataset Lifecycle Framework · Kubeﬂow · Kubernetes · Bioinformatics