What If What I Need Is Not in Powerai (Yet)? What You Need to Know to Build from Scratch?
Total Page:16
File Type:pdf, Size:1020Kb
IBM Systems What if what I need is not in PowerAI (yet)? What you need to know to build from scratch? Jean-Armand Broyelle June 2018 IBM Systems – Cognitive Era Things to consider when you have to rebuild a framework © 2017 International Business Machines Corporation 2 IBM Systems – Cognitive Era CUDA Downloads © 2017 International Business Machines Corporation 3 IBM Systems – Cognitive Era CUDA 8 – under Legacy Releases © 2017 International Business Machines Corporation 4 IBM Systems – Cognitive Era CUDA 8 Install Steps © 2017 International Business Machines Corporation 5 IBM Systems – Cognitive Era cuDNN and NVIDIA drivers © 2017 International Business Machines Corporation 6 IBM Systems – Cognitive Era cuDNN v6.0 for CUDA 8.0 © 2017 International Business Machines Corporation 7 IBM Systems – Cognitive Era cuDNN and NVIDIA drivers © 2017 International Business Machines Corporation 8 IBM Systems – Cognitive Era © 2017 International Business Machines Corporation 9 IBM Systems – Cognitive Era © 2017 International Business Machines Corporation 10 IBM Systems – Cognitive Era cuDNN and NVIDIA drivers © 2017 International Business Machines Corporation 11 IBM Systems – Cognitive Era Prepare your environment • When something goes wrong it’s better to Remove local anaconda installation $ cd ~; rm –rf anaconda2 .conda • Reinstall anaconda $ cd /tmp; wget https://repo.anaconda.com/archive/Anaconda2-5.1.0-Linux- ppc64le.sh $ bash /tmp/Anaconda2-5.1.0-Linux-ppc64le.sh • Activate PowerAI $ source /opt/DL/tensorflow/bin/tensorflow-activate • When you need to install a python package – Use pip install – Use pip install xx==<version> to force the installation of a given © 2017 International Business Machines Corporation 12 IBM Systems – Cognitive Era Keras installation & Tensorflow 1.8.0 compilation © 2017 International Business Machines Corporation 13 IBM Systems – Cognitive Era Install Keras • Set your anaconda in the $PATH $ export PATH=/opt/anaconda2/bin:$PATH • Activate powerAI tensorflow $ source /opt/DL/tensorflow/bin/tensorflow-activate • Install Keras via pip $ pip install keras • Test the stack $ python >>> import keras Using TensorFlow backend. © 2017 International Business Machines Corporation 14 IBM Systems – Cognitive Era Recompile Tensorflow 1.8.0 Download and Build bazel (minimum version for TF 1.8.0 is bazel 0.10) Bazel is build tool that is required to compile tensorflow. $ mkdir bazel $ cd bazel $ wget https://github.com/bazelbuild/bazel/releases/download/0.10.0/bazel-0.10.0-dist.zip $ unzip bazel-0.10.0-dist.zip $./compile.sh Building Bazel from scratch.. … … Build successful! Binary is here: /root/bazel/output/bazel Add bazel binary to the PATH export PATH=/root/bazel/output/:$PATH © 2017 International Business Machines Corporation 15 IBM Systems – Cognitive Era Download and Compile tensorflow (latest release) git clone --recurse-submodules https://github.com/tensorflow/tensorflow.git cd tensorflow/ ./configure Please specify the location of python. [Default is /opt/anaconda2/bin/python]: <leave blank> Please input the desired Python library path to use. Default is [/opt/DL/tensorflow/lib/python2.7/site- packages]: <leave blank> Do you wish to build TensorFlow with jemalloc as malloc support? [Y/n]: Y Do you wish to build TensorFlow with Google Cloud Platform support? [Y/n]: Y Do you wish to build TensorFlow with Hadoop File System support? [Y/n]: Y Do you wish to build TensorFlow with Amazon S3 File System support? [Y/n]: N Do you wish to build TensorFlow with Apache Kafka Platform support? [Y/n]: N Do you wish to build TensorFlow with XLA JIT support? [y/N]: N Do you wish to build TensorFlow with GDR support? [y/N]: N Do you wish to build TensorFlow with VERBS support? [y/N]: N Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: N Do you wish to build TensorFlow with CUDA support? [y/N]: Y Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to default to CUDA 9.0]: <leave blank> Please specify the location where CUDA 9.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: <leave blank> Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7.0]: <leave blank> Please specify the location where cuDNN 7 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: <leave blank> © 2017 International Business Machines Corporation 16 IBM Systems – Cognitive Era Do you wish to build TensorFlow with TensorRT support? [y/N]: N Please specify the NCCL version you want to use. [Leave empty to default to NCCL 1.3]: <leave blank> Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,5.2]6.0 Do you want to use clang as CUDA compiler? [y/N]: N Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: <leave blank> Do you wish to build TensorFlow with MPI support? [y/N]: N Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -mcpu=native]: -mcpu=power8 Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]:N Build pip package bazel build --copt='-std=gnu99' --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg Install the package pip install /tmp/tensorflow_pkg/tensorflow-1.8.0-cp27-cp27mu-linux_ppc64le.whl © 2017 International Business Machines Corporation 17 IBM Systems – Cognitive Era Exit tensorflow build directory and verify installation [root@289961934b8a ~]# python Python 2.7.13 |Anaconda custom (64-bit)| (default, Sep 15 2017, 20:54:27) [GCC 4.8.4] on linux2 Type "help", "copyright", "credits" or "license" for more information. Anaconda is brought to you by Continuum Analytics. Please check out: http://continuum.io/thanks and https://anaconda.org >>> import tensorflow as tf >>> tf.__version__ '1.8.0' >>> © 2017 International Business Machines Corporation 18 IBM Systems – Cognitive Era Get access to TensorFlow's latest build we now have two options for you to access the latest TensorFlow version for Power Systems : 1. TensorFlow version 1.8 for Power has been built and is available through our Linux Technology Center partnership with IBM Unicamp process. You can get TF 1.8 for Power (along with readme info, etc.) right by contacting IBM : – call the Montpellier team for example ! 2. You can also access a Jenkins instance at the Open Source software hub with Oregon State, which builds TensorFlow images as part of a community build. Google includes us in their community distribution, and you can see the build status here: https://github.com/tensorflow/ © 2017 International Business Machines Corporation 19 IBM Systems – Cognitive Era Caffe rebuilt and installation with Large Model Support © 2017 International Business Machines Corporation 20 IBM Systems – Cognitive Era Pre-requisite • Install CUDA Toolkit and driver, CUDNN 7 • Install Openblas lib and devel packages $ sudo yum install openblas-* • Build NCCL $ git clone https://github.com/NVIDIA/nccl.git $ make • Install lmdb packages $ sudo yum install lmdb-* liblmdb-dev • Install protobuf packages $ yum install protobuf-* libprotobuf-dev protobuf-compiler • Install hdf5 packages $ yum install hdf5-* libhdf5-dev • Install boost packages $ sudo yum install boost-* libboost-* • Install leveldb and snappy $ sudo yum install libleveldb-dev libsnappy-dev $ sudo yum install libleveldb-dev libsnappy-dev • Other packages © 2017 International$ sudo Businessyum Machinesinstall Corporationlibgflags-* libgoogle-glog-dev 21 IBM Systems – Cognitive Era Rebuild caffe with LMS • The build and install steps for Caffe LMS follows the standard steps for BVLC Caffe. • Public git hub link for Caffe LMS code: https://github.com/ibmsoe/caffe/tree/master-lms $ git clone https://github.com/ibmsoe/caffe.git $ git checkout master-lms commit d04b5c56edc0a5a046ae5f0ea0edd310aed99d4d • Update Makefile.config to point to dependent library and include file paths • Compile the source code $ make all • Executable files are generated under the folder "builds/tools". © 2017 International Business Machines Corporation 22 IBM Systems – Cognitive Era Validation and test • Environment variable settings: • NOTE: In the below settings, the path must be changed to specify the installation paths on the • system used. • User specific aliases and functions : $ export LD_LIBRARY_PATH=/home/sarithav/DL/nccl/lib:$LD_LIBRARY_PATH • Command used for training using Caffe LMS: $ ./build/tools/caffe train -solver solver.prototxt -gpu 0,1,2,3 -lms 1000 © 2017 International Business Machines Corporation 23 IBM Systems – Cognitive Era torch7 compilation © 2017 International Business Machines Corporation 24 IBM Systems – Cognitive Era Install torch7 Torch7 requires LuaJIT (interpreter) and Luarocks (LUA package manager) to work. The official git repository (https://github.com/torch/luajit-rocks) does not support ppc64le architecture but ported code exists in https://github.com/PPC64/ • Install prerequisites $ yum install –y readline-devel • Clone git repository $ git clone --recurse-submodules https://github.com/regiscely/torch7 • Compile $ cd torch7/ $ export TORCH_NVCC_FLAGS="-D__CUDA_NO_HALF_OPERATORS__" $ ./install.sh ©