May 22-25, 2017 | San Francisco Port, Develop, Tune on on POWER8

Why Linux on POWER?

● Enterprise Class Hardware

● 2x price/performance

● Easy to port to, Easy to use

● Optimized for performance –High-performing, highly reliable platform, capable of handling large quantities of data more efficiently – Enables high speed off-load capabilities with technologies such as CAPI and GPUs. –Flexible, fast of analytics algorithms

● 4X threads per core vs. x86 (up to 1536 threads per system)

● Memory: Large, fast workspace maximizes business insight

● 4X memory bandwidth vs. x86 (up to 32 TB of memory)

● Cache Ensure continuous data load for fast responses

● 6X more cache vs x86 (>19 MB cache per core) Linux is Linux everywhere!

Software: It's Linux! ▶Linux on POWER is Linux, with full distribution support from the major enterprise Linux distributions:

(RHEL) from Red Hat,

from Canonical, and

– SUSE Linux Enterprise Server (SLES) from SUSE.

● Thousands of open source binary packages are supported on the POWER platform

● Most standard packages are available directly from the major Linux distributions.

● Thousands of community maintained ppc64le packages that run on POWER

● Open Source POWER Availability Tool (OSPAT), searches ppc64le distros and ported apps The definitive guide to Linux applications to POWER

▶Research ▶Plan ▶Set up ▶Build ▶Resources Overview of Research Process

▶Plan for the port ▶Get started ▶Get hardware access ▶Choose Linux distribution ▶Prepare development environment ▶Determine database requirements ▶Gather test cases and tools ▶Are you ready to get started? Research

May 22-25, 2017 San Francisco Researching your app – Getting Started

▶Many packages are trivial to port to POWER ▶often require a simple recompile or make command. ▶Some require minor tweaks to #ifdef configurations ▶Some have – which implies big endian ▶Some need additional libraries, e.g. some math libraries ▶Some simply require identifying and loading the right dependencies ▶If the port is not trivial, we need to build a plan... AccesstoHardware

May 22-25, 2017 San Francisco Access to POWER8 hardware

▶If you have your own hardware already, you are set ▶But if not... –There are POWER cloud resources available throughout the world –Use the OpenPOWER Developer Resources Map,

●https://developer.ibm.com/linuxonpower/cloud-resources/ Cloud centers available world wide Highlights of current clouds Highlights of current clouds ●Cloud Hardware Recommendations

● If you are an academic or open source developer consider these cloud resources:

– Oregon State University- Open Source Lab (OSL)

● In partnership with IBM, the OSL provides access to IBM POWER based servers for developing and testing open source projects. Request access – Unicamp MiniCloud

● No-charge, access to Power virtual machines (VMs) for developing, testing or migrating applications to Linux on Power. Hosted by University of Campinas, Brazil. Request access

● If you are an ISV or need access to GPUs, CAPI, or NVlink technologies consider:

– Nimbix

● The Nimbix Jarvice Cloud platform is the first commercial cloud environment to feature POWER8 with NVLink. It features ready to deploy instances for deep learning and NVLink application deployment and development.

● If you don’t have hardware and don’t want to use a cloud,use an emulator

– QEMU (user-mode emulation) available in SDK – IBM Power Functional Simulator (full system simulation) are provided with the SDK. In this mode, you can develop and port applications without the need for Power hardware.

May 22-25, 2017 San Francisco Hardware: Simulators

▶If you want to try the port through a simulator, consider: –Instal the IBM Software Developer Kit (SDK) for Linux on Power on your Linux laptop/desktop

–run your Power binaries on the same desktop through emulation technologies

●Both QEMU (user-mode emulation) and the

●IBM Power Functional Simulator (full system simulation) –are provided with the SDK. In this mode, you can develop and port applications without the need for Power hardware. LinuxDistributions

May 22-25, 2017 San Francisco Linux Distribution

● Do you need the latest and greatest technology?

– Ubuntu, Debian, and Fedora typically have frequent releases that integrate the latest open source projects. They are good choices for initial development in cloud, containers, cognitive, HPC, and other areas where the latest technology is absolutely necessary.

● Do you need the best support?

– Red Hat Enterprise Linux (RHEL), SUSE Enterprise Linux (SLES), and Ubuntu, in partnership with IBM, provide long term distribution support and are a common target for key ISV applications that expect long term stability.

● Community supported distributions

– Ubuntu (and its many variants, such as kubuntu),

– CentOS, and

– OpenSUSe provide community supported distributions, which provide much of the same application support as their enterprise versions but without the corresponding support costs.

May 22-25, 2017 San Francisco Selecting a Distribution based on apps

● Are prerequisite and dependent packages available on the distribution of your choice?

– Use the Open Source POWER Availability Tool (OSPAT) search engine to see if the correct packages and versions of packages are available on the distribution of your choice.

May 22-25, 2017 San Francisco Open Source POWER Availability Tool (OSpat)

May 22-25, 2017 San Francisco Ospat Search for Elasticsearch

May 22-25, 2017 San Francisco Ospat Future Directions

▶Currently scans common distro contents ▶Adding pointers to all packages ported to POWER ▶Planning to add Ruby gems, Python pips, Node.js npms ▶Including HPC ported applications lists ▶Adding biobuilds.org contents ▶Working with OSU Center for Genome Research and Biocomputing (CGRB) –http://cgrb.oregonstate.edu/ –Listing their projects –Hosting build instructions at .com/ppc64le/build-scripts

▶Working with Nimbix to build docker containers

▶Populating https://hub.docker.com/u/ppc64le/ with pre-built POWER containers

▶Tracking binaries when available

▶Many Bullets ▶Logo Colored Bullets May 22-25, 2017 San Francisco Other Distributions

● There are several other Linux distributions that support the POWER platform including:

– Fedora (be sure to search for ppc64le packages)

– Debian

– CentOS

May 22-25, 2017 San Francisco Languages

May 22-25, 2017 San Francisco Application Programming Language

● What language is your application written in?

– Power supports most but not all current languages (see list below)

– Most compiled applications will simply require a re-build

– Most interpreted languages will run without changes

● C# Clang++ ada Angular.js awk Clang Clojure (JVM) (Mono) (LLVM) Fixedhea GNU D (LLVM) DoT.js Erlang G++ GCC GNU Go der.js Fortran Java GNU GNU Objective Go Lang Haskell HHVM (OpenJD JQuery JRuby Objective C C++ K) Node.js/V Julia (LLVM) Lua Modula 2/3 OCaml Octave Perl Phantom.js 8S Rust Sala PHP pypy Python Lang Ruby SpiderMonkey (LLVM) (JVM) SQL Swift (LLVM)

May 22-25, 2017 San Francisco Porting a Compiled Language

Compiled languages Effort required to port: Recompile and test* C, C++, and FORTRAN languages are compiled to instructions for a specific machine/platform. The C language in particular can expose more of the underlying machine architecture to the program and is thus, slightly less portable. That said, most applications written in a compiled language with no platform dependencies will only require a recompile to run on Power. *It’s estimated that less than 5% of Linux applications from any platform written in C/C++ will require changes. The Source Code Advisor, which is part of the IBM Software Developer Kit (SDK) for Linux on Power, can analyze your source code and show you which areas require changes. The SDK is discussed in more detail below.

May 22-25, 2017 San Francisco C/C++ Applications

Many of the most popular are available and optimized for POWER, including the following:

● GNU Collection (GCC)

● Clang

● XL C/C++ for Linux

● XL Fortran for Linux

May 22-25, 2017 San Francisco IBM Advanced Toolchain for Linux on POWER

Install the Advance Toolchain for the latest open source compilers, runtime libraries, and tools enabled and optimized for POWER8. Current GCC compilers and language levels Languages: C/C++, FORTRAN, Go Includes cross compilers for SDK client Optimized POSIX runtime libraries: libc, libm, libpthread Extra libraries: Zlib, OpenSSL, Boost, Tcmalloc, Intel TBB, SPHDE Performance profiling tools: oprofile, valgrind Learn more about the Advance Toolchain for Linux on Power (including how to install it).

● https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W51a7ffcf4dfd_4b40_9d82_44 6ebc23c550/page/IBM%20Advance%20Toolchain%20for%20PowerLinux%20Documentation?section=insta llation

May 22-25, 2017 San Francisco IBM Software Development Kit (SDK) for Linux on Power

If you’re porting an application written in C/C++, consider using the the SDK for Linux on Power, which is a free, -based integrated development environment (IDE) that includes powerful tools to aid developers porting to Linux on Power:

● Migration Advisor

● Build Advisor

● Source Code Advisor Each of these tools, and many others (including the IBM Advance Toolchain) are integrated into the SDK, which you can download from here: https://developer.ibm.com/linuxonpower/sdk-download/ or in source form from here: https://github.com/open-power-sdk

May 22-25, 2017 San Francisco Migration Advisor

The Migration Advisor scans your project source code and reports issues that are likely due to portability or performance including:

● Non-portable compiler intrinsics

● Non-portable API calls

● Non-portable assembly

● Preprocessor masking of architecture-specific optimizations

● Endian issues

● Sub-optimal intrinsics In addition, the Migration Advisor suggests likely remedies for these issues and for many of them offers to implement the remedy directly in the source code with a single click.

May 22-25, 2017 San Francisco Build Advisor

the Build Advisor scans your project build output

● Recommends “best practices” you can implement for improved results

● Looks for compilers (type and version),

● optimization levels,

● processor-specific optimization, and

● other compiler and linker flags.

May 22-25, 2017 San Francisco Source Code Advisor

The Source Code Advisor leverages technology developed by IBM Research

● Analyzes run-time characteristics of a program

● Looks for performance issues that cannot be determined at compile-time.

● Reports issues with explanations and suggestions for possible remedies

● Offers to implement the remedy directly in the source code with a single click.

May 22-25, 2017 San Francisco Java

If your application is written in Java Most open source projects are built with OpenJDK and is usually a safe choice for your development needs. OpenJDK is available from the distro – see instructions here: http://openjdk.java.net/install IBM JDK is available here: https://developer.ibm.com/javasdk/downloads/ NOTE: Oracles JDK is not available for ppc64le

May 22-25, 2017 San Francisco Python

Python versions 2 and 3 are available from the distros.

● Python 2 continues to be provided for backwards compatibility

● Python 3 should be the new standard. There is also a miniconda environment for ppc64le available here:

● https://repo.continuum.io/miniconda/Miniconda2-4.3.14-Linux-ppc64le.sh A repository of packages ported to the miniconda environment here:

● https://repo.continuum.io/pkgs/

May 22-25, 2017 San Francisco Node.js

Node is available directly from IBM here: https://developer.ibm.com/node/ There is also a docker container available for installation from that same site. IBM has also tested almost all npms

● Automated testing, using each tools own supplied tests

● Found some errors with npm's pulling down Intel based binaries

– Generally very few problems found.

May 22-25, 2017 San Francisco GO

● Go Lang (from )

● Latest golang for Power will be available from the golang download site - https://golang.org/dl/

● Advanced Toolchain also provide golang binaries.

● Docker images are in ppc64le/golang

● Gcc go has also been ported but go lang is recommended

May 22-25, 2017 San Francisco Libraries

Mathematics libraries

● Engineering and Scientific Library (ESSL)

● ESSL is a collection of high performance mathematical providing a wide range of functions for many common scientific and engineering applications. Languages: Fortran, C and C++ serial, SMP and SPMD. Download ESSL.

● libblas

● The Basic Linear Algebra Subprograms (BLAS) library, which are libraries used for optimizing mathematical computation on POWER8. Download libblas for Ubunutu. Download libblas for Fedora (also runs on RHEL)

● Mathematical Acceleration Subsystem (MASS) for Linux

● MASS consists of libraries of mathematical intrinsic functions tuned specifically for optimum performance on POWER architectures. Languages: C, C++ and Fortran.

May 22-25, 2017 San Francisco Compiler Optimization Flags

XL (xlc, xlC, xlf) Latest release: GNU (gcc, g++, gfortran) Latest Clang Latest release: 3.8.0 13.1.5/15.1.5 12/2016 release: 6.3 12/2016 03/2016 Architecture Generate instructions that run on -target powerpcle- unknown-linux- -qarch=pwr8 (default) -mcpu= POWER8 -mcpu=pwr8 Optimization levels

Disable all optimizations -O0 -qnoopt (default) -O0 (default) -O0 (default) -O -O or -O1 -O0 -O2 -O2 -O2 Optimization levels -O3 -O3 -O3 -O4 -Ofast -Os -O5[1] Commercial code: Commercial code: Recommended optimization (A good -O3 or -O3 -qipa -O3 -mcpu=power8 balance between run-time performance -O2 Technical computing/analytic: Technical computing/analytic: and compilation time) -O3 or -O3 -qhot -O3 -mcpu=power8 -funroll- loops Additional optimizations

Feedback directed optimization -qpdf1 -qpdf2 -fprofile-generate -fprofile-use -fprofile-instr-generate -fprofile-instr-use

Interprocedural optimizations -qipa -flto

OpenMP -qsmp=omp -fopenmp -fopenmp

Loop optimizations -qhot -fpeel-loops -funroll-loops -funroll-loops

May 22-25, 2017 San Francisco Databases

May 22-25, 2017 San Francisco Databases

According to Gartner, by 2018, more than 70% of new in-house applications will be developed on an open source database management system (OSDBMS), and 50% of existing commercial RDBMS instances will have been converted. And guess what? Open source databases running on POWER8 deliver 1.8-2X+ greater value versus equivalent x86 solutions. The number of ported databases is constantly changing. Use the Open Source POWER Availability Tool (OSPAT) to find out if yours is already available on the platform. Additional community maintained packages that may not have been captured by OSPAT yet, are listed here: https://developer.ibm.com/linuxonpower/open-source-pkgs/open-source-db/

May 22-25, 2017 San Francisco Recommended Databases: MongoDB

Here’s an intro to NoSql database that run on IBM Power Systems running Linux and why you might choose one. MongoDB Enterprise Server Type: Document Store | Latest version: v3.4 | Distribution support: RHEL 7.1, 7.2 and Ubuntu 16.04 MongoDB is a NoSQL document store designed for unstructured or semi-structured big data. Run it on POWER8 and you gain a high-performance DBaaS platform that delivers an integrated, real-time view of all your data – throughout your enterprise. Learn more: Get started with MongoDB on IBM Power Systems Tuning guide for MongoDB on IBM Power Systems Go to the MongoDB website

May 22-25, 2017 San Francisco Recommended Databases: Neo4j

Neo4j Type: Graph store | Latest version: 3.2 | Distribution support: RHEL 7.3 and Ubuntu 16.04 Neo4j is an industry leading open source graph database, that when run on POWER, resets the scalability benchmark for what real-time graph processing can deliver your data. Learn more: Get started with Neo4j on IBM Power Systems Neo4j on IBM Power Systems solution brief Neo4j and IBM Power Systems data sheet Go to the Neo4j website

May 22-25, 2017 San Francisco Recommended Databases:

Redis Type: In-memory key value store | Latest version: 3.2.8 | Distribution support: RHEL 7.3, Ubuntu 14.04, 16.04 Redis NoSQL open source data engine and Power Systems facilitates developer processes, delivering next-generation applications with near-zero latency, high availability and seamless scalability to meet user expectations. Learn more: Redis on IBM Power Systems solution brief Go to the Redis Labs website

May 22-25, 2017 San Francisco Recommended Databases: Cassandra

Cassandra Type: Wide-column store | Latest version: 3.7 | Distribution support: RHEL 7.3, Ubuntu 16.04 is a free and open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Learn more: The Cassandra database on IBM POWER Systems running Linux Go to the Apache Cassandra website

May 22-25, 2017 San Francisco Recommended Databases: EnterpriseDB

EnterpriseDB (EDB) Postgres Advanced Server Type: Relational DBMS | Latest version: 9.5 | Distribution support: RHEL 7.3 EDB Postgres Advanced Server is an open source object relational database for enterprise data management. Run it on IBM POWER LC servers for big data and gain a 1.8x price-performance advantage guarantee over x86 servers. Learn more: EDB Postgres Advanced Server 9.5 on IBM Power Systems install and tuning guide Get started with EDB Postgres Advanced Server 9.5 on IBM Power Systems EDB Postgres on IBM Power Systems solution brief Go to the EDB website

May 22-25, 2017 San Francisco Recommended Database: Kinetica

Kinetica Type: In-memory accelerated by GPU | Latest version: 6.0 | Distribution support: RHEL 7.3, Ubuntu 16.04 Kinetica is a GPU-accelerated database for real-time analysis of large and streaming datasets. Kinetica on POWER8 with NVLink Technology and Tesla P100 GPUs delivers 2.4x more Kinetica queries per hour than accelerated x86 solutions. Learn more: Kinetica on IBM Power Systems Go to the Kinetica website

May 22-25, 2017 San Francisco Recommended Data Platform: Hortonworks

Hortonworks Data Platform (HDP) 100% open source Apache Hadoop and Spark on IBM Power Systems built with OpenPOWER technology. Hortonworks Data Platform on OpenPOWER LC delivers 1.7X Hadoop workload performance compared to x86. Learn more: Deploying an OpenStack-based private cloud and Hortonworks Data Platform (HDP) on a Linux on IBM Power Systems server HDP on IBM Power Systems solution brief HDP on IBM Power Systems reference architecture Go to the Hortonworks website

May 22-25, 2017 San Francisco Build

May 22-25, 2017 San Francisco Building an applicatoin

Do I know my packages build instructions? If so, Build It! But if not...(guidelines for open source projects)

May 22-25, 2017 San Francisco Check availability of build instructions

● Build instructions, which often specify the dependencies/pre-requisites are at times available at either of the following locations –we check these to see if these are useful/applicable to us either in part or whole

● A –Project web page

● B –Github home page of the project

● C –Project README file/install script etc.

● D –Intel Dockerfiles

● E –For packages already ported by Z team, https://github.com/linux-on- ibm-z

May 22-25, 2017 5/25/2017 San Francisco External CI engine (Travis-CI/Jenkins) etc.

● Many a times build instructions can easily be interpreted from the build logs or travis.yaml file

● We can also check in autoport tool at this step, though the team does not use it very actively.

May 22-25, 2017 5/25/2017 San Francisco Check for availability of project specific dependency lists as part of the source code

● This varies from project to project and to a large extent depends on the underlying language in which the source is written. A few examples:

● A –Gemfile (Ruby Projects)

● B –Requirements.txt (Pyton projects)

● C –package.json (node.js projects)

● D –Makefile (C/C++ projects)

May 22-25, 2017 5/25/2017 San Francisco Install common dependencies

● Determined again by the underlying language in which the project is written, this includes the common set that may or may not have been covered in step-3. These include build essentials (gcc/g+++, make etc.) , node.js & npm (node.js package), ruby/ruby-dev (ruby packages) etc.

May 22-25, 2017 5/25/2017 San Francisco Trial and error

● This is generally the last and final step –we go ahead with what we find / can install using the above steps and attempt to build the package. Other dependencies are identified and installed as the build fails and reports missing dependencies.

May 22-25, 2017 5/25/2017 San Francisco Optimize

May 22-25, 2017 San Francisco Short Discussion

▶ Optimizing is much harder to turn into simple steps ▶ General guidelines

May 22-25, 2017 San Francisco http://developer.ibm.com/linuxonpower/porting-guide

May 22-25, 2017 San Francisco