AI and Big Data on IBM Power Systems Servers

Front cover AI and Big Data on IBM Power Systems Servers Ivaylo B. Bozhinov Anto Ajay Raj John Rafael Freitas de Lima Ahmed (Mash) Mashhour James Van Oosten Fernando Vermelho Allison White Redbooks IBM Redbooks AI and Big Data on IBM Power Systems Servers March 2019 SG24-8435-00 Note: Before using this information and the product it supports, read the information in “Notices” on page xv. First Edition (March 2019) This edition applies to IBM Power Systems servers that are based on the POWER9 and POWER8 processor-based technology that is running the software stack that is described in “Software levels” on page 120. © Copyright International Business Machines Corporation 2019. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Contents Figures . vii Tables . xi Examples . xiii Notices . .xv Trademarks . xvi Preface . xvii Authors. xviii Now you can become a published author, too! . xix Comments welcome. xix Stay connected to IBM Redbooks . .xx Chapter 1. Solution overview. 1 1.1 Introduction . 2 1.1.1 Types of AI . 2 1.1.2 What is a data lake . 3 1.2 Artificial intelligence solutions . 4 1.2.1 IBM PowerAI . 4 1.2.2 IBM Watson Machine Learning Accelerator . 5 1.2.3 IBM Watson Studio Local . 7 1.2.4 H2O Driverless AI . 9 1.2.5 IBM PowerAI Vision . 10 1.2.6 Key differences among the popular AI solutions. 11 1.3 Data platforms. 11 1.3.1 Apache Hadoop and Hortonworks . 11 1.3.2 IBM Spectrum Scale . 13 1.3.3 IBM Elastic Storage Server. 15 1.3.4 IBM Spectrum Scale HDFS Transparency Connector . 17 1.4 When to use a GPU . 19 1.5 Hortonworks Data Platform GPU support . 22 1.5.1 Native GPU support . 22 1.5.2 GPU discovery . 23 1.5.3 GPU isolation and monitoring . 23 1.5.4 GPU scheduling . 24 1.5.5 Hortonworks Data Platform 3 and YARN container with GPU support. 24 1.6 Linux on Power . 24 1.7 Client use cases . 25 1.7.1 Asian job-hunting services company. 25 1.7.2 Large bank . 26 1.7.3 South American IT services provider . 26 1.7.4 Governmental agency. 26 1.7.5 European IT services company. 26 Chapter 2. Integration overview. 29 2.1 Architecture overview . 30 2.1.1 Infrastructure stack . 31 © Copyright IBM Corp. 2019. All rights reserved. iii 2.2 System configurations. 33 2.2.1 IBM Watson Machine Learning Accelerator configuration . 33 2.2.2 IBM Watson Studio Local configurations . 35 2.2.3 Configuring an HDP system . 38 2.2.4 Configuring a proof of concept . 38 2.2.5 Conclusion . 39 2.3 Deployment options . 39 2.3.1 Deploying IBM Watson Studio Local in stand-alone mode or with IBM Watson Machine Learning Accelerator . 39 2.3.2 Using the Hadoop Integration service versus using an Apache Livy connector . 40 2.3.3 Deploying H2O Driverless AI in stand-alone mode or within IBM Watson Machine Learning Accelerator. 40 2.3.4 Running Spark jobs. 43 2.4 IBM Watson Machine Learning Accelerator and Hortonworks Data Platform. 43 2.5 IBM Watson Studio Local with Hortonworks Data Platform . 44 2.6 IBM Watson Studio Local with IBM Watson Machine Learning Accelerator . 46 2.7 IBM Spectrum Scale and Hadoop Integration. 47 2.7.1 Information Lifecycle Management . 48 2.8 Security . 50 2.8.1 Datalake security . 51 2.8.2 IBM Watson Machine Learning Accelerator security with Hadoop . 53 2.8.3 IBM Watson Studio Local security . 53 2.8.4 IBM Spectrum Scale security . 55 Chapter 3. Integrating new data. 57 3.1 Data ingestion overview . 58 3.2 Types of data ingestion . 59 3.3 Options for data ingestion . 61 3.4 Using data connectors to work with external data sources . 63 3.5 How integration improves the artificial intelligence models. 64 Chapter 4. Integration details. 65 4.1 Integrating IBM Watson Machine Learning Accelerator with Hortonworks . ..

Load more