Lenovo Big Data Validated Design for Cazena Saas Using Cloudera on Thinkagile HX
Total Page:16
File Type:pdf, Size:1020Kb
Lenovo Big Data Validated Design for Cazena SaaS using Cloudera on ThinkAgile HX Last update: 27 June 2019 Version 1.0 Configuration Reference Number: BGDCL03XX91 Reference architecture for Solution based on the Cloudera Enterprise with Cazena ThinkSystem HX platform running SaaS Data Lakes VMware vSphere virtualization Deployment considerations for Solution based on Cazena’s Fully scalable racks including detailed Managed SaaS Data Lake running validated bills of material on the ThinkAgile HX platform Dan Kangas (Lenovo) Venkat Chandra (Cazena) John Piekos (Cazena) Keith Adams (Lenovo) Ajay Dholokia (Lenovo) Gary Cudak (Lenovo) 1 Lenovo Big Data Validated Design for Cazena SaaS using Cloudera on ThinkAgile HX Table of Contents 1 Introduction .............................................................................................. 4 2 Business problem and business value .................................................. 5 Business Problem .................................................................................................... 5 Business Value ........................................................................................................ 5 2.2.1 Time to Production....................................................................................................................... 5 2.2.2 Deploy ML & Analytics Quickly with the AppCloud ..................................................................... 6 2.2.3 Plug & Play Enterprise Deployment: ........................................................................................... 6 2.2.4 Enabling Multi-tenancy ................................................................................................................ 6 2.2.5 DevOps 24x7 Production Operations .......................................................................................... 6 2.2.6 Secure data lake with a highly-differentiated security model. ..................................................... 7 2.2.7 Best of Breed Data Lake - Optimized for ThinkAgile HX ............................................................. 7 3 Requirements ........................................................................................... 8 Functional Requirements ......................................................................................... 8 Non-functional Requirements................................................................................... 8 4 Architectural Overview ............................................................................ 9 Cazena .................................................................................................................. 10 Cloudera Enterprise ............................................................................................... 10 Lenovo ThinkAgile HX Appliance ........................................................................... 11 5 Component Model ................................................................................. 12 Cazena Components ............................................................................................. 12 5.1.1 Cazena AppCloud...................................................................................................................... 12 5.1.2 Data Ingestion ............................................................................................................................ 13 5.1.3 Cazena Security ........................................................................................................................ 13 5.1.4 The Cloudera Data Lake ........................................................................................................... 13 5.1.5 Cazena Operations (DevOps) ................................................................................................... 13 ThinkAgile HX - Nutanix ......................................................................................... 14 5.2.1 Nutanix Prism ............................................................................................................................ 15 5.2.2 Controller VM (CVM) ................................................................................................................. 16 6 Operational Model ................................................................................. 18 Hardware Description ............................................................................................ 18 6.1.1 ThinkAgile HX Key features ....................................................................................................... 18 6.1.2 Lenovo ThinkAgile HX5520 Appliance ...................................................................................... 19 6.1.3 Lenovo ThinkAgile HX3320 Appliance ...................................................................................... 19 2 Lenovo Big Data Validated Design for Cazena SaaS using Cloudera on ThinkAgile HX 6.1.4 Lenovo ThinkSystem NE0152T ................................................................................................. 20 6.1.5 Lenovo RackSwitch G8272 ....................................................................................................... 20 Cluster HX Node Configurations ............................................................................ 21 6.2.1 Node T-shirt Sizes ..................................................................................................................... 21 6.2.2 Node types and VM organization .............................................................................................. 23 6.2.3 System Management node ........................................................................................................ 24 Cluster Software Stack .......................................................................................... 24 Cazena Orchestration ............................................................................................ 24 Cazena AppCloud .................................................................................................. 25 Cloudera Service Role Layouts.............................................................................. 25 System Management ............................................................................................. 26 Networking ............................................................................................................. 27 6.8.1 Data Network ............................................................................................................................. 28 6.8.2 Hardware Management Network ............................................................................................... 28 6.8.1 10Gb and 25Gb Data Network Configurations .......................................................................... 29 Predefined Lenovo HX Cluster Configurations....................................................... 29 6.9.1 Cluster Storage Capacity ........................................................................................................... 31 6.9.2 Storage Tiering with NVMe and SSD Drives ............................................................................. 31 7 Deployment Considerations ................................................................. 32 Increasing Cluster Performance ............................................................................. 32 Processor Selection ............................................................................................... 32 Memory Size and Performance .............................................................................. 32 Designing for Storage Capacity and Performance ................................................. 34 7.4.1 Node Capacity ........................................................................................................................... 34 7.4.2 Estimating Disk Space ............................................................................................................... 35 Scaling Considerations .......................................................................................... 35 8 Cluster Hardware Bill of Materials ........................................................ 36 HX5520 Bill of Materials ......................................................................................... 36 HX3320 Bill of Materials ......................................................................................... 37 Systems Management Node Bill of Materials ........................................................ 39 Network .................................................................................................................. 40 Rack ....................................................................................................................... 40 9 Acknowledgements ............................................................................... 42 10 Resources .............................................................................................. 43 11 Document History .................................................................................. 45 3 Lenovo Big Data Validated Design for Cazena SaaS using Cloudera on ThinkAgile HX 1 Introduction This document describes the reference architecture for Cazena’s big data Software as a Service (SaaS) offering with Cloudera running on the Lenovo ThinkAgile HX platform. This solution delivers Cloudera Enterprise as an SaaS, fully-managed private cloud data lake. The reference architecture is a predefined and