Ready Solutions for Data Analytics Cloudera Hadoop 6.1 Architecture Guide
Total Page:16
File Type:pdf, Size:1020Kb
Ready Solutions for Data Analytics Cloudera Hadoop 6.1 Architecture Guide April 2019 H17614.1 Abstract This reference architecture guide describes the architectural recommendations for Cloudera Hadoop 6.1 software on Dell EMC PowerEdge servers and Dell EMC Networking switches. Copyright © 2017-2019 Dell Inc. or its subsidiaries. All rights reserved. Published April 2019 Dell believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS-IS.“ DELL MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. USE, COPYING, AND DISTRIBUTION OF ANY DELL SOFTWARE DESCRIBED IN THIS PUBLICATION REQUIRES AN APPLICABLE SOFTWARE LICENSE. Dell, EMC, and other trademarks are trademarks of Dell Inc. or its subsidiaries. Other trademarks may be the property of their respective owners. Published in the USA. Dell EMC Hopkinton, Massachusetts 01748-9103 1-508-435-1000 In North America 1-866-464-7381 www.DellEMC.com 2 Ready Architecture for Cloudera Hadoop 6.1 CONTENTS Figures 5 Tables 7 Chapter 1 Executive Summary 9 Document purpose......................................................................................10 Audience..................................................................................................... 10 Hadoop overview........................................................................................ 10 Cloudera Enterprise software overview...................................................... 10 Hadoop for the enterprise.............................................................. 10 Data management...........................................................................11 Cloudera Enterprise components.................................................... 11 Cloudera Enterprise Data Hub........................................................ 12 Cloudera Hadoop 6.1 Ready Solution...........................................................12 Solution use case summary......................................................................... 14 Chapter 2 Ready Architecture Components 15 Solution components.................................................................................. 16 Dell EMC PowerEdge rack servers.............................................................. 17 Dell EMC PowerEdge R640 server................................................. 17 Dell EMC PowerEdge R740xd server............................................. 18 Chapter 3 Solution Architecture Overview 19 Cluster architecture................................................................................... 20 High-level node architecture......................................................... 20 High availability..............................................................................25 Network architecture................................................................................. 27 Network definitions....................................................................... 28 Cluster physical networks..............................................................28 Physical network components....................................................... 29 Rack server hardware configurations......................................................... 36 Infrastructure Nodes..................................................................... 36 Worker Nodes................................................................................37 Edge Nodes................................................................................... 39 Node configuration ....................................................................... 40 Chapter 4 References 43 Cloudera partnership and certification........................................................44 Dell EMC Customer Solution Centers......................................................... 44 Technical support.......................................................................................45 Appendix A Dell EMC PowerEdge R740xd Worker Nodes Physical Rack Configuration 47 Worker Nodes single-rack configuration.....................................................48 Worker Nodes initial rack configuration......................................................49 Worker Nodes additional pod rack configuration........................................ 50 Ready Architecture for Cloudera Hadoop 6.1 3 CONTENTS Appendix B Tested Component Versions 53 Software versions...................................................................................... 54 Network switch firmware versions............................................................. 54 Dell EMC PowerEdge R640 firmware versions........................................... 54 Dell EMC PowerEdge R740xd firmware versions........................................55 Glossary 57 Index 61 4 Ready Architecture for Cloudera Hadoop 6.1 FIGURES 1 Solution components.................................................................................................. 16 2 Dell EMC PowerEdge R640 server 10 x 2.5 in. chassis................................................ 18 3 Dell EMC PowerEdge R740xd server 3.5 in. chassis....................................................18 4 Cluster architecture................................................................................................... 20 5 Cluster network fabric architecture............................................................................ 27 6 Hadoop 25 GbE network connections.........................................................................29 7 Dell EMC PowerEdge R640 network ports................................................................. 30 8 Dell EMC PowerEdge R740xd Worker Node network ports........................................30 9 25 GbE single-pod networking equipment.................................................................. 32 10 Dell EMC Networking Z9100-ON multiple-pod networking equipment........................33 11 Multiple-pod view using Dell EMC Networking Z9100-ON switches (based on Layer 3 ECMP) ...................................................................................................................... 34 Ready Architecture for Cloudera Hadoop 6.1 5 6 Ready Architecture for Cloudera Hadoop 6.1 TABLES 1 Cloudera Enterprise components/services ................................................................. 11 2 Cloudera Enterprise Data Hub.....................................................................................12 3 Solution use cases.......................................................................................................14 4 Data processing and access components.................................................................... 17 5 Cluster node roles.......................................................................................................20 6 Service locations by node........................................................................................... 22 7 Recommended number of nodes and pods for 25 GbE cluster....................................24 8 Alternative number of nodes and pods for 25 Gbe cluster.......................................... 24 9 Rack and pod density scenarios..................................................................................25 10 CDH network definitions.............................................................................................28 11 Cluster networks........................................................................................................ 28 12 Network/bond/interface cross reference ..................................................................30 13 Per rack network equipment ......................................................................................35 14 Per pod network equipment....................................................................................... 35 15 Per cluster aggregation network switches for multiple pods.......................................35 16 Per node network cables required.............................................................................. 35 17 Hardware configurations: Dell EMC PowerEdge R640 Infrastructure Nodes .............36 18 Dell EMC PowerEdge R640 Infrastructure Node volumes.......................................... 37 19 Dell EMC PowerEdge R640 Infrastructure Node partitions........................................ 37 20 Hardware configurations: Dell EMC PowerEdge R740xd Worker Nodes.....................37 21 Dell EMC PowerEdge R740xd Worker Node volumes................................................. 38 22 Dell EMC PowerEdge R740xd Worker Node partitions............................................... 38 23 Hardware Configurations – Dell EMC PowerEdge R640 Edge Nodes.........................39 24 Dell EMC PowerEdge R640 Edge Node volumes........................................................ 40 25 Dell EMC PowerEdge R640 Edge Node partitions...................................................... 40 26 Specialized Worker Node subtypes............................................................................. 41 27 Solution Support Matrix............................................................................................. 45 28 Single-rack configuration: Worker Nodes................................................................... 48 29 Initial pod rack configuration: