Cloudera-Intel-Cisco Hadoop Benchmark TOI (External) What Matters in a Hadoop Cluster?
Total Page:16
File Type:pdf, Size:1020Kb
Cloudera-Intel-Cisco Hadoop Benchmark TOI (External) What matters in a Hadoop Cluster? Floris Grandvarlet (Cisco) [email protected] Patrick Schotts (Intel) [email protected] Woody Christy (Cloudera) [email protected] Cloudera-Intel-Cisco Hadoop Benchmark TOI (External) What matters in a Hadoop Cluster? Acknowledgments The authors acknowledge the contributions of: Intel: Stephen G. Anderson, [email protected] Rob Kypriotakis, [email protected] Jacob A. Ohara, [email protected] Gert Pauwels, [email protected] Richard B. Pilling, [email protected] Cisco: Arnaud Bassaler, [email protected] Peter Ruttens, [email protected] Michel Sumbul, [email protected] Karthik Kulkarni, [email protected] Cloudera: Sandeep Brahmarouthu, [email protected] Jonathan Cooper, [email protected] Rob Johnson, [email protected] Kunal Kusoorkar, [email protected] Dwai Lahiri, [email protected] Jonathan Seidman, [email protected] ALL DESIGNS, SPECIFICATIONS, STATEMENTS, INFORMATION, AND RECOMMENDATIONS (COLLEC¬TIVELY, “DESIGNS”) IN THIS PAPER ARE PRESENTED “AS IS,” WITH ALL FAULTS. CISCO AND ITS SUP¬PLIERS DISCLAIM ALL WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OR ARISING FROM A COURSE OF DEALING, USAGE, OR TRADE PRACTICE. IN NO EVENT SHALL CISCO OR ITS SUPPLIERS BE LIABLE FOR ANY INDIRECT, SPECIAL, CONSEQUENTIAL, OR INCIDENTAL DAMAGES, INCLUDING, WITHOUT LIMITATION, LOST PROFITS OR LOSS OR DAMAGE TO DATA ARISING OUT OF THE USE OR INABILITY TO USE THE DESIGNS, EVEN IF CISCO OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. THE DESIGNS ARE SUBJECT TO CHANGE WITHOUT NOTICE. USERS ARE SOLELY RESPONSIBLE FOR THEIR APPLICATION OF THE DESIGNS. THE DESIGNS DO NOT CONSTITUTE THE TECHNICAL OR OTHER PROFESSIONAL ADVICE OF CISCO, ITS SUPPLIERS OR PARTNERS. USERS SHOULD CONSULT THEIR OWN TECHNICAL ADVISORS BEFORE IMPLEMENTING THE DESIGNS. RESULTS MAY VARY DEPENDING ON FACTORS NOT TESTED BY CISCO. CCDE, CCENT, Cisco Eos, Cisco Lumin, Cisco Nexus, Cisco StadiumVision, Cisco TelePresence, Cisco WebEx, the Cisco logo, DCE, and Welcome to the Human Network are trademarks; Changing the Way We Work, Live, Play, and Learn and Cisco Store are service marks; and Access Registrar, Aironet, AsyncOS, Bringing the Meeting To You, Catalyst, CCDA, CCDP, CCIE, CCIP, CCNA, CCNP, CCSP, CCVP, Cisco, the Cisco Certified Internetwork Expert logo, Cisco IOS, Cisco Press, Cisco Systems, Cisco Systems Capital, the Cisco Systems logo, Cisco Unity, Collaboration Without Limitation, EtherFast, EtherSwitch, Event Center, Fast Step, Follow Me Browsing, FormShare, GigaDrive, HomeLink, Internet Quotient, IOS, iPhone, iQuick Study, IronPort, the IronPort logo, LightStream, Linksys, MediaTone, MeetingPlace, MeetingPlace Chime Sound, MGX, Networkers, Networking Academy, Network Registrar, PCNow, PIX, PowerPanels, ProConnect, ScriptShare, SenderBase, SMARTnet, Spectrum Expert, StackWise, The Fastest Way to Increase Your Internet Quotient, TransPath, WebEx, and the WebEx logo are registered trademarks of Cisco Systems, Inc. and/or its affiliates in the United States and certain other countries. All other trademarks mentioned in this document or website are the property of their respective owners. The use of the word partner does not imply a partnership relationship between Cisco and any other com¬pany. (0809R) © 2014 Cisco Systems, Inc. All rights reserved. Cloudera-Intel-Cisco v2.0 Public Page 2 Cloudera-Intel-Cisco Hadoop Benchmark TOI (External) What matters in a Hadoop Cluster? Contents 1. Introduction ................................................................................................................................................................... 4 Executive Summary ........................................................................................................................................................... 5 2. Benchmark Test bed ...................................................................................................................................................... 7 2.1. Hardware ......................................................................................................................................................................... 7 2.2. Software .......................................................................................................................................................................... 7 2.3. Software post-installation configuration ........................................................................................................................... 8 2.4. Architecture ..................................................................................................................................................................... 9 2.5. Server Configuration and Cabling .................................................................................................................................... 10 2.6. Rack ................................................................................................................................................................................. 11 3. CPU Benchmark ............................................................................................................................................................ 12 3.1. Overview .......................................................................................................................................................................... 12 3.2. CPU Test Architecture ..................................................................................................................................................... 12 3.3. CPU Benchmarks Caveats ............................................................................................................................................... 13 3.3.1. Cloudera Manager Architecture ............................................................................................................................ 15 3.3.2. Power measurements ........................................................................................................................................... 16 3.4. Results ............................................................................................................................................................................. 18 3.4.1. Tera Results for CPU ............................................................................................................................................. 18 3.4.2. Word Count for CPU ............................................................................................................................................. 19 3.4.3. Power Results for CPU .......................................................................................................................................... 20 3.4.4. Consolidated Results with Pricing ......................................................................................................................... 20 3.5. CPU Benchmark Results Conclusion ............................................................................................................................... 21 4. Cluster Benchmark ........................................................................................................................................................ 23 4.1. Overview .......................................................................................................................................................................... 23 4.2. Benchmark Caveat........................................................................................................................................................... 23 4.2.1. Benchmark Caveat : Raid Configuration ................................................................................................................ 23 4.2.2. Benchmark Caveat : Network Bandwidth .............................................................................................................. 24 4.3. Benchmark Hyper-Threading ........................................................................................................................................... 26 4.3.1. Hyper-Threading details ........................................................................................................................................ 27 4.4. Benchmark Network Bandwidth ....................................................................................................................................... 28 4.4.1. TeraGen and TeraSort details ................................................................................................................................ 29 4.5. Benchmark Hyper-Threading/Networking results conclusion .......................................................................................... 31 4.6. Benchmark Data Nodes Scale-out .................................................................................................................................. 32 4.7. Benchmark HDD scaling .................................................................................................................................................. 33 4.8. Benchmark HDD/Scaling results conclusion ...................................................................................................................