SOLUTION BRIEF

Mellanox and IBM Enable Enterprise Ready Open Hadoop for Big Data Analytics and Artificial Intelligence

Executive Summary Hortonworks Data Platform Hadoop is an open-source data analysis Hortonworks Data Platform (HDP) uses the software framework for storing data and Hadoop Distributed File System (HDFS) for running big data and analytic applications. scalable, fault-tolerant big data storage and It allows for the aggregation Hadoop’s centralized Yet Another Resource A Mellanox and IBM and review of large data Negotiator (YARN) architecture for resource integrated solution built sets in order to gain and workload management. YARN enables a valuable insights that can range of data processing engines including on open hardware and unlock large scale data SQL, real-time streaming and batch software provides: insights for businesses. processing, among others, to interact IBM leverages simultaneously with shared datasets, Hortonworks, a leading avoiding unnecessary and costly data silos Performance advantage Data Platform innovator of and unlocking an entirely new approach to 1.7x versus competing x86- Hadoop and Apache Spark analytics. based solutions on their IBM Power Systems™ and uses HDP on Power Systems Mellanox Power Systems with IBM POWER8® Reduction in storage P interconnect solutions to processors and differentiated hardware 3x infrastructure deliver an enterprise-ready acceleration technology are designed to requirements open data platform that’s deliver breakthrough performance for big built for modern big data data analytics workloads. The POWER8 Seamless integration with analytic applications. processor delivers industry-leading IBM PowerAI for machine Combining the best open performance for big data analytics technology innovation with applications running on HDP, with multi- and deep learning industry-leading threading designed for fast execution of platforms performance and IT analytics, multi-level cache for continuous efficiency and commodity data load and fast response and a large, Enterprise-ready, solution hardware, IBM, Mellanox high-bandwidth memory workspace to and Hortonworks, are able maximize throughput for data-intensive built on open hardware to accelerate big data applications. and software that is fully analytics and artificial tested. intelligence for IBM Power Systems OpenPOWER LC organizations to unlock and server family scale data driven insights Designed for flexibility and seamless Industry-leading support like never before integration to existing clusters and clouds, the and expertise. new IBM OpenPOWER LC server family offers the data-crushing POWER8 processor in a range of purpose-built system configurations, from compute-dense to storage-rich.

www.mellanox.com SOLUTION BRIEF

The LC family’s innovative design in partnership High-performance and Efficient with the OpenPOWER Foundation offers hardware Optimized for ultra-low latency fabrics, the Mellanox accelerator-offload for compute, storage and SX1410 series Ethernet switch exceeds networking workloads—for incredible speed-ups to requirements for high-performance and can be analytics and massive efficiencies in data deployed as an ideal top-of-rack switch for HDP movement. deployments. With a unique form-factor that includes 48 10GbE hosts ports and 12 40GbE The POWER8 processor’s leading thread density, switch uplink ports or can be further split using 2-to1 large cache and memory bandwidth and superior or 4-to-1 break-out cables to allow a higher number I/O capabilities are a great match for in-memory of 10GbE ports at the expense losing 40GbE ports. Apache Spark workloads including SQL, This allows for as many as 64 10GbE ports on one streaming, graph and machine learning analytics. switch and occupying only a single rack unit.

IBM Spectrum Scale and Elastic Storage The Mellanox SX1410 has the ability to process Server data packets at full line rate, without dropping An integrated storage system running IBM packets, this ensure optimal network and application Spectrum Scale software on IBM Power Systems, performance. To better handle I/O contention, IBM Elastic Storage Server can serve as the Mellanox employs a superior buffer to predictably underlying storage for HDP. IBM Spectrum Scale manage I/O across all ports on the switch. This is a software-defined storage system based on a allows the switch to divide buffer resources fairly, parallel file system architecture that provides File similarly as it does bandwidth, allowing the full use (NFS, SMB, POSIX) and Object (S3, Swift) access of switch capacity. For example, when a microburst and supports HDFS APIs. Support for HDFS APIs or incast (many-to-one) broadcast occurs, the enables in-place analytics on enterprise storage network cannot allow one application or client a instead of copying data from enterprise storage to majority of the network capacity and accidentally analytics silos. In-place analytics not only allow others to starve. The SX1410 series switches eliminates duplication of data but also avoids the provide fair and predictable performance to prevent problems of running analytics on stale data. In these sort of predicaments. addition, Spectrum Scale provides shared storage to HDP, which allows for de-coupling of compute Increasing Hadoop Efficiency and storage to enable optimized configurations. Mellanox ConnectX®-4 Ethernet adapters support 10/25/40/50 and 100Gbps Ethernet speeds and IBM and Mellanox—Better Together provide sub-microsecond latency and offloading The Mellanox Ethernet interconnect solution of mechanisms such as RDMA, Erasure Coding, TCP, switches, adapters, connectors and cables enables UDP, as well as overlay network and OVS offloads. the network to scale linearly as application By utilizing offload capabilities to bypass the CPU, workload requirements increase. By leveraging the server resources are freed, leaving more CPU advanced offload and acceleration capabilities, cores available to analyze data or provide other Mellanox solutions mitigate network and compute tasks. This allows for higher scalability and virtualization penalties and minimize CPU burdens efficiency within the Hortonworks environments. By to achieve maximum network efficiency. Building doing this, the adapter reduces application runtime and deploying data analysis solutions is easy due and offers the flexibility and scalability to make to single sourced reliable components that are infrastructure run as efficiently and productively as tested to work together with IBM hardware and possible. Enabling data centers to leverage the Hortonworks Data Platform software. This allows adapter to increase their operational efficiency, for a high-throughput, low-latency network that is improve server utilization, and maximize application capable of scaling from 10 to 25, 40, 50 and even productivity, while reducing total cost of ownership 100 Gbps Ethernet. (TCO).

© Copyright 2016. Mellanox Technologies. All rights reserved. Mellanox, Mellanox logo, and ConnectX are registered trademarks of Mellanox Technologies, Ltd. Mellanox NEO is a trademark of Mellanox Technologies, Ltd. All other trademarks are property of their respective owners. v8.04.17 SOLUTION BRIEF

Reducing Infrastructure Costs and Conclusion Complexity IBM partners with Mellanox and Hortonworks to In efforts to reduce costs and ensure unlock superior data throughput and speed for interoperability, Mellanox introduced Ethernet connected enterprise data of all types, from all breakout cables and optical transceivers. sources. With industry-leading performance and Breakout cables (cables that connect to higher IT efficiency combined with the best of open speeds at the switch and “fan out” to multiple innovation to accelerate big data analytics and AI, lower speed links) allow deployment of four 10G organizations can unlock and scale data-driven connects at the servers to be aggregated to the insights for their business like never before. top-of-rack switch at 40G connections. This lowers costs and simplifies cable management IBM Power Systems with POWER8® processors by reducing the number of cables required. And and differentiated acceleration technology are due to stringent interoperability testing with designed to crush big data workloads, raising the Mellanox switches and adapters, provides a bar for what’s possible with next-gen data-driven high degree of reliability and guaranteed applications — today. integration. The Mellanox’s ConnectX®-4 adapters reduce the Superior performance for Apache CPU overhead in packet processing through Hadoop and Spark Workloads advanced hardware-based stateless offloads and HDP on Power Systems and over Mellanox flow steering engine. While Mellanox switches interconnects delivers more data faster, provide efficiencies for analytic workloads and enabling valuable analytics for better and assist OpenPower-based servers to achieve new quicker decision making. In IBM performance performance capabilities. testing for typical Apache Hadoop workloads, HDP on Power Systems versus x86-based Leveraging recent advances in CPU, memory, solutions demonstrated 70 percent more queries storage and networking through industry leaders per hour based on an average response time like IBM and Mellanox, HortonWork Data Platform and 40 percent reduction on average in query has the power it needs to solve challenging data response time. analytic problems.

View the Mellanox IBM OEM website: EXPLORE FURTHER www.mellanox.com/oem/ibm

Visit the IBM Hortonworks on Power Financial Services Solution Brief: https://www.ibm.com/common/ssi/cgi- website: ibm.biz/hortonworksOnPower bin/ssialias?htmlfid=POS03163USEN& Reference Architecture and Design: Solution Brief: https://public.dhe.ibm.com/common/ssi/ecm/po/en/p https://www.ibm.com/common/ssi/cgi- ol03270usen/POL03270USEN.PDF bin/ssialias?htmlfid=POS03160USEN&

Join the Hortonworks Community: https://community.hortonworks.com/

350 Oakmead Parkway, Suite 100, Sunnyvale, CA 94085 Tel: 408-970-3400 l Fax: 408-970-3403 www.mellanox.com

© Copyright 2016. Mellanox Technologies. All rights reserved. Mellanox, Mellanox logo, and ConnectX are registered trademarks of Mellanox Technologies, Ltd. Mellanox NEO is a trademark of Mellanox Technologies, Ltd. All other trademarks are property of their respective owners. v8.04.17