Quick viewing(Text Mode)

Lustre* Performance Is Superior to HDFS* with the Latest Intel® Xeon® Processor Family

Lustre* Performance Is Superior to HDFS* with the Latest Intel® Xeon® Processor Family

White Paper

Intel® Enterprise Edition for * Software High-Performance Division Lustre* Performance is Superior to HDFS* with the Latest ® Xeon® Processor Family

Intel® Enterprise Edition for Lustre* software shows a significant performance advantage over Apache HDFS*—upgrading to the Intel® Xeon® processor E5-2600 v4 product family provides an additional performance benefit

Executive Summary Financial service organizations seeking to improve their profit margin and Intel® Enterprise Edition competitive advantage recognize the advantage of affordable high-performance for Lustre* software computing (HPC) and real-time analytics of . Distributed processing using MapReduce* framework and a parallel enables outperforms Apache HDFS* enterprises to keep up with today’s exponential data growth and use that data by as much as 30 percent. to reach better business decisions more quickly than ever before. Oliver Wyman Upgrading to the next- finds that during the past 20 years, the margins on deposits and cash equities have declined by 33 to 50 percent while the need for computing power in the financial generation Intel® Xeon® services industry has grown 200 to 500 percent faster than revenue.1 processor family can boost Understanding that better data processing performance translates to increased Lustre performance by business value, Intel conducted two performance tests: another 18 percent. • One test compared the performance of a financial services application running on the MapReduce framework on Intel® Enterprise Edition for Lustre* software (Intel® EE for Lustre* software) against the performance of the same application running on Apache Hadoop Distributed File System* (HDFS*). Both systems used servers based on the Intel® Xeon® processor E5-2600 v3 product family. • Another test compared the Intel EE for Lustre software performance from the above test against the performance of the identical system running on the Intel Xeon processor E5-2600 v4 product family. The tests revealed that Intel EE for Lustre software outperforms HDFS by as much as 30 percent.2 Upgrading to the next-generation Intel Xeon processor family can boost Lustre performance by another 18 percent.2

Table of Contents Executive Summary ...... 1 Background...... 2 Methodology...... 2 Test Configurations...... 3 Author Results ...... 3 Chakravarthy Nagarajan Next Steps...... 4 HPC Solutions Architect, Intel Conclusion...... 4 White Paper | Lustre* Performance is Superior to HDFS* with the Latest Intel® Xeon® Processor Family 2

Background The steps involved in Test 1 were as follows: Successful financial services firms are using a modern 1. We generated a data set of a given size using the Tata compute infrastructure to answer line-of-business questions Consultancy Services Limited* (TCS*) MapReduce-based associated with a wide range of use cases, including fraud FINRA* DataGenerator. Note that the data was directly detection, customer segmentation analysis, customer generated on both clusters (Lustre and HDFS) by the sentiment analysis, risk aggregation, counterparty risk DataGenerator; there was no need to copy the data set analytics, credit risk assessment, and 360-degree customer to Lustre or HDFS. service. This infrastructure includes the ability to perform 2. We prepared each cluster to collect performance monitoring massively parallel processing and real-time analytics of an data using the TCS MasterCraft Data Profiler* tool. array of disparate data sources. 3. We generated the workload from the name node of the cluster. Such an infrastructure uses tools such as the 4. On completion of the batch jobs, we collected the Distribution for Apache Hadoop* software, the MapReduce* execution log of each job, collected performance framework, and a parallel file system for storage. Those firms monitoring data using the MasterCraft Data Profiler, and who want to cement their success aim to future-proof their aggregated this data at the name node. data centers by investing in an infrastructure that will be able to handle ever-increasing operational efficiency demands 5. We calculated the average execution time for each job and even as the volume of data grows. monitored CPU, memory, disk, and network utilization for the batch execution duration. Intel is committed to helping the financial industry continue 6. We repeated steps 1-5 for different data sizes. to grow and prosper by providing the technology and information necessary for financial services firms to choose Test 2: Comparing Lustre* Performance on infrastructure components that best meet their business Refreshed Hardware needs. In particular, firms need to know which parallel file system is the most efficient, and whether upgrading servers We compared the Lustre performance obtained from can provide significant performance benefits. In the white Test 1 to another cluster based on the next generation paper, “Big Data Meets High-Performance Computing,” of processor (Intel Xeon processor E5-2600 v4 product Intel described how Intel® Enterprise Edition for Lustre* family). We measured same metrics as in Test 1: average software (Intel® EE for Lustre* software) and Hadoop job execution time, CPU utilization, disk utilization, memory combine to bring big data analytics to high-performance utilization, and network utilization of the name node and computing (HPC) configurations. one data node. As in Test 1, we used Intel Manager for Lustre software to monitor maximum read/write bandwidth, Lustre Intel recently benchmarked the performance of a financial maximum CPU, and memory utilization. services application running on the MapReduce framework on Intel EE for Lustre software with the performance of the The steps involved in Test 2 were as follows: same application running on Apache Hadoop Distributed File 1. We generated a data set of a given size using the TCS System* (HDFS*). Both systems used servers based on the MapReduce-based FINRA DataGenerator. Note that the Intel® Xeon® processor E5-2600 v3 product family. Intel also data is directly generated in Lustre by the DataGenerator compared the performance of Intel EE for Lustre software (no need to copy data to Lustre). running on the Intel Xeon processor E5-2600 v3 product 2. We prepared the cluster to collect performance monitoring family to the performance of the identical system running on data using the TCS MasterCraft Data Profiler. the next-generation Intel Xeon processor (v4). 3. We generated the workload from the name node of the cluster. Methodology 4. On completion of the batch jobs, we collected the execution log of each job, collected performance monitoring data The following sections describe the methodology used for using the MasterCraft Data Profiler, and aggregate this data the two performance tests. We used single concurrency only at the name node. for both tests. 5. We calculated the average job execution time and monitored Test 1: Comparing Lustre* and HDFS* Performance CPU, memory, disk, and network utilization for the batch execution duration. We used two clusters – one running Intel EE for Lustre software and the other running HDFS. We measured average 6. We repeated steps 1-5 for different data sizes. job execution time, CPU utilization, disk utilization, memory utilization, and network utilization of the name node and one Model Parameters data node. For the Lustre configuration, we also used Intel® We tuned the number of maps and reduce tasks to obtain the Manager for Lustre* software to monitor maximum read/write best performance. Additionally, Table 1 shows the processor bandwidth, Lustre maximum CPU, and memory utilization. parameters that we tuned to achieve the best performance gain. White Paper | Lustre* Performance is Superior to HDFS* with the Latest Intel® Xeon® Processor Family 3

Table 1 . Model Parameter Settings for Best Performance

Intel® Xeon® processor Intel® Xeon® processor E5-2600 v3 product E5-2600 v4 product Configuration Item Parameter family family Container Memory yarn.nodemanager.resource.memory-mb 114 GB 248 GB Container Virtual CPU Cores yarn.nodemanager.resource.cpu-vcores 56 88 Container Memory Maximum yarn.scheduler.maximum-allocation-mb 8 GB 16 GB Container Virtual CPU (vCPU) yarn.scheduler.maximum-allocation-vcores 56 88 Cores Maximum Container vCPU Cores Minimum yarn.scheduler.minimum-allocation-vcores 1 1 * Heap Size of Map/Reduce mapred.child.java.opts 1000 MB 4096 MB Map Maximum Memory .map.memory.mb 2048 MB -1 Reduce Maximum Memory mapreduce.reduce.memory.mb 4096 MB -1 Maximum Tasks per Job mapred.jobtracker.maxtasks.per.job Default -1 Reduce Input Limit mapred.reduce.input.limit Default -1 MapReduce Task I/O Factor mapreduce.task.io.sort.factor Default 100 MapReduce I/O Sort MB Mapreduce.task.io.sort.mb 500 1024

Test Configurations Cluster 3: Hadoop+Lustre All tests used the following software: (Intel Xeon processor E5-2600 v4 product family) • 1 resource manager, 1 history server, 7 node managers • MapReduce framework: part of the Cloudera Distribution for Hadoop software version 5.4.2 • 8 nodes, each equipped with Intel Xeon processor E5‑2699 v4 @ 2.20 GHz with 88 cores and 256 GB RAM • : CentOS*3 6.7 • 60 TB of Lustre storage, 10-GB network with compute The Lustre performance tests used Intel EE for Lustre software nodes and Lustre file system version 2.4.1.0, as well as the Hadoop Adapter for Lustre (HAL), which optimizes MapReduce processing on Lustre using Hadoop connectors and patches for easier management. In Results both Lustre configurations, we used a stripe count of 8, which is The following Map and Reduce query was submitted for the optimum stripe count to obtain the best performance from each test: Intel EE for Lustre software for our particular MapReduce job. SELECT sum(routed_share_quantity * route_price) The following sections describe the specific configuration of FROM default.rt_query_extract WHERE issue_symbol like ‘XLP’ the test clusters. Clusters 1 and 2 were used in Test 1; Cluster 3 was used in Test 2. This query was run as a MapReduce job for different data sizes on the various clusters. Validations were performed to Cluster 1: Hadoop+HDFS verify that the same data was read and written from both the (Intel Xeon processor E5-2600 v3 product family) systems under test. The number of Maps and Reduces differ • 1 cluster manager, 1 name node, 7 data nodes per cluster, based on best-effort tuning on each cluster. • 8 nodes, each equipped with Intel® Xeon® processor Figure 1 illustrates the results from Test 1, which compared E5‑2697 v3 @ 2.60 GHz with 56 cores and 128 GB RAM Lustre performance to HDFS performance. As can be seen, execution time was significantly less for Lustre for both • 50 TB of total cluster storage and 1 GbE network the 1-TB and 7-TB data sets. For the larger data set, Lustre connectivity between compute nodes outperformed HDFS by 30 percent. Cluster 2: Hadoop+Lustre Figure 2 illustrates the value of refreshing hardware. While the (Intel Xeon processor E5-2600 v3 product family) performance of Lustre was quite good on the cluster equipped • 1 resource manager, 1 history server, 7 node managers with the Intel Xeon processor E5-2697 v3, running the exact same test on the next generation Intel Xeon processor • 8 nodes, each equipped with Intel Xeon processor E5‑2699 v4 resulted in an 18-percent performance increase. E5‑2697 v3 @ 2.60 GHz with 56 cores and 128 GB RAM By choosing Intel EE for Lustre software and upgrading • 60 TB of Lustre storage, 10 GbE network connectivity their servers, financial firms can expect nearly a 50-percent between compute nodes and the Lustre file system increase in performance compared to running HDFS. White Paper | Lustre* Performance is Superior to HDFS* with the Latest Intel® Xeon® Processor Family 4

Performance Comparison Lustre* Performance Boost HDFS* and Lustre* Provided by Next-Generation Processor HDFS Intel® eon® Processor E- v Lustre Intel® eon® Processor E- v LOER IS BETTER LOER IS BETTER 1 Better 30 Better Performance Performance

1 1

Execution Time (seconds) 1 Execution Time (seconds)

1 TB TB 1 TB TB Data Size Data Size

Figure 1 . For large data sets, Intel® Enterprise Edition for Figure 2 . Upgrading the server to the next-generation Lustre* software outperformed Apache HDFS* by 30 percent. Intel® Xeon® processor resulted in an 18-percent performance gain for Intel® Enterprise Edition for Lustre* software. Next Steps Conclusion The MapReduce tests documented in this white paper used an Our performance benchmark tests, which showed Lustre runs analytic I/O-bound application, which is limited by the Lustre 30 percent faster than HDFS, confirm that HPC applications bandwidth. We intend to perform a similar comparison using a can use their existing Lustre parallel file system to conduct CPU-bound application. Also, we plan to test the performance I/O-bound analytics. of HDFS on the Intel Xeon processor E5-2600 v4 product In addition, we proved the cost-effectiveness of refreshing family, with both an analytic I/O-bound and a CPU-bound hardware—upgrading to the next generation of the Intel Xeon application. processor helped boost Lustre performance by 18 percent (again, for an I/O-bound application). We found that the performance of I/O-bound applications is limited by the Lustre bandwidth. Learn More We anticipate that CPU-bound applications may achieve an even greater benefit from upgrading to a more powerful processor. You may find the following resources useful: • Big Data Meets High-Performance Computing paper • Intel® Solutions for Lustre* software For more information visit intel .com/lustre or email hpdd-sales@intel .com.

1 cloudera.com/content/dam/cloudera/Resources/PDF/whitepaper/Cloudera_Financial_Services_Industry.pdf 2 Refer to the Results section of the document to view the configurations used to obtain these performance gains. 3 Community ENTerprise Operating System. A free rebuild of source packages from the Red Hat Enterprise . Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. See intel.com/products/ processor_number for details. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to intel.com/performance. Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference intel.com/performance/resources/benchmark_limitations.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. System configurations, SSD configurations and performance tests conducted are discussed in detail within the body of this paper. For more information go to intel.com/performance. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer, or learn more at at intel.com. THE INFORMATION PROVIDED IN THIS PAPER IS INTENDED TO BE GENERAL IN NATURE AND IS NOT SPECIFIC GUIDANCE. RECOMMENDATIONS (INCLUDING POTENTIAL COST SAVINGS) ARE BASED UPON INTEL’S EXPERIENCE AND ARE ESTIMATES ONLY. INTEL DOES NOT GUARANTEE OR WARRANT OTHERS WILL OBTAIN SIMILAR RESULTS. INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Copyright © 2016 Intel Corporation. All rights reserved. Intel, the Intel logo, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. * Other names and brands may be claimed as the property of others. 0916/EYAR/KC/PDF Please Recycle 334705-001US