Unlock Bigdata Analytic Efficiency With Lake

Jian Zhang, Yong Fu,

March, 2018 Agenda

. Background & Motivations . The Workloads, Reference Architecture Evolution and Performance Optimization . Performance Comparison with Remote HDFS . Summary & Next Step 3 Challenges of scaling Hadoop* Storage

BOUNDED Storage and Compute resources on Hadoop Nodes brings challenges

Data Capacity Silos Costs Performance & efficiency Typical Challenges Data/Capacity Multiple Storage Silos Space, Spent, Power, Utilization Upgrade Cost Inadequate Performance Provisioning And Configuration

Source: 451 Research, Voice of the Enterprise: Storage Q4 2015 *Other names and brands may be claimed as the property of others. 4 Options To Address The Challenges

Compute and Large Cluster More Clusters Storage Disaggregation • Lacks isolation - • Cost of • Isolation of high- noisy neighbors duplicating priority workloads hinder SLAs datasets across • Shared big • Lacks elasticity - clusters datasets rigid cluster size • Lacks on-demand • On-demand • Can’t scale provisioning provisioning compute/storage • Can’t scale • compute/storage costs separately compute/storage costs scale costs separately separately

Compute and Storage disaggregation provides Simplicity, Elasticity, Isolation

5 Unified Hadoop* and API for cloud storage

Hadoop Compatible File System abstraction layer: Unified storage API interface Hadoop fs –ls s3a://job/

adl:// oss:// s3n:// gs:// s3:// s3a:// wasb://

2006 2008 2014 2015 2016

6 Proposal: * with disagreed

SQL …… Hadoop Services • Virtual Machine • Container • Bare Metal HCFS Compute 1 Compute 2 Compute 3 … Compute N

Object Storage Services Object Object Object Object • Co-located with gateway Storage 1 Storage 2 Storage 3 … Storage N • Dynamic DNS or load balancer • Data protection via storage or erasure code Disaggregated Object Storage Cluster • Storage tiering

*Other names and brands may be claimed as the property of others. 7 8 Workloads

• Simple Read/Write • DFSIO: TestDFSIO is the example of a benchmark that attempts to measure the Storage's capacity for reading and writing bulk data. • Terasort: a popular benchmark that measures the amount of time to sort one terabyte of randomly distributed data on a given computer system. • Data Transformation • ETL: Taking data as it is originally generated and transforming it to a format (Parquet, ORC) that more tuned for analytical workloads. • Batch Analytics • To consistently executing analytical process to process large set of data. • Leveraging 54 derived from TPC-DS * queries with intensive reads across objects in different buckets

9 Bigdata on Object Storage Performance overview --Batch analytics

1TB Dataset Batch Analytics and Interactive Query Hadoop/SPark/Presto Comparison (lower the better) HDD: (740 HDDD OSDs, 40x Compute nodes, 20x Storage nodes) SSD:(20 SSDs, 5 Compute nodes, 5 storage nodes) 10000 7719 8000 6573 6000 5060 3968 3446 3852 4000 2244 2120 2000 Query Query Time(s) 0 Parquet (SSD) ORC (SSD) Parquet (HDD) ORC HDD)

Hadoop 2.8.1 / Spark 2.2.0 Presto 0.170 Presto 0.177

• Significant performance improvement from Hadoop 2.7.3/Spark 2.1.1 to Hadoop 2.8.1/Spark 2.2.0 (improvement in s3a) • Batch analytics performance of 10-node AFA is almost on-par with 60-node HDD cluster

10 Hardware Configuration & Topology --Dedicate LB 5x Compute Node • Intel® Xeon™ processor E5-2699 v4 @ 2.2GHz, 64GB mem • 2x10G 82599 10Gb NIC Head Hadoop Hadoop Hadoop Hadoop Hadoop • 2x SSDs Hive Hive Hive Hive Hive • 3x Data storage (can be emliminated) Spark Spark Spark Spark Spark Software: Presto • Hadoop 2.7.3 Presto Presto Presto Presto • Spark 2.1.1 • Hive 2.2.1 • Presto 0.177 1x10Gb NIC • RHEL7.3

4x10Gb NIC 5x Storage Node, 2 RGW nodes, 1 LB nodes LB • Intel() Xeon(R) CPU E5-2699v4 2.20GHz • 128GB Memory 2x10Gb NIC • 2x 82599 10Gb NIC • 1x Intel® P3700 1.0TB SSD as WAL and RGW1 RGW2 • 4x 1.6TB Intel® SSD DC S3510 as data drive • 2x 400G S3700 SSDs • 1 OSD instances one each S3510 SSD 2x10Gb NIC • RHEl7.3 MON • RHCS 2.3 OSD1 OSD2 OSD3 OSD4 OSD5 OSD1 … OSD4 11

*Other names and brands may be claimed as the property of others. Improve Query Success Ratio with trouble-shootings

1TB Query Success % (54 TPC-DS Queries) 1TB&10TB Query Success %(54 TPC-DS Queries) 120% tuned 100% 120% 100% 80% 80% 60% 60% 40% 40% 20% Query Query Success % 20% 0% hive-parquet spark-parquet presto-parquetpresto-parquet 0% (untuned) (tuned) spark-parquet spark-orc presto-parquet presto-parquet

Count of Issue Type • 100% selected TPC-DS query passed with tunings S3a driver issue • Improper Default configuration • small capacity size, Runtime issue • wrong middleware configuration Middleware issue • improper Hadoop/Spark configuration for Improper default configuration different size and format data issues Deployment issue Compatible issue 12 Ceph issue

0 5 10 15 ESTAB 0 0 ::ffff:10.0.2.36:44446 ::ffff:10.0.2.254:80 Optimizing HTTP Requests ESTAB 0 0 ::ffff:10.0.2.36:44454 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44374 ::ffff:10.0.2.254:80 ESTAB 159724 0 ::ffff:10.0.2.36:44436 ::ffff:10.0.2.254:80 -- The bottlenecks ESTAB 0 0 ::ffff:10.0.2.36:44448 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44338 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44438 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44414 ::ffff:10.0.2.254:80 ESTAB 0 480 ::ffff:10.0.2.36:44450 ::ffff:10.0.2.254:80 timer:(on,170ms,0) ESTAB 0 0 ::ffff:10.0.2.36:44442 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44390 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44326 ::ffff:10.0.2.254:80 Compute time ESTAB 0 0 ::ffff:10.0.2.36:44452 ::ffff:10.0.2.254:80 take the big ESTAB 0 0 ::ffff:10.0.2.36:44394 ::ffff:10.0.2.254:80 part. ESTAB 0 0 ::ffff:10.0.2.36:44444 ::ffff:10.0.2.254:80 Return 500 (compute time = ESTAB 0 0 ::ffff:10.0.2.36:44456 ::ffff:10.0.2.254:80 2 code in read data +sort seconds interval ======RadosGW log ESTAB 0 0 ::ffff:10.0.2.36:44508 ::ffff:10.0.2.254:80 ) ESTAB 0 0 ::ffff:10.0.2.36:44476 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44524 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44374 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44500 ::ffff:10.0.2.254:80 New connections out everyESTAB time, 0 0 ::ffff:10.0.2.36:44504 ::ffff:10.0.2.254:80 ESTAB 0 0 ::ffff:10.0.2.36:44512 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.259976 7fddd67fc700 1 ======starting new request req=0x7fddd67f6710 =====Connection not reusedESTAB 0 0 ::ffff:10.0.2.36:44506 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.271829 7fddd5ffb700 1 ======starting new request req=0x7fddd5ff5710 ===== ESTAB 0 0 ::ffff:10.0.2.36:44464 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.273940 7fddd7fff700 0 ERROR: flush_read_list(): d->client_c->handle_data() returned - ESTAB 0 0 ::ffff:10.0.2.36:44518 ::ffff:10.0.2.254:80 5 ESTAB 0 0 ::ffff:10.0.2.36:44510 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.274223 7fddd7fff700 0 WARNING: set_req_state_err err_no=5 resorting to 500 ESTAB 0 0 ::ffff:10.0.2.36:44442 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.274253 7fddd7fff700 0 ERROR: s->cio->send_content_length() returned err=-5 ESTAB 0 0 ::ffff:10.0.2.36:44526 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.274257 7fddd7fff700 0 ERROR: s->cio->print() returned err=-5 ESTAB 0 0 ::ffff:10.0.2.36:44472 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.274258 7fddd7fff700 0 ERROR: STREAM_IO(s)->print() returned err=-5 ESTAB 0 0 ::ffff:10.0.2.36:44466 ::ffff:10.0.2.254:80 2017-07-18 14:53:52.274267 7fddd7fff700 0 ERROR: STREAM_IO(s)->complete_header() returned err=-5

Unresued connection cause high read time and performance 13 Optimizing HTTP Requests -- S3a input policy Solution Background

• The S3A filesystem client supports the notion of input policies, similar to that Enable random read policy in core-site.: of the POSIX fadvise() API call. This tunes the behavior of the S3A client to optimize HTTP GET requests for various use cases. To optimize HTTP GET requests, you can take advantage of the S3A experimental input fs.s3a.experimental.input.fadvise policy fs.s3a.experimental.input.fadvise. random • Ticket: https://issues.apache.org/jira/browse/HADOOP-13203 fs.s3a.readahead.range Seek in a IO stream 64K

Close stream • By reducing the cost of closing existing HTTP diff > 0 skip forward diff = targetPos – pos diff < 0 backward &Open connection requests, this is highly efficient for file IO accessing a again binary file through a series of `PositionedReadable.read()` and `PositionedReadable.readFully()` calls.

diff = 0 sequential read

14 Optimizing HTTP Requests -- Performance

1TB Batch Analytics Query on Parquet 10000 8305 8267 • Readahead feature support from 8000 Hadoop 2.8.1, but not enabled by 6000 default. Apply random read policy, 4000 2659

seconds 500 issue gone and performance 2000 improved 3x than before 0 Hadoop 2.7.3/Spark 2.1.1 Hadoop 2.8.1/Spark Hadoop 2.8.1/Spark 2.2.0(untuned) 2.2.0(tuned)

1TB Dataset Batch Analytics and Interactive Query Hadoop/Spark • All Flash storage architecture Comparison (lower the better) also show great performance 6000 4810 benefit and low TCO which 3346 4000 2659 2120 compared with HDD storage 2000 Seconds 0 Hadoop 2.8.1/Spark2.2.0[SSD] Hadoop 2.8.1/Spark2.2.0[HDD]

Parquet ORC

15 Optimizing HTTP Requests --Resource Utilization Comparison

Load Balance Network IO Hadoop Shuffle Disk Bandwidth OSD Disk Bandwidth Before 1500000 600000 600000 optimized 1000000 400000 400000 BW 500000 BW 200000 200000 0 0 0 -5 0 -5 1085 2175 3265 4355 5445 6535 7625 8715 990 10198 1085 2175 3265 4355 5445 6535 7625 8715 9805 1980 2970 3960 4950 5940 6930 7920 8910 9900

Sum of rxkB/s Sum of txkB/s Sum of rkB/s Sum of wkB/s Sum of rkB/s Sum of wkB/s

Load Balance Network IO After Hadoop Shuffle Disk Bandwidth OSD Disk Bandwidth 3000000 optimized 800000 600000 600000 2000000 400000 400000 BW 1000000 BW 200000 200000 0 0 0 0 -5 -5 505 465 930 505 1015 1525 2035 2545 3055 4011 4521 5031 1395 1860 2325 2790 3255 3720 4185 4650 1015 1525 2035 2545 3055 3565 4075 4585

Sum of rxkB/s Sum of txkB/s Sum of rkB/s Sum of wkB/s Sum of rkB/s Sum of wkB/s 16

16 Resolving RGW BW bottleneck --The bottlenecks

Network IO Single Query Network IO 3000000 2500000 2500000 2000000 2000000 1500000 1500000 1000000 1000000 500000 500000 0 0 0 0 235 470 705 940 10 20 30 40 50 60 70 80 90 1175 1410 1645 1880 2115 2350 2585 2820 3055 3290 3525 3760 3995 4230 4465 4700 4935 100 110 120 130 140 150 160 170

Sum of rxkB/s Sum of txkB/s Sum of rxkB/s Sum of txkB/s

• LB BW has became the bottleneck • Observed many messages blocked at load balancer server(send to s3a driver), but not much blocked at receiving on s3a driver side

17 Hardware Configuration --No LB with more RGWs and round-robin DNS

5x Compute Node Head Hadoop Hadoop Hadoop Hadoop Hadoop • Intel® Xeon™ processor E5-2699 v4 @ DNS Server Hive Hive Hive Hive Hive 2.2GHz, 64GB mem Spark Spark Spark Spark Spark • 2x10G 82599 10Gb NIC • 2x SSDs Presto Presto Presto Presto Presto • 3x Data storage (can be emliminated) Software: • Hadoop 2.7.3 1x10Gb NIC • Spark 2.1.1 • Hive 2.2.1 • Presto 0.177 • RHEL7.3

RGW1 RGW2 RGW2 RGW2 RGW2 5x Storage Node, 2 RGW nodes, 1 LB nodes • Intel(R) Xeon(R) CPU E5-2699v4 2.20GHz • 128GB Memory • 2x 82599 10Gb NIC • 1x Intel® P3700 1.0TB SSD as WAL and rocksdb • 4x 1.6TB Intel® SSD DC S3510 as data drive 2x10Gb NIC • 2x 400G S3700 SSDs MON • 1 OSD instances one each S3510 SSD • RHEl7.3 OSD1 OSD2 OSD3 OSD4 OSD5 • RHCS 2.3 OSD1 … OSD4

*Other names and brands may be 18 claimed as the property of others. Performance evaluation --No LB with more RGWs and round-robin DNS

1TB Batch Analytics Query (parquet) Query 42 Query Time(10TB Dataset with parquet) 3000 2659 2500 2244 140 129 120 2000 100 78 1500 80

seconds 1000 60 seconds 40 500 20 0 0 Dedicate single load balance DNS Dedicate load balance DNS RR • 18% performance improvement with more RGWs and round-robin DNS • Query42(has less shuffle) is 1.64x faster in the new architecture Key Resource Utilization Comparison

(Dedicate load balance server)S3A Driver (DNS)S3A Driver Network IO Network IO 800000 800000 600000 600000 400000

kB/s 400000 200000 0 200000

0

Axis Title 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

Sum of rxkB/s Sum of txkB/s Sum of rxkB/s Sum of txkB/s

• Compute side(Hadoop s3a driver) can read more data from OSD, which represent DNS deployment really can gain network throughput performance than single gateway with bonding/teaming technology Hardware Configuration

--RGW and OSD Collocated 5x Compute Node • Intel® Xeon™ processor E5-2699 v4 @ 2.2GHz, 64GB mem • 2x10G 82599 10Gb NIC Head Hadoop Hadoop Hadoop Hadoop Hadoop • 2x SSDs DNS Server Hive Hive Hive Hive Hive • 3x Data storage (can be emliminated) Spark Spark Spark Spark Spark Software: Presto • Hadoop 2.7.3 Presto Presto Presto Presto • Spark 2.1.1 • Hive 2.2.1 • Presto 0.177 1x10Gb NIC • RHEL7.3

5x Storage Node, 2 RGW nodes, 1 LB nodes • Intel(R) Xeon(R) CPU E5-2699v4 2.20GHz • 128GB Memory • 2x 82599 10Gb NIC 3x10Gb NIC • 1x Intel® P3700 1.0TB SSD as WAL and rocksdb • 4x 1.6TB Intel® SSD DC S3510 as data drive MON • 2x 400G S3700 SSDs • 1 OSD instances one each S3510 SSD RGW3 RGW4 RGW5 • RHEl7.3 RGW1 RGW2 • RHCS 2.3 OSD1 OSD2 OSD3 OSD4 OSD5 OSD1 … OSD4

21 Performance evaluation with RGW & OSD collocated

1TB Batch Analytics Query on Other workloads Parquet 2500 2242 2162 1TB dfsio_write 2000

1500 10TB parquet ETL 1000 seconds 500 0 500 1000 1500 2000 2500 0 dedicate rgws rgw+ osd co-locality rgw+ osd co-locality dedicate rgws

• No need extra dedicate RGW servers, RGW instance and OSD go through different network interface by enable ECMP • No performance degradation, but more less TCO

22 RGW & OSD collocated – RGW scaling

1TB DFSIO write 10TB ETL(Lower the better) 2500 2233 2254 2119 400 337 2000 1730 300 252 250 1500 950 200

MB/s 1000 seconds 500 100 0 0 1 2 3 4 5 6 9 18 # of RGWs # of RGWs

• Scale out RGWs can improve performance before OSD(storage) saturating • So How many RGWs can win the best performance should be decided by the bandwidth of each RGW server and throughput of OSDs

23 Using Ceph RBD as shuffle Storage -- Eliminate the physical drive on the compute!

1TB Batch Analytics Query on Parquet UC11 1TB Dataset (Physical shuffle storage 2500 2262 2244 vs RBD shuffle storage) 2000 200 150 1500 100 50 1000 0 seconds 500 0 RBD for Shuffle Physical Device for Shuffle physical-query_time rbd-query_time • Mount two RBDs on compute node remotely instead of physical shuffle device • Ideally, the latency on remote RBDs larger than local physical device, but the bandwidth of remote RBD is not smaller than local physical device too • So final performance of a TPC-DS query set on RBDs maybe close or even better than on physical device while there are heavy shuffles

24 25 Hardware Configuration --Remote HDFS 5x Compute Node • Intel® Xeon™ processor E5-2699 v4 @ Name Node Node Manager Node Manager Node Manager Node Manager 2.2GHz, 64GB mem Spark Spark Spark • 2x10G 82599 10Gb NIC Node Manager Spark • 2x S3700 as shuffle storage Spark Software: • Hadoop 2.7.3 • Spark 2.1.1 • Hive 2.2.1 • Presto 0.177 1x10Gb NIC • RHEL7.3

5x Data Node • Intel(R) Xeon(R) CPU E5-2699v4 2.20GHz 1x10Gb NIC • 128GB Memory • 2x 82599 10Gb NIC • 1x Intel® P3700 1.0TB SSD as WAL and rocksdb • 7x 400G S3700 SSDs as data stire • RHEl7.3 Data Data Data Data Data • RHCS 2.3 Node 1 Node 2 Node 3 Node 4 Node 5

26 Bigdata on Cloud vs. Remote HDFS --Batch Analytics

1TB Dataset Batch Analytics and Interactive Query comparsion • On-par performance compared with with remote HDFS (lower the better) remote HDFS HDD: (740 HDDD OSDs, 24x Compute nodes) SSD:(20 SSDs, 5 Compute nodes) • With optimizations, bigdata analytics on object storage is onpar with 6000 5060 remote, especially on parquet format 5000 data 4000 2839 • performance of s3a driver close to 3000 2244 1921 native dfsclient , and demonstrate 2000 compute and storage separate Query Query TIme(s) 1000 solution has a considerable 0 performance compare with Parquet ORC combination solution Over Object Store Remote HDFS

27 Bigdata on Cloud vs. Remote HDFS --DFSIO

DFSIO on 1TB Dataset(Throughput) 5000 4257 3845 4000 2863 3000 2354 1644 2000 MB/s 1000 0 remote remote ceph over s3a DFSIO write performance on hdfs(replicaiton hdfs(replication Ceph is better than remote 1) 3) Write to Ceph hit disk bottleneck hdfs(43%), but read dfsio write dfsio read performance is 34% lower Disk Bandwidth Shuffle storage block at 300000 consuming data from 250000 200000 150000 100000 50000 0 -5 20 45 70 95 120 145 170 195 220 245 270 295 320 345 370 395

Sum of rkB/s Sum of wkB/s

28 Bigdata on Cloud vs. Remote HDFS --Terasort

1TB Terasort Total Time cost at Throughput(MB/s) Reduce stage is big part 1000 887 800 600 442 400 200 0 Read and write Remote HDFS Ceph over s3a concurrently

OSD Data Drive Disk Bandwidth OSD Data Drive IO Latencies 2500000 200 2000000 150 1500000 100 1000000 500000 50 0 0 -5 -5 185 375 565 755 945 155 315 475 635 795 955 1135 1325 1515 1705 1895 2085 2275 2465 2655 1115 1275 1435 1595 1755 1915 2075 2235 2395 2555 2715

Sum of rkB/s Sum of wkB/s Sum of await Sum of svctm

29 Bigdata on Cloud vs. Remote HDFS --Ongoing rename optimizations

DirectOutputCom An implementation in Spark 1.6, that return the destination address as mitter working then no need to rename/move task output, no good robustness for failures, removed in Spark 2.0 IBM's "Stocator" Targets Openstack Swift, good robustness, but it is another file system committer for s3a Staging committer A choice of new s3a committer*, need large capacity of hard disk for staged data Magic committer A choice of new s3a committer*, if you know your object store is consistent or use s3gurad, this committer has higher performance

Rename operation can be improved!

* New s3a committer has merged into trunk, and will release in Hadoop 3.1 or later 30 31 Summary and Next Step

Summary • Bigdata on Ceph data lake is functionality ready validated by industry standard decision making workloads TPC-DS • Bigdata on the Cloud delivers on-par performance with remote HDFS for batch analytics, intensive write operations still need further optimizations • All flash solutions demonstrated significant TCO benefit compared with HDD solutions

Next • Expand analytic workloads scope • Rename operations optimizations to improve the performance • Accelerating the performance with speed up layer for shuffle

32 33 34 Experiment environment

Cluster Hadoop head Hadoop slave Load balancer OSD RGW Roles Hadoop name node Data node Haproxy Ceph osd Ceph rados gateway Secondary name Node manager node Resource Presto server manager Data node Node manager Hive metastore service Yarn history server Spark history server Presto server # of node 1 5 1 5 5 Processor Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 44 cores HT enabled Intel(R) Xeon(R) CPU E31280 @ 3.50GH 4 cores HT enabled Memory 128GB 64GB 128GB 32GB Storage 4x 1TB HDD 1x Intel S3510 480 GB • 1x Intel® P3700 1x Intel S3510 480 GB 2x Intel S3510 480GB SSD(vs s3700 metrics) SSD 1.6TB as jounal SSD • 4x 1.6TB Intel® SSD DC S3510 2X 400GB s370 as data store

Network 10GB 40GB 10GB+10GB 10GB S3A Key Performance Configuration fs.s3a.connection.maximu 10 m fs.s3a.threads.max 30 SW Configuration fs.s3a.socket.send.buffer 8192 Hadoop version 2.7.3/2.8.1 fs.s3a.socket.recv.buffer 8192 Spark version 2.1.1/2.2.0 fs.s3a.threads.keepalivetim 60 Hive version 2.2.1 e Presto version 0.177 fs.s3a.max.total.tasks 1000 Executor memory 22GB fs.s3a.multipart.size 100M Executor cores 5 fs.s3a.block.size 32M # of executor 24 fs.s3a.readahead.range 64k JDK version 1.8.0_131 fs.s3a.fast.upload true Memory.overhead 5GB fs.s3a.fast.upload.buffer array fs.s3a.fast.upload.active.bl 4 ocks fs.s3a.experimental.input.f radom advise Legal Disclaimer & Optimization Notice

• Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.com/benchmarks. • INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. • Copyright © 2018, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

Optimization Notice Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804

37

37