Intel Cirrascale and Petrobras Case Study

Case Study Intel® Xeon Phi™ Coprocessor Intel® Xeon® Processor E5 Family Big Data Analytics High-Performance Computing Energy Accelerating Energy Exploration with Intel® Xeon Phi™ Coprocessors Cirrascale delivers scalable performance by combining its innovative PCIe switch riser with Intel® processors and coprocessors To find new oil and gas reservoirs, organizations are focusing exploration in the deep sea and in complex geological formations. As energy companies such as Petrobras work to locate and map those reservoirs, they need powerful IT resources that can process multiple iterations of seismic models and quickly deliver precise results. IT solution provider Cirrascale began building systems with Intel® Xeon Phi™ coprocessors to provide the scalable performance Petrobras and other customers need while holding down costs. Challenges • Enable deep-sea exploration. Improve reservoir mapping accuracy with detailed seismic processing. • Accelerate performance of seismic applications. Speed time to results while controlling costs. • Improve scalability. Enable server performance and density to scale as data volumes grow and workloads become more demanding. Solution • Custom Cirrascale servers with Intel Xeon Phi coprocessors. Employ new compute blades with the Intel® Xeon® processor E5 family and Intel Xeon Phi coprocessors. Cirrascale uses custom PCIe switch risers for fast, peer-to-peer communication among coprocessors. Technology Results • Linear scaling. Performance increases linearly as Intel Xeon Phi coprocessors “Working together, the are added to the system. Intel® Xeon® processors • Simplified development model. Developers no longer need to spend time optimizing data placement. and Intel® Xeon Phi™ coprocessors help Business Value • Faster, better analysis. More detailed and accurate modeling in less time HPC applications shed improves oil and gas exploration. performance constraints.” • Enhanced ROI. The processor and coprocessors work together to deliver maximum value from compute resources. • More processing power. Performs more processing in the same data center – David Driggers, CEO, space, for reduced costs. Cirrascale Speeding Seismic Analysis with Petrobras, the Brazilian multinational Coprocessors energy corporation. “Like many oil and A premier provider of build-to-order gas companies, Petrobras is continually infrastructure, Cirrascale works with its looking for ways to scale performance of customers to engineer custom server its compute-intensive seismic applications solutions that can scale as data sets grow and increase compute density to make larger and workloads become more maximum use of its data center space,” demanding. One of its customers is says David Driggers, CEO of Cirrascale. To help accelerate seismic applications, Designing a Solution for To enable better scaling, Cirrascale Petrobras had been using servers Optimum Scaling developed a PCIe switch riser that equipped with graphic processing Cirrascale set out to design a system to allows multiple Intel Xeon Phi units (GPUs) to augment the main optimize scaling and minimize latency. coprocessors to communicate directly processors. But the company found the In a traditional dual-processor server with each other as peers, without GPU approach did not fully capitalize with four coprocessors, infrastructure crossing through the main processors. on multithreaded applications. providers such as Cirrascale typically Each proprietary riser holds up to four attach one pair of coprocessors to each Intel Xeon Phi coprocessors. Engineers from Petrobras and Intel of the main processors. “Coprocessors The Cirrascale team developed a demonstrated demanding seismic code in the same pair communicate directly, compute blade design with two switch running on a system with four Intel but to communicate with the other risers holding a combined eight Intel Xeon Phi coprocessors. They delivered pair, they must go through the main Xeon Phi coprocessors, integrated on performance per watt competitive with processors first—a slower path the blade with the Intel Xeon processor Intel® processor-only systems, but with that imposes a latency penalty on E5 family. “The Intel Xeon processor E5 significantly more processing being performance,” says Ellis. “This approach family delivers superior performance done on each server node thanks to the also complicates programming, for high-performance computing Intel coprocessors. However, because because programmers are forced [HPC] applications compared with of the massive and growing data sets to place data in different locations previous generations. At the same time, involved in seismic processing, the depending on which coprocessor will these processors support massive Petrobras team needed to know how make actual use of it.” well the performance would scale as memory footprints and an abundance more coprocessors were added. “Having of high-speed I/O,” says Driggers. twice the number of coprocessors “Working together, the Intel Xeon doesn’t necessarily deliver twice processors and Intel Xeon Phi the performance,” says Scott Ellis, coprocessors help HPC applications engineering fellow at Cirrascale. “System shed performance constraints.” latency can take a toll.” 2 Scalable Intel® Xeon Phi™ performance helps energize the future Delivering Outstanding Maximizing Benefits from Test Results Compute Resources Lessons Learned Tests demonstrated the outstanding The Cirrascale riser with Intel “For true HPC, we find it is important scalability of the new Cirrascale coprocessors also enables companies to really understand how our code implementation. The testing was to obtain more work from their matches the architecture of the conducted using a server with eight Intel Intel Xeon processor E5 family than systems we are considering,” says Xeon Phi 3120P coprocessors plus two with GPU-based implementations. Paulo Souza, HPC consultant and Intel Xeon processors E5-2670. Scaling Instead of merely offloading work software engineer at Petrobras. “That was measured in gigasamples per to coprocessors, the Cirrascale understanding has enabled us to second, with each sample representing implementation shares work among the perform optimizations with the Intel® one point in a complex model of an oil processors and coprocessors through Xeon Phi™ coprocessor and obtain or gas reservoir. “The test results show standard message passing interface excellent results.” that we can scale with nearly a one-to- (MPI) protocols typically used in local one ratio of performance gain to Intel networks. “The main processors coprocessors added,” says Driggers. perform as much as 40 percent of the “While we tested with the Intel Xeon Phi computational work,” says Driggers. 3120P, the results indicated that any Intel Xeon Phi coprocessor would have Simplifying Application the same scaling, allowing customers to Development choose the best price-performance for Using Intel Xeon Phi coprocessors “ Developers no longer need their specific job.” allows software developers to work more quickly since they can employ to spend hours taking into Providing Faster, Better the same tools and approaches as they account where a piece of data Seismic Analysis do with the Intel Xeon processor E5 The increased performance helps family. “Developers no longer need resides or how to get it from deliver better, more accurate modeling to spend hours taking into account one place to another. All of the results. “Companies like Petrobras can where a piece of data resides or how perform more iterations of seismic to get it from one place to another,” Intel® Xeon Phi™ coprocessors models and generate more precise says Ellis. “All of the Intel Xeon Phi on a Cirrascale server blade images of reservoir locations, avoiding coprocessors on a Cirrascale server costly drilling mistakes. This is a key blade communicate as peers.” communicate as peers.” consideration since a large portion of oil reserves that companies like Reducing Cost of Ownership – Scott Ellis, Petrobras are investigating are located Petrobras was impressed with the Engineering Fellow, beneath deep and ultra-deep waters,” scalability test results for the Intel Xeon Cirrascale says Driggers. “Locating reservoirs Phi processor blade system. “Scalable more quickly also helps reduce performance was one of the most processing costs.” important capabilities we needed the Intel Xeon Phi coprocessors to provide,” says Paulo Souza, HPC consultant and software engineer at Petrobras. 3 The company also sees the potential For its part, the Cirrascale team Find the solution that’s right for to reduce total cost of ownership using appreciates the ease of working with your organization. Contact your Intel Intel Xeon Phi coprocessors. “With Intel Xeon Phi coprocessors. “Having representative, visit Intel’s Business eight cards per server, we can reduce a robust set of tools is very important Success Stories for IT Managers, and the overall number of server nodes,” for us to embrace a technology,” says check out IT Center, Intel’s resource for says Souza. “That saves data center real Driggers. “Intel Xeon Phi coprocessors the IT industry. estate and helps reduce electricity and have a mature tool set based on open More information about Intel® Xeon Phi™ maintenance costs.” standards that is simple for us to use, coprocessors can be found at saves time, and allows us to speed www.intel.com/xeonphi up adoption of the Intel coprocessor- based systems for our customers.” Software developer resources can be found at https://software.intel.com/

Intel Cirrascale and Petrobras Case Study

Accelerating HPL Using the Intel Xeon Phi 7120P Coprocessors

Power Measurement Tutorial for the Green500 List

Towards Better Performance Per Watt in Virtual Environments on Asymmetric Single-ISA Multi-Core Systems

Comparing the Power and Performance of Intel's SCC to State

NVIDIA Ampere GA102 GPU Architecture Whitepaper

Using Intel Processors for DSP Applications

Quantitative Analysis Modern Processor Design: Fundamentals of Superscalar Processors

High Performance Embedded Computing in Space

Low-Power High Performance Computing

PAKCK: Performance and Power Analysis of Key Computational Kernels on Cpus and Gpus

EPYC: Designed for Effective Performance

Measurement, Control, and Performance Analysis for Intel Xeon Phi