Accelerate Hybrid Cloud AI Workloads Solution Brief
Total Page:16
File Type:pdf, Size:1020Kb
Solution Brief Data Center | Hybrid Cloud Accelerate Hybrid Cloud AI Workloads Ease your journey to hybrid/multicloud with a reference architecture for Intel® technology and VMware Cloud Foundation Executive Summary To remain competitive in today’s world, organizations need a modern data center. Companies using these data centers must accelerate their product development, compete more successfully at a lower cost, and Solution Benefits reduce their downtime and maintenance overhead. Technology must move and change with the times—solutions for hosting applications and Intel’s VMware Cloud Foundation services must innovate and change as well. reference architecture takes advantage of Intel® compute, Companies with older and outdated data centers will want to meet these memory, storage, and networking challenges by upgrading to hybrid cloud solutions, where the data center innovations to help enable can easily and seamlessly interface between on‑premises and cloud software-defined data centers and systems. Based on VMware Cloud Foundation with VMware Tanzu, Intel hybrid/multicloudOptional adoption. partner logo goes here addressed these requirements by offering a hybrid/multicloud reference • Fast AI inference. AI workloads architecture—available in a Base and Plus configuration—that is easily can benefit from innovations deployable and manageable for virtual machines (VMs) and containers. from Intel such as Intel® DL Boost. • Flexibility and portability. VMware Cloud Foundation helps enable enterprises to run their workloads where it makes most sense, whether that’s Private Public on‑premises, in a public cloud, Cloud Cloud or in several clouds at once. VMware Cloud Foundation VMware VMware VMware VMware vSphere vRealize Suite vSA S‑T VMware SDDC Manager Figure 1. VMware Cloud Foundation supports software-defined data centers that can benefit greatly from Intel® compute, memory, storage, and networking technologies. Solution Brief | Accelerate Hybrid Cloud AI Workloads 2 Business Challenge: Building a Hybrid Cloud • Machine-learning inference. Once a model is trained, it Machine-Learning Architecture can be run on new data sets to uncover hidden insights. Inference is compute-intensive, and can benefit from A data center and the machine learning that a data center innovations from Intel such as Intel® Deep Learning Boost enables needs to run optimally for a business to stay (Intel® DL Boost) with Vector Neural Network Instructions competitive. Balancing high performance and cost is a (VNNI)—available starting with vSphere 7 and ESXi 7.0, continual challenge. Enterprises seek infrastructure that which are foundational components of the VMware Cloud is characterized by less downtime, less setup time, easier Foundation 4 platform. maintenance, and lower overhead costs—without sacrificing • Data warehousing and analytics. Data warehouses are performance. Legacy data centers cannot take advantage considered one of the core components of business of the cost efficiencies and new technologies available in a intelligence. They are a central location to store data from one hybrid/multicloud environment. Such data centers cannot or more disparate sources as well as current and historical adapt to changing workload requirements quickly and nimbly. data. The VMware hybrid/multicloud platform supports data For companies with outdated data center technologies, warehousing, including industry‑proven solutions based on meeting these challenges involves replacing legacy hardware Microsoft SQL Server 2019 or Oracle Database 19c. and software with modern, hybrid‑cloud‑capable solutions that can accelerate the entire software and hardware provisioning, deployment, and maintenance lifecycle along with application development, testing, and delivery. But, especially for machine learning, companies may be daunted A Closer Look at Intel® DL Boost and VNNI by assembling and maintaining hybrid cloud infrastructure. 2nd Gen Intel® Xeon® Scalable processors offer Machine learning requires large datasets that are difficult to something unique that is not available with any other get into the cloud, and machine‑learning models must be processor on the market: Intel® DL Boost with VNNI. continuously retrained and updated. In addition, data for This technology takes advantage of, and improves machine learning can be sensitive, highly regulated, or may upon, Intel® AVX-512. VNNI improves AI performance contain intellectual property, which raises the data’s security by combining three instructions into one—thereby concerns. Intel and VMware have teamed up to help take optimizing the use of compute resources and utilizing the the guesswork out of building a machine‑learning solution. cache more effectively and avoiding potential bandwidth VMware Cloud Foundation is a hybrid cloud platform that bottlenecks. In Intel benchmarks, VNNI speeds the runs on Intel® hardware, offering an easily deployable and delivery of inference results by up to 30x, compared to manageable hybrid/multicloud platform for managing VMs the previous-generation Intel Xeon Scalable processor.1 and orchestrating containers. Typical Workloads in the Hybrid/Multicloud Data Center Solution Value: High Performance in the The combination of VMware Cloud Foundation and Intel® Hybrid/Multicloud Environment technology running on VMs or in containers can support a wide variety of use cases: VMware Cloud Foundation is a full‑stack HCI solution that helps accelerate adoption of hybrid/multicloud • Machine-learning training. Image classification is one of environments. When combined with Intel technology, the most popular use cases for deep learning. Training VMware Cloud Foundation provides consistently high such models can be time‑consuming and, without the performance, reduced data center footprint, and efficient right tools, requires specialized skills. The VMware operations management. Cloud Foundation platform works with various machine‑ learning frameworks, including DataRobot, which is a With the end-to-end solution that Intel and VMware offer, popular automated machine‑learning platform that takes enterprises can quickly launch database processing and advantage of optimizations for Intel® architecture. With AI, and scale workloads to accommodate future needs. The a library of hundreds of powerful open‑source machine‑ unified cloud solution presented in this solution brief can learning algorithms, the DataRobot platform applies many run containerized applications and traditional VMs that are best practices to machine learning and helps to accelerate located in an on‑premises data center as well as in the public and scale data science capabilities while increasing cloud, such as on Amazon Web Services. transparency, accuracy, and collaboration. Solution Brief | Accelerate Hybrid Cloud AI Workloads 3 Container provisioning and lifecycle management are This solution provides infrastructure and operations across provided by VMware Tanzu Kubernetes Grid (TKG). The private and public clouds with excellent performance and hybrid/multicloud structure of the solution allows enterprises reliability from Intel® hardware components. to extend available resources and easily migrate workloads Users can also take advantage of Intel DL Boost with VNNI. from on‑premises to the cloud and back. Intel conducted experiments to show the improvement of Enterprises can use Intel® Optane™ technology to boost inference performance with an Intel architecture‑optimized their VMware Cloud Foundation workload performance by container stack that uses the new VNNI instruction set. placing data closer to the CPU. This technology is a new Our tests benchmarked the ResNet50 v1.5 topology with class of non-volatile memory and storage media that fills the int8 and fp32 precision, using the Intel® Optimization gap between high‑performing volatile memory and lower‑ for TensorFlow container stack with Intel’s Model Zoo performing NAND storage and HDDs. By placing data closer pretrained models. We ran three tests:3 to the CPU, Intel Optane technology helps architects to confidently deploy an agile, high-performing infrastructure • Compare the performance improvement of Intel DL that helps organizations create innovative services and Boost with VNNI using int8 precision against fp32 optimize their infrastructure investments. precision. As shown in Figure 2, int8 precision enabled a 4.1x improvement for the Base configuration and a Intel Optane technology can be deployed in two different ways: 4.38x improvement for the Plus configuration. For a small • Intel® Optane™ persistent memory (PMem) gives enterprises decrease in precision, performance quadrupled. the ability to extract more from larger datasets by combining • Compare throughput from the default TensorFlow more capacity and native persistence in a DIMM form factor. container against a container using the Intel Optimization Data can be accessed, processed, and analyzed in near real for TensorFlow. Framework optimizations from the Intel time to deliver deep insights, improve operations, and create Optimization for TensorFlow can provide 2.33x improvement new revenue streams. for the Base configuration and 2.61x performance • Intel® Optane™ SSDs help remove data bottlenecks to improvement for the Plus configuration. accelerate transactions and time to insights, so users get what • Compare the results of running VMware Cloud Foundation they need, when they need it. With high quality of service 4.0.1 (which takes advantage of Intel DL Boost and VNNI) and at least 6x faster performance