Accelerate Your AI Journey with Intel

Intel® AI Workshop 2021 Accelerate Your AI Journey with Intel Laurent Duhem – HPC/AI Solutions Architect ([email protected]) Shailen Sobhee - AI Software Technical Consultant ([email protected]) Notices and Disclaimers ▪ Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. ▪ No product or component can be absolutely secure. ▪ Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. For more complete information about performance and benchmark results, visit http://www.intel.com/benchmarks . ▪ Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit http://www.intel.com/benchmarks . ▪ Intel® Advanced Vector Extensions (Intel® AVX) provides higher throughput to certain processor operations. Due to varying processor power characteristics, utilizing AVX instructions may cause a) some parts to operate at less than the rated frequency and b) some parts with Intel® Turbo Boost Technology 2.0 to not achieve any or maximum turbo frequencies. Performance varies depending on hardware, software, and system configuration and you can learn more at http://www.intel.com/go/turbo. ▪ Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. ▪ Cost reduction scenarios described are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. ▪ Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. ▪ © Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others. 2 Notices and Disclaimers ▪ This presentation is tailored for developers and curious data-scientists seeking for optimal performances in their daily production ▪ No AI theory will be lectured during this talk ▪ Preliminary knowledge for ML frameworks/packages – as seen below is certainly a must.. ▪ Intel Software offering is vast and address multiple Hardware flavors (XPUs) – so buckle-up for dense lectures .. ;-) 3 AI is Interdisciplinary DATA More AI Insights Machine Reinforcement Learning Learning Regression Classification Clustering Symbolic Reasoning Create Analogical Reasoning Business Transmit Evolutionary Computing Ingest Data Ensemble Bayes Methods Analytics methods Search/Query, Integrate Statistics, etc. And More… Operational Stage 4 Clean Deep Learning Security Image Language Recommender More Image Object Image Natural Language Speech Recommender Data Normalize Recognition Detection Segmentation Processing (NLP) ⇄ Text Systems Generation 4 Flexible Software AI Ecosystem Hardware for AI Solutions Portofolio Acceleration 5 Flexible Software AI Ecosystem Hardware for AI Solutions Portofolio Acceleration 6 INTELLIGENT SOLUTIONS Intel AI Builders OEM Systems Industry-leading ISVs and system Ready-to-deploy, end-to-end solutions integrators solutions to accelerate jointly designed and developed by Intel & adoption of artificial intelligence (AI) partners to simplify customer experiences across verticals and workloads intel.com/ai/deploy-on-intel- builders.intel.com/ai architecture.html Intel AI Public Cloud Solutions Intel® Select Solutions AI-optimized solutions developed by AI-optimized solutions for real-world Intel with cloud service providers (CSPs) demands that are pre-configured and including for Amazon Web Services rigorously benchmark7 tested to (AWS), Baidu Cloud, Google Cloud accelerate infrastructure deployment Platform, Microsoft Azure & more intel.com/selectsolutions 7 INTELLIGENT CSP PaaS Offerings – Overview SOLUTIONS AWS Azure GCP Name SageMaker Azure Machine Learning with Brainwave Google App Engine Type PaaS PaaS PaaS Instance C5 Instance Fv2 or HC Series Flexible Environment Description A fully managed platform to easily A fully managed cloud service to easily build, deploy, and share predictive A fully managed serverless platform build, train and deploy machine analytics solutions. to build highly scalable applications learning models at any scale OS N/A N/A N/A HW SKUs C5 Instance (Skylake) Intel Arria® 10 FPGA FW Pre-configured DAAL4Py Marketplace approach for optimized FW (marketplace) WIP Use Case Ad targeting, prediction & Modern web applications and forecasting, industrial IoT & Machine scalable mobile backends Learning CSP Value Ease of use. Pre-configured Prop environment 8 8 INTELLIGENT CSP IaaS Offerings – Overview SOLUTIONS AWS Azure GCP Name DL AMI Data Science VMs Cycle* Cloud Google* Compute Engine Instance C5 C5 Fv2 or HC Series HC Series Platform based on Skylake Description Pre-installed Customer-built DL Azure VM images, pre-installed, Easy-to-set-up clusters with Scalable, high-performance virtual pip packages engine – clean configured and tested with several Singularity containers machines slate popular AI/DL tools HW SKUs Intel Xeon Platinum 8000 series Various HW Platforms Any HW platforms (validated on Intel Xeon Platinum family (Skylake) (code-named Skylake) Skylake) Optimized TensorFlow, MxNet, and PyTorch TensorFlow and VM templates on TensorFlow TensorFlow FW MarketPlace Instance 2vCPU to 72vCPU Fsv2-Series Any Instance size Up to 160 vCPU Size 2 to 72 vCPU Memory 144 GiB Up to 144 GiB Up to 3.75 TB Use Case Advanced compute intensive Batch processing, web servers, HPC workloads but can run deep Improve and manage patient data, workloads: high performance web analytics and gaming learning create intuitive customer experience servers, HPC, batch processing, ad serving, gaming, distributed 9 analytics and ML/DL inference CSP Value Best price performance Lower per-hour list price is best Dynamically provision HPC Azure Industry-leading price and Prop value in price-performance in Azure clusters and orchestrate data and performance portfolio jobs for hybrid and cloud Easily transition from on-prem to workflows cloud, compliance and global reach 9 INTELLIGENT CSP IaaS Offerings – Overview SOLUTIONS • Amazon Web Services: intel.ai/aws • Baidu Cloud: intel.ai/baidu • Google Cloud Platform: intel.ai/gcp • Microsoft Azure: intel.ai/microsoft 10 10 Flexible Software AI Ecosystem Hardware for AI Solutions Portofolio Acceleration 11 FLEXIBLE Delivering AI from Cloud-to-Device ACCELERATION Cloud/DC Edge Device CPU only For mainstream AI use cases CPU + GPU When compute is dominated by AI, HPC, graphics, and/or real-time media CPU + CUSTOM When compute is dominated by deep learning (DL) Intel® FPGAs DL Training/Inference DL Custom DL Inference DC = Data Center DL = Deep Learning 12 FLEXIBLE Intel® Xeon® Scalable Processors ACCELERATION THE ONLY DATA CENTER CPU OPTIMIZED FOR AI INTEL ® ADVANCED VECTOR EXTENSIONS 512 (INTEL AVX512) INTEL ® DEEP LEARNING BOOST (INTEL DL BOOST) INTEL® ADVANCED MATRIX EXTENSIONS (INTEL AMX) 2019 2021 2022 3rd GEN COOPER LAKE ND 14NM th 2 GEN NEXT GEN INTEL DL BOOST (BFLOAT16) 4 GEN CASCADE LAKE rd SAPPHIRE RAPIDS 14NM 3 GEN NEXT-GENERATION TECHNOLOGIES NEW AI ACCELERATION (VNNI) ICE LAKE INTEL AMX NEW MEMORY STORAGE HIERARCHY 10NM SHIPPING 1H’21 LEADERSHIP PERFORMANCE 13 The Evolution of Microprocessor Parallelism → → More cores More Threads Wider vectors Scalar A + B = C Intel® Intel® Xeon® Intel® Xeon® Intel® Xeon® Intel® Xeon® Intel® Xeon® Xeon® Processor Processor Processor Processor Processor Intel® Xeon® SIMD Processor 5100 series 5500 series 5600 series E5-2600 v2 E5-2600 v3 Scalable 64-bit series series Processor1 A + B = C v4 series A B C Up to Core(s) 1 2 4 6 12 18-22 28 A B C Up to Threads 2 2 8 12 24 36-44 56 A B C 14 A B C SIMD Width 128 128 128 128 256 256 512 A B C Intel® Intel® SSE Intel® SSE Intel® Intel® Vector ISA Intel® SSE3 Intel® AVX A B C SSE3 4.1 4.2 AVX2 AVX-512 A B C 1. Product specification for launched and shipped products available on ark.intel.com. 14 FLEXIBLE INTEL ® DEEP LEARNING BOOST OVERVIEW ACCELERATION 15 FLEXIBLE INTEL ® DEEP LEARNING BOOST OVERVIEW ACCELERATION 16 FLEXIBLE INTEL ® DEEP LEARNING BOOST OVERVIEW

Accelerate Your AI Journey with Intel

GPU Developments 2018

CTL RFP Proposal

Mass-Producing Your Certified Cluster Solutions

NVM Express and the PCI Express* SSD Revolution SSDS003

System Design for Telecommunication Gateways

Extracting and Mapping Industry 4.0 Technologies Using Wikipedia

Fast Setup and Integration of ABAQUS on HPC Linux Cluster and the Study of Its Scalability

Appro Xtreme-X Computers with Mellanox QDR Infiniband

GPU Developments 2017T

High Performance Computing

Spotlight on Ansys 12.0

Increasing Productivity with PRIMERGY X86 HPC from Fujitsu