Mellanox In-Network Computing for AI and the Development with NVIDIA (SHARP – NCCL) Qingchun Song, Mellanox July 2, 2019

Total Page:16

File Type:pdf, Size:1020Kb

Mellanox In-Network Computing for AI and the Development with NVIDIA (SHARP – NCCL) Qingchun Song, Mellanox July 2, 2019 Mellanox In-Network Computing For AI and The Development With NVIDIA (SHARP – NCCL) Qingchun Song, Mellanox July 2, 2019 © 2019 Mellanox Technologies | Confidential 1 Data Processing Revolution – Data Centric Compute-Centric Data-Centric Von Neumann DataFlow Machine Machine © 2019 Mellanox Technologies | Confidential 2 CPU-Centric HPC/AI Center Everything CPU Network Storage © 2019 Mellanox Technologies | Confidential 3 Data-Centric HPC/AI Center Workload Workload Workload Communication Framework (MPI) CPU Functions Network Storage Functions Functions In-CPU Computing In-Network Computing In-Storage Computing © 2019 Mellanox Technologies | Confidential 4 In-Network Computing Architecture CPU-Centric (Onload) Data-Centric (Offload) CPU GPU CPU GPU GPU CPU GPU CPU CPU GPU CPU GPU GPU CPU GPU CPU Communications Latencies Communications Latencies of 30-40us of 3-4us © 2019 Mellanox Technologies | Confidential 5 In-Network Computing to Enable Data-Centric Data Centers CPU GPU Scalable Hierarchical Aggregation and GPU Reduction Protocol CPU CPU GPU GPUDirect Over RDMA NVMe Fabrics GPU CPU © 2019 Mellanox Technologies | Confidential 6 In-Network Computing Connects the World’s Fastest HPC and AI Supercomputer • Summit CORAL System, World’s Fastest HPC / AI System • Nvidia V100 GPU + InfiniBand HCA + In-Network Computing Fabric © 2019 Mellanox Technologies | Confidential 7 GPUDirect RDMA Technology and Advantages © 2019 Mellanox Technologies | Confidential 8 Scalable Hierarchical Aggregation And Reduction Protocol (SHARP) © 2019 Mellanox Technologies | Confidential 9 Accelerating HPC and AI Applications Accelerating HPC Applications ▪ Significantly reduce MPI collective runtime ▪ Increase CPU availability and efficiency ▪ Enable communication and computation overlap Enabling Artificial Intelligence Solutions to Perform Critical and Timely Decision Making ▪ Accelerating distributed machine learning © 2019 Mellanox Technologies | Confidential 10 AllReduce Example – Trees ▪ Many2One and One2Many traffic patterns – possible network congestion ▪ Probably not a good solution for large data ▪ Large scale requires higher tree / larger radix ▪ Result distribution – over the tree / MC Switch EndNode Stage1 Stage2 © 2019 Mellanox Technologies | Confidential 11 AllReduce (Example) - Recursive Doubling ▪ The data is recursively divided, processed by CPUs and distributed ▪ The rank’s CPUs are occupied performing the reduce algorithm ▪ The data is sent at least 2x times, consumes at least twice the BW Rank 1 Rank 2 Rank 3 Rank 4 Step 1 ½ Data Calculation phase Step 2 ¼ Data Step 3 ¼ Data Result sending phase Step 4 ½ Data © 2019 Mellanox Technologies | Confidential 12 Scalable Hierarchical Aggregation Protocol Reliable Scalable General Purpose Primitive, Applicable to Multiple Use-cases ▪ In-network Tree based aggregation mechanism ▪ Large number of groups ▪ Multiple simultaneous outstanding operations ▪ Streaming aggregation Accelerating HPC applications SHArP Tree Root ▪ Scalable High Performance Collective Offload ▪ Barrier, Reduce, All-Reduce, Broadcast ▪ Sum, Min, Max, Min-loc, max-loc, OR, XOR, AND ▪ Integer and Floating-Point, 16 / 32 / 64 bit SHArP Tree Aggregation Node ▪ Up to 1KB payload size (in Quantum) (Process running on HCA) ▪ Significantly reduce MPI collective runtime SHArP Tree Endnode (Process running on HCA) ▪ Increase CPU availability and efficiency ▪ Enable communication and computation overlap Accelerating Machine Learning applications SHArP Tree ▪ Proven the many-to-one Traffic Pattern ▪ CUDA , GPUDirect RDMA © 2019 Mellanox Technologies | Confidential 13 Scalable Hierarchical Aggregation Protocol ▪ SHARP Tree is a Logical Construct Switch/Router ▪ Nodes in the SHArP Tree are IB Endnodes HCA ▪ Logical tree defined on top of the physical underlying fabric ▪ SHArP Tree Links are implemented on top of the IB transport (Reliable Connection) ▪ Expected to follow the physical topology for performance but not required SHArP Tree Root Physical Topology ▪ SHARP Operations are Executed by a SHARP Tree ▪ Multiple SHArP Trees are Supported SHArP Tree Aggregation Node ▪ Each SHArP Tree can handle Multiple Outstanding (Process running on HCA) SHArP Operations SHArP Tree Endnode (Process running on HCA) ▪ Within a SHArP Tree, each Operation is Uniquely Identified by a SHArP-Tuple ▪ GroupID ▪ SequenceNumber One SHArP Tree © 2019 Mellanox Technologies | Confidential 14 SHARP Principles of Operation - Request Aggregation Request SHArP Tree Root © 2019 Mellanox Technologies | Confidential 15 SHARP Principles of Operation – Response Aggregation Response SHArP Tree Root © 2019 Mellanox Technologies | Confidential 16 NCCL Ring ▪ Simple ▪ Linear Latency ▪ Support in NCCL-2.3 & previous version ▪ Multiple rings Switch Switch © 2019 Mellanox Technologies | Confidential 17 NCCL Tree ▪ Support added in NCCL-2.4 ▪ Keep Intra-node chain ▪ Node leaders participate in tree ▪ Binary double tree ▪ Multiple rings -> Multiple trees Network Fabric NIC NIC NIC © 2019 Mellanox Technologies | Confidential 18 NCCL SHARP Network Fabric NIC NIC NIC © 2019 Mellanox Technologies | Confidential 19 NCCL SHARP ▪ Collective network Plugin NCCL Allreduce Bandwidth - 1 Ring 25 ▪ Replace Inter-node tree with SHARP Tree SHARP Ring Tree ▪ Keeps Intra-node ring 20 ▪ Aggregation in network switch ▪ Streaming from GPU memory with GPU Direct RDMA 15 ▪ 2x BW compared to NCCL-TREE 10 Bandwidth ( GB/s) Bandwidth 5 0 8 16 32 64 128 256 512 2048 Message Size(MB) 1024 SHARP Enables 2X Higher Data Throughput for NCCL © 2019 Mellanox Technologies | Confidential 20 SHARP AllReduce Performance Advantages (128 Nodes) SHARP enables 75% Reduction in Latency Scalable Hierarchical Aggregation and Reduction Protocol Providing Scalable Flat Latency © 2019 Mellanox Technologies | Confidential 21 SHARP AllReduce Performance Advantages 1500 Nodes, 60K MPI Ranks, Dragonfly+ Topology Scalable Hierarchical SHARP Enables Highest Performance Aggregation and Reduction Protocol © 2019 Mellanox Technologies | Confidential 22 SHARP Performance – Application (OSU) Network-Based Computing Laboratory http://nowlab.cse.ohio-state.edu/ The MVAPICH2 Project http://mvapich.cse.ohio-state.edu/ Source: Prof. DK Panda, Ohio State University © 2019 Mellanox Technologies | Confidential 23 SHARP Performance Advantage for AI ▪ SHARP provides 16% Performance Increase for deep learning, initial results ▪ TensorFlow with Horovod running ResNet50 benchmark, HDR InfiniBand (ConnectX-6, Quantum) 16% 11% P100 NVIDIA GPUs, RH 7.5, Mellanox OFED 4.4, HPC-X v2.3, TensorFlow v1.11, Horovod 0.15.0 © 2019 Mellanox Technologies | Confidential 24 NCCL-SHARP Performance – DL Training ResNet50 25000 20456 20000 18755 15000 Images/sec 10000 5000 0 NCCL NCCL-SHARP System Configuration: (4) HPE Apollo 6500 systems configured with (8) NVIDIA Tesla V100 SXM2 16GB, (2) HPE DL360 Gen10 Intel Xeon-Gold 6134 (3.2 GHz/8-core/130 W) CPUs, (24) DDR4-2666 CAS-19-19-19 Registered Memory Modules HPE 1.6 TB NVMe SFF (2.5") SSD, ConnectX-6 HCA, IB Quantum Switch (EDR speed), Ubuntu 16.04 © 2019 Mellanox Technologies | Confidential 25 Accelerating All Levels of HPC / AI Frameworks Application ▪ Data Analysis ▪ Real Time ▪ Deep Learning Communication ▪ Mellanox SHARP In-Network Computing ▪ MPI Tag Matching ▪ MPI Rendezvous ▪ Software Defined Virtual Devices Network ▪ Network Transport Offload GPUDirect ▪ RDMA and GPU-Direct RDMA ▪ SHIELD (Self-Healing Network) RDMA ▪ Enhanced Adaptive Routing and Congestion Control Connectivity ▪ Multi-Host Technology ▪ Socket-Direct Technology ▪ Enhanced Topologies © 2019 Mellanox Technologies | Confidential 26 Thank You © 2019 Mellanox Technologies | Confidential 27.
Recommended publications
  • Mellanox Technologies Announces the Appointment of Dave Sheffler As Vice President of Worldwide Sales
    MELLANOX TECHNOLOGIES ANNOUNCES THE APPOINTMENT OF DAVE SHEFFLER AS VICE PRESIDENT OF WORLDWIDE SALES Former VP of AMD joins InfiniBand Internet Infrastructure Startup SANTA CLARA, CA – January 17, 2000 – Mellanox Technologies, a fabless semiconductor startup company developing InfiniBand semiconductors for the Internet infrastructure, today announced the appointment of Dave Sheffler as its Vice President of Worldwide Sales, reporting to Eyal Waldman, Chief Executive Officer and Chairman of the Board. “Dave Sheffler possesses the leadership, business capabilities, and experience which complement the formidable engineering, marketing, and operations talent that Mellanox has assembled. He combines the highest standards of ethics and achievements, with an outstanding record of grow- ing sales in highly competitive and technical markets. The addition of Dave to the team brings an additional experienced business perspective and will enable Mellanox to develop its sales organi- zation and provide its customers the highest level of technical and logistical support,” said Eyal Waldman. Prior to joining Mellanox Dave served as Vice President of Sales and Marketing for the Americas for Advanced Micro Devices (NYSE: AMD). Previously, Mr. Sheffler was the VP of Worldwide Sales for Nexgen Inc., where he was part of the senior management team that guided the company’s growth, leading to a successful IPO and eventual sale to AMD in January 1996. Mellanox Technologies Inc. 2900 Stender Way, Santa Clara, CA 95054 Tel: 408-970-3400 Fax: 408-970-3403 www.mellanox.com 1 Mellanox Technologies Inc MELLANOX TECHNOLOGIES ANNOUNCES THE APPOINTMENT OF DAVE SHEFFLER AS VICE PRESIDENT OF WORLDWIDE Dave’s track record demonstrates that he will be able to build a sales organization of the highest caliber.
    [Show full text]
  • IBM Cloud Web
    IBM’s iDataPlex with Mellanox® ConnectX® InfiniBand Adapters Dual-Port ConnectX 20 and 40Gb/s InfiniBand PCIe Adapters Data center cost structures are shifting, from equipment-focused to space-, power- and personnel-focused. Total cost of owner- ship and green initiatives are major spending drivers as data centers are being refreshed. Topping off those challenges is the rapid evolution of information technology becoming a competitive advantage through business process management and deployment of service oriented architectures. I/O technology plays a key role in meeting many of the spending drivers – provisioning capacity for future growth, efficient scaling of compute, LAN and SAN capacity, reduction of space and power in the data center helping reduce TCO and doing so while enhancing data center agility. IBM iDataPlex and ConnectX InfiniBand Adapter Cards IBM’s iDataPlex with Mellanox ConnectX InfiniBand adapters High-Performance for the Flexible Data Center offers a solution that is eco-friendly, flexible, responsive, and Scientists, engineers, and analysts eager to solve challenging positioned to address the evolving requirements of Internet-Scale problems in engineering, financial markets, and the life and earth and High-Performance sciences rely on high-performance systems. In addition, with the use of multi-core CPUs, virtualized infrastructures and networked Unprecedented Performance, Scalability, and storage, users are demanding increased efficiency of the entire Efficiency for Cloud Computing IT compute and storage infrastructure.
    [Show full text]
  • MLNX OFED Documentation Rev 5.0-2.1.8.0
    MLNX_OFED Documentation Rev 5.0-2.1.8.0 Exported on May/21/2020 06:13 AM https://docs.mellanox.com/x/JLV-AQ Notice This document is provided for information purposes only and shall not be regarded as a warranty of a certain functionality, condition, or quality of a product. NVIDIA Corporation (“NVIDIA”) makes no representations or warranties, expressed or implied, as to the accuracy or completeness of the information contained in this document and assumes no responsibility for any errors contained herein. NVIDIA shall have no liability for the consequences or use of such information or for any infringement of patents or other rights of third parties that may result from its use. This document is not a commitment to develop, release, or deliver any Material (defined below), code, or functionality. NVIDIA reserves the right to make corrections, modifications, enhancements, improvements, and any other changes to this document, at any time without notice. Customer should obtain the latest relevant information before placing orders and should verify that such information is current and complete. NVIDIA products are sold subject to the NVIDIA standard terms and conditions of sale supplied at the time of order acknowledgement, unless otherwise agreed in an individual sales agreement signed by authorized representatives of NVIDIA and customer (“Terms of Sale”). NVIDIA hereby expressly objects to applying any customer general terms and conditions with regards to the purchase of the NVIDIA product referenced in this document. No contractual obligations are formed either directly or indirectly by this document. NVIDIA products are not designed, authorized, or warranted to be suitable for use in medical, military, aircraft, space, or life support equipment, nor in applications where failure or malfunction of the NVIDIA product can reasonably be expected to result in personal injury, death, or property or environmental damage.
    [Show full text]
  • Meridian Contrarian Fund Holdings As of 12/31/2016
    Meridian Contrarian Fund Holdings as of 12/31/2016 Ticker Security Name % Allocation NVDA NVIDIA CORP 5.8% MSFT MICROSOFT CORP 4.1% CFG CITIZENS FINANCIAL GROUP INC 3.8% ALEX ALEXANDER & BALDWIN INC 3.5% EOG EOG RESOURCES INC 3.3% CACI CACI INTERNATIONAL INC 3.3% USB US BANCORP 3.1% XYL XYLEM INC/NY 2.7% TOT TOTAL SA 2.6% VRNT VERINT SYSTEMS INC 2.5% CELG CELGENE CORP 2.4% BOH BANK OF HAWAII CORP 2.2% GIL GILDAN ACTIVEWEAR INC 2.1% LVS LAS VEGAS SANDS CORP 2.0% MLNX MELLANOX TECHNOLOGIES LTD 2.0% AAPL APPLE INC 1.9% ENS ENERSYS 1.8% ZBRA ZEBRA TECHNOLOGIES CORP 1.8% MU MICRON TECHNOLOGY INC 1.7% RYN RAYONIER INC 1.7% KLXI KLX INC 1.7% TRMB TRIMBLE INC 1.6% IRDM IRIDIUM COMMUNICATIONS INC 1.5% QCOM QUALCOMM INC 1.5% Investors should consider the investment objective and policies, risk considerations, charges and ongoing expense of an investment carefully before investing. The prospectus and summary prospectus contains this and other information relevant to an investment in the Fund. Please read the prospectus or summary prospectus carefully before you invest or send money. To obtain a prospectus, please contact your investment representative or access the Meridian Funds’ website at www.meridianfund.com. ALPS Distributors, Inc., a member FINRA is the distributor of the Meridian Mutual Funds, advised by Arrowpoint Asset Management, LLC. ALPS, Meridian and Arrowpoint are unaffiliated. Arrowpoint Partners is a trade name for Arrowpoint Asset Management, LLC., a registered investment adviser. The portfolio holdings for the Meridian Funds are published on the Funds' website on a calendar quarter basis, no earlier than 30 days after the end of the quarter.
    [Show full text]
  • Storage for HPC and AI
    Storage for HPC and AI Make breakthroughs faster with artificial intelligence powered by high performance computing systems and storage BETTER PERFORMANCE Unlock the value of data with high performance storage built for HPC. The flood of information generated by sensors, satellites, simulations, high-throughput devices and medical imaging is pushing data repositories to sizes that were once inconceivable. Data analytics, high performance computing (HPC) and artificial intelligence (AI) are technologies designed to unlock the value of all that data, driving the demand for powerful HPC systems and the storage to support them. Reliable, efficient and easy-to-adopt HPC storage is the key to enabling today’s powerful HPC systems to deliver transformative decision making, business growth and operational efficiencies in the data-driven age. Dell Technologies | Ready Solutions for HPC 2 THE INTELLIGENCE BEHIND DATA INSIGHTS Articial Machine Deep intelligence learning learning AI is a complex set of technologies underpinned by machine learning (ML) and deep learning (DL) algorithms, typically run on powerful HPC systems and storage. Together, they enable organizations to gain deeper insights from data. AI is an umbrella term that describes a machine’s ability to act autonomously and/or interact in a human-like way. ML refers to the ability of a machine to perform a programmed function with the data The capabilities of AI, ML and DL can unleash predictive and prescriptive analytics on a given to it, getting progressively better at the task over time as it analyzes more data massive scale. Like lenses, AI, ML and DL can be used in combination or alone — depending and receives feedback from users or engineers.
    [Show full text]
  • Mellanox for Big Data
    Mellanox for Big Data March 2013 Company Overview Ticker: MLNX . Leading provider of high-throughput, low-latency server and storage interconnect • FDR 56Gb/s InfiniBand and 10/40/56GbE • Reduces application wait-time for data • Dramatically increases ROI on data center infrastructure . Company headquarters: • Yokneam, Israel; Sunnyvale, California • ~1,160 employees* worldwide . Solid financial position • Record revenue in FY12; $500.8M • Q1’13 guidance ~$78M to $83M • Cash + investments @ 12/31/12 = $426.3M * As of December 2012 © 2013 Mellanox Technologies 2 Leading Supplier of End-to-End Interconnect Solutions Storage Server / Compute Switch / Gateway Front / Back-End Virtual Protocol Interconnect Virtual Protocol Interconnect 56G IB & FCoIB 56G InfiniBand 10/40/56GbE & FCoE 10/40/56GbE Fibre Channel Comprehensive End-to-End InfiniBand and Ethernet Portfolio ICs Adapter Cards Switches/Gateways Host/Fabric Software Cables © 2013 Mellanox Technologies 3 RDMA | Efficiency, Latency, & Application Performance Without RDMA With RDMA and Offload ~53% CPU ~88% CPU Utilization Utilization User SpaceUser SpaceUser ~47% CPU ~12% CPU Overhead/Idle Overhead/Idle System System Space System System Space © 2013 Mellanox Technologies 4 Big Data Solutions © 2013 Mellanox Technologies 5 Big Data Needs Big Pipes . Capabilities are Determined by the weakest component in the system . Different approaches in Big Data marketplace – Same needs • Better Throughput and Latency • Scalable, Faster data Movement Big Data Applications Require High Bandwidth and Low Latency Interconnect * Data Source: Intersect360 Research, 2012, IT and Data scientists survey © 2013 Mellanox Technologies 6 Unstructured Data Accelerator - UDA . Plug-in architecture • Open-source Hive Pig - https://code.google.com/p/uda-plugin/ MapMap ReduceReduce HBase .
    [Show full text]
  • Dual-Port Adapter with Virtual Protocol Interconnect for Dell Poweredge
    ADAPTER CARDS ConnectX®-2 VPI Dual-Port adapter with Virtual Protocol Interconnect ® for Dell PowerEdge C6100-series Rack Servers ConnectX-2 adapter cards with Virtual Protocol Interconnect (VPI) supporting InfiniBand and Ethernet connectivity provide the highest performing and most flexible interconnect solution for Enterprise Data Centers, High-Performance Computing, and Embedded environments. Clustered data bases, parallel processing, transactional services and high-performance embedded I/O applications will achieve significant performance improvements resulting in reduced completion time and lower cost per operation. ConnectX-2 with VPI also simplifies network deployment by consolidating cables and enhancing performance in virtualized server environments. Virtual Protocol Interconnect BENEFITS VPI-enabled adapters make it possible for Dell PowerEdge C6100-series Rack Servers to operate – One adapter for InfiniBand, 10 Gigabit over any converged network leveraging a consolidated software stack. With auto-sense capability, Ethernet or Data Center Bridging fabrics each ConnectX-2 port can identify and operate on InfiniBand, Ethernet, or Data Center Bridging – World-class cluster performance (DCB) fabrics. FlexBoot™ provides additional flexibility by enabling servers to boot from remote – High-performance networking and storage InfiniBand or LAN storage targets. ConnectX-2 with VPI and FlexBoot simplifies I/O system design access and makes it easier for IT managers to deploy infrastructure that meets the challenges of a – Guaranteed bandwidth and low-latency dynamic data center. services – Reliable transport World-Class Performance – I/O consolidation InfiniBand —ConnectX-2 delivers low latency, high bandwidth, and computing efficiency for – Virtualization acceleration performance-driven server and storage clustering applications. Efficient computing is achieved by – Scales to tens-of-thousands of nodes offloading from the CPU routine activities which allows more processor power for the application.
    [Show full text]
  • COPA-DATA Case Study: Built Scalable and Efficient Data Center with Microsoft Hyperconverged Platform and Mellanox Networking
    CASE STUDY COPA-DATA Case Study: Built Scalable and Efficient Data Center with Microsoft Hyperconverged Platform and Mellanox Networking More and more manufacturing businesses operate around the globe. To manage their local equipment and operations effectively, these companies resort to industrial automation to control, monitor and streamline their geographically diverse HIGHLIGHTS manufacturing operations. COPA-DATA, headquartered in Austria, is one of the technological leaders for such automation solutions. COPA-DATA develops its software, zenon, for ergonomic and highly dynamic process solutions. zenon gives users access to • Hyperconverged platform built on Microsoft all data relating to individual machines, assembly lines, or a company’s entire production Windows Server 2016 and Storage Spaces Direct site from a single system. Additional services such as predictive analysis, machine with RoCE learning, cross-site reporting, remote maintenance, and control can be fully cloud-based • Mellanox Ethernet Storage Fabric of 10/25/100GbE using Microsoft Azure or implemented in hybrid scenarios – opening the door to service- for compute and storage oriented business models. Recognized by its innovations and software solutions, COPA- DATA won the 2017 Microsoft Internet of Things (IoT) Partner of the Year award. • Cost efficient and scalable for future expansion COPA-DATA has over 135,000 installed systems worldwide and serves companies in • Over 90% reduction in backup time the Food & Beverage, Energy & Infrastructure, Automotive and Pharmaceutical sectors. • Up to 820K IOPS (read only) and up to 500K IOPS To better support its customers with local contact persons and local technical support, (70% read, 30% write), tested with VMFleet COPA-DATA has established offices in Europe, North America and Asia, and collaborates with partners and distributors throughout the world.
    [Show full text]
  • NVIDIA Corp NVDA (XNAS)
    Morningstar Equity Analyst Report | Report as of 14 Sep 2020 04:02, UTC | Page 1 of 14 NVIDIA Corp NVDA (XNAS) Morningstar Rating Last Price Fair Value Estimate Price/Fair Value Trailing Dividend Yield % Forward Dividend Yield % Market Cap (Bil) Industry Stewardship Q 486.58 USD 250.00 USD 1.95 0.13 0.13 300.22 Semiconductors Exemplary 11 Sep 2020 11 Sep 2020 20 Aug 2020 11 Sep 2020 11 Sep 2020 11 Sep 2020 21:37, UTC 01:27, UTC Morningstar Pillars Analyst Quantitative Important Disclosure: Economic Moat Narrow Wide The conduct of Morningstar’s analysts is governed by Code of Ethics/Code of Conduct Policy, Personal Security Trading Policy (or an equivalent of), Valuation Q Overvalued and Investment Research Policy. For information regarding conflicts of interest, please visit http://global.morningstar.com/equitydisclosures Uncertainty Very High High Financial Health — Moderate Nvidia to Buy ARM in $40 Billion Deal with Eyes Set on Data Center Source: Morningstar Equity Research Dominance; Maintain FVE Quantitative Valuation NVDA Business Strategy and Outlook could limit Nvidia’s future growth. a USA Abhinav Davuluri, CFA, Analyst, 19 August 2020 Undervalued Fairly Valued Overvalued Nvidia is the leading designer of graphics processing units Analyst Note that enhance the visual experience on computing Abhinav Davuluri, CFA, Analyst, 13 September 2020 Current 5-Yr Avg Sector Country Price/Quant Fair Value 1.67 1.43 0.77 0.83 platforms. The firm's chips are used in a variety of end On Sept. 13, Nvidia announced it would acquire ARM from Price/Earnings 89.3 37.0 21.4 20.1 markets, including high-end PCs for gaming, data centers, the SoftBank Group in a transaction valued at $40 billion.
    [Show full text]
  • Mellanox Technologies Announces Complete PCI-Expresstm HCA Product Line at Intel® Developer Forum
    Mellanox Technologies Announces Complete PCI-ExpressTM HCA Product Line at Intel® Developer Forum SANTA CLARA, CALIFORNIA, and YOKNEAM, ISRAEL - Sep 15, 2003 - Mellanox® Tech- nologies Ltd., the leader in InfiniBandTM semiconductors, today announced enhanced support for the Mellanox family of HCA, Host Channel Adapter, devices by offering a full product line of PCI-ExpressTM InfiniBand HCA Cards. The InfiniHost Express product family incorporates Mell- anox 3rd generation architectural enhancements for maximum capability, and includes an 8X PCI Express interconnect, designed to connect to Intel chipsets, enabling up to 40Gb/sec of total data bandwidth from the InfiniBand fabric all the way into the server’s core logic. Both, standard height and low profile cards will be offered. "Our 3rd generation HCA will continue to advance InfiniBand capabilities," said Michael Kagan, vice president of architecture and software for Mellanox. "New enhancements in InfiniHost Express will double the data bandwidth and deliver the data with lower latencies." Mellanox has worked with Sandia Labs, Intel and other leading enterprise vendors to demonstrate how clusters can improve visualization performance by using industry standard servers and the open standard InfiniBand architecture interconnect. This demonstration at IDF features visualiza- tion software running on an InfiniBand cluster of Intel XeonTM and Itanium® 2 based servers. Future deployments of such systems will incorporate InfiniHost PCI-Express based HCA cards to further improve latency, bandwidth,
    [Show full text]
  • Mellanox Ethernet Adapter Brochure
    Ethernet World-Class Performance Ethernet SmartNICs Product Line Ethernet Network Adapters with Advanced Hardware Accelerators, Unequaled RoCE Capabilities and Enhanced Security, Enabling Data Center Efficiency and Scalability Ethernet Mellanox® Ethernet Network Interface Cards (NICs) enable the highest data center performance for hyperscale, public and private clouds, storage, machine learning, artificial intelligence, big data and telco platforms World-Class Performance and Scale Mellanox ConnectX Ethernet SmartNICs offer best-in-class network performance serving low-latency, high throughput applications at 10, 25, 40, 50, 100 and up to 200 Gb/s Ethernet speeds. Mellanox Ethernet adapters deliver industry-leading connectivity for performance-driven server and storage applications. These ConnectX adapter cards enable high bandwidth coupled with ultra-low latency for diverse applications and systems, resulting in faster access and real-time responses. ConnectX adapter cards provide best-in-class performance and efficient computing through advanced acceleration and offload capabilities, including RDMA over Converged Ethernet (RoCE), NVMe-over-Fabrics (NVMe-oF), virtual switch offloads, GPU communication acceleration, hardware acceleration for virtualization, encryption/decryption hardware acceleration, and the connectivity of multiple compute or storage hosts to a single interconnect adapter. ConnectX network acceleration technology frees the CPU resources for compute tasks, allowing for higher scalability and efficiency within. ConnectX adapter cards are part of Mellanox’s complete end-to-end Ethernet networking portfolio for data centers which also includes Open Ethernet switches, application acceleration packages, and cabling to deliver a unique price-performance value proposition for network and storage solutions. Using Mellanox, IT managers can be assured of the highest performance, reliability and most efficient network fabric at the lowest cost for the best return on investment.
    [Show full text]
  • Fiscal Year 2020 Form 10-K
    UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 Form 10-K (Mark One) ☒ ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the fiscal year ended February 1, 2020 or ☐ TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the transition period from to Commission file number 0-30877 Marvell Technology Group Ltd. (Exact name of registrant as specified in its charter) Bermuda 77-0481679 (State or other jurisdiction of (I.R.S. Employer incorporation or organization) Identification No.) Canon’s Court, 22 Victoria Street, Hamilton HM 12, Bermuda (Address of principal executive offices) (441) 296-6395 (Registrant’s telephone number, including area code) Securities registered pursuant to Section 12(b) of the Act: Title of each class Trading Symbol Name of each exchange on which registered Common shares, $0.002 par value per share MRVL The Nasdaq Stock Market LLC Securities registered pursuant to Section 12(g) of the Act: None Indicate by check mark if the registrant is a well-known seasoned issuer, as defined in Rule 405 of the Securities Act. Yes ☐ No ☒ Indicate by check mark if the registrant is not required to file reports pursuant to Section 13 or Section 15(d) of the Act. Yes ☐ No ☒ Indicate by check mark whether the registrant (1) has filed all reports required to be filed by Section 13 or 15(d) of the Securities Exchange Act of 1934 during the preceding 12 months (or for such shorter period that the registrant was required to file such reports), and (2) has been subject to such filing requirements for the past 90 days.
    [Show full text]