AWS' Secret Weapon Is Revolutionizing Computing
Total Page:16
File Type:pdf, Size:1020Kb
Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing by, David Vellante June 18th, 2021 Contributing Author: David Floyer AWS is pointing the way to a revolution in system architecture. Much in the same way that AWS defined the cloud operating model last decade, we believe it is once again leading in future systems. The secret sauce underpinning these innovations is specialized designs that break the stranglehold of inefficient and […] © 2021 Wikibon Research | Page 1 Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing Contributing Author: David Floyer AWS is pointing the way to a revolution in system architecture. Much in the same way that AWS defined the cloud operating model last decade, we believe it is once again leading in future systems. The secret sauce underpinning these innovations is specialized designs that break the stranglehold of inefficient and bloated centralized processing architectures. We believe these moves position AWS to accommodate a diversity of workloads that span cloud, data center as well as the near and far edge. In this Breaking Analysis we’ll dig into the moves that AWS has been making, explain how they got here, why we think this is transformational for the industry and what this means for customers, partners and AWS’ many competitors. AWS’ Architectural Journey – The Path to Nitro & Graviton The IaaS revolution started by AWS gave easy access to VM’s that could be deployed and decommissioned on-demand. Amazon used a highly customized version of Xen that allowed multiple VM’s to run on one physical machine. The hypervisor functions were controlled by x86. According to Werner Vogels, as much as 30% of the processing was wasted, meaning it was supporting hypervisor functions and managing other parts of the system, including the storage and networking. These overheads lead to AWS developing custom ASICS that helped accelerate workloads. In 2013, AWS began shipping custom chips and partnered with AMD to announce EC2 C3 instances. But as the AWS cloud scaled, Amazon wasn’t satisfied with the performance gains and they were seeing architectural limits down the road. That prompted AWS to start a partnership with Annapurna Labs in 2014 and they launched EC2 C4 instances in 2015. The ASIC in C4 optimized offload functions for storage and networking but still relied on Intel Xeon as the control point. AWS shelled out a reported $350M to acquire Annapurna in 2015 – a meager sum to acquire the secret sauce of its future system design. This acquisition led to a modern version of Project Nitro in 2017. [Nitro © 2021 Wikibon Research | Page 2 Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing offload cards were first introduced in 2013]. At this time, AWS introduced C5 instances, replaced Xen with KVM and more tightly coupled the hypervisor with the ASIC. Last year, Vogels that this milestone offloaded the remaining components, including the control plane and rest of the I/O; and enabled nearly 100% of the processing to support customer workloads. It also enabled a bare metal version of compute that spawned the partnership with VMware to launch VMware Cloud on AWS. Then in 2018, AWS took the next step and introduced Graviton, its custom designed Arm-based chip. This broke the dependency on x86 and launched a new era of architecture, which now supports a wide variety of configurations to support data intensive workloads. These moves set the framework other AWS innovations including new chips optimized for ML, training, AI, inferencing. The bottom line is AWS has architected an approach that offloaded the work currently done by the central processor. It has set the stage for the future allowing shared memory, memory disaggregation and independent resources that can be configured to support workloads from the cloud to the edge– at much lower cost than can be achieved with general purpose approaches. Nitro is the key to this architecture. To summarize – AWS Nitro is a set of custom hardware and software that runs on Arm-based chips spawned from Annapurna. AWS has moved the hypervisor, network and storage virtualization to dedicated hardware that frees up the CPU to run more efficiently. The reason this is so compelling in our view is AWS now has the architecture in place to compete at every level of the massive TAM comprising public cloud, on-prem datacenters and both the near and far edge. Sets the Direction for the Entire Industry This chart below pulls data from the ETR data set. It lays out key players competing for the future of cloud, data center and the edge. We’ve superimposed NVIDIA and Intel. They don’t show up directly in the ETR survey but they clearly are platform players in the mix. The data shows Net Score on the vertical axis– that’s a measure of spending velocity. Market Share is on the horizontal axis, which is a measure of pervasiveness in the data set. We’re not going to dwell on the relative positions here, rather let’s comment on the players and start with AWS. We’ve laid out the path © 2021 Wikibon Research | Page 3 Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing AWS took to get here and we believe they are setting the direction for the future. AWS AWS is really pushing hard on migration to it’s Arm-based platforms, from x86. Patrick Moorhead at the Six Five Summit spoke with David Brown who heads EC2 at AWS. And he talked extensively about about migrating from x86 to AWS’ Arm-based Graviton 2. And he announced a new developer challenge to accelerate migration to Arm. The carrot Brown laid out for customers is 40% better price performance. He gave the example of a customer running 100 server instances can do the same work with 60 servers by migrating to Graviton2 instances. There’s some migration work involved by the customers but the payoff is large. Generally, we bristle at the thought of migrations. The business value of migrations is a function of the benefit achieved, less the cost of the migration, which must account for any business disruption, code freezes, retraining and time to value variables. But it seems in this case, AWS is minimizing the migration pain. The benefit to customers according to Brown is that AWS currently offers something like 400 different EC2 instances. As we reported earlier this year, nearly 50% of the new EC2 instances shipped last year were Arm-based. And AWS is working hard to accelerate the pace of migration away from x86 onto its own design. Nothing could be more clear. Intel Intel is finally responding in earnest to the market forces. We essentially believe Intel is taking a page out of Arm’s playbook. We’ll dig into that a bit today. In 2015, Intel paid $16.7B for Altera, a maker of FPGAs. Also at the Six Five Summit, Navin Shenoy of Intel presented details of what Intel is calling an IPU – Infrastructure Processing Unit. This is a departure from Intel norms where everything is controlled by a central processing unit. IPUs are basically smart NICs as are DPUs – don’t get caught up in the acronym soup. As we’ve reported, this is all about offloading work, disaggregating memory and evolving SoCs – system on chip and SoPs – system on package. But let this sink in a bit. Intel’s moves this past week – it seems to us anyway – are clearly designed to create a platform that enable its partners to build a Nitro-like offload capability. And the basis of that platform is a $16.7B acquisition. Compare that to AWS’ $350M tuck-in of Annapurna. That’s Incredible. Now Shenoy said in his presentation “We’ve already deployed IPU’s using FPGAs in very high volume at Microsoft Azure and we’ve recently announced partnerships with Baidu, JD Cloud and VMWare.” Let’s look at VMware in particular VMware VMware is the other really prominent platform player in this race. In 2020, VMware announced project Monterey, which is a Nitro-like architecture that it claims is not reliant on any specific FPGA or SoC. VMware is partnering with and intends to accommodate new technologies including Intel’s FPGA, Nvidia’s Arm- based Bluefield NICs and Pensando’s smart NICs. It is also partnering with Dell, HPE and Lenovo to drive end-to-end integration across the respective solutions of those companies. So VMware is firmly in the mix. However this is early days and Monterey is a project, not a product. VMware likely chose to work with Intel for a variety of reasons including most software running on VMware has been built for x86. As well, Pat Gelsinger was leading VMware at the time and probably saw the future pretty clearly– the company’s and his own. Despite the Intel connection, the architectural design of Monterey appears to allow VMware to incorporate innovations from other suppliers, including, AMD and Arm-based © 2021 Wikibon Research | Page 4 Breaking Analysis: AWS’ Secret Weapon is Revolutionizing Computing platforms like Bluefield. The bottom line is VMware has a project that moves it toward a Nitro-like offering and appears to be ahead of the non-cloud competition with respect to this trend. But in our view, being the Switzerland of smart NICs is only a first step toward having full control over the architecture as does AWS with Nitro. Specifically we refer to VMware possibly designing an underlying solution optimized for VMWare that separates the compute completely from other components. Perhaps this is the intent but currently the details are sketchy. The next major step would be to design a custom chip like AWS Graviton.