``Zeppelin'': an Soc for Multichip Architectures

Total Page:16

File Type:pdf, Size:1020Kb

``Zeppelin'': an Soc for Multichip Architectures This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. IEEE JOURNAL OF SOLID-STATE CIRCUITS 1 “Zeppelin”: An SoC for Multichip Architectures Thomas Burd , Senior Member, IEEE, Noah Beck, Sean White, Milam Paraschou, Member, IEEE, Nathan Kalyanasundharam, Gregg Donley, Alan Smith, Member, IEEE, Larry Hewitt, and Samuel Naffziger, Fellow, IEEE Abstract— AMD’s “Zeppelin” system-on-a-chip (SoC) com- 1) Client Market: Single-chip AM4 package with two bines eight high-performance “Zen” cores with a shared 16-MB DDR4 channels, 24 PCIe Gen3 lanes [4] and is platform L3 Cache, along with six high-speed I/O links and two compatible with the previous generation AMD SoCs. DDR4 channels, using the infinity fabric (IF) to provide a high speed, low latency, and power-efficient connectivity solution. 2) High-End Desktop Market: Two-chip sTR4 package This solution allows the same SoC silicon die to be designed with four DDR4 channels and 64 PCIe Gen3 lanes. into three separate packages and provides highly competitive 3) Server Market: Four-chip SP3 package with eight solutions in three different market segments. IF is critical to DDR4 channels and 128 PCIe Gen3 lanes for one- this high-leverage design re-use, utilizing a coherent, scalable socket systems, scalable with coherent interconnect to data fabric (SDF) for on-die communication, as well as inter-die links, extending up to eight dies across two packages. To support two-socket systems. this scalability, an energy efficient, custom physical-layer link The critical enabler for this flexibility is the infinity fabric was designed for in-package, high-speed communication between (IF), comprised of two key components, or planes. The first the dies. Using an additional scalable control fabric (SCF), is the scalable data fabric (SDF) that provides coherent data a hierarchical power and system management unit (SMU) was used to monitor and manage a distributed set of dies to ensure transport between cores, memory controllers, and IO, and can the products stay within infrastructure limits. It was essential do so within the same die, across dies within the same pack- that the floor plan of the SoC was co-designed with the package age, or between packages in a two-socket system. The second substrate. The SoC used a 14-nm FinFET process technology and is the scalable control fabric (SCF) that provides a common 2 contains 4.8B transistors on a 213 mm die. command and control mechanism for system configurability Index Terms— 14 nm, high-frequency design, microproces- and management. Similar to the SDF, the SCF connects all sors, multi-chip module (MCM), scalable fabric, system-on-a- the components within the SoC, among dies within the same chip (SoC) architecture. package, and between packages in a two-socket system. A flexible, yet power-efficient physical implementation of the IF was a key requirement for competitive products, which I. INTRODUCTION drove a customized, on-package, and high-speed Serializer– MD’s next-generation system-on-a-chip (SoC), code- Deserializer (SerDes) link interface, while not as power effi- Anamed “Zeppelin,” was designed with the flexibility to cient as other on-package interconnect solutions, such as allow the single silicon design to target products in a multitude embedded multi-die interconnect bridge (EMIB), at 2 pJ/bit of markets, including server, mainstream desktop PCs, and versus 1.2 pJ/bit [5], the IF solution provides much greater high-end desktop PCs [1]. The Zeppelin SoC was designed product design flexibility. EMIB requires dies to be physically in Global Foundries’ 14-nm LPP FinFET process technology, adjacent, while IF utilizes package routing layers to support utilizing a back-end stack of 13 copper interconnect layers much more complex connection topologies, but with a custom with a top-level aluminum redistribution layer [2], [3]. SerDes solution to minimize transmission energy as compared The highest priority design goal was to provide an SoC to existing off-package SerDes solutions. that was architected with leadership server capabilities, but in addition, also have the scalability and configurability to II. ARCHITECTURE support additional complementary markets. These include: A. Functional Overview Manuscript received May 18, 2018; revised August 4, 2018 and The SoC, as shown in Fig. 1, consists of two core com- September 17, 2018; accepted September 18, 2018. This paper was approved plexes (CCXs), in which each complex contains four high- by Guest Editor Masato Motomura. (Corresponding author: Thomas Burd.) performance “Zen” x86 cores providing two-way simultaneous T. Burd, N. Kalyanasundharam, and G. Donley are with Advanced Micro Devices, Santa Clara, CA 95054 USA (e-mail: [email protected]). multi-threading (SMT), each with a 512-kB L2 Cache, and a N. Beck and S. White are with Advanced Micro Devices, Boxborough, shared 8-MB L3 Cache [3]. There are two DDR4 channels MA 01719 USA. with ECC supporting two DIMMs per channel at speeds up M. Paraschou and S. Naffziger are with Advanced Micro Devices, Fort Collins, CO 80528 USA. to 2666 MT/s. There are two combo physical-layer links, A. Smith and L. Hewitt are with Advanced Micro Devices, Austin, each which can be configured as a 16-lane PCIe Gen3 inter- TX 78735 USA. face, or an eight-lane SATA interface, or a 16-lane inter- Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. socket SerDes interface. An additional four high-speed SerDes Digital Object Identifier 10.1109/JSSC.2018.2873584 interfaces provide die-to-die links. There is an IO complex 0018-9200 © 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination. 2 IEEE JOURNAL OF SOLID-STATE CIRCUITS Fig. 3. Infinity data fabric topology. Fig. 1. “Zeppelin” SoC architecture. 2) bandwidth scalability to support a broad range of prod- ucts from RyzenTM Mobile to EPYCTM servers (and even RadeonTM GPUs); 3) guaranteed quality of service (QoS) for real-time clients; 4) standardized interfaces to enable automated build flows for rapid deployment of network-on-chip (NOC); and 5) low latency, which is perhaps the most important tenet. IF uses the enhanced coherent HyperTransport (cHT+)pro- + Fig. 2. “Zen” cache hierarchy. tocol built upon the cHT used in multiple generations of server deployments [8]. Zeppelin uses a seven-state MDOEFSI coherence protocol, in which the states are exclusively modi- that provides an integrated southbridge, including PCIe and fied (M), dirty (D), shared modified (O), exclusive clean (E), SATA controllers, four USB 3.1 Gen 1 ports, as well as SPI, forwarder clean (F), shared clean (S), and invalid (I). A distrib- LPC, UART, I2C, and RTC interfaces. All of these components uted SRAM-based full directory is supported. The directory are connected with the IF providing coherent data transport protocol supports directed multi-cast and broadcast probes. between all the IPs on the SoC. The protocol also allows for probe responses to be combined at the links. B. Core Complex and Cache Hierarchy SDF uses two standard interfaces—scalable data port (SDP) The CCX, detailed in [3] and [6], can fetch and decode up and fabric transport interface (FTI). Along with the standard to four instructions per cycle (IPC), and dispatch up to six interfaces, a modular design was key to building complex micro-operations per cycle, utilizing eight parallel execution topologies. The main blocks within the data fabric, as shown units, providing 52% higher IPC performance than the prior- in Fig. 3, are master, slave, transport switch, and Coherent generation x86 processor core [7]. As shown in Fig. 2, AMD Socket Extender (CAKE). There are two types of within the Zen core there is a 64 kB, four-way set-associative masters on Zeppelin—cache coherent master (CCM) and an instruction cache with 32 B/cycle of fetch bandwidth, and a IO master and slave (IOMS). Master block in the data fabric 32 kB, eight-way set-associative data cache with 48 B/cycle of abstracts the complexities of identifying the request target and load/store bandwidth. The private, 512 kB L2 cache supports routing functions away from the clients. Clients of data fabric 64 B/cycle of bandwidth to the L1 caches with 12-cycle that initiate requests use an SDP port to talk to a master block latency. The fast, shared L3 cache supports 32 B/cycle of in the data fabric. Clients with service requests use a slave bandwidth to the L2 caches with a 35-cycle latency. The SDP port. There are two types of slaves: coherent slave (CS), L3 cache is filled from L2 victims for all four cores of the traditionally known as a home agent which hosts directory CCX, and L2 tags are duplicated in the L3 cache for probe and participates in ordering and is responsible for maintaining filtering and fast cache transfer. The hierarchy can support coherency; and IO slave which provides access to devices. up to 50 outstanding misses from L2 to L3 per core, and IOMS is built as a single block to allow upstream responses 96 outstanding misses from L3 to main memory. to push prior posted writes on the same port. CS interfaces with the memory controller shown as UMC in Fig. 1. The Zeppelin SoC has two DDR4 channels, two CCXs, C. Infinity Fabric support for up to four IF on-package (IFOP) links and two The IF’s SDF was built around several design tenets: IF inter-socket (IFIS) links.
Recommended publications
  • H242-Z10 Edge Server - AMD up 2U 4 Nodes
    H242-Z10 Edge Server - AMD UP 2U 4 Nodes Features • Supports 5G network infrastructure • 2U - 4 nodes rear access server system for Edge Computing • Single AMD EPYC™ 7002 series processor family • 4 x LGA 4094 sockets • 8-Channel RDIMM/LRDIMM DDR4, 32 x DIMMs • 8 x 1Gb/s LAN ports (Intel® I350-AM2) • 4 x 2.5" NVMe hot-swappable SSD bays • 8 x M.2 with PCIe Gen3 x4 interface • 8 x Low profile PCIe x16 expansion slots • 4 x OCP 2.0 Gen3 x16 mezzanine slots • Aspeed® AST2500 remote management controller • 2+0 1200W 80 PLUS Platinum power supply AMD EPYC™ 7002 Series Processor (Rome) The next generation of AMD EPYC has arrived, providing incredible compute, IO and bandwidth capability – designed to meet the huge demand for more compute in big data analytics, HPC and cloud computing. Built on 7nm advanced process technology, allowing for denser compute capabilities with lower power consumption Up to 64 core per CPU, built using Zen 2 high performance cores and AMD’s innovative chiplet architecture Supporting PCIe Gen 4.0 with a bandwidth of up to 64GB/s, twice of PCIe Gen 3.0 Embedded security protection to help defend your CPU, applications, and data Optional CMC / Ring Topology Module GIGABYTE’s H242-Series servers include support for an optional CMC / Ring Topology Module, featuring an Aspeed CMC (Central Management Controller), LAN hub and dual MLAN ports for multi-node management (including iKVM support) by connecting internally to Aspeed BMCs integrated on each node. This results only in one MLAN connection required to perform manage- ment of all four nodes, resulting in less ToR (Top of Rack) cabling and switch connections.
    [Show full text]
  • Intel/AMD CPU Release Dates, 2018
    Release dates of desktop microprocessors (2018) http://www.cpu-world.com/Releases/Desktop_CPU_releases_(2018).html News • CPUs / Chips • Benchmarks • Information • Forum • Links • About Identification • Pinouts • S-Spec numbers • Glossary Search the site / Ident CPU / Quick CPU looku Release dates of desktop microprocessors (2018) 2017 2019 AMD Intel Desktop CPU releases January 2004 February First Zen-based desktop APUs 2005 14nm, 4 cores, Radeon Vega graphics 2006 2007 Ryzen 3 2200G 2008 3.5GHz / 4MB L3 / 4 cores / Unlocked / Socket AM4 2009 Ryzen 5 2400G 2010 3.6GHz / 4MB L3 / 4 cores / Unlocked / Socket AM4 2011 Other markets: March April 2nd Generation Zen CPUs 8th Generation lineup launch 2018 embedded CPUs 2018 mobile CPUs "Pinnacle Ridge", 12nm, up to 8 cores Coffee Lake-S, 14nm, Up to 6 cores 2018 server CPUs Ryzen 5 2600X Celeron G4900T 3.6GHz / 16MB L3 / 6 cores / Unlocked / Socket AM4 2.9GHz / 2MB L3 / 2 cores / Socket 1151 Ryzen 5 2600 Celeron G4900 3.4GHz / 16MB L3 / 6 cores / Unlocked / Socket AM4 3.1GHz / 2MB L3 / 2 cores / Socket 1151 Ryzen 7 2700X Celeron G4920 3.7GHz / 16MB L3 / 8 cores / Unlocked / Socket AM4 3.2GHz / 2MB L3 / 2 cores / Socket 1151 Ryzen 7 2700 Core i3-8100T 3.2GHz / 16MB L3 / 8 cores / Unlocked / Socket AM4 3.1GHz / 6MB L3 / 4 cores / Socket 1151 Core i3-8300T Ryzen 3 2200GE 3.2GHz / 8MB L3 / 4 cores / Socket 1151 3.2GHz / 4MB L3 / 4 cores / Unlocked / Socket AM4 Core i3-8300 Ryzen 5 2400GE 3.7GHz / 8MB L3 / 4 cores / Socket 1151 3.2GHz / 4MB L3 / 4 cores / Unlocked / Socket AM4 Core i5-8400T 1.7GHz
    [Show full text]
  • Evaluation of AMD EPYC
    Evaluation of AMD EPYC Chris Hollowell <[email protected]> HEPiX Fall 2018, PIC Spain What is EPYC? EPYC is a new line of x86_64 server CPUs from AMD based on their Zen microarchitecture Same microarchitecture used in their Ryzen desktop processors Released June 2017 First new high performance series of server CPUs offered by AMD since 2012 Last were Piledriver-based Opterons Steamroller Opteron products cancelled AMD had focused on low power server CPUs instead x86_64 Jaguar APUs ARM-based Opteron A CPUs Many vendors are now offering EPYC-based servers, including Dell, HP and Supermicro 2 How Does EPYC Differ From Skylake-SP? Intel’s Skylake-SP Xeon x86_64 server CPU line also released in 2017 Both Skylake-SP and EPYC CPU dies manufactured using 14 nm process Skylake-SP introduced AVX512 vector instruction support in Xeon AVX512 not available in EPYC HS06 official GCC compilation options exclude autovectorization Stock SL6/7 GCC doesn’t support AVX512 Support added in GCC 4.9+ Not heavily used (yet) in HEP/NP offline computing Both have models supporting 2666 MHz DDR4 memory Skylake-SP 6 memory channels per processor 3 TB (2-socket system, extended memory models) EPYC 8 memory channels per processor 4 TB (2-socket system) 3 How Does EPYC Differ From Skylake (Cont)? Some Skylake-SP processors include built in Omnipath networking, or FPGA coprocessors Not available in EPYC Both Skylake-SP and EPYC have SMT (HT) support 2 logical cores per physical core (absent in some Xeon Bronze models) Maximum core count (per socket) Skylake-SP – 28 physical / 56 logical (Xeon Platinum 8180M) EPYC – 32 physical / 64 logical (EPYC 7601) Maximum socket count Skylake-SP – 8 (Xeon Platinum) EPYC – 2 Processor Inteconnect Skylake-SP – UltraPath Interconnect (UPI) EYPC – Infinity Fabric (IF) PCIe lanes (2-socket system) Skylake-SP – 96 EPYC – 128 (some used by SoC functionality) Same number available in single socket configuration 4 EPYC: MCM/SoC Design EPYC utilizes an SoC design Many functions normally found in motherboard chipset on the CPU SATA controllers USB controllers etc.
    [Show full text]
  • SMBIOS Specification
    1 2 Document Identifier: DSP0134 3 Date: 2019-10-31 4 Version: 3.4.0a 5 System Management BIOS (SMBIOS) Reference 6 Specification Information for Work-in-Progress version: IMPORTANT: This document is not a standard. It does not necessarily reflect the views of the DMTF or its members. Because this document is a Work in Progress, this document may still change, perhaps profoundly and without notice. This document is available for public review and comment until superseded. Provide any comments through the DMTF Feedback Portal: http://www.dmtf.org/standards/feedback 7 Supersedes: 3.3.0 8 Document Class: Normative 9 Document Status: Work in Progress 10 Document Language: en-US 11 System Management BIOS (SMBIOS) Reference Specification DSP0134 12 Copyright Notice 13 Copyright © 2000, 2002, 2004–2019 DMTF. All rights reserved. 14 DMTF is a not-for-profit association of industry members dedicated to promoting enterprise and systems 15 management and interoperability. Members and non-members may reproduce DMTF specifications and 16 documents, provided that correct attribution is given. As DMTF specifications may be revised from time to 17 time, the particular version and release date should always be noted. 18 Implementation of certain elements of this standard or proposed standard may be subject to third party 19 patent rights, including provisional patent rights (herein "patent rights"). DMTF makes no representations 20 to users of the standard as to the existence of such rights, and is not responsible to recognize, disclose, 21 or identify any or all such third party patent right, owners or claimants, nor for any incomplete or 22 inaccurate identification or disclosure of such rights, owners or claimants.
    [Show full text]
  • Lista Sockets.Xlsx
    Data de Processadores Socket Número de pinos lançamento compatíveis Socket 0 168 1989 486 DX 486 DX 486 DX2 Socket 1 169 ND 486 SX 486 SX2 486 DX 486 DX2 486 SX Socket 2 238 ND 486 SX2 Pentium Overdrive 486 DX 486 DX2 486 DX4 486 SX Socket 3 237 ND 486 SX2 Pentium Overdrive 5x86 Socket 4 273 março de 1993 Pentium-60 e Pentium-66 Pentium-75 até o Pentium- Socket 5 320 março de 1994 120 486 DX 486 DX2 486 DX4 Socket 6 235 nunca lançado 486 SX 486 SX2 Pentium Overdrive 5x86 Socket 463 463 1994 Nx586 Pentium-75 até o Pentium- 200 Pentium MMX K5 Socket 7 321 junho de 1995 K6 6x86 6x86MX MII Slot 1 Pentium II SC242 Pentium III (Cartucho) 242 maio de 1997 Celeron SEPP (Cartucho) K6-2 Socket Super 7 321 maio de 1998 K6-III Celeron (Socket 370) Pentium III FC-PGA Socket 370 370 agosto de 1998 Cyrix III C3 Slot A 242 junho de 1999 Athlon (Cartucho) Socket 462 Athlon (Socket 462) Socket A Athlon XP 453 junho de 2000 Athlon MP Duron Sempron (Socket 462) Socket 423 423 novembro de 2000 Pentium 4 (Socket 423) PGA423 Socket 478 Pentium 4 (Socket 478) mPGA478B Celeron (Socket 478) 478 agosto de 2001 Celeron D (Socket 478) Pentium 4 Extreme Edition (Socket 478) Athlon 64 (Socket 754) Socket 754 754 setembro de 2003 Sempron (Socket 754) Socket 940 940 setembro de 2003 Athlon 64 FX (Socket 940) Athlon 64 (Socket 939) Athlon 64 FX (Socket 939) Socket 939 939 junho de 2004 Athlon 64 X2 (Socket 939) Sempron (Socket 939) LGA775 Pentium 4 (LGA775) Pentium 4 Extreme Edition Socket T (LGA775) Pentium D Pentium Extreme Edition Celeron D (LGA 775) 775 agosto de
    [Show full text]
  • Purpose Power AMD Platform Solutions Performance Optimized, Power-Ef Cient and Perfect Density Mar
    HPC. Big Data. Convergence. All Purpose Power AMD Platform Solutions Performance Optimized, Power-efcient and Perfect Density Mar. 2021 3rd Gen AMD Ryzen Threadripper Server Board TRX40D8-2N2T with Dual 10GbE and IPMI AMD Ryzen™ Threadripper Gen 3.0 TRX40D8-2N2T Form Factor ATX (12" x 9.6") Processor System CPU Socket sTRX4 (LGA4094P), supports 3rd Gen AMD Ryzen™ Threadripper™ Series Processors PCH AMD TRX40 Memory Capacity 8 DIMM slots (2DPC), supports DDR4 ECC and non-ECC UDIMM, up to 3200MHz 32GB per DIMM Expansion PCIe slot 3 PCIe 4.0 x16, 1 PCIe 4.0 x8 Others 1 OCulink (PCIe 4.0 x4) [BTO] Storage M.2 2 M-key (PCIe 4.0 x4 or SATA 6Gb/s); Form factor: 22110/2280/2260/2242 SATA port AMD TRX40: 4 SATA 6Gb/s (2 supports SATA DOM), ASM 1061: 2 SATA 6Gb/s Network RJ45 2 RJ45 (10GbE) by Intel® X710-AT2, 2 RJ45 (2.5 GbE) by Intel® i225 Management BMC ASPEED AST2500 : IPMI (Intelligent Platform Management Interface) 2.0 IPMI Dedicated LAN 1 x Realtek RTL8211E for Dedicated Management GLAN I/O USB USB 3.2 Gen 2 Port: 1 rear Type A, 1 rear Type C USB 3.2 Gen 1 Port: 2 rear Type A, 2 via header, 1 type A AMD X399-based Versatile and Multiple PCIe X399D8A-2T ATX Board with 10GbE Networking AMD Ryzen™ Threadripper Gen 1.0 X399D8A-2T Form Factor ATX (12" x 9.6") Processor Support Single Socket LGA SP3 4094P, Supports AMD TR4 Socket Ryzen Threadripper Series Processors Chipset AMD X399 Memory Capacity 8 DIMM Slots, Supports Dual Channel DDR4 3600+(OC)/3200(OC)/2933(OC)/2666/2400/2133 ECC and non-ECC U DIMM (Max 64GB(SR)/128GB(DR)) Storage SATA Controller
    [Show full text]
  • Muntatge D'equips Microinformàtics Conjunt Placa Mare, Processador I
    M1: Muntatge i manteniment d'equips microinformàtics UF3: Muntatge d'equips microinformàtics Conjunt placa mare, processador i memòria CFGM Sistemes Microinformàtics i Xarxes Professor: Joaquim Martínez Característiques Sòcol processador Connexió DIMM de memòria Tipus de memòria Velocitat del bus de memòria Velocitat interna del processador Xipset Evolució dels processadors Fins fa pocs anys els processadors només incorporaven els circuits lògics del processador, o processadors si és multinucli, i memòria caché. Fa poc han començat a incorporar el controlador de memòria, especialment des de l'aparició de les memòries DDR3, connectant-se la memòria directament al processador i simplificant el pont nord del xipset, que interconnecta el processador, el PCIe x16 per les gràfiques i el pont sud. Actualment alguns fins i tot integren un processador gràfic i el controlador PCIe x16 de connexió per les gràfiques, cas en que desapareix el pont nord del xipset i el processador es connecta directament al pont sud del xipset. Evolució dels processadors Controlador de memòria Depenen d'on es trobi el controlador de memòria la compatibilitat del tipus de memòria pot dependre només del processador o també del xipset. Trobem dos casos: Controlador de memòria al processador. Processador connectat directament amb la memòria. El processador ha de ser compatible amb el tipus de memòria. Controlador de memòria al pont nord. Processador connectat a la memòria a través del pont nord del xipset. El processador i el xipset han de ser compatible amb el tipus de memòria. Controlador de memòria Al final seran les connexions de la placa mare les que marcaran el tipus de processador i de memòria a fer servir, segons el xipset i les connexions de que disposa.
    [Show full text]
  • SMBIOS) Reference 6 Specification
    1 2 Document Identifier: DSP0134 3 Date: 2018-04-26 4 Version: 3.2.0 5 System Management BIOS (SMBIOS) Reference 6 Specification 7 Supersedes: 3.1.1 8 Document Class: Normative 9 Document Status: Published 10 Document Language: en-US 11 System Management BIOS (SMBIOS) Reference Specification DSP0134 12 Copyright Notice 13 Copyright © 2000, 2002, 2004–2016 Distributed Management Task Force, Inc. (DMTF). All rights 14 reserved. 15 DMTF is a not-for-profit association of industry members dedicated to promoting enterprise and systems 16 management and interoperability. Members and non-members may reproduce DMTF specifications and 17 documents, provided that correct attribution is given. As DMTF specifications may be revised from time to 18 time, the particular version and release date should always be noted. 19 Implementation of certain elements of this standard or proposed standard may be subject to third party 20 patent rights, including provisional patent rights (herein "patent rights"). DMTF makes no representations 21 to users of the standard as to the existence of such rights, and is not responsible to recognize, disclose, 22 or identify any or all such third party patent right, owners or claimants, nor for any incomplete or 23 inaccurate identification or disclosure of such rights, owners or claimants. DMTF shall have no liability to 24 any party, in any manner or circumstance, under any legal theory whatsoever, for failure to recognize, 25 disclose, or identify any such third party patent rights, or for such party’s reliance on the standard or 26 incorporation thereof in its product, protocols or testing procedures.
    [Show full text]
  • Second Generation AMD EPYC Processor Enhanced Cache and Memory Architecture
    Second Generation AMD EPYC Processor Enhanced Cache and Memory Architecture Paper sponsored by AMD Introduction The original AMD EPYC processor launched in 2017 as a revolutionary multi-die-on-package processor that allowed AMD to build a high performance and high throughput processor in a more efficient manner. Building off the success of the first generation, AMD has evolved the multi-die strategy, as the company now focused on a different form of multi-die packaging solutions in the second generation EPYC server processor. This new solution, which AMD refers to as hybrid multi-die, allows AMD to not only divide a potentially large die into small interconnected dies, but also specific functions can be fabricated in the most appropriate process node based on required cost and performance. This agile hybrid multi-die architecture decoupled the CPU and cache complex and I/O innovation paths, giving AMD the ability to deliver the best process technology for CPU cores and letting I/O circuitry develop at its own rate. With this new architecture, to date AMD has been able to set over 140 world records across workloads that include Analytics and Big Data; Enterprise; HPC; Virtualization and Cloud Workloads. AMD EPYC 7002 Series server processor This paper will look at the changes made from the first generation EPYC processor that improved cache performance and reduced overall memory access time variability. February 2020 Copyright © 2020 TIRIAS Research. All Rights Reserved Page 1 Second Generation AMD EPYC Processor Enhanced Cache and Memory Architecture Quick review of 7001 vs. 7002 The heart of the original EPYC (7001 Series) design was the AMD Infinity Fabric that ties the cores and the multiple die together on the multichip module (MCM) package.
    [Show full text]
  • System Management BIOS (SMBIOS) Reference Specification DSP0134
    1 2 Document Identifier: DSP0134 3 Date: 2020-07-17 4 Version: 3.4.0 5 System Management BIOS (SMBIOS) Reference 6 Specification 7 Supersedes: 3.3.0 8 Document Class: Normative 9 Document Status: Published 10 Document Language: en-US 11 System Management BIOS (SMBIOS) Reference Specification DSP0134 12 Copyright Notice 13 Copyright © 2000, 2002, 2004–2020 DMTF. All rights reserved. 14 DMTF is a not-for-profit association of industry members dedicated to promoting enterprise and systems 15 management and interoperability. Members and non-members may reproduce DMTF specifications and 16 documents, provided that correct attribution is given. As DMTF specifications may be revised from time to 17 time, the particular version and release date should always be noted. 18 Implementation of certain elements of this standard or proposed standard may be subject to third party 19 patent rights, including provisional patent rights (herein "patent rights"). DMTF makes no representations 20 to users of the standard as to the existence of such rights, and is not responsible to recognize, disclose, 21 or identify any or all such third party patent right, owners or claimants, nor for any incomplete or 22 inaccurate identification or disclosure of such rights, owners or claimants. DMTF shall have no liability to 23 any party, in any manner or circumstance, under any legal theory whatsoever, for failure to recognize, 24 disclose, or identify any such third party patent rights, or for such party’s reliance on the standard or 25 incorporation thereof in its product, protocols or testing procedures. DMTF shall have no liability to any 26 party implementing such standard, whether such implementation is foreseeable or not, nor to any patent 27 owner or claimant, and shall have no liability or responsibility for costs or losses incurred if a standard is 28 withdrawn or modified after publication, and shall be indemnified and held harmless by any party 29 implementing the standard from any and all claims of infringement by a patent owner for such 30 implementations.
    [Show full text]
  • H252-Z10 High Density Server - up 2U 4 Nodes Server
    H252-Z10 High Density Server - UP 2U 4 Nodes Server Features • 2U - 4 nodes rear access server system • Single AMD EPYC™ 7002 series processor family • 8-Channel RDIMM/LRDIMM DDR4, 32 x DIMMs • 8 x 1Gb/s LAN ports (Intel® I350-AM2) • 4 x Dedicated management ports • 1 x CMC global management port • 24 x 2.5" NVMe/SATA hot-swappable SSD bays • 8 x M.2 with PCIe Gen3 x4 interface • 8 x Low profile PCIe x16 expansion slots • 4 x OCP 2.0 Gen3 x16 mezzanine slots • Aspeed® AST2500 remote management controller • 2+0 2000W 80 PLUS Platinum power supply AMD EPYC™ 7002 Series Processor (Rome) The next generation of AMD EPYC has arrived, providing incredible compute, IO and bandwidth capability – designed to meet the huge demand for more compute in big data analytics, HPC and cloud computing. Built on 7nm advanced process technology, allowing for denser compute capabilities with lower power consumption Up to 64 core per CPU, built using Zen 2 high performance cores and AMD’s innovative chiplet architecture Supporting PCIe Gen 4.0 with a bandwidth of up to 64GB/s, twice of PCIe Gen 3.0 Embedded security protection to help defend your CPU, applications, and data ASPEED AST2500 CMC for Chassis Management & Multi-Node Monitoring Node_1 BMC MLAN GIGABYTE’s H-Series Servers feature an Aspeed CMC (Central Management Controller) ASPEED AST2500 for chassis-level management and node-level monitoring (by connecting internally to ASPEED Node_2 BMC MLAN AST1250 Aspeed BMCs integrated on each node). This results only in one MLAN connection CMC Chip CMC MLAN ASPEED required to perform monitoring of all four nodes*, resulting in less ToR (Top of Rack) AST2500 MLAN cabling and switch connections.
    [Show full text]
  • AMD's Zen-Based Processor Lines (Family 17H)
    AMD’s Zen-based processor lines (Family 17h) Dezső Sima January 2019 (Ver. 2.2) Sima Dezső, 2019 AMD’s processor lines belonging to the Zen Family (Family 17h) • 1. Introduction • 2. The Zen cores • 3. The 4-core CCX building block • 4. The Zeppelin module • 5. The Infinity Fabric • 6. The Ryzen desktop line • 7. The Ryzen Mobile line • 8. The ThreadRipper HED line • 9. The Epyc server line • 10. References In the Lectures only Sections 1 - 4, 6.1 and 8.2 will be discussed (except Sections 2.1.3 and 2.1.4). 1. Introduction 1. Introduction (1) 1. Introduction Overview of AMD’s processor lines AMD’s in-house designed x86 families 32-bit The Hammer Intermediate The Bulldozer The Cat The Zen x86 families family families family family family K5/K6/K7 K8/K10/K10.5 Families Family Families Family families families 11h/12h 15h 14h/16h 17h (08h/10h/10.5h) (32-bit (64-bit (Mobile/DT (High-performance (Low-power (Modular design) Mobile/DT) x86 family) oriented) oriented) oriented) 1996-2003) (2003-2009) (2008-2011) (2011-2016) (2011-2015) (2017- ) K7: Athlon (1999-2003) Remark Before the in-house designed K5, AMD licensed and manufactured Intel designed processors. 1. Introduction (2) 1. Introduction AMD's unit shares on the world market [51] 1. Introduction (3) Efficiency of Intel and AMD CPUs - 2004 to Ryzen [89] (Core 2) 1. Introduction (4) Announcing details of the Zen architecture at Computex 6/2016 [1] 1. Introduction (5) Brand names of AMD’s Zen-based (Family 17h-based) processor lines Launched in 2017-2018 2018 2019 Family 17h Family 17h Family 17h (00h-0Fh) (00h-0Fh) (xxh-xxh) (Zen) (Zen+) (Zen+) 14 nm 12 nm 12 nm 4P servers 2P servers Naples (Epyc 7xx1) 1P servers Naples (Epyc 7xx1P) Servers (85-140 W) Whitehaven Pinnacle Ridge HED (ThreadRipper (TR) (ThreadRipper (TR) (~95-125 W) 1xxxX) 2xxxX/WX)) Summit Ridge Mainstream/ Desktops (Ryzen 7/5/3 1xxx/1xxxX) Pinnacle Ridge Entry level Raven Ridge (APU) (Ryzen 7/5 2xxx/2xxxX) (30-95 W) (Ryzen 7/5/3 2000G/GE) High perf.
    [Show full text]