Graphene-Based Wireless Agile Interconnects for Massive Heterogeneous Multi-Chip Processors

1 Graphene-based Wireless Agile Interconnects for Massive Heterogeneous Multi-chip Processors Sergi Abadal, Robert Guirado, Hamidreza Taghvaee, Akshay Jain, Elana Pereira de Santana, Peter Haring Bol´ıvar, Mohamed Saeed, Renato Negra, Zhenxing Wang, Kun-Ta Wang, Max C. Lemme, Joshua Klein, Marina Zapater, Alexandre Levisse, David Atienza, Davide Rossi, Francesco Conti, Martino Dazzi, Geethan Karunaratne, Irem Boybat and Abu Sebastian Abstract—The main design principles in computer architecture Multicore have recently shifted from a monolithic scaling-driven approach Multi-chiplet 500–1000 mm2 processor with to the development of heterogeneous architectures that tightly system with 1–10 chiplets regular NoC integrated co-integrate multiple specialized processor and memory chiplets. 1–10 Tb/s off-chip NiP 100–1000 ns In such data-hungry multi-chip architectures, current Networks- (e.g. [3]) 1–10 pJ/bit in-Package (NiPs) may not be enough to cater to their heteroge- Point-to-point neous and fast-changing communication demands. This position paper makes the case for wireless in-package nanonetworking 2 as the enabler of efficient and versatile wired-wireless inter- CPU 100–500 mm | 10–100 cores | ~1 Tb/s connect fabrics for massive heterogeneous processors. To that 10–100 ns | 0.1–1 pJ/bit | Mesh Acc end, the use of graphene-based antennas and transceivers with Accelerator chiplet with dense, unique frequency-beam reconfigurability in the terahertz band versatile, lightweight NoC is proposed. The feasibility of such a nanonetworking vision and the main research challenges towards its realization are 2 analyzed from the technological, communications, and computer 1–100 mm 100–10000 cores architecture perspectives. 0.1–1 Tb/s 1–10 ns 0.1–0.5 pJ/bit Flexible INTRODUCTION The end of Dennard scaling has led to the rise of multicore Fig. 1. The chip-scale communication landscape in the heterogeneous chiplet era: Network-in-Package (NiP) to interconnect chiplets, Network-on-Chip processors, whereby a set of independent cores are integrated (NoC) for multicore processors, and dense fabrics for accelerators. For the within a single chip. In theory, the potential of these processors three scenarios, we list popular system sizes, number of nodes, bisection scales with the number of cores [1]. In practice, realizing bandwidth, latency, energy per transmitted bit, and topology. such a potential requires an on-chip interconnect capable of providing high-throughput and low-latency data sharing among cores. This fundamentally implies that communication, requirements. At the system level, communication becomes and not computation, becomes the main bottleneck in these dominated by off-chip transfers and mandated by the heteroge- multicore chips –especially as we scale towards thousand cores nous mixture of chiplets of a particular SiP. At the chiplet [2]. level, some accelerators are extremely data-intensive and re- Current architectural trends maintain the multicore essence, quire an ultra-dense on-chip interconnect, further extending the spectrum of communication needs of SiPs. AI accelerators are arXiv:2011.04107v1 [cs.ET] 8 Nov 2020 but have recently resorted to disintegration and specialization to continue improving performance. Disintegration means that a very popular example of this trend [3]. large chips are being replaced by ensembles of smaller, het- To satisfy the communication needs of multicore chips, erogeneous chiplets that are interconnected through a silicon Network-on-Chip (NoC) has been the standard interconnect interposer within a System-in-Package (SiP), as illustrated in fabric in the last decade [2]. NoCs are packet-switched net- Figure 1 [3]. Specialization means that chiplets are becoming works of integrated routers and wires typically arranged in a increasingly optimized to accelerate a given particular compu- mesh topology. This approach, however, has important limi- tation. This has two key implications in the communications tations when scaled beyond tens of cores. In particular, chip- wide and broadcast transfers suffer from increasing latency Sergi Abadal ([email protected]), Robert Guirado, Hamidreza Taghvaee, and Akshay Jain are with the Universitat Politecnica` de Catalunya. Elana and energy consumption due to the many hops needed to Pereira de Santana and Peter Haring Bol´ıvar are with the University of Siegen. reach the destination(s) [2]. Thus, when developing ultra-dense Mohamed Saeed, Renato Negra, Kun-Ta Wang, and Max C. Lemme are with accelerators, the NoC is forced to become a specialized fabric the RWTH Aachen University. Zhenxing Wang, Kun-Ta Wang, and Max C. Lemme are also with AMO GmbH. Joshua Klein, Marina Zapater, Alexandre with little flexibility [3]. Further, when moving to multi-chip Levisse, and David Atienza are with the Swiss Federal Institute of Technology SiPs, the concept of NoC has been extended to incorporate off- Lausanne. Marina Zapater is also with University of Applied Sciences and chip links and form the Network-in-Package (NiP) [4], aggra- Arts Western Switzerland. Davide Rossi and Francesco Conti are with the University of Bologna. Martino Dazzi, Geethan Karunaratne, Irem Boybat vating their existing issues and adding new ones. For instance, and Abu Sebastian are with IBM Research Europe. NiPs are rigid and prone to over-provisioning in a scenario 2 TABLE I Steered beams COMPARISON OF DIFFERENT INTERCONNECT TECHNOLOGIES FOR Graphene Chiplets NETWORK-IN-PACKAGE (NIP). CAPACITY REFERS TO BISECTION (multiple bands) BANDWIDTH. Antenna Metric Interposer Optics Wireless Graphene Medium Wires Waveguides Package Package Frequency Baseband Optical mmWave Terahertz Capacity (Tb/s) 0.1–1 1–100 0.01–0.1 0.1–1 Latency (ns) 10–100 10–100 1–10 1–10 Energy (pJ/b) 1–10 0.1–10 1–10 1–10 Multiplexing No Time* Time Total** Broadcast Poor Expensive Native Native *only if global waveguides are used. **space, time, and frequency. that demands a versatile solution to address heterogeneous communication needs. This, together with the connectivity Package Mem constraints of each chiplet determined by the amount of Mem Chiplet Chiplet physical connection pins, severely limits the applicability and Interposer scalability of the SiP approach. ... Emerging technologies such as nanophotonics or wireless Package Substrate interconnects were originally proposed to address the issues Rest of the system of NoCs [5]. Now, given the increasing importance of off- chip links in chiplet-based architectures and the advent of Fig. 2. Schematic diagram of a System-in-Package (SiP) hosting a hetero- accelerators, there is a renewed interest in these technologies. geneous set of chiplets. The interconnect fabric is composed of a silicon interposer Network-in-Package (NiP) augmented with graphene-based agile In particular, nanophotonic networks have been proposed due wireless links. to its massive bandwidth density and high energy efficiency [6]. While this approach solves the bandwidth problem of NiPs, its flexibility is limited by the need for laying down the open issues and research challenges towards the realization waveguides across the system. The wireless approach, enabled of this nanonetworking vision, from the technological integra- by recent advances in integrated millimeter-wave (mmWave, tion aspect up to the heterogeneous architecture design and 30-300 GHz) devices, is theoretically capable of providing simulation. the desired reconfigurability and, thus, becomes an appealing complement to the wired networks [7]. AGILE WIRELESS INTERCONNECT In this paper, we make the case for wireless communications within a multi-chip package as the enabler of efficient and FABRICS ENABLED BY GRAPHENE scalable NiP-based architectures. While the approach may Figure 2 illustrates the proposed wireless networking vision be feasible with conventional Complementary Metal–Oxide– within the context of heterogeneous multi-chip architectures. Semiconductor (CMOS) mmWave technology, we pose that In this paradigm, graphene-based antennas and transceivers are its full potential will only be realized at the terahertz band co-integrated with the computing elements of a SiP and use (0.3–3 THz) and by means of graphene-based antennas and the system package as the wireless propagation medium. The transceivers (see Table I for a summarized comparison). At the wireless links form a network within and across chiplets that antenna, graphene offers unique tunability and miniaturization complements the wired interconnect fabric. properties, which enable the realization of compact tunable In this context, wireless communications offer multiple beam-steering arrays [8]. At the transceiver, graphene may benefits. By not needing to lay down wires between source enable ultra-high modulation speeds at low cost by virtue of its and destination, wireless is able to bypass I/O pin limitations high responsivity, excellent linearity, and wide dynamic range or wire routing constraints. The newly available bandwidth can [9]. We claim that, with the appropriate physical, link, and be shared dynamically to adapt to the needs of the architecture. network layer protocols, the proposed wireless communication The latency, critical in this scenario, can be reduced by plane will be able to deliver not only the architectural flexibil- an order of magnitude, especially in transfers that would ity required by heterogeneous multi-chip processors, but also otherwise require the traversal of long multihop paths within enough bandwidth to satisfy the needs of massive architectures

Graphene-Based Wireless Agile Interconnects for Massive Heterogeneous Multi-Chip Processors

A Product Engine for Energy-Efficient Execution of Binary Neural Networks Using Resistive Memories

Connecting Artificial Intelligence (AI) to Internet of Things (Iot) November 11 to 13, 2020

Ecocloud E-Newsletter in This Issue: Welcome Message from the Executive Committee in the News 2017 Ecocloud Annual Event Around

Iccad 2020 Virtual Event

Adressing Thermal Issues in 3D Stacks in the Nano-Tera CMOSAIC Project

Opening Session & Award Presentations

Temperature-Aware Design and Management for 3D Multi-Core Architectures

PRESS RELEASE DATE 2017 Event Highlights Electronics for the Internet of Things Era and Wearable and Smart Medical Devices

A Semi-Analytical Thermal Modeling Framework for Liquid-Cooled Ics Arvind Sridhar, Member, IEEE,Mohamedm.Sabry,Member, IEEE, and David Atienza, Senior Member, IEEE

System-Level Thermal-Aware Design of 3D Multiprocessors with Inter-Tier Liquid Cooling

Ayse Kivilcim Coskun

CEDA Standards Committee Nominations Requested for EDA