1

Graphene-based Wireless Agile Interconnects for Massive Heterogeneous Multi-chip Processors Sergi Abadal, Robert Guirado, Hamidreza Taghvaee, Akshay Jain, Elana Pereira de Santana, Peter Haring Bol´ıvar, Mohamed Saeed, Renato Negra, Zhenxing Wang, Kun-Ta Wang, Max C. Lemme, Joshua Klein, Marina Zapater, Alexandre Levisse, David Atienza, Davide Rossi, Francesco Conti, Martino Dazzi, Geethan Karunaratne, Irem Boybat and Abu Sebastian

Abstract—The main design principles in computer architecture Multicore have recently shifted from a monolithic scaling-driven approach Multi-chiplet 500–1000 mm2 processor with to the development of heterogeneous architectures that tightly system with 1–10 chiplets regular NoC integrated co-integrate multiple specialized processor and memory chiplets. 1–10 Tb/s off-chip NiP 100–1000 ns In such data-hungry multi-chip architectures, current Networks- (e.g. [3]) 1–10 pJ/bit in-Package (NiPs) may not be enough to cater to their heteroge- Point-to-point neous and fast-changing communication demands. This position paper makes the case for wireless in-package nanonetworking 2 as the enabler of efficient and versatile wired-wireless inter- CPU 100–500 mm | 10–100 cores | ~1 Tb/s connect fabrics for massive heterogeneous processors. To that 10–100 ns | 0.1–1 pJ/bit | Mesh Acc end, the use of graphene-based antennas and transceivers with Accelerator chiplet with dense, unique frequency-beam reconfigurability in the terahertz band versatile, lightweight NoC is proposed. The feasibility of such a nanonetworking vision and the main research challenges towards its realization are 2 analyzed from the technological, communications, and computer 1–100 mm 100–10000 cores architecture perspectives. 0.1–1 Tb/s 1–10 ns 0.1–0.5 pJ/bit Flexible INTRODUCTION The end of Dennard scaling has led to the rise of multicore Fig. 1. The chip-scale communication landscape in the heterogeneous chiplet era: Network-in-Package (NiP) to interconnect chiplets, Network-on-Chip processors, whereby a set of independent cores are integrated (NoC) for multicore processors, and dense fabrics for accelerators. For the within a single chip. In theory, the potential of these processors three scenarios, we list popular system sizes, number of nodes, bisection scales with the number of cores [1]. In practice, realizing bandwidth, latency, energy per transmitted bit, and topology. such a potential requires an on-chip interconnect capable of providing high-throughput and low-latency data sharing among cores. This fundamentally implies that communication, requirements. At the system level, communication becomes and not computation, becomes the main bottleneck in these dominated by off-chip transfers and mandated by the heteroge- multicore chips –especially as we scale towards thousand cores nous mixture of chiplets of a particular SiP. At the chiplet [2]. level, some accelerators are extremely data-intensive and re- Current architectural trends maintain the multicore essence, quire an ultra-dense on-chip interconnect, further extending the spectrum of communication needs of SiPs. AI accelerators are

arXiv:2011.04107v1 [cs.ET] 8 Nov 2020 but have recently resorted to disintegration and specialization to continue improving performance. Disintegration means that a very popular example of this trend [3]. large chips are being replaced by ensembles of smaller, het- To satisfy the communication needs of multicore chips, erogeneous chiplets that are interconnected through a silicon Network-on-Chip (NoC) has been the standard interconnect interposer within a System-in-Package (SiP), as illustrated in fabric in the last decade [2]. NoCs are packet-switched net- Figure 1 [3]. Specialization means that chiplets are becoming works of integrated routers and wires typically arranged in a increasingly optimized to accelerate a given particular compu- mesh topology. This approach, however, has important limi- tation. This has two key implications in the communications tations when scaled beyond tens of cores. In particular, chip- wide and broadcast transfers suffer from increasing latency Sergi Abadal ([email protected]), Robert Guirado, Hamidreza Taghvaee, and Akshay Jain are with the Universitat Politecnica` de Catalunya. Elana and energy consumption due to the many hops needed to Pereira de Santana and Peter Haring Bol´ıvar are with the University of Siegen. reach the destination(s) [2]. Thus, when developing ultra-dense Mohamed Saeed, Renato Negra, Kun-Ta Wang, and Max C. Lemme are with accelerators, the NoC is forced to become a specialized fabric the RWTH Aachen University. Zhenxing Wang, Kun-Ta Wang, and Max C. Lemme are also with AMO GmbH. Joshua Klein, Marina Zapater, Alexandre with little flexibility [3]. Further, when moving to multi-chip Levisse, and David Atienza are with the Swiss Federal Institute of Technology SiPs, the concept of NoC has been extended to incorporate off- Lausanne. Marina Zapater is also with University of Applied Sciences and chip links and form the Network-in-Package (NiP) [4], aggra- Arts Western Switzerland. Davide Rossi and Francesco Conti are with the University of Bologna. Martino Dazzi, Geethan Karunaratne, Irem Boybat vating their existing issues and adding new ones. For instance, and Abu Sebastian are with IBM Research Europe. NiPs are rigid and prone to over-provisioning in a scenario 2

TABLE I Steered beams COMPARISON OF DIFFERENT INTERCONNECT TECHNOLOGIES FOR Graphene Chiplets NETWORK-IN-PACKAGE (NIP).CAPACITYREFERSTOBISECTION (multiple bands) BANDWIDTH. Antenna Metric Interposer Optics Wireless Graphene Medium Wires Waveguides Package Package Frequency Baseband Optical mmWave Terahertz Capacity (Tb/s) 0.1–1 1–100 0.01–0.1 0.1–1 Latency (ns) 10–100 10–100 1–10 1–10 Energy (pJ/b) 1–10 0.1–10 1–10 1–10 Multiplexing No Time* Time Total** Broadcast Poor Expensive Native Native *only if global waveguides are used. **space, time, and frequency. that demands a versatile solution to address heterogeneous communication needs. This, together with the connectivity Package Mem constraints of each chiplet determined by the amount of Mem Chiplet Chiplet physical connection pins, severely limits the applicability and Interposer scalability of the SiP approach. ... Emerging technologies such as nanophotonics or wireless Package Substrate interconnects were originally proposed to address the issues Rest of the system of NoCs [5]. Now, given the increasing importance of off- chip links in chiplet-based architectures and the advent of Fig. 2. Schematic diagram of a System-in-Package (SiP) hosting a hetero- accelerators, there is a renewed interest in these technologies. geneous set of chiplets. The interconnect fabric is composed of a silicon interposer Network-in-Package (NiP) augmented with graphene-based agile In particular, nanophotonic networks have been proposed due wireless links. to its massive bandwidth density and high energy efficiency [6]. While this approach solves the bandwidth problem of NiPs, its flexibility is limited by the need for laying down the open issues and research challenges towards the realization waveguides across the system. The wireless approach, enabled of this nanonetworking vision, from the technological integra- by recent advances in integrated millimeter-wave (mmWave, tion aspect up to the heterogeneous architecture design and 30-300 GHz) devices, is theoretically capable of providing simulation. the desired reconfigurability and, thus, becomes an appealing complement to the wired networks [7]. AGILE WIRELESS INTERCONNECT In this paper, we make the case for wireless communications within a multi-chip package as the enabler of efficient and FABRICSENABLEDBY GRAPHENE scalable NiP-based architectures. While the approach may Figure 2 illustrates the proposed wireless networking vision be feasible with conventional Complementary Metal–Oxide– within the context of heterogeneous multi-chip architectures. Semiconductor (CMOS) mmWave technology, we pose that In this paradigm, graphene-based antennas and transceivers are its full potential will only be realized at the terahertz band co-integrated with the computing elements of a SiP and use (0.3–3 THz) and by means of graphene-based antennas and the system package as the wireless propagation medium. The transceivers (see Table I for a summarized comparison). At the wireless links form a network within and across chiplets that antenna, graphene offers unique tunability and miniaturization complements the wired interconnect fabric. properties, which enable the realization of compact tunable In this context, wireless communications offer multiple beam-steering arrays [8]. At the transceiver, graphene may benefits. By not needing to lay down wires between source enable ultra-high modulation speeds at low cost by virtue of its and destination, wireless is able to bypass I/O pin limitations high responsivity, excellent linearity, and wide dynamic range or wire routing constraints. The newly available bandwidth can [9]. We claim that, with the appropriate physical, link, and be shared dynamically to adapt to the needs of the architecture. network layer protocols, the proposed wireless communication The latency, critical in this scenario, can be reduced by plane will be able to deliver not only the architectural flexibil- an order of magnitude, especially in transfers that would ity required by heterogeneous multi-chip processors, but also otherwise require the traversal of long multihop paths within enough bandwidth to satisfy the needs of massive architectures the dense maze of wired interconnects. A particularly relevant with data-intensive accelerators [10]. example is broadcast, which involves flooding the network in This work aims to provide a description of the envis- wired NiPs. In the wireless case, broadcast becomes scalable aged nanonetworking scenario (Figure 2) from the techno- since additional chiplets only need to incorporate a transceiver logical, communications, and architectural perspectives. We to participate in the communication. first briefly discuss relevant literature in integrated graphene The use of graphene antennas turns the wireless intercon- antennas and transceivers. We also depict the structure of nect into a true wireless nanonetwork. Such compact beam- a typical multi-chip package and its wireless propagation steerable, frequency-tunable antennas lead to substantial im- characteristics, to then outline the roles of the wireless com- provements in antenna gain, achievable capacity, and network- munications plane within the architecture. Finally, we describe level flexibility. These additional benefits can be exploited 3 by the architecture through a judicious orchestration of the directive modes with steerable beams for parallel high-speed space, time, and frequency wireless channels. We next describe transmissions. The protocol stack, which in this context would the most salient characteristics at the antenna, network, and mostly be in charge of determining the most appropriate architecture fronts. transmission mode and combination of links, can exploit the monolithic nature of the system. In other words, the designer GRAPHENE-BASED ANTENNAS has control over multiple aspects of the whole architecture. This allows to eliminate issues such as the hidden terminal The concept of wireless flexibility in this scenario can or the deafness problem, where the transmitting and receiving only be realized by means of a compact, high-bandwidth, beams are not aligned. Moreover, it provides unique opportu- highly-reconfigurable wireless technology. Graphene is ideally nities for optimization not available in other scenarios, such as suited to this purpose thanks to its outstanding optoelectronic the use of heuristics or pre-established transmission schedules properties. Through the support of Surface Plasmon Polaritons adapted to known applications. Thanks to these opportunities, (SPPs) in the THz band, graphene opens the door to the a simple yet agile protocol stack can be envisaged to be design of THz antennas with ultra-broad tunable resonance and capable of dealing with the extreme density of the wireless reconfigurable beams. Although full-fledged prototypes are network and the stringent latency deadlines imposed by the still not available, the potential of graphene antennas has been applications (i.e. <10 ns). proven in multiple works [8], [11]. Moreover, the integration of graphene to mature process platforms such as CMOS has been HETEROGENEOUS COMPUTING ARCHITECTURES demonstrated with great success [12]. Therefore, graphene Dense wireless networks at the chip scale have the po- antennas can be integrated into conventional processor chips. tential to eliminate the bottlenecks of computing systems. As summarized in Figure 3, the unique plasmonic properties Historically, these systems have been based on von Neumann of graphene in the THz band allows to consider multiple architectures where processors and memory are separated and, spatial and frequency wireless channels. This is because SPPs hence, data communication is key. In this context, the ideal are slow waves allowing the reduction of the resonant length of scenario would be having many interconnected cores sharing the antenna by up to two orders of magnitude [8]. Moreover, data through a low-latency, multi-Tb/s, and energy-efficient SPPs in graphene are tunable electrically, which implies that memory. This, however, is hard to achieve in processors the resonant frequency of the antenna can be tuned within a with many cores, which are forced to use complex memory wide range by just changing a bias voltage [8]. Tunability also hierarchies to minimize the negative impact of data movement implies RF switchability, as antenna elements can be tuned [2]. In these processors, the cost of communication increases in or out of a certain fixed frequency. These are excellent by an order of magnitude when moving from the nearby small conditions for the development of ultra-compact, CMOS- caches to the larger but distant off-chip memories. This not compatible and digitally programmable transmitarrays able to only hurts performance and programmability, but also limits control both the frequency and direction of radiation without the scalability of the processor. the need for many RF chains [11]. As we discuss next, these Disintegration and specialization have been recently adopted antennas open the door to the creation of multiple spatial and to scale performance further, but communication continues frequency channels even in area-constrained scenarios such as being a bottleneck. On the one hand, disintegration implies that of chip-scale communications. placing multiple small chiplets on a common substrate and communicating then through high-speed interconnects. How- WIRELESS NETWORKSWITHINA COMPUTING ever, these interconnects cannot sustain architectures with PACKAGE more than a handful of chiplets due to physical and topo- The computing package is a new scenario for wireless logical constraints. On the other hand, specialization implies communications and possesses unique characteristics. Waves having dedicated engines such as Neural Processing Units propagate through a dense environment as illustrated in Figure (NPUs) for specific workloads like AI. In this case, the 4, which leads to frequent reflections at the surrounding of custom interconnects are generally not flexible enough to deal chips and at the package limits. Moreover, some materials with a wide range of applications. Beyond this, disruptive present at the chips such as bulk silicon are lossy. Hence, technologies such as in-memory computing are opening the the channel introduces high attenuation and potentially dense door to chiplets that deliver unprecedented performance by multipath [13]. Fortunately, this in-package channel is static partially alleviating the memory wall [14]. However, for highly and controlled. This implies that the package dimensions pipelined execution and for being general enough to deal and materials can be chosen to reduce both the losses and with different applications, an agile communication fabric is the channel length, making it compatible with low-power, essential. ultra-reliable broadband communication at mmWave and THz In this context, the vision of agile wireless networks at frequencies [13]. the chip scale is a very promising approach. On classical The use of graphene antennas allows to envision an agile architectures, wireless networks provide a way to connect and dense chip-scale wireless nanonetwork capable of switch- cores with distant off-chip memories with similar performance ing between multiple frequencies and transmission modes. than for on-chip caches. Moreover, their broadcast capabilities Links can be formed within or across chips, using either open the door to scalable cache coherence, which are neces- omnidirectional modes for low-latency broadcast or highly sary to guarantee the programmability of massively parallel 4

Biasing lines

Source Graphene Substrate

(a) Direction of radiation (b) (c)

Fig. 3. Evolution of graphene antennas from (a) simple graphene dipoles illustrating miniaturization and tunability, to (b) compact Yagi-Uda arrays showing joint frequency-beam reconfigurability, and (c) compact arrays with full steering capabilities. Data obtained from electromagnetic simulations and from [11].

in a very low antenna efficiency. The main reason is that the 0 graphene sheets that constitute the antenna must have high quality to achieve resonance and, while graphene shows such -10 quality as a free-standing layer, it quickly degrades when inte- grated into a THz component or circuit. Multiple approaches -20 have been proposed to remedy this, including the doping of the graphene sheets to improve efficiency, the improvement of the graphene integration process, or via impedance matching -30 between the antenna and the transceiver [10]. Path loss [dB] Besides an antenna, transceiver circuits operating at THz 0 1 2 10 10 10 frequencies need to be developed as well [10]. Traditional Distance [mm] CMOS transistors have an increasingly limited performance at Fig. 4. Path loss as a function of distance in an interposer system composed this higher frequency range, because it exceeds their maximum of four chiplets with four 60-GHz monopoles per chiplet. The inset illustrates oscillation frequency (fmax) and cutoff frequency (fT ). To the simulated landscape and the different propagation phenomena [13]. address this issue, heterogeneous technology solutions can be envisioned. Circuits can be implemented in high-frequency processors. On novel architectures based on disintegration and technologies like Silicon-Germanium (SiGe), which can op- specialization, the flexibility of THz wireless channels allows erate at 300 GHz, and be co-integrated with the graphene to improve the connectivity of chiplets while minimizing over- antennas. In order to reach even higher frequencies, graphene- provisioning. These features, coupled with the capabilities based active-mixing components could also be established as of emerging in-memory computing technologies, can be the an alternative for frequency up- and down-conversion with low cornerstone of new breed of general-purpose architectures with power consumption [15]. accelerator-level performance. Finally, the implementation of transceivers exploiting the unique properties of graphene-based field effect transistor RESEARCH CHALLENGES (FET)s is also a promising yet challenging alternative. Like in the antenna case, the reported f and f for graphene-based Albeit promising, the realization of multi-chip architectures max T FETs are lower than the expected values due to technology integrating wireless nano-networks entails multiple open chal- immaturity [15]. Consequently, the main challenge comes lenges. We next describe them in a bottom-up approach. again from the inability of the graphene FET to provide the necessary gain at THz frequencies.A solution for the receiver ANTENNAAND TRANSCEIVER IMPLEMENTATION to overcome this issue is reported in [9], but it comes at Multiple research groups have simulated graphene antennas the cost of a drastic reduction of the receiver sensitivity. in the past years, including resonant, leaky-wave, and even The receiver sensitivity is crucial to compensate the losses reflectarray antennas [8], [10], [11]. Simulation and numerical of the wireless channel. Another possible solution to provide analysis show an attractive reduction of the antenna size, as the imperative gain for both transmitters and receivers is to well as the possibility of tuning the antenna response by exploit graphene-based devices in a parametric transreactance only changing the biasing voltage. Unfortunately, until now, architecture. By not relying on the transimpedance of the no graphene antenna has been manufactured that matches transistor as in prior works, the device is not limited by the the theoretical predictions. The THz emission of a graphene fmax and fT and offers the potential to provide the required antenna observed in preliminary experiments is weak, resulting gain in the THz band. 5

TECHNOLOGICAL INTEGRATION relatively costly due to the need for bulky and power-hungry Graphene’s high carrier mobility, gate-tunable carrier den- components such as Phase-Locked Loops (PLLs). Second, sity and general compatibility with silicon technology [12] the channel is prone to (static) multipath [13]. Third, the bit -12 are key to the networking approach laid out in this paper. error rate required in this scenario is ∼10 . At upper layers However, many of its record-breaking performance indicators of design, the protocols must manage the reconfigurability have been achieved only in highly idealized environments properties of the graphene antennas. This represents an open or on champion devices with nearly perfect materials exfo- issue as none of the existing wireless chip-scale networks liated manually from natural crystals. Graphene transfer and proposals, e.g. [7], support dynamic beam-steering or fre- integration within the chip environment with its dielectrics, quency tuning simply because conventional on-chip antennas electrical contacts and passive components, and at the required cannot have such capabilities. In other words, there are no performance for wireless in-package communication, remains protocols in the literature that can orchestrate the space-time- an open challenge. frequency channels offered by graphene antennas with the In this direction, recent research has achieved the inte- simplicity needed at the chip scale, while adapting to the traffic gration of graphene-based RF components such as diodes requirements. A promising approach to tackle this problem is in a full transceiver circuit [9]. However, the target here to co-design the Medium Access Control (MAC) and network are the subTHz and THz ranges, which require different, protocols with the architecture, a possibility that is unique to more demanding designs for higher performance. Potentially, this scenario due to its monolithic nature. quartz-based monolithic integrated circuits would be able to attain the required integration for graphene-based wireless on- MASSIVELY PARALLEL HETEROGENEOUS chip communication. However, variability-tolerant design still ARCHITECTURES needs to be realized. The main reason is that, in the end, the processes also have to be compatible with future back-end- The performance, efficiency and flexibility of on-chip and of-line CMOS integration, especially with respect to process in-package wireless communication is an excellent opportunity technology and temperatures. to eliminate the bottleneck of current computing systems. The integration of graphene antennas is even more criticial The main challenges here are to identify the parts of the as they not only require significantly higher performance, architecture where wireless can make a larger impact, to then i.e. higher carrier mobility, than transceiver circuits, but also co-design the network and architecture to optimally exploit demand higher controllability on its electrical properties to the wireless interconnect and caring not to overload it. Often, achieve the promised tunability. The doping level of graphene the challenge resides in the disruptive nature of the potential is not yet intentionally controllable, neither replacing specific innovations that wireless can bring in the architecture domain, atoms or by means of chemical doping. Even if the doping which has been traditionally driven by rather incremental level can be controlled, there is usually a trade-off between the optimization cycles. carrier mobility and the doping level, meaning that high quality To illustrate this process, let us consider the case of het- and high controllability are hard to achieve simultaneously. erogeneous architectures for AI with conventional cores for Besides, in order to stabilize the performance of the graphene generic computations and accelerators for certain tasks. In the antenna, a delicate encapsulation of the graphene sheet is accelerator side, let us consider an emerging technology such highly demanded for the graphene antennas. This layer would as in-memory computing, which is particularly well suited for also affect the doping level of graphene significantly. All this deep learning [14]. requires high-level process and materials , which In deep learning applications, each layer of a deep neural remains a big challenge for the immature graphene-based network can be mapped to an in-memory computing core technology. that stores the corresponding synaptic weights. However, such accelerators present unique challenges when it comes to core- to-core communication and interconnection with conventional MODELSAND PROTOCOLSFOR WIRELESS IN-PACKAGE digital cores. The physically localized nature of synaptic COMMUNICATIONS weights in AI applications entails the need for communication The chip scale is one of the last frontiers in wireless across larger physical distances which could be prohibitive for communications and, as such, challenges abound. The scenario wired interconnects. Moreover, deep neural networks often exhibits a unique combination of unexplored aspects, stringent feature the need of multicasting data simultaneously from constraints (i.e. low area and power) and high performance one core to multiple others. In fact, one-to-many connectivity requirements (i.e. 1–10 ns and 10–100 Gb/s). For instance, seems to be increasingly prevalent in the most advanced the chip-scale THz channel remains largely unexplored, as networks, with one clear example being DenseNet, which cur- very few works have studied mmWave and THz wave prop- rently holds the record accuracy for ImageNet classification. In agation in enclosed packages [13]. Clearly, this prevents the this context, wireless technology brings multiple benefits, but direct application of channel models and protocols used in also poses several co-design challenges such as the optimal other scenarios and calls for novel, opportunistic, and highly dimensioning of the wireless network, the development of optimized solutions instead. appropriate multicasting schemes, or the optimal mapping In the envisaged scenario, physical layer protocols are chal- of neurons in the different wirelessly connected in-memory lenged by three main constraints. First, coherent schemes are computing cores. 6

HETEROGENEOUS SIMULATION FRAMEWORKS (ii) establishing methodologies for protocol-architecture co- design, and (iii) integrating wireless communication models The performance evaluation of complex computing systems into architectural simulators. embedding emerging technologies and architectures is gener- ally a considerable challenge that involves exploring the design space. The use of emerging technologies opens new explo- ration dimensions and thereby complicates the search, which ACKNOWLEDGMENT still needs to be performed in a computationally acceptable This work has been supported by the European Commission time-frame. Moreover, the exploration can only be done if the under H2020 grants WiPLASH (GA 863337), 2D-EPL (GA simulation tools are updated with appropriate performance and 952792), and Graphene Flagship (GA 881603); the European cost models of the new technology. Research Council under Consolidator grants COMPUSAPIEN Relevant to this work, wireless networking is generally (GA 725657) and PROJESTOR (GA 682675), the German poorly supported in computer architecture simulators. Existing Ministry of Education and Research under grant GIMMIK tools [4], [7] provide simple estimations of energy, perfor- (03XP0210) and the and the German Research Foundation mance, and area trade-offs for simple wireless interconnects under grant HIPEDI (WA 4139/1-1). in standard architectures. A cross-cutting integration between physical-level and system-level considerations is missing, which prevents the study of thermal effects, radiation and REFERENCES interference patterns, or their impact in the wireless network [1] B. Bohnenstiehl, A. Stillmaker, J. Pimentel, T. Andreas, B. Liu, A. Tran, performance. Therefore, new system simulators are required to E. Adeagbo, and B. Baas, “KiloCore: A Fine-Grained 1,000-Processor guide both application designers in the optimization of their Array for Task-Parallel Applications,” IEEE Micro, vol. 37, no. 2, pp. applications, and architects in evaluating complex heteroge- 63–69, 2017. [2] G. Nychis, C. Fallin, T. Moscibroda, O. Mutlu, and S. Seshan, “On-chip neous systems. networks from a networking perspective: congestion and scalability in The challenges associated with the development of such many-core interconnects,” in Proceedings of the SIGCOMM, 2012, pp. a simulator are several. First, the flexibility provided by the 407–18. [3] Y. S. Shao, J. Clemons, and R. Venkatesan et al., “Simba: Scaling frequency-beam reconfigurability and adaptive MAC protocols Deep-Learning Inference with Multi-Chip-Module-Based Architecture,” must be accurately modeled and integrated. Second, system- in Proceedings of the MICRO-52, 2019. level aspects such as the impact of transceiver circuits in [4] N. Enright Jerger, A. Kannan, Z. Li, and G. H. Loh, “NoC Architectures for Silicon Interposer Systems,” Proceedings of MICRO-47, pp. 458– the thermal profile of the chip, and vice versa, must be 470, 2014. evaluated and modeled as well. Third, and as mentioned above, [5] J. Kim, K. Choi, and G. Loh, “Exploiting new interconnect technologies wireless communication may lead to disruptive changes in in on-chip communication,” IEEE Journal on Emerging and Selected the architecture –and the simulator has to be prepared to Topics in Circuits and Systems, vol. 2, no. 2, pp. 124–136, 2012. [6] M. Wade, R. Meade, and C. Ramamurthy et al., “TeraPHY: A Chiplet cover such cases. This may involve the modeling of emerging Technology for Low-Power, High-Bandwidth In-Package Optical I/O,” technologies and specialized workloads that may be interesting IEEE Micro, vol. 40, no. 2, pp. 63–71, 2020. for wireless-enabled architectures, e.g. in-memory computing [7] S. Shamim, N. Mansoor, R. S. Narde, V. Kothandapani, A. Ganguly, and J. Venkataraman, “A Wireless Interconnection Framework for Seamless cores for AI acceleration. In all cases, the simulator must Inter and Intra-chip Communication in Multichip Systems,” IEEE Trans- avoid calling complex physical simulators and, instead, rely actions on Computers, vol. 66, no. 3, pp. 389–402, 2017. on accurate behavioral models based on antenna and circuit [8] D. Correas-Serrano and J. S. Gomez-Diaz, “Graphene-based Antennas for Terahertz Systems: A Review,” FERMAT, 2017. [Online]. Available: characterization, wave propagation simulation, protocol de- http://arxiv.org/abs/1704.00371 sign, and architecture development. [9] M. Saeed, A. Hamed, Z. Wang, M. Shaygan, D. Neumaier, and R. Negra, “Graphene integrated circuits: new prospects towards receiver realisa- tion,” Nanoscale, vol. 10, no. 1, pp. 93–99, 2018. [10] S. Abadal, E. Alarcon,´ M. C. Lemme, M. Nemirovsky, and A. Cabellos- CONCLUSION Aparicio, “Graphene-enabled Wireless Communication for Massive Multicore Architectures,” IEEE Communications Magazine, vol. 51, Computing system design trends are leading towards multi- no. 11, pp. 137–143, 2013. chip and accelerator-rich architectures, thus increasing the [11] A. Singh, M. Andrello, N. Thawdar, and J. M. Jornet, “Design and operation of a graphene-based plasmonic nano-antenna array for com- pressure cast to the already struggling wired interconnects. munication in the terahertz band,” IEEE Journal on Selected Areas in Wireless interconnects, with their broadcast capabilities and Communications, vol. 38, no. 9, pp. 2104–2117, 2020. flexibility, have been regarded as a valid complement to the [12] D. Neumaier, S. Pindl, and M. C. Lemme, “Integrating graphene into wired backbone, but they fall short in terms of bandwidth. In semiconductor fabrication lines,” Nature Materials, vol. 18, no. 6, p. 525, 2019. this position paper, we claim that graphene antennas can make [13] X. Timoneda, S. Abadal, A. Franques, D. Manessis, J. Zhou, J. Torrellas, a disruptive impact in future computing systems by virtue of E. Alarcon,´ and A. Cabellos-Aparicio, “Engineer the Channel and Adapt its high bandwidth and beam-frequency reconfigurability. The to it: Enabling Wireless Intra-Chip Communication,” IEEE Transactions on Communications, vol. 68, no. 5, pp. 3247–3258, 2020. use of graphene can actually turn a mere wireless interconnect [14] A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh, and E. Eleftheriou, into an agile and dense nanonetwork capable of adapting to “Memory devices and applications for in-memory computing,” Nature the needs of the architecture. For this vision to be realized, Nanotechnology, vol. 15, pp. 529–544, 2020. [15] A. Hamed, M. Saeed, and R. Negra, “Graphene-based frequency- however, challenges need to be addressed at multiple levels, conversion mixers for high-frequency applications,” IEEE Transactions from which we highlight the need for (i) optimized methods on Microwave Theory and Techniques, vol. 68, no. 6, pp. 2090–2096, to integrate the graphene antenna into the chip environment, 2020. 7

Sergi Abadal is a Distinguished Researcher at Universitat Politecnica` de Joshua Klein is a doctoral assistant at the Ecole´ Polytechnique Fed´ erale´ de Catalunya (UPC). His research interests include graphene antennas, chip-scale Lausanne (EPFL). His research interests include system-level design, real-time wireless communications, and computer architecture. systems, and RISC architectures for machine learning.

Robert Guirado is a graduate student at UPC. His research interests include wireless networks-on-chip, graph neural networks, and antenna design. Marina Zapater is Associate Professor at HES-SO and external collaborator in EPFL. Her research interests include the design and optimization of heterogeneous architectures, from AI-enabled edge devices to HPC servers.

Hamidreza Taghvaee is a PhD student at UPC. His main research interests include electromagnetics, metamaterials, and antenna design. Alexandre Levisse is a postdoctoral researcher in the Embedded Systems Laboratory (ESL) at EPFL. His research focuses on emerging technologies and architecture design for energy-efficient computing.

Akshay Jain is a Postdoctoral researcher at UPC, Spain. His research interests include 5G and beyond networks, machine learning, and optimization methods.

David Atienza is Professor of EE and Heads the Embedded Systems Laboratory (ESL) at EPFL. His research focuses on design methodologies for energy-efficient HPC and edge-AI computing for IoT. Elana Pereira de Santana is a PhD student at the Institute of High Frequency and Quantum at University of Siegen. Her research interests include 2-D materials and graphene-based antennas for THz communications.

Davide Rossi Assistant Professor at the University of Bologna. His research interests include ultra-low power multicore SoC design. Peter Haring Bol´ıvar is Chair for High Frequency and Quantum Electronics, University of Siegen. His research interests include terahertz technology, high frequency electronics, nanotechnology, and photonics.

Francesco Conti is Assistant Professor the University of Bologna. His research interests include machine learning applications and accelerators. Mohamed Saeed is a Postdoctoral researcher at the chair of High Frequency Electronics, RWTH Aachen University. His research interests include mixed signal and high frequency circuits in CMOS, III-V, and emerging technologies.

Martino Dazzi is a predoctoral researcher at IBM Research Europe. His research interests include machine learning, in-memory computing, and hard- ware acceleration for machine learning applications. Renato Negra is Professor for High Frequency Electronics at the RWTH Aachen University. His research interests include high frequency systems and circuits in both silicon, III-V, and 2D-material technologies.

Geethan Karunaratne is a predoctoral researcher at IBM Research Europe. His research interests include in-memory computing and emerging brain- Zhenxing Wang is the head of Graphene Electronics Group of AMO GmbH, inspired computing paradigms such as hyperdimensional computing. Aachen. His research interests include electronic, optoelectronic devices and circuits based on graphene and 2-D materials.

Irem Boybat is a postdoctoral researcher at IBM Research Europe. Her Kun-Ta Wang is a PhD student at RWTH Aachen University and research research interests include in-memory computing, neuromorphic computing assistant at AMO GmbH, Aachen. His research interests include graphene and emerging memory technologies. based antenna and electronic devices.

Max C. Lemme is Professor of Electronic Devices at RWTH Aachen Abu Sebastian is a Distinguished Research Staff Member at IBM Research University and Director of AMO GmbH, Aachen. His research interests Europe. His research interests include neuromorphic and in-memory comput- include electronic, optoelectronic and nanoelectromechanical devices based ing, exploratory memory devices and nanoscale science and engineering. on graphene, 2-D materials and Perovskites.