A Many-core Architecture for In-Memory Data Processing Sandeep R Agrawal Sam Idicula Arun Raghavan
[email protected] [email protected] [email protected] Oracle Labs Oracle Labs Oracle Labs Evangelos Vlachos Venkatraman Govindaraju Venkatanathan Varadarajan
[email protected] [email protected] venkatanathan.varadarajan@oracle. Oracle Labs Oracle Labs com Oracle Labs Cagri Balkesen Georgios Giannikis Charlie Roth
[email protected] [email protected] [email protected] Oracle Labs Oracle Labs Oracle Labs Nipun Agarwal Eric Sedlar
[email protected] [email protected] Oracle Labs Oracle Labs ABSTRACT ACM Reference format: For many years, the highest energy cost in processing has been Sandeep R Agrawal, Sam Idicula, Arun Raghavan, Evangelos Vlachos, Venka- traman Govindaraju, Venkatanathan Varadarajan, Cagri Balkesen, Georgios data movement rather than computation, and energy is the limiting Giannikis, Charlie Roth, Nipun Agarwal, and Eric Sedlar. 2017. A Many-core factor in processor design [21]. As the data needed for a single Architecture for In-Memory Data Processing. In Proceedings of MICRO-50, application grows to exabytes [56], there is clearly an opportunity Cambridge, MA, USA, October 14–18, 2017, 14 pages. to design a bandwidth-optimized architecture for big data compu- https://doi.org/10.1145/3123939.3123985 tation by specializing hardware for data movement. We present the Data Processing Unit or DPU, a shared memory many-core that is specifically designed for high bandwidth analytics workloads. 1 INTRODUCTION The DPU contains a unique Data Movement System (DMS), which A large number of data analytics applications in areas varying provides hardware acceleration for data movement and partition- from business intelligence, health sciences and real time log and ing operations at the memory controller that is sufficient to keep telemetry analysis already benefit from working sets that span up with DDR bandwidth.