GEMM-like Tensor-Tensor Contraction Paul Springer Paolo Bientinesi AICES, RWTH Aachen University AICES, RWTH Aachen University Aachen, Germany Aachen, Germany
[email protected] [email protected] 1 INTRODUCTION 2 GEMM-LIKE TENSOR CONTRACTION Tensor contractions (TCs)—a generalization of matrix-matrix mul- Let A, B, and C be tensors (i.e., multi-dimensional arrays), a tensor tiplications (GEMM) to multiple dimensions—arise in a wide range contraction is: of scientific computations such as machine learning1 [ ], spectral CI α × AI × BI + β × CI ; (1) element methods [10], quantum chemistry calculations [2], mul- C A B C tidimensional Fourier transforms [4] and climate simulations [3]. where IA, IB, and IC denote symbolic index sets. We further define Despite the close connection between TCs and matrix-matrix prod- three important index sets: Im := IA \ IC, In := IB \ IC, and ucts, the performance of the former is in general vastly inferior to Ik := IA \ IB, respectively denoting the free indices of A, the that of an optimized GEMM. To close such a gap, we propose a novel free indices B, and the contracted indices. Moreover, we adopt the approach: GEMM-like Tensor-Tensor (GETT) contraction [8]. Einstein notation, indicating that all indices of Ik are implicitly GETT captures the essential features of a high-performance summed (i.e., contracted). GEMM. In line with a highly optimized GEMM, and in contrast to Adhering to this definition, we can reformulate Equation (1) previous approaches to TCs, our high-performance implementa- such that its similarity to a matrix-matrix multiplication becomes tion of the GETT approach is fully vectorized, exploits the CPU’s apparent: cache hierarchy, avoids explicit preprocessing steps, and operates C C α ×A A ×B B +β ×C C ; (2) Π ¹Im [In º Π ¹Im [Ik º Π ¹In [Ik º Π ¹Im [In º on arbitrary dimensional subtensors while preserving stride-one A B C memory accesses.