A Systematic Survey of General Sparse Matrix-Matrix Multiplication

A Systematic Survey of General Sparse Matrix-Matrix Multiplication Jianhua Gao Weixing Ji School of Computer Science and Technology School of Computer Science and Technology Beijing Institute of Technology Beijing Institute of Technology Beijing, China Beijing, China [email protected] [email protected] Zhaonian Tan Yueyan Zhao School of Computer Science and Technology School of Information and Electronics Beijing Institute of Technology Beijing Institute of Technology Beijing, China Beijing, China tan [email protected] [email protected] Abstract—SpGEMM (General Sparse Matrix-Matrix Multipli- in the ith row of A, also a∗j to denote the vector comprising cation) has attracted much attention from researchers in fields of the entries in the jth column of A. In the general matrix- of multigrid methods and graph analysis. Many optimization matrix multiplication (GEMM), the ith row of the result matrix techniques have been developed for certain application fields and computing architecture over the decades. The objective of this C can be generated by paper is to provide a structured and comprehensive overview c = (a · b ; a · b ; :::; a · b ); (1) of the research on SpGEMM. Existing optimization techniques i∗ i∗ ∗1 i∗ ∗2 i∗ ∗r have been grouped into different categories based on their target where the operation · represents the dot product of two vectors problems and architectures. Covered topics include SpGEMM [17]. applications, size prediction of result matrix, matrix partitioning and load balancing, result accumulating, and target architecture- Compared with GEMM, SpGEMM requires to exploit the oriented optimization. The rationales of different algorithms in sparsity of the two input matrices and the result matrix because each category are analyzed, and a wide range of SpGEMM algo- of huge memory and computational cost for the zero entries rithms are summarized. This survey sufficiently reveals the latest with dense formats. Most of the research on SpGEMM is progress and research status of SpGEMM optimization from proposed based on the row-wise SpGEMM algorithm given 1977 to 2019. More specifically, an experimentally comparative by Gustavson [18]. The i-th row of the result matrix C is study of existing implementations on CPU and GPU is presented. P Based on our findings, we highlight future research directions and given by ci∗ = j aij ∗ bj∗, where the operator ∗ represents how future studies can leverage our findings to encourage better the scalar multiplication, and the computation only occurs in design and implementation. non-zero entries of sparse matrices A and B. The pseudocode Index Terms—SpGEMM, Parallel Optimization, Sparse Matrix for Gustavson’s SpGEMM algorithm is presented in Algorithm Partitioning, Parallel Architecture. 1. The goal of this survey paper is to provide a structured I. INTRODUCTION and broad overview of extensive research on SpGEMM design General sparse matrix-matrix multiplication, abbreviated as and implementations spanning multiple application domains arXiv:2002.11273v1 [cs.DC] 26 Feb 2020 SpGEMM, is a fundamental and expensive computational ker- and research areas. We not only strive to provide an overview nel in numerous scientific computing applications and graph of existing algorithms, data structures, and libraries available algorithms, such as algebraic multigrid solvers [1] [2] [3] but also try to uncover the design decisions behind the [4], triangle counting [5] [6] [7] [8], multi-source breadth- various implementation by the designers. It gives an in-depth first searching [9] [10] [11], the shortest path finding [12], presentation of the many algorithms and libraries available for colored intersecting [13] [14], and subgraphs [15] [16]. Hence computing SpGEMM. the optimization of SpGEMM has the potential to impact a Sparse matrix has been a hot topic of many surveys and wide variety of applications. reviews. Duff et al. [19] provide an extensive survey of Given two input sparse matrices A and B, SpGEMM sparse matrix research developed before the year of 1976. A computes a sparse matrix C such that C = A × B, where broad review of the sparse matrix and vector multiplication is A 2 Rp×q;B 2 Rq×r;C 2 Rp×r, and the operator × presented by Grossman et al. [20]. Over the past decades, represents the general matrix multiplication. Let aij denote SpGEMM attracts intensive attention in a wide range of the element in the ith row and the jth column of matrix A, application fields, and many new designs and optimizations are and we write ai∗ to denote the vector comprising of the entries proposed targeting different architectures. The development of Algorithm 1: Pseudocode for the Gustavson’s SpGEMM II. LITERATURE OVERVIEW algorithm. A. Research Questions Input: A 2 p×q;B 2 q×r R R This study follows the guidelines of the systematic literature Output: C 2 p×r R review proposed by Kitchenham [22], which is initially used in 1 for a 2 A do i∗ medical science but later gained interest in other fields as well. 2 for a 2 a and a is non-zero do ij i∗ ij According to the three main phases: planning, conducting, 3 for b 2 b and b is non-zero do jk j∗ jk and reporting, we have formulated the following research 4 value a ∗ b ; ij jk questions: 5 if c 2= c then ik i∗ RQ1: What are the applications of SpGEMM, and how are 6 c 0; ik they formulated? 7 end RQ2: What is the current state of SpGEMM research? 8 c c + value; ik ik RQ3: How do the SpGEMM implementations on the off- 9 end the-shelf processors perform? 10 end RQ4: What challenges could be inferred from the current 11 end research effort that will inform future research? Regarding RQ1, the study will address how SpGEMM is used in these applications, including the sub-questions like multi/many-core architectures and parallel programming have what are the typical applications of SpGEMM? Is it used in brought new opportunities for the acceleration of SpGEMM, an iterative algorithm? Do the matrices keep constant in the and also challenging problems emerged need to be considered computation? This will give an insight into the requirements of intensively. To the best of our knowledge, this is the first SpGEMM in solving real problems. In RQ2, the study looks survey paper that overviews the developments of SpGEMM at existing techniques that were proposed in recent decades. over the decades. The goal of this survey is to present a Regarding RQ3, we will perform some evaluations for both working knowledge of the underlying theory and practice of CPU and GPU to have a general idea about the performance SpGEMM for solving large-scale scientific problems, and to of these implementations. Finally, in RQ4, we will summarize provide an overview of the algorithms, data structures, and the challenges and future research directions according to our libraries available to solve these problems, so that the readers investigative results. can both understand the methods and know how to use them B. Search Procedure best. This survey covers the application domains, challenging problems, architecture-oriented optimization techniques, There have been some other reviews presented in the related and performance evaluation of available implementations for work section [21]. However, they cover a more limited number SpGEMM. We adopt the categories of challenging problems of studies. We evaluate and interpret all high-quality studies given in [21] and summarize existing techniques about size related to SpGEMM in this study. predicting of the result matrix, load balancing, and intermedi- To have a broad coverage, we perform a systematic literature ate results merging. survey by indexing papers from several popular digital libraries The main factors that differentiate our survey from the (IEEE Explore Digital Library, ACM Digital Library, Elsevier existing related work are: ScienceDirect, Springer Digital Library, Google Scholar, Web of Science, DBLP) using the keywords “SpGEMM”, “Sparse • the systematic process for collecting the data; matrix”, “sparse matrix multiplication”, “sparse matrix-matrix • a classified list of studies on both applications and opti- multiplication”. It is an iterative process, and the keywords mization techniques; are fine-tuned according to the returned results step by step. • analysis of the collected studies in the view of both Just like other literature survey work, we try to broaden the algorithm design and architecture optimization; search as much possible while maintaining a manageable result • an detailed evaluation of 6 library implementations. set. In order to do that, only refereed journal and conference The rest of the paper is organized as follows. Section publications were included. Then, we read the titles and 2 gives a brief overview of this survey. Section 3 intro- abstracts of these papers and finally include 87 papers of more duces several classical applications of SpGEMM in detail. than 142 papers we found in this survey. In section 4, the related work on three pivotal problems of SpGEMM computation is presented. In section 5, the related C. Search Result work of SpGEMM optimization based on different platforms It is difficult to give a sufficient and systematic taxonomy is introduced in detail. Besides, we perform a performance for classifying SpGEMM research because the same topic may evaluation employing different SpGEMM implementations in have different meanings in different contexts. For example, section 6. A discussion about the challenges and future work load balance of SpGEMM in distributed computing, shared of SpGEMM is presented in section 7, and followed

Load more