Breadth First Search Vectorization on the Intel Xeon Phi Mireya Paredes Graham Riley Mikel Luján University of Manchester University of Manchester University of Manchester Oxford Road, M13 9PL Oxford Road, M13 9PL Oxford Road, M13 9PL Manchester, UK Manchester, UK Manchester, UK
[email protected] [email protected] [email protected] ABSTRACT Today, scientific experiments in many domains, and large Breadth First Search (BFS) is a building block for graph organizations can generate huge amounts of data, e.g. social algorithms and has recently been used for large scale anal- networks, health-care records and bio-informatics applica- ysis of information in a variety of applications including so- tions [8]. Graphs seem to be a good match for important cial networks, graph databases and web searching. Due to and large dataset analysis as they can abstract real world its importance, a number of different parallel programming networks. To process large graphs, different techniques have models and architectures have been exploited to optimize been applied, including parallel programming. The main the BFS. However, due to the irregular memory access pat- challenge in the parallelization of graph processing is that terns and the unstructured nature of the large graphs, its large graphs are often unstructured and highly irregular and efficient parallelization is a challenge. The Xeon Phi is a this limits their scalability and performance when executing massively parallel architecture available as an off-the-shelf on off-the-shelf systems [18]. accelerator, which includes a powerful 512 bit vector unit The Breadth First Search (BFS) is a building block of with optimized scatter and gather functions.