Beyond Macrobenchmarks: Microbenchmark-based Graph Database Evaluation Matteo Lissandrini Martin Brugnara Yannis Velegrakis Aalborg University University of Trento University of Trento
[email protected] [email protected] [email protected] ABSTRACT There are two categories of graph management systems Despite the increasing interest in graph databases their re- that address two complementary yet distinct sets of func- quirements and specifications are not yet fully understood tionalities. The first is that of graph processing systems [27, by everyone, leading to a great deal of variation in the sup- 42,44], which analyze graphs to discover characteristic prop- ported functionalities and the achieved performances. In erties, e.g., average connectivity degree, density, and mod- this work, we provide a comprehensive study of the exist- ularity. They also perform batch analytics at large scale, ing graph database systems. We introduce a novel micro- implementing computationally expensive graph algorithms, benchmarking framework that provides insights on their per- such as PageRank [54], SVD [23], strongly connected com- formance that go beyond what macro-benchmarks can of- ponents identification [63], and core identification [12, 20]. fer. The framework includes the largest set of queries and Those are systems like GraphLab, Giraph, Graph Engine, operators so far considered. The graph database systems and GraphX [67]. The second category is that of graph are evaluated on synthetic and real data, from different do- databases (GDB for short) [5]. Their focus is on storage mains, and at scales much larger than any previous work. and querying tasks where the priority is on high-throughput The framework is materialized as an open-source suite and and transactional operations.