Graph analytics using vertica relational database

The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.

Citation Jindal, Alekh et al. “Graph Analytics Using Vertica Relational Database.” 2015 IEEE International Conference on Big Data (Big Data), October 29 - November 1 2015, Santa Clara, California, USA, Institute of Electrical and Electronics Engineers (IEEE), December 2015: 1191-1200 © 2015 Institute of Electrical and Electronics Engineers (IEEE)

As Published http://dx.doi.org/10.1109/BigData.2015.7363873

Publisher Institute of Electrical and Electronics Engineers (IEEE)

Version Original manuscript

Citable link http://hdl.handle.net/1721.1/111783

Terms of Use Creative Commons Attribution-Noncommercial-Share Alike

Detailed Terms http://creativecommons.org/licenses/by-nc-sa/4.0/ Graph Analytics using the Vertica Relational Database

Alekh Jindal∗ Samuel Madden∗ Malu´ Castellanos? Meichun Hsu? ∗CSAIL, MIT ?Vertica, HP Software

Abstract regard, but have typically only evaluated one or a small number of Graph analytics is becoming increasingly popular, with a deluge of benchmarks, and not demonstrated the feasibility of implementing new systems for graph analytics having been proposed in the past an efficient, general-purpose graph engine in a relational system. few years. These systems often start from the assumption that a Apart from the need to avoid copying data in and out of file sys- new storage or query processing system is needed, in spite of graph tems, graph engines suffer from another limitation. As graphs get data being often collected and stored in a relational database in the larger and larger, frequently the users want to (or will have to) select first place. In this paper, we study Vertica relational database as a subset of a graph before performing analysis on it. For example, a platform for graph analytics. We show that vertex-centric graph it is unlikely that a user will run a single-source shortest path query analysis can be translated to SQL queries, typically involving table on the entire trillion node Facebook graph — this would be pro- scans and joins, and that modern column-oriented databases are hibitively slow on any system. Rather, it is more likely that users very well suited to running such queries. Specifically, we present will run several shortest paths queries over different subsets of the an experimental evaluation of the Vertica relational database sys- graph (e.g., the N-hop neighbors of some particular user.) Further- tem on a variety of graph analytics, including iterative analysis, a more, real-world graphs have vertices and edges accompanied by combination of graph and relational analyses, and more complex 1- several other attributes. For example, edges in a social network may hop neighborhood graph analytics, showing that it is competitive to be of different types such as friends, family, or classmates. Simi- two popular vertex-centric graph analytics systems, namely Giraph larly nodes may have several attributes to describe the properties and GraphLab. of each person in the social network, e.g., their username, birth- date, and so on. Given such metadata, an analyst would typically do some ad-hoc relational analysis in addition to the graph anal- 1. INTRODUCTION ysis. For instance, the analyst may want to preprocess and shape Recent years have seen growing interest in the area of graph data the graph before running the actual graph algorithm, e.g., filtering management from businesses and academia. This focus on graphs edges based on timestamp or limiting the graph to just close friends arises from their use in a number of new applications, including so- of a particular user. Similarly, he may want to analyze the output of cial network analytics, transportation, ad and e-commerce recom- the graph analysis, e.g., computing aggregates to count the number mendation systems, and web search. As a result, a deluge of new of edges or nodes satisfying some property, or other statistics. Such graph data management systems have been proposed. In particular, pre- and post- processing of graph data requires relational operators many systems focus on graph analytics, e.g., efficiently computing such as selection, projection, aggregation, and join, for which rela- statistics and other metrics over graphs, such as PageRank or short- tional databases are highly optimized. est paths. These graph analytics workloads are seen as quite differ- In addition to combining relational analysis, several graph anal- ent from traditional database analytics, largely due to the iterative yses compute aggregates over a larger neighborhood. For example, nature of many of these computations, and the perceived awkward- counting the triangles in a graph requires every vertex to access ness of expressing graph analytics as SQL queries (which typically its neighbors’ neighbors (which could potentially form the three involves multiple self-joins on tables of nodes and edges). Exam- vertices of the triangle). Likewise, finding whether a vertex acts ples of these new systems include a number of so-called “vertex- as a bridge (weak ties) between two disconnected vertices requires centric” systems (e.g., Pregel [1], Giraph [2], GraphLab [3], Trin- every vertex to check for the presence of edges between its neigh- ity [4], and Pregelix [5]). bors, i.e. the 1-hop neighborhood. Vertex-centric interfaces like arXiv:1412.5263v1 [cs.DB] 17 Dec 2014 Pregel [1] are tedious for expressing the above queries, as they re- 1.1 Why Relational Databases? quire sending neighborhood information to all its neighbors in the Given the popular demand for graph analytics, a natural question first superstep and then performing the actual analysis in the sec- is whether or not traditional database systems really are a bad fit for ond superstep. SQL, on the other hand, is a much more powerful these graph analytics workloads? This question arises because, in and general purpose language for capturing such analyses. For ex- many real-world scenarios, graph data is collected and stored in a ample, in Vertica, we can express triangle counting as a three-way relational database in the first place and it is expensive to move data self-join over the edge table very efficiently [7]. Similarly, we can around. Given this, if it is avoidable, users may prefer not to ex- detect weak ties using two inner joins (to get the two vertices on port their data from the relational database into a graph database. either side of the bridge) and one outer join (to make sure that the Rather, they would like to perform the graph analytics (with com- two vertices are not connected by an edge). Thus, by simply adding parable performance) directly with the relational engine, without more joins, SQL is more flexible at expressing such graph analyses. the expensive step of copying data into a file system (or distributed Finally, the query optimizer in a relational database picks the storage system like HDFS), in order to be processed by a graph best plan for the multi-join queries and the system can take care of system, and then (possibly) back into the relational system for fur- re-segmenting the data when needed. As an example, the optimizer ther processing. Indeed, some early efforts to implement graph may decide to fully pipeline the triangle counting query and never queries in relational databases [6,7,8] have shown promise in this materialize the intermediate output. This is in contrast to static

1 query execution plans in typical graph analytics systems such as public void compute( Iterable messages) { Giraph. Furthermore, since graph manipulations in Giraph are not // get the minimum distance implemented as operators, it is difficult to modify or extend the if ( getSuperstep () == 0) setValue(new DoubleWritable(Integer .MAX VALUE)); Giraph execution pipeline. Relational databases, on the other hand, int minDist = isSource () ? 0 : Integer .MAX VALUE; are extensible by design. for ( IntWritable message : messages) minDist = Math.min(minDist, message.get () ) ; 1.2 Why Column Stores? // send messages to all edges if new minimum is found Graph analytics typically involves scan-oriented table joins fol- if (minDist < getValue() . get () ) { lowed by aggregates, which are highly suited for column-oriented setValue(new IntWritable (minDist)) ; relational databases like Vertica. In an earlier effort, we compared for (Edge edge : getEdges()) { int distance = minDist + edge.getValue() . get () ; the performance of different relational data stores over graph queries sendMessage(edge. getTargetVertexId () , new IntWritable ( distance )) ; and column stores were a clear winner [9]. This is due to combi- } nation of several features in modern column stores, including effi- } voteToHalt() ; // halt cient data storage (vertical partitioning and compression), vector- } ized data access, query pipelining, late materialization, and numer- ous join optimizations. Furthermore, in contrast to the narrow ta- Listing 1: Single Source Shortest Path in Giraph. bles in raw graphs, the presence of metadata results in wide vertex and edge tables, where column stores will perform especially well extend Vertica to trade an increased memory footprint for faster with as they only need to access the columns that are relevant for runtimes, comparable to that of GraphLab, (iv) relational engines the analysis. naturally excel at combining graph analysis with relational analy- In this paper, we describe four key aspects necessary to build sis, and (v) relational engines can implement more complex 1-hop high-performance graph analytics in the Vertica column-oriented neighborhood graph analyses, which vertex-centric programming database. First, we look at how we can translate the logical query cannot express efficiently (Section4). plans of vertex-centric graph queries into relational operators and run them as standard SQL. Although vertex compute functions can 2. BACKGROUND be rewritten into pure SQL in some cases, we find that table UDFs Vertex-centric graph processing, first proposed in Pregel [1], has (offered by many relational databases, including Vertica) are suffi- become the most popular general purpose way of processing graph cient to express arbitrarily complex vertex functions as well. Sec- data, due to its ease-of-use and proven ability of engines based on ond, we show several query optimization techniques to tune the it to scale to large graphs [3, 10,4, 11,5]. In this section, we first performance of graph queries on Vertica. These include consid- recap the vertex-centric programming model. Then, to understand ering updating vs replacing the nodes table on each iteration, in- the graph processing in a typical vertex-centric system, we analyze cremental evaluation of queries, and eliminating redundant joins. the execution pipeline in Giraph, an open-source implementation Third, we outline several features specific to a column-store like of Pregel, and express it as a logical query plan. Other Pregel- Vertica that makes it well suited to run graph analytics queries. Fi- like systems use a similar static query plan, though some may use nally, we show how Vertica can be optimized using table UDFs different scheduling strategies, e.g. GraphLab. to run iterative graph analytics in-memory, which significantly re- duces the disk I/Os (and overall query time) at the cost of higher 2.1 Vertex-centric Model memory footprint. In the vertex-centric programming model, the user provides a Contributions. In summary, our key contributions are as follows: UDF (the vertex program) specifying the computation that happens (1.) We take a closer look at vertex-centric graph processing, using at each vertex of the graph. The UDFs update the vertex state and the Giraph system (a popular graph analytics system) as an exam- communicate by sharing messages with neighboring vertices. The ple. We show that vertex-centric graph processing can be expressed underlying execution engine may choose to run the vertex-centric as a query execution plan, which in the case of Giraph is a fixed programs synchronously, as a series of supersteps with synchro- plan that is used to run all Giraph programs. We then show that this nization between then, or asynchronously, where threads update a plan can be expressed as a logical query plan that can be optimized representation of the graph in shared memory. Programmers do not using a relational query optimizer (Section2). have to worry about details such how the graph is partitioned across nodes/threads, how it is distributed across multiple machines, or (2.) We show how we can translate this vertex-centric plan into how message passing and coordination works. Each vertex may SQL, which can be run on standard relational databases. We de- be on the same physical machine or a different, remote machine. scribe several query optimizations to improve the performance of The concept is similar to MapReduce, where programmers only vertex-centric queries and describe Vertica specific features to run specify map and the reduce functions without worrying about the these queries efficiently. As a concrete example, we discuss the system details. To illustrate, Listing1 shows how a programmer physical query execution plan of single source shortest path on Ver- would implement single source shortest paths (SSSP) using Gi- tica. Lastly, we show how Vertica can be extended via table UDFs raph (other Pregel-like systems have very similar syntax). In this to run the entire unmodified vertex-centric query in-memory and as program, each vertex compares its current shortest distance to the a single transaction (Section3). source to the distance reported by each of its neighbors, and if a (3.) We provide an extensive experimental evaluation of several shorter distance is found, updates its distance and propagates the typical graph queries on large, billion-edge graphs in Vertica. We updated distance to its neighbors. compare it with two popular vertex-centric graph processing sys- tems, GraphLab and Giraph. Our key findings are: (i) Vertica has 2.2 Giraph Execution Pipeline comparable end-to-end performance to these popular vertex-centric We now provide a detailed study of execution workflow used in systems, (ii) Vertica has a much smaller memory footprint than Giraph, to illustrate the key steps in vertex-centric program execu- other systems, at the cost of much greater disk I/O, (iii) We can tion. Giraph runs graph analyses as user provided vertex-centric

2 Hash-based shuffling Hash-based shuffling No sorting, as HDFS G=(V,E) Giraph logical Pushing down the Replacing M No sorting, as Split opposed to Hadoop 1 query plan 2 vertexCompute UDF 3 by V E opposed to Hadoop Intermediate data is Intermediate data is W1 Scan W2 W3 W4 RecRead not persisted V’ U M’ M’ V’ not persisted … … … V.id=E.from Shuffle vertexCompute vertexCompute V’ E Server Data W1 W2 W3 W4 γ γ partition store V vertexCompute V1 Input Superstep edge store message store V.id=M.to … … … γ V1.id=E.to V.id=E.from V Master synchronize V2.id=E.from M V.id=M.to V1 vertexCompute V E V M V2 E Shuffle

W1 Server Data W2 W3 W V’ V’ Figure 2: Rewriting Giraph Logical Query Plan. partition store 4 V’ U M’ edge store Superstep 1 message store … … … σ σ V’ d’

Γ resentedΓ as a logical queryΓV1.r=0.15/n+0.85* plan, consisting of relational operators vertexCompute d’=min(V2.d+1) cc’=min(V2.id) and the vertexCompute UDF.sum(V2.r/V The2.outD) right-side of Figure1 shows such ……. Shuffle γ γ V γ γ V1 a simplifiedV1 logical queryV1 plan. Assuming that the graph structure

V1.id=E.to V1.id=E.to V11.id=E.to … itself remains unchanged , the Giraph execution pipeline is essen- Server Data V.id=M.to V2.id=E.from V2.id=E.from V2.id=E.from

Giraph Mapper W1 W2 W3 partition store W4 V1 Vtially1 a distributedV vertex1 update query. That is, it takes the set of edge store message store … … … V2 E verticesV2 V (eachE havingV an2 id andE a value), edges E (each having a Master synchronize V.id=E.from M source and a destination vertex id) and messages M (each having Giraph Mapper cleanup destination vertex id and the message value), runs the vertexCom- store V E pute UDF for each vertex, and produces the set of output vertices 0 0 HDFS (V ) and messages (M ), as shown on the right in Figure1. Output Superstep G’=(V’,E’) The downside of the above vertex-centric query execution in Gi- Physical Execution Pipeline Logical Query Plan raph is that all graph analysis is forced to fit into a fixed query Figure 1: Giraph Physical Execution Pipeline and its Logical plan. This is not desirable for several analyses. For example, trian- Representation. gle counting, which requires a three-way join over the edges table, is very difficult to fit in this model. Moreover, the Giraph logical programs on top of Hadoop MapReduce. The user provides the query plan is not really implemented as a composition of query op- computation that happens at each vertex of the graph and Giraph erators, making it very difficult to modify, extend, or add function- takes care of running it in a distributed fashion over a large cluster. ality to the execution pipeline. For instance, the join with M is im- Specifically, Giraph executes the vertex-centric query as a map- plemented as a sort merge join; changing to another join implemen- GiraphMapper only job (the ) on Hadoop MapReduce. However, it tation would require several deep changes in the system. Further- uses the Hadoop MapReduce infrastructure only to allocate nodes, more, even if one could extend or modify the physical execution using mappers simply as containers for Giraph workers. These map pipeline, e.g. switch merge join to hash join, Giraph cannot make jobs run for the duration of the job, repeatedly executing the com- dynamic decisions regarding the best physical plan. For example, pute UDF and communicating with other mappers over sockets. To hash join may be suitable for very large numbers of intermediate illustrate this data flow, the left-side of Figure1 shows the physical messages and merge join better for small numbers of messages. W W execution of Giraph with four workers 1 to 4. The execution Giraph does not have this flexibility. Finally, Giraph is a custom supersteps in Giraph is organized into , wherein each worker oper- built query processor restricted to a specific type of graph analysis. ates in parallel during the superstep and the workers synchronize It cannot be used for more broader types of queries, e.g. multi-hop InputSuperstep at the end of the superstep. During the , the system analysis, or end-to-end graph analysis, e.g. analyzing the output of V E splits the input graph into a list of vertices and list of edges graph analysis, or combining multiple graph analyses. as it reads data from HDFS. Each worker reads the split assigned In the rest of the paper, we show how relational databases can to it, parses it into vertices and edges, and partitions them across overcome many of these limitations and, in particular, how Vertica all workers, typically using a hash-based partitioner. Each worker is highly suited for a variety of graph analytics. then builds its ServerData, consisting of three components: (i) the partition store to keep the partition vertices and related metadata, (ii) the edge store to keep the partition edges and related metadata, 3. GRAPH ANALYTICS USING VERTICA and (iii) the message store to keep the incoming messages for this In this section, we describe how vertex-centric queries can be run partition. At the end of the InputSuperstep, i.e. when all work- in SQL on a relational database like Vertica. The goal of this sec- ers have finished creating the ServerData, the workers are ready to tion is to show how we can: (i) translate vertex-centric graph analy- perform the actual vertex computation. In each superstep after the ses to standard SQL queries, (ii) apply several query optimizations InputSuperstep, the workers run the vertexCompute UDF for the to improve the performance of graph analyses, and (iii) leverage vertices in their respective partition and shuffle the outgoing mes- key features of Vertica for efficiently executing these graph analyt- sages across all workers. The workers then update their respective ics queries. ServerData and wait for everyone to finish the superstep (the syn- chronize barrier). Finally, when there are no more messages to pro- 3.1 Translation to SQL cess, the workers store the output graph back in HDFS during the In the following, we describe how we rewrite and translate the OutputSuperstep. Thus, we see that similar to MapReduce execu- Giraph logical query plan to standard SQL. tion in Hadoop [12], Giraph has a static hard-coded query execution pipeline. 3.1.1 Eliminating the message table 1This is true for several typical vertex-centric graph analysis, such 2.3 Logical Query Plan as PageRank, shortest paths, connected components, etc.

3 Hash-basedHash-basedHash-based shuffling shuffling shuffling No Nosorting, Nosorting, sorting, as as as GiraphGiraph logicalGiraph logical logical PushingPushing downPushing down the down the the ReplacingReplacingReplacing M M M opposedopposedopposed to Hadoopto Hadoopto Hadoop 1 1query1query planquery plan plan 2 2vertexCompute2vertexComputevertexCompute UDF UDF UDF 3 3by 3V by EV by EV E IntermediateIntermediateIntermediate data data is data is is notnot persistednot persisted persisted V’ UV’ M’ UV’ M’ U M’ M’ M’ M’ V’ V’ V’ V.id=E.fromV.id=E.fromV.id=E.from vertexComputevertexComputevertexCompute vertexComputevertexComputevertexCompute V’ V’ V’ E E E γ γ γ γ γ γ V V V vertexComputevertexComputevertexCompute V1 V1 V1 V.id=M.toV.id=M.toV.id=M.to γ γ γ V1.id=E.toV1.id=E.toV1.id=E.to V.id=E.fromV.id=E.fromV.id=E.from V V V V2.id=E.fromV2.id=E.fromV2.id=E.from M M M V.id=M.toV.id=M.toV.id=M.to V1 V1 V1 V V VE E E V V V M M M V2 V2 V2 E E E

V’ V’ V’ V’ V’ V’ are smaller than the already known distance. The resulting vertices can then be used to update the vertex relation, as shown in List- σ σ σ σ σ σ V’ V’ V’ d’

Γd’=min(VΓd’=min(VΓ2d’=min(V.d+1)2.d+1)2.d+1) Γcc’=min(VΓcc’=min(VΓcc’=min(V2.id) 2.id) 2.id) ΓV1.r=0.15/n+0.85*ΓV1.r=0.15/n+0.85*ΓV1.r=0.15/n+0.85* sum(Vsum(V2.r/Vsum(V22.outD).r/V22.outD).r/V2.outD) γ γ γ γ γ γ γ γ γ UPDATE vertex AS v SET v.d=v’.d V1 V1 V1 V1 V1 V1 V1 V1 V1 FROM ( V1.id=E.toV1.id=E.toV1.id=E.to V1.id=E.toV1.id=E.toV1.id=E.to V1.id=E.toV1.id=E.toV1.id=E.to SELECT v1.id, MIN(v2.d+1) AS d V2.id=E.fromV2.id=E.fromV2.id=E.from V2.id=E.fromV2.id=E.fromV2.id=E.from V2.id=E.fromV2.id=E.fromV2.id=E.from V1 V1 V1 V1 V1 V1 V1 V1 V1 FROM vertex AS v1, edge AS e, vertex AS v2 WHERE v2.id = e.from node AND v1.id = e.to node V2 V2 V2 E E E V2 V2 V2 E E E V2 V2 V2 E E E GROUP BY e.to node, v1.d Giraph Mapper Giraph Mapper Giraph Mapper (a) SSSP (b) CC (c) PageRank HAVING MIN(v2.d+1) < v1.d ) AS v’ Figure 3: Logical query plans for three vertex-centric queries: WHERE v.id=v’.id; (i) single source shortest path (SSSP), (ii) connect components (CC), and PageRank. Listing 2: Shortest Path in SQL. Finally, a driver program (run via a stored procedure) repeatedly runs the above shortest path query as long as there are any updates, Consider again the Giraph logical query plan, shown on the left i.e. at least one of the vertices finds a shorter distance. in Figure2. The relation M in this plan is an intermediate output 3.2 Query Optimizations and an artifact of message passing, a consistency mechanism in Giraph. Since relational databases take care of consistency and The advantage of expressing graph analysis as relational queries allow us to operate directly on the relational tables, we can get rid is that we can apply several relational query optimizations, i.e. we of M. Note that the relation M is used to communicate the new have the flexibility to optimize the queries in several different ways values of vertices to its neighbors. Therefore, we can push down in order to boost performance, in contrast to the hard-coded exe- the vertex compute function and obtain the new messages M 0 by cution pipeline in Giraph. In the following, we present three such joining the new vertex values (V 0) with the outgoing edges (E), as query optimizations that can be used to tune the performance. We shown in the middle of Figure2. Finally, we can replace M with use vertex-centric single source shortest paths as the running ex- V 1 E and get rid of relation M completely, as shown on the right ample. However, of course, these optimizations are applicable in in Figure2. This simplified query plan deals only with relations V general to SQL-based graph analytics. and E as the input and produces modified relation V 0 as output. 3.2.1 Update Vs Replace 3.1.2 Translating the vertex compute functions Graph queries often involve updating large portions of the graph over and over again. However, large number of updates can be a The vertexCompute in the Giraph logical query plan (Figure1) barrier to good performance, especially in read optimized systems can be an arbitrary user defined function, similar to map/reduce in like Vertica. To overcome this problem, we can instead replace the the MapReduce framework. However, for many graph analytics, vertex or edge table with a new copy of tables containing the up- the vertex function involves relatively simple and well defined ag- dated values. For instance, the single source shortest path involves gregate operations, which can be expressed directly in relational updating all vertices that find a smaller distance in an iteration. As algebra/SQL. For example, the vertex function for SSSP in List- we explore the graph in parallel, the number of such vertices can ing1 finds the MIN of the neighboring distances and applies the quickly grow very large. Therefore, instead of updating the ver- filter for detecting smaller distances, i.e.: tices in the existing vertex relation, we can create a new vertex

SSSP : vertexCompute 7−→ σd’

4 Hash-based shuffling No sorting, as Giraph logical Pushing down the Replacing M opposed to Hadoop 1 query plan 2 vertexCompute UDF 3 by V E Intermediate data is not persisted V’ U M’ M’ V’ V.id=E.from vertexCompute vertexCompute V’ E γ γ V vertexCompute V1 V.id=M.to γ V1.id=E.to V.id=E.from V V2.id=E.from M V.id=M.to V1 V E V M V2 E

V’ V’

σ σ V’ d’

5 BASE QUERY PLAN Query: explain SELECT e.to_node AS id, min(n1.value+1) AS value FROM twitter_node AS n1, twitter_edge AS e, twitter_node AS n2 WHERE n1.id=e.from_node AND n2.id=e.to_node GROUP BY e.to_node Hash-based shuffling HAVING min(n1.value+1) < min(n2.value); No sorting, as All Nodes Vector: opposed to Hadoop node[0]=node0 (initiator) Up Intermediate data is node[1]=node1 (executor) Up not persisted node[2]=node2 (executor) Up node[3]=node3 (executor) Up Giraph logical vertexCompute UDF Replacing join Root 1 query plan 2 as Table UDF 3 with union OutBlk=[UncTuple(2)] V’ U M’ V’ U M’ NewEENode V’ U M’ Table UDF OutBlk=[UncTuple(2)] Table UDF vertexCompute vertexCompute vertexCompute ExprEval: e.to_node, sort sort γV

V.id=M.to γV.pid Recv from: node0,node1,node2,node3 γV.pid V.id=E.from V.id=M.to M V.id=E.from U Send to: node0 V E M V E M V E FilterStep: ( < ) Figure 6: Rewriting Giraph Query Plan using Table UDFs. GroupByPipe: 1 keys 1 Disk-based Iterations 2 In-Memory Iterations Aggs: min((n1.value + 1)), min(n2.value) We now look at the a specific example of physicalV’ query execu- tion plan for single source shortest path (SSSP) on Vertica. Figure5 Table UDF StorageMergeStep: twitter_edge; 1 sorted shows the planV’ U for M’ running SSSP over the TwitterV’ graph U M’ (41 million nodes, 1.4Table billionUDF edges). The query involves twoVertex Compute joins, one hash GroupByPipe: 1 keys Aggs: min((n1.value + 1)), min(n2.value) and the othervertexCompute sort merge. To perform the hash join, the systemSynchronization Updates broadcasts thesort smaller node relation (shown in the middle). While ExprEval: M’ e.to_node, (n1.value + 1), n2.value scanning the largeγ edge relation, the query applies two SIP filters, V.pid V’ VE Join: Merge-Join: one for the hash join condition and other for the merge join con- using previous join and twitter_node_b0 U dition, in order to filter unnecessary tuples at the scansort step itself. Join: Hash-Join: The hash joinV blocksE M on the smaller node relation. However, once StorageMergeStep: twitter_node_b0; 1 sorted γ using twitter_edge and twitter_node_b0 the hash table is built, the remainder of the query is fullyV.pid pipelined, ScanStep: twitter_edge including the merge join, the group by, and writing theU final output. SIP2(HashJoin): e.from_node ScanStep: twitter_node_b0 Recv from: node0,node1,node2,node3 SIP1(MergeJoin): e.to_node id, value This is possible because the merge join and the groupV byE are on the to_node (not emitted),from_node same key. As a result, data does not need to be re-segmented for the group by and the system performs 1-pass aggregation locally Send to: node0,node1,node2,node3 on each machine.

StorageUnionStep: twitter_node_b0 The above query execution plan is different from the Giraph query execution pipeline of Figure1 in three ways: (i) it filters the ScanStep: twitter_node_b0 unnecessary tuples from the large edge table as early as possible by id, value using sideways information passing, (ii) it fully pipelines the query Figure 5: Query execution plan of shortest paths in Vertica. execution as opposed to blocking the data flow at the vertex func- tion in Giraph, and (iii) it picks the best join execution strategies Vertica supports pipelined query execution, which avoids mate- and broadcasts the data wherever required as compared to the static rializing intermediate results that would otherwise require repeated hard-coded join implementation in Giraph. As a result, Vertica is access to disk. This is important because graph queries involve able to produce better execution strategies for such graph queries. join operations that can have large intermediate results which can benefit dramatically from pipelining. For instance, in each itera- 3.4 Extending Vertica tion, the single source shortest path joins a vertex with its incom- Relational database are extensible by design via the use of UDFs. ing edges and incoming nodes, thereby resulting in an intermediate We see how we can extend Vertica to address two issues: (i) how result with cardinality equal to the number of edges. We can in- to run unmodified vertex programs without translating to SQL, and duce pipelining for such queries by creating sort orders on join and (ii) avoiding the expensive intermediate disk I/O in iterative graph group by attributes. Additionally, we can express graph operations queries. as nested queries, allowing the query optimizer to employ pipelin- ing between the inner and outer query when possible. This is in 3.4.1 Running Unmodified Vertex Programs contrast to Giraph, which blocks the execution and materializes all We saw in Section 3.1 that several common vertex functions can intermediate output before running the vertex compute function. be rewritten as relational operators. However, certain algorithms, Thus, pipelining allows Vertica to avoid materializing large inter- such as collaborative filtering, have more sophisticated vertex func- mediate outputs, which are typical in graph queries. This reduces tion implementations which cannot easily be mapped to SQL op- memory footprint and improves performance. erators. We can, however, still run vertex functions as table UDFs in Vertica without translating to relational operators. The middle 3.3.4 Intra-query Parallelism of Figure6 shows such a logical query plan. We first partition the Vertica includes capabilities that allow it employ multiple cores vertices, then sort the vertices in each partition, and finally invoke to process a single query. To allow Vertica to explore graphs in the table UDF for each partition. The table UDF iterates over each parallel as much as possible, we rewrite graph exploration queries vertex, invokes the vertexCompute function over it, and outputs the that involve a self-join on the edges table by adding a GROUP BY union of updated vertices (V 0) and outgoing messages (M 0). As an clause on the edge id, and let Vertica partition the groups across optimization, by batching several vertices in each table UDF, we CPU cores to process subgraphs in parallel. Though such parallel can significantly reduce the UDF overhead in relational databases. graph exploration ends up doing more work in each iteration, it still This query plan can be further improved by replacing the table joins reduces the number of joins and results in much better performance. with unions, as shown in the right of Figure6. The table UDF is then responsible for segregating the tuples from different tables be- 3.3.5 Example SSSP Query Execution on Vertica fore calling the vertex function.

6 Hash-based shuffling No sorting, as opposed to Hadoop Intermediate data is not persisted

Giraph logical vertexCompute UDF Replacing join 1 query plan 2 as Table UDF 3 with union

V’ U M’ V’ U M’ V’ U M’ Table UDF Table UDF vertexCompute vertexCompute vertexCompute sort sort γV

V.id=M.to γV.pid γV.pid V.id=E.from V.id=M.to M V.id=E.from U V E M V E M V E

Type Name Nodes Edges Disk-based Iterations 2 In-Memory Iterations 1 -small 81,306 1,768,149 V’ GPlus 107,614 13,673,453 Directed Table UDF LiveJournal 4,847,571 68,993,773 V’ U M’ V’ U M’ Twitter 41,652,230 1,468,365,182 YouTube 1,134,890 2,987,624 Table UDF Vertex Compute Undirected LiveJournal-undir 3,997,962 34,681,189 vertexCompute Synchronization Updates Table 1: The datasets used in the evaluation. sort M’ Metric Dataset Vertica GraphLab Giraph γV.pid V’ VE LiveJournal 45.927 15.621 12.049 Upload Time (sec) U Twitter 916.421 472.358 267.799 sort LiveJournal 0.423 3.030 3.030 Disk Usage (GB) V E M Twitter 9.964 73.140 73.140 γV.pid U Table 2: Data preparation over two datasets. V E 6.4. We ran all experiments with cold cache and report the average Figure 7: In-memory Vertex-centric Query Execution in Ver- of three runs. tica. Baselines. We compare Vertica with two popular vertex-centric 3.4.2 Avoiding Intermediate Disk I/Os graph processing engines, Giraph (version 1.0.0 running on Hadoop 1.0.4 with 4 workers per-node) and GraphLab (version 2.2 running Iterative queries generate a significant amount of intermediate on 4 nodes via MPI and using all available threads). data in each iteration. Since relational databases run iterative queries via an external driver program, the output of each iteration is spilled Datasets. We ran our benchmarks on a variety of datasets of vary- to disk, thereby resulting in substantial additional I/O. This I/O is ing sizes, including both directed as well as undirected graphs. also happens when running vertex functions as table UDFs. How- Table1 shows the different datasets used in our evaluation. All ever, we can implement a special table UDF in which the UDF in- datasets are publicly available at http://snap.stanford.edu/data. stances load the entire graph at the beginning and store the graph in Data Preparation. The queries in our experiments must be read shared memory, without writing the output of each iteration to disk data from an underlying data store before running the analysis. (emulating the Giraph-like map-only behavior using table UDFs While Vertica reads data from its internal data store, Giraph and in Vertica). The right side of Figure6 shows such a query plan. GraphLab read the data from HDFS. All datasets are stored as a Of course this approach has less I/O at the cost of a higher mem- list of nodes and a list of edges. For GraphLab, we further split the ory footprint. And since the entire graph analysis runs as a single data files into 4 parts, such that each node can load (ingress=grid) transaction, many of the database overheads such as locking, log- the graph in parallel during analysis. Table2 summarizes the data ging, and buffer lookups are further reduced. However, the UDF is preparation costs for the three systems. We can see that Giraph and now responsible for materializing and updating V 1 E, as well as GraphLab simply copy the raw files to HDFS and load faster than propagating the messages from one iteration to the other (the syn- Vertica. However, Vertica has significantly less disk usage, due to chronization barrier). Still, once implemented2, the shared memory compression and encoding, compared to Giraph and GraphLab. extension allows users to run unmodified vertex programs (or those which are difficult to translate to SQL). In some cases it can also 4.1 Typical Vertex-centric Analysis yield a significant speed-up (up to 2.6 times) over even native SQL We first look at the performance of three typical graph queries, variants (as shown in our experiments). namely PageRank, SSSP, and connected components, on Vertica and compare it with Giraph and GraphLab. Then, we break the 4. EXPERIMENTS total query time into the time to load/store from disk and the actual graph analysis time. We used the built-in PageRank, SSSP, and In this section, we describe the experiments we performed to an- connected component algorithms for Giraph (provided as example alyze and benchmark the performance of Vertica (version 6.1.2) on algorithms) and GraphLab (provided in the graph analytics toolkit). graph analytics, and to compare it to dedicated graph-processing For Vertica, we implemented these three algorithms as described systems (Giraph and GraphLab). We organize our experiments as below: follows. First, we look at the performance of several typical graph PageRank. We implemented PageRank query shown in Figure4 as queries over large billion-edge graphs. Second, we dig deeper and a combination of two SQL statements on the Vertex (V) and Edge look at the memory footprint and disk I/Os incurred and analyze (E) tables: (i) to compute the outbound PageRank contributed by the differences. Third, we evaluate our in-memory table UDF im- every vertex: plementation of vertex-centric programs over different graph anal- CREATE TABLE V_outbound AS yses. Fourth, we study end-to-end graph processing, comprising of SELECT id, value/Count(to_id) AS new_value graph algorithms combined with relational operators to sub-select FROM E,V portions of the graph prior to running analytics, project out graph WHERE E.from_id=V.id structure from graph meta-data, and perform aggregations and joins GROUP BY id,value; after the graph analysis completes. Finally, we look at more com- and (ii) to compute the total PageRank of a vertex as a sum of in- plex graph analytics beyond vertex-centric analysis, namely 1-hop coming PageRanks: analysis, and evaluate the utility of Vertica on these operations. CREATE TABLE V_prime AS Hardware. Our test bed consists of a cluster of 4 machines, each SELECT to_node AS id, 0.15/N+0.85*SUM(new_value) AS value FROM E,V_outbound having 12 (6x2) 2GHz Xeon cores, running 24-threads with hyper- WHERE E.from_id=V_outbound.id threading, 48GB of memory, 1.4T disk, running on RHEL Santiago GROUP BY to_id; 2Our current implementation runs on multiple cores on a single After each iteration, we replace the old vertex table V with V prime, node. Future work will look at distributing it across several nodes. i.e. drop V and rename V prime to V.

7 PageRankPageRankPageRank (per (per superstep(per superstep superstep for for Twitter) forTwitter) Twitter) PageRankPageRankPageRank (total (total (totalof of10 10 ofiterations) 10iterations) iterations) PageRankPageRankPageRank (3 (3iterations) iterations)(3 iterations) LineRankLineRankLineRank (per (per iteration)(per iteration) iteration) LiveJournalLiveJournalLiveJournalTwitterTwitterTwitter InputInputInput 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 910 10Outp10OutpOutp ut ut ut GraphLabGraphLabGraphLab 41.4441.4441.44493.511493.511493.511 GiraphGiraphGiraph143.529143.529143.52966.65066.65066.650103.600103.600103.60083.30883.30883.30887.06787.06787.06781.50181.50181.50185.20885.20885.20883.54883.54883.54879.21279.21279.21281.44581.44581.44585.85685.85685.85615.74715.74715.747 LiveJournalLiveJournalLiveJournalTwitterTwitterTwitter LiveJournalLiveJournalLiveJournalTwitterTwitterTwitter GiraphGiraphGiraph 126.82126.82126.821040.811040.811040.81 VerticaVerticaVertica 0 07.1638787.1638780 7.163878100.724782100.724782100.72478276.58791876.58791876.58791877.87374777.87374777.87374777.55075577.55075577.55075576.98139676.98139676.98139676.45138576.45138576.45138576.79525276.79525276.79525278.10283378.10283378.10283376.88100276.88100276.8810020 0 0 GiraphGiraphGiraph 91.301666691.301666691.3016666553.67333553.67333553.67333 GiraphGiraphGiraph 666666766666676666667333333333333333333333333333 VerticaVerticaVertica 62.2306362.2306362.23063643.535643.535643.535 VerticaVerticaVertica 18.698666618.698666618.6986666193.0605193.0605193.0605 VerticaVerticaVertica 60.650005260.650005260.65000521453.4187751453.4187751453.418775 ShortestShortestShortest Path Path Path(per (per superstep(per superstep superstep for for Twitter) forTwitter) Twitter) 666666766666676666667 ShortestShortestShortest Path-1 Path-1 Path-1 600600600 800800800 InputInputInput 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 8 8 8 9 9 910 10 1011 11 1112 12 1213 13 1314 14 1415 15 1516 16 1617 17Outp17OutpOutp 553.7553.7553.7 ut ut ut GiraphGiraphGiraph GiraphGiraphGiraph LiveJournalLiveJournalLiveJournalTwitterTwitterTwitter GiraphGiraphGiraph148.048148.048148.0483.3303.3303.3301.1371.1371.1371.3841.3841.3847.3057.3057.30543.35743.35743.35715.78615.78615.7869.1469.1469.1462.4952.4952.4951.4161.4161.4161.2351.2351.2355.3515.3515.3511.7231.7231.7231.3671.3671.3671.1731.1731.1731.1261.1261.1263.0393.0393.0391.0881.0881.08813.74413.74413.744 VerticaVerticaVertica VerticaVerticaVertica GraphLabGraphLabGraphLab 30.25330.25330.253392.107392.107392.107 VerticaVerticaVertica 0 013.00768913.0076890 13.00768927.23340927.23340927.2334098.9366538.9366538.93665320.16783720.16783720.16783720.92998220.92998220.92998243.52610143.52610143.52610131.52689331.52689331.52689318.73158418.73158418.73158415.78384515.78384515.78384515.08415915.08415915.0841597.741647.741647.7416410.35488310.35488310.3548839.0573979.0573979.0573978.5847888.5847888.58478810.31774810.31774810.31774810.04499510.04499510.0449955.9957955.9957955.995795 450450450 600600600 GiraphGiraphGiraph 66.56066.56066.560303.889303.889303.889 ShortestShortestShortest Path Path Path(per (per superstep(per superstep superstep for for Twitter)-1 forTwitter)-1 Twitter)-1 VerticaVerticaVertica 39.73290139.73290139.732901279.364689279.364689279.364689 PageRankPageRankPageRankPageRankPageRank (per superstep (per (per (per (per superstep superstep superstep superstep for Twitter) for for for Twitter)for Twitter) Twitter) Twitter) PageRankPageRankPageRankPageRankPageRank (total (totalof(total (total10(total ofiterations)of of 1010of 10 iterations)10iterations) iterations) iterations) PageRankPageRankPageRankPageRankPageRank (3 iterations) (3 (3iterations) (3 (3iterations) iterations) iterations) LineRankLineRankLineRankLineRank (perLineRank iteration) (per (per (per iteration)(per iteration) iteration) iteration) LiveJournalLiveJournalLiveJournalTwitterTwitterTwitter InputInputInputInputInput 1 1 11 12 2 22 23 3 33 43 4 44 54 5 55 65 6 66 76 7 77 87 8 88 98 9 9910910 10101110Outp11Outp1112 12 1213 13 1314 14 1415 15 1516 16 1617 17 1718 18 1819 19 1920 20 2021 21 2122 22 2223 23 2324 24 2425 25 2526 26 2627 27 2728 28 2829 29 2930 30 3031 31 3132 32 3233 33 3334 34 3435 35 3536 36 3637 37 3738 38 3839 39 3940 40 4041 41 4142 42 4243 43 4344 44 4445 45 4546 46 4647 47 4748 48 4849 49 4950 50 5051 51 5152 52 5253 53 5354 54 5455 55 5556 56 5657 57 5758 58 5859 59 5960 60 6061 61 6162 62 6263 63 6364 64 6465 65 6566 66 6667 67 6768 68 6869 69 6970 70 7071 71 7172 72 7273 73 7374 74 7475 75Outp75OutpOutp 300300300 400400400 ConnectedConnectedConnectedLiveJournal ComponentLiveJournal Component ComponentTwitterTwitter Input InputInput1 1 2 1 2 3 2 3 4 3 4 5 4 5 6 5 6 7 6 7 8 7 8 9 8 910 9Outp10 10OutpOutp ututut ut ut ut GraphLabGraphLabGraphLab 41.4441.4441.44 493.511493.511493.511 GiraphGiraphGiraph ut ut GraphLabGraphLab 41.44 41.44493.511493.511 GiraphGiraphGiraph147.582147.582143.529147.582143.529143.52961.60861.60866.65061.60866.65066.65062.57162.571103.60062.571103.600103.60018.63118.63183.30818.63183.30883.30810.81110.81187.06710.81187.0672.15887.0672.15881.5012.15881.5011.45181.5011.45185.2081.45185.2081.26485.2081.26483.5481.26483.54883.5481.371.3779.2121.3779.2121.43979.2121.43981.4451.43981.4451.29381.4451.29385.8561.29385.8561.18985.8561.18915.7471.18915.7475.82415.7475.8245.8241.0681.0681.0681.0341.0341.0341.0911.0911.0911.1441.1441.1441.0141.0141.0140.9460.9460.9460.9570.9570.9571.0481.0481.0481.0871.0871.0871.0171.0171.0171.0971.0971.0971.0391.0391.0390.970.970.971.0211.021LiveJournal1.0211.0381.0381.0381.0561.056Twitter1.0560.9870.9870.9871.0071.0071.0070.9680.9680.9681.0071.0071.0071.0071.0071.0070.990.99LiveJournal0.990.9870.9870.9871.0211.021Twitter1.0211.0211.0211.0210.9640.9640.9641.1821.1821.1821.1311.1311.1311.031.031.030.9450.9450.9450.9250.9250.9251.0621.0621.0621.0381.0381.0381.0321.0321.0320.9540.9540.9541.0061.0061.0061.0421.0421.0421.0241.0241.0240.8870.8870.8871.1181.1181.1180.9880.9880.9880.9840.9840.9840.9560.9560.9561.0241.0241.0241.0261.0261.0260.920.920.920.920.920.921.0821.0821.0821.0281.0281.0281.0481.0481.0481.0441.0441.0440.9620.9620.9620.9790.9790.9791.0361.0361.0360.9460.9460.9461.0041.0041.0040.9660.9660.9660.9510.9510.9511.0091.0091.0090.960.960.960.9190.9190.9191.0961.0961.0961.0191.0191.0195.5035.5035.503 GiraphGiraph143.529143.52966.65066.650103.600103.60083.30883.30887.06787.06781.50181.50185.20885.20883.54883.54879.21279.21281.44581.44585.85685.85615.74715.747 LiveJournalLiveJournalLiveJournalLiveJournalTwitterTwitterTwitterTwitter LiveJournalLiveJournalLiveJournalLiveJournalTwitterTwitterTwitterTwitter 193.1193.1193.1 GiraphGiraphGiraph LiveJournalLiveJournalLiveJournal126.82126.82Twitter126.82TwitterTwitter1040.811040.811040.81 VerticaVerticaVerticaVerticaVerticaVertica 5.66030105.660301007.1638785.6603017.1638787.16387888.86565488.865654100.72478288.865654100.724782100.72478277.75300777.75300776.58791877.75300776.58791876.58791870.8516370.8516377.87374770.8516377.87374762.43566677.87374762.43566677.55075562.43566677.55075520.99962177.55075520.99962176.98139620.99962176.9813968.72108876.9813968.72108876.4513858.72108876.45138511.35781476.45138511.35781476.79525211.35781476.79525211.19492576.79525211.19492578.10283311.19492578.1028339.86509878.1028339.86509876.8810029.86509876.8810025.3888876.8810025.388885.388880 00 Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time GiraphGiraph 126.82126.821040.811040.81 (seconds) Time (seconds) Time VerticaVertica 0 7.1638780 7.163878100.724782100.72478276.58791876.58791877.87374777.87374777.55075577.55075576.98139676.98139676.45138576.45138576.79525276.79525278.10283378.10283376.88100276.8810020 0 GiraphGiraphGiraphGiraphGiraph91.301666691.301666691.301666691.301666691.3016666553.67333553.67333553.67333553.67333553.67333 GiraphGiraphGiraphGiraphGiraph GraphLabGraphLabVerticaVerticaGraphLabVertica 42.20942.20962.2306362.2306342.20962.23063462.632462.632643.535462.632643.535643.535 66666676666667666666766666676666667333333333333333333333333333333333333333333333 150150150 200200200 VerticaVertica 62.2306362.23063643.535643.535 VerticaVerticaVerticaVerticaVertica18.698666618.698666618.698666618.698666618.6986666193.0605193.0605193.0605193.0605193.0605 VerticaVerticaVerticaVerticaVertica60.650005260.650005260.650005260.650005260.65000521453.4187751453.4187751453.4187751453.4187751453.418775 91.391.391.3 666666766666676666667 GiraphGiraphGiraph 82.042582.042582.0425437.904437.904437.904 ShortestShortestShortestShortest ShortestPath (perPath Path Path Path superstep (per (per (per (per superstep superstep superstep superstep for Twitter) for for forTwitter) for Twitter) Twitter) Twitter) 66666676666667 VerticaVertica 62.03949762.039497377.965377.965 18.718.718.7 VerticaShortestShortestShortestShortestShortest Path-162.039497 Path-1Path-1 Path-1 Path-1377.965 0 600600600 0 800800800 Input InputInputInputInput1 112 11 223 22 334 33 445 44 556 55 667 66 778 77 889 88 9109 99 1010111010 1112111111 1212131212 1314131313 1415141414 1516151515 1617161616Outp171717Outp17OutpOutpOutp 60006000 553.7553.7553.7 8000 8000 ututut GiraphGiraphGiraph553.7553.7 GiraphGiraphGiraph GiraphGiraphGiraph ut ut UntitledUntitledGiraphUntitledGiraph 1 1 1 UntitledUntitledUntitled 2 2 2 UntitledUntitledUntitled Giraph1 1 Giraph1UntitledUntitledUntitled 2 2 2 LiveJournalLiveJournalLiveJournalLiveJournalLiveJournalTwitterTwitterTwitterTwitterTwitter GiraphGiraph148.048148.048148.0483.330148.048148.0483.3303.3301.1373.3303.3301.1371.1371.3841.1371.1371.3841.3847.3051.3841.3847.3057.30543.3577.3057.30543.35743.35715.78643.35743.35715.78615.7869.14615.78615.7869.1469.1462.4959.1469.1462.4952.4951.4162.4952.4951.4161.4161.2351.4161.4161.2351.2355.3511.2351.2355.3515.3511.7235.3515.3511.7231.7231.3671.7231.7231.3671.3671.1731.3671.3671.1731.1261.1731.1731.1731.1263.0391.1261.1261.1263.0391.0883.0393.0393.0391.08813.7441.0881.0881.08813.74413.74413.74413.744 VerticaVerticaVertica 0 0013.00768913.00768913.00768927.23340927.23340927.2334098.9366538.9366538.93665320.16783720.16783720.16783720.92998220.92998220.92998243.52610143.52610143.52610131.52689331.52689331.52689318.73158418.73158418.73158415.78384515.78384515.78384515.08415915.08415915.0841597.741647.741647.7416410.35488310.35488310.3548839.0573979.0573979.0573978.5847888.58478810.31774810.31774810.31774810.04499510.04499510.0449955.9957955.9957955.995795 VerticaVerticaVerticaVerticaVertica VerticaVerticaVerticaVerticaVertica GraphLabGraphLabGraphLabGraphLabGraphLab 30.25330.25330.25330.25330.253392.107392.107392.107392.107392.107 VerticaVertica 0 13.0076890 13.00768927.23340927.2334098.9366538.93665320.16783720.16783720.92998220.92998243.52610143.52610131.52689331.52689318.73158418.73158415.78384515.78384515.08415915.0841597.741647.7416410.35488310.3548839.0573979.0573978.5847888.58478810.3177488.58478810.31774810.04499510.0449955.9957955.995795 450450450 600600600 GiraphGiraphGiraph 66.56066.56066.560 303.889303.889303.889 450 450 600 600 Giraph1,200Giraph1,2001,200 66.56066.560303.889303.889 500500500 500500500 462.6462.6462.6 ShortestShortestShortest Path Path Path (per (per (per superstep superstep superstep for for for Twitter)-1 Twitter)-1 Twitter)-1 VerticaVerticaVerticaVerticaVertica 39.73290139.73290139.73290139.732901GraphLab39.732901GraphLab279.364689GraphLab279.364689279.364689279.364689279.364689 1040.81040.81040.8 GraphLabGraphLabGraphLab GraphLabGraphLabGraphLab 437.9437.9437.9 ShortestShortest Path (perPath superstep (per superstep for Twitter)-1 for Twitter)-1 GiraphGiraphGiraph GiraphGiraphGiraph 392.1392.1392.1 GiraphGiraphGiraph 378.0378.0378.0 900900900 InputInputInput3753753751 11 2 22 3 33 4 44 5 55 6 66 7 77 8 88 9 99 101010375375111111375121212 131313 141414 151515 161616 171717 181818 191919 202020 212121 222222 232323 242424 252525 262626 272727 282828 292929 303030 313131 323232 333333 343434 353535 363636 373737 383838 393939 404040 414141 424242 434343 444444 454545 464646 474747 484848 494949 505050 515151 525252 535353 545454 555555 565656 575757 585858 595959 606060 616161 626262 636363 646464 656565 666666 676767 686868 696969 707070 717171 727272 737373 747474 757575OutpOutpOutp 300300300 400400400 ConnectedConnectedConnectedConnectedConnected Component ComponentVerticaComponent Component VerticaComponentVertica Input Input 1 1 2 2 3 Vertica3 4VerticaVertica4 5 5 6 6 7 7 8 8 9 910 1011 1112 1213 13Vertica14VerticaVertica1415 1516 1617 1718 1819 1920 2021 2122 2223 2324 2425 2526 2627 2728 2829 2930 3031 3132 3233 3334 3435 3536 3637 3738 3839 3940 4041 4142 4243 4344 4445 4546 4647 4748 4849 4950 5051 5152 5253 5354 5455 5556 5657 5758 5859 5960 6061 6162 6263 6364 6465 6566 6667 6768 6869 6970 7071 7172 7273 7374 7475 Outp75 Outp 300 300 400 400 303.9303.9303.9 ut ut ututut Giraph Giraph643.5Giraph643.5Giraph147.582Giraph643.5147.582147.58261.608147.582147.58261.60861.60862.57161.60861.60862.57162.57118.63162.57162.57118.63118.63110.81118.63118.63110.81110.8112.15810.81110.8112.1582.1581.4512.1582.1581.4511.4511.2641.4511.4511.2641.2641.371.2641.2641.371.4391.371.37279.41.37279.41.4391.4391.293279.41.4391.4391.2931.2931.1891.2931.2931.1891.1895.8241.1891.1895.8245.8241.0685.8245.8241.0681.0681.0341.0681.0681.0341.0911.0341.0341.0341.0911.1441.0911.0911.0911.1441.0141.1441.1441.1441.0140.9461.0141.0141.0140.9460.9570.9460.9460.9460.9571.0480.9570.9570.9571.0481.0871.0481.0481.0481.0871.0171.0871.0871.0871.0171.0971.0171.0171.017Bi-edges:1.0971.039Bi-edges:1.0971.0971.097Bi-edges:1.0391.0390.971.0391.0390.971.0210.97 0.970.971867.98 1.0211867.981.0381.021 1867.981.0211.0211.0381.0561.0381.0381.0381.056 0.987sec1.056 1.056sec1.056 0.987sec1.0070.9870.9870.9871.0070.9681.0071.0071.0070.9681.0070.9680.9680.9681.0071.0071.0071.0071.0071.0070.991.0071.0071.0070.990.9870.990.990.990.9871.0210.9870.9870.9871.0211.0211.0211.0211.0211.0210.9641.0211.0211.0210.9641.1820.9640.9640.9641.1821.1311.1821.1821.1821.1311.031.1311.1311.1311.030.9451.031.031.030.9450.9250.9450.9450.9450.9251.0620.9250.9250.9251.0621.0381.0621.0621.0621.0381.0321.0381.0381.0381.0320.9541.0321.0321.0320.9541.0060.9540.9540.9541.0061.0421.0061.0061.0061.0421.0241.0421.0421.0421.0240.8871.0241.0241.0240.8871.1180.8870.8870.8871.1180.9881.1181.1181.1180.9880.9840.9880.9880.9880.9840.9560.9840.9840.9840.9561.0240.9560.9560.9561.0241.0261.0241.0241.0241.0260.921.0261.0261.0260.920.920.920.920.920.921.0820.920.921.0820.921.0281.0821.0821.0821.0281.0481.0281.0281.0281.0481.0441.0481.0481.0481.0440.9621.0441.0441.0440.9620.9790.9620.9620.9620.9791.0360.9790.9790.9791.0360.9461.0361.0361.0360.9461.0040.9460.9460.9461.0040.9661.0041.0041.0040.9660.9510.9660.9660.9660.9511.0090.9510.9510.9511.0090.961.0091.0091.0090.960.9190.960.960.9190.961.0960.9190.9191.0960.9191.0191.0961.0961.0191.0965.5031.0191.0195.5031.0195.5035.5035.503 193.1193.1193.1193.1193.1 600600600LiveJournalLiveJournalLiveJournalLiveJournalLiveJournalTwitterTwitterTwitterTwitterTwitter VerticaVerticaVerticaVerticaVertica 5.6603012505.6603015.66030188.8656542505.6603015.66030125088.86565488.86565477.75300788.86565488.86565477.75300777.75300770.8516377.75300777.75300770.8516370.8516362.43566670.8516370.8516362.43566662.43566620.99962162.43566662.43566620.99962120.9996218.72108820.99962120.9996218.7210888.72108811.3578148.7210888.72108811.35781411.35781411.19492511.35781411.35781411.19492511.1949259.86509811.19492511.1949259.8650989.8650985.388889.8650989.8650985.388885.388882505.388885.38888250250 Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time 493.5493.5493.5 (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time Time (seconds) Time 150150 (seconds) Time 200200 GraphLabGraphLabGraphLabGraphLabGraphLab 42.20942.20942.20942.20942.209462.632462.632462.632462.632462.632 150 150 150 200 200 200 91.3 91.3 91.391.391.3 GiraphGiraph300GiraphGiraph300Giraph300 82.042582.042582.042582.042582.0425437.904437.904437.904437.904437.904 125125125 125125125 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time 82.082.082.0 66.666.666.6 62.062.062.0 VerticaVerticaVerticaVerticaVertica 62.03949762.03949762.039497126.862.03949762.039497126.8126.8377.965377.965377.965377.965377.965 39.739.739.7 42.242.242.2 18.7 18.7 18.718.718.7 41.441.441.4 62.262.262.2 30.330.330.3 0 0 0 00 0 0 0 00 0 0 0 0 0 0 0 0 0 UntitledUntitledUntitled 1UntitledUntitled 1 Untitled 1 11UntitledUntitled 2UntitledUntitled 2 2 22 UntitledUntitled Untitled1UntitledUntitled 1Untitled 1 Untitled 11 Untitled2UntitledUntitled 2 2 22 LiveJournalLiveJournalLiveJournal TwitterTwitterTwitter LiveJournalLiveJournalLiveJournal TwitterTwitterTwitter LiveJournalLiveJournalLiveJournal TwitterTwitterTwitter

1,2001,2001,2001,2001,200 500 500500500500 500 500500500500 462.6462.6462.6462.6462.6 GraphLabGraphLabGraphLabGraphLabGraphLab 1040.81040.81040.81040.81040.8 GraphLabGraphLabGraphLabGraphLabGraphLab GraphLabGraphLabGraphLabGraphLabGraphLab 437.9437.9437.9437.9437.9 392.1392.1392.1 GiraphGiraphGiraphGiraphGiraph GiraphGiraphGiraphGiraphGiraph 392.1392.1 GiraphGiraphGiraphGiraphGiraph 378.0378.0378.0378.0378.0 900900900 375375375 375375375 900 900 VerticaVerticaVerticaVerticaVertica 375 375 VerticaVerticaVerticaVerticaVertica 375 375 VerticaVerticaVerticaVerticaVertica 160160160 160160160 303.9303.9303.9303.9303.9 160160160 GiraphGiraphGiraph 643.5643.5643.5643.5643.5 GiraphGiraphGiraph 279.4279.4279.4279.4279.4 GiraphGiraphGiraph Bi-edges:Bi-edges:Bi-edges:Bi-edges:Bi-edges: 1867.98 1867.98 1867.98 1867.981867.98 sec sec sec secsec 600 600600600600 VerticaVerticaVertica 250 250250250250 VerticaVerticaVertica 250 250250250250 VerticaVerticaVertica 120120120 493.5493.5493.5493.5493.5 120120120 120120120

300 300300300300 125 125125125125 125 125125125125 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time 82.0 82.082.082.082.0 808080 808080 66.6 66.666.666.666.6 808080 62.0 62.062.062.062.0 126.8126.8126.8126.8126.8 39.7 39.739.739.739.7 42.2 42.242.242.242.2 41.4 41.441.441.441.4 62.2 62.262.262.262.2 30.3 30.330.330.330.3 0 00 00 0 00 00 0 00 00 404040 404040 404040 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time LiveJournalLiveJournalLiveJournalLiveJournalLiveJournal TwitterTwitterTwitterTwitterTwitter (seconds) Time LiveJournalLiveJournalLiveJournalLiveJournalLiveJournal TwitterTwitterTwitterTwitterTwitter (seconds) Time LiveJournalLiveJournalLiveJournalLiveJournalLiveJournal TwitterTwitterTwitterTwitterTwitter (a) PageRank (b) Shortest Path (c) Connected Components 0 0 0 0 0 0 0 0 0 1 1 21 2 32 3 43 4 54 5 65 6 76 7 87 8 98 9 9 1 12 123 234 345 456 567 678 789 89 9 5 5 5 10 10 10 10 1011 101112 111213 121314 131415 141516 151617 1617 17 10 10151015201520252025302530353035403540454045504550555055605560656065706570757075 75 InputInputInput Figure 8: TypicalInputInputInput vertex-centric Analysis Using Vertica. InputInputInput OutputOutputOutput OutputOutputOutput OutputOutputOutput

160 160160160160 160 160 160 160 GiraphGiraph 160160160 GiraphGiraph 160160160 GiraphGiraph GiraphGiraphGiraph GiraphGiraphGiraph GiraphGiraphGiraph VerticaVertica VerticaVertica VerticaVertica 120 120 VerticaVerticaVertica 120 120 VerticaVerticaVertica 120 120 VerticaVerticaVertica 120120120 120120120 120120120

80 80 80 80 80 80 808080 808080 808080 PageRankPageRankPageRank (Twitter) (Twitter) (Twitter) ShortestShortestShortest Paths Paths Paths (Twitter) (Twitter) (Twitter) ConnectedConnectedConnected Components Components Components (Twitter) (Twitter) (Twitter) 40 40 40 40 40 40 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time LoadLoad /Load Store / Store / StoreIterationsIterationsIterations (seconds) Time LoadLoad /Load Store / Store / StoreIterationsIterationsIterations (seconds) Time LoadLoad /Load Store / Store / StoreIterationsIterationsIterations Time (seconds) Time Time (seconds) Time Time (seconds) Time Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time 4040GraphLab40GraphLabGraphLab 374.811374.811374.811 118.7118.7118.7 GraphLab(seconds) Time GraphLab40GraphLab4040 366.607366.607366.607 25.525.525.5 GraphLab(seconds) Time GraphLab40GraphLab4040 370.709370.709370.70991.92391.92391.923 0 0 0 0 0 0 GiraphGiraph1 Giraph21 32 43 159.2854159.28159.2865 837.4076 837.408837.407 98 9 Giraph1Giraph2 13 24 35 46 161.7957 68161.7979 8 101.469 101.46 GiraphGiraphGiraph5 5 153.09153.09153.09233.44233.44233.44 10 10 Giraph 161.79 10 11101.461012 1113 1214 1315 1416 1517 16 17 10 15 10 20 15 25 20 30 25 35 30 40 35 45 40 50 45 55 50 60 55 65 60 70 65 75 70 75 Input Input Input Input Input Input Vertica 0.00000 725.113 Output Output Vertica 0.00000 277.025 Output Output Vertica 0.00000 373.094 Output Output 0 00VerticaVertica 0.000000.00000 725.113725.113 VerticaVertica 0.000000.00000 277.025277.025 VerticaVertica 0.000000.00000 373.094373.094 1 11 2 22 3 33 4 44 5 55 6 66 7 77 8 88 9 99 0 00 0 00 (a) PageRank 10 1010 1 12 2(b)3 34 Shortest45 56 67 78 8 Path9 9 (c) Connected5 5 Components 1 2 3 4 5 6 7 8 9 11 1111 5 InputInputInput 10 1010 12 121213 131314 141415 151516 161617 1717 10 101015 151520 202025 252530 303035 353540 404045 454550 505055 555560 606065 656570 707075 7575 OutputOutputOutput InputInputInput InputInputInput OutputOutputOutput OutputOutputOutput Figure 9: Per-iteration runtime of typical vertex-centric Analysis on Twitter graph. IterationsIterationsIterations IterationsIterationsIterations IterationsIterationsIterations

140014001400 600600600 600600600 AlgorithmAlgorithmAlgorithm Time Time Time AlgorithmAlgorithmAlgorithm Time Time Time AlgorithmAlgorithmAlgorithm Time Time Time LoadLoadLoad / Store/ Store / Store Time Time Time LoadLoadLoad / Store/ Store / Store Time Time Time LoadLoadLoad / Store/ Store / Store Time Time Time 105010501050 450450450 450450450 PageRankPageRankPageRankPageRankPageRank (Twitter) (Twitter) (Twitter) (Twitter) (Twitter) ShortestShortestShortestShortest ShortestPaths Paths Paths(Twitter) Paths Paths (Twitter) (Twitter) (Twitter) (Twitter) ConnectedConnectedConnectedConnectedConnected Components Components Components Components Components (Twitter) (Twitter) (Twitter) (Twitter) (Twitter) 700700700 Load /Load LoadStoreLoadLoad / /Store Store / Iterations/Store StoreIterationsIterationsIterationsIterations 300300300 Load /Load StoreLoadLoadLoad / Store/ Store /Iterations /Store StoreIterationsIterationsIterationsIterations 300300300 Load /Load StoreLoadLoadLoad / Store / Store Iterations/ /Store StoreIterationsIterationsIterationsIterations GraphLabGraphLabGraphLabGraphLabGraphLab 374.811374.811374.811374.811374.811118.7 118.7118.7118.7118.7 GraphLabGraphLabGraphLabGraphLabGraphLab 366.607366.607366.607366.607366.60725.5 25.525.525.525.5 GraphLabGraphLabGraphLabGraphLabGraphLab 370.709370.709370.709370.709370.70991.92391.92391.92391.92391.923 350350Giraph350GiraphGiraphGiraphGiraph 159.28159.28159.28159.28159.28837.40837.40837.40837.40837.40 Giraph150150Giraph150GiraphGiraphGiraph 161.79161.79161.79161.79161.79101.46101.46101.46101.46101.46 150150Giraph150GiraphGiraphGiraphGiraph 153.09153.09153.09153.09153.09233.44233.44233.44233.44233.44 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time VerticaVerticaVerticaVerticaVertica 0.000000.000000.000000.000000.00000725.113725.113725.113725.113725.113 VerticaVerticaVerticaVerticaVertica 0.000000.000000.000000.000000.00000277.025277.025277.025277.025277.025 VerticaVerticaVerticaVerticaVertica 0.000000.000000.000000.000000.00000373.094373.094373.094373.094373.094 0 0 0 0 0 0 0 0 0 GraphLabGraphLabGraphLab GiraphGiraphGiraph VerticaVerticaVertica GraphLabGraphLabGraphLab GiraphGiraphGiraph VerticaVerticaVertica GraphLabGraphLabGraphLab GiraphGiraphGiraph VerticaVerticaVertica (a) PageRank (b) Shortest Path (c) Connected Components

14001400140014001400 Figure 10: Cost breakdown600 600600600600 of typical vertex-centric Analysis on Twitter600 600600600600 graph. IterationsIterationsIterationsIterationsIterations IterationsIterationsIterationsIterationsIterations IterationsIterationsIterationsIterationsIterations LoadLoadLoad /Load LoadStore // StoreStore // StoreStore LoadLoadLoad /Load LoadStore / /Store Store // StoreStore LoadLoad Load/ LoadStoreLoad / Store/ Store // StoreStore 10501050 450 450 450 450 1050Single10501050 Source Shortest Path (SSSP). We implemented450450450 SSSP on propagated in either edge direction,450450450 we propagate them in opposite Vertica using the query shown in Listing4, i.e. we incrementally directions over alternate iterations, i.e. we change the join condition 700 700 300 300 300 300 compute700700700 the distances and replace the old vertex table300300300 with the new in the above query to: v2.id=e.to300300300 node AND v1.id=e.from node. one, unless the number of updates is less than 5000, in which case These optimizations help to significantly speed up the convergence 350 350 150 150 150 150 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time 350350350 (seconds) Time 150150150 (seconds) Time 150150150 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time we update the vertex table in-place. (seconds) Time of the algorithm, since component(seconds) Time ids can be propagated quickly Connected Components. We implemented the HCC algorithm [15] on either edge directions. 0 00 00 0 00 00 Figures 8(a), 8(b), and 8(c)0 0show0 00 the query runtime of PageRank in VerticaGraphLabGraphLab (this is theGiraph sameGiraph algorithmVerticaVertica used in Giraph).GraphLab HCCGraphLab assigns GiraphGiraph VerticaVertica GraphLabGraphLab GiraphGiraph VerticaVertica each nodeGraphLabGraphLabGraphLab to a componentGiraphGiraphGiraph identifier.VerticaVerticaVertica We initialize the vertexGraphLabGraphLabGraphLab values Giraph(10GiraphGiraph iterations),VerticaVerticaVertica SSSP, and ConnectedGraphLabGraphLabGraphLab ComponentsGiraphGiraphGiraph on GraphLab,VerticaVerticaVertica with their ids and in each iteration the vertex updates are computed Giraph, and Vertica over directed graphs using 4 machines. We can (similar to SSSP) as follows: see that Vertica outperforms Giraph and is comparable to GraphLab CREATE TABLE v_update_prime AS on these three queries, both on the smaller LiveJournal graph as SELECT v1.id, MIN(v2.value) AS value well as the billion edge Twitter graph. The reason is that these FROM v_update AS v2, edge AS e, vertex AS v1 WHERE v2.id=e.from_node AND v1.id=e.to_node queries are full scan-oriented join queries for which Vertica is heav- GROUP BY v1.id, v1.value ily optimized. A reader might also think that our SQL implemen- HAVING MIN(v2.value) < v1.value; tations are thoroughly hand crafted, which is actually true. In fact, As in shortest paths, we apply these updates either in-place or by one of our goals in this paper is to show that if developers really replacing the vertex table, depending upon the number of updates. care about the graph algorithms they can tune them quite a bit in the Additionally, we apply two optimizations: (1) we do not perform SQL engines. In Section 4.3, we evaluate our more general shared incremental computation at first; rather, we update all vertices in memory implementation to run arbitrary vertex-centric UDFs. the first few iterations, i.e. we use the entire vertex table instead of Note that the advantage of Vertica varies across benchmarks. For v update in the above query, (2) since the component ids can be example, on the Twitter graph, Vertica has 40% improvement over

8 DiskDiskDisk I/O:I/O: I/O: bytesbytes bytes readread read TotalTotalTotal I/OI/O I/O DiskDiskDisk I/O:I/O: I/O: bytesbytes bytes readread read TotalTotalTotal I/OI/O I/O InputInputInput 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 101010OutpOutpOutpTotalTotalTotal ututut ReadReadRead WriteWriteWrite GraphGraphGraphInputInput24.3824.38Input24.38 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 1010 10OutpOutpOutpTotalTotal24.3824.38Total24.38 LabLabLab utut ut GraphLabGraphLabGraphLab ReadReadRead24.3824.3824.38WriteWrite0.6659667968750.6659667968750.665966796875Write GraphGraphGiraphGiraphGiraphGraph24.3824.3824.8024.8024.8024.38 24.3824.3824.802477627992624.802477627992624.3824.8024776279926 LabLabLab 247724772477 GraphLabGiraphGraphLabGiraphGiraphGraphLab 24.802477627992624.802477627992624.802477627992624.3824.3824.380.6659667968750.6659667968751.162708334624771.162708334624771.162708334624770.665966796875 GiraphGiraphVerticaVerticaVerticaGiraph24.8024.8024.8000 00.0790.0790.0790.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.948 0024.802477627992624.802477627992608.617109284736228.6171092847362224.80247762799268.61710928473622 247724772477736373637363 596959695969 596959695969 596959695969 596959695969 596959695969 596959695969 596959695969 596959695969 596959695969 GiraphVerticaGiraphVerticaVerticaGiraph 24.80247762799268.6171092847362224.80247762799268.617109284736228.6171092847362224.80247762799261.162708334624771.1627083346247713.292301639914513.292301639914513.29230163991451.16270833462477 VerticaVerticaVertica 00 0.07900.0790.0790.9480.9480.9480.9480.9480.9480.9480.948DiskDisk0.948Disk I/O:0.9480.948I/O: I/O:0.948 bytesbytes bytes0.9480.948 0.948 writtenwritten written0.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.9480.948 00 8.6171092847362208.617109284736228.61710928473622 736373637363596959695969596959695969596959695969596959695969596959695969596959695969596959695969596959695969596959695969 VerticaVerticaVertica 8.617109284736228.617109284736228.6171092847362213.292301639914513.292301639914513.2923016399145 DiskDiskDisk I/O:I/O: I/O: bytesbytes bytes writtenwritten written InputInputInput 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 101010OutpOutpOutpTotalTotalTotal ututut GraphGraphGraphInputInputInput 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 1010 10OutpOutp0.6659667968750.665966796875Outp0.665966796875TotalTotal0.6659667968750.665966796875Total0.665966796875 LabLabLab utut ut GraphGraphGiraphGiraphGiraphGraph 0.6659667968750.6659667968751.162708334624771.162708334624770.6659667968751.162708334624770.6659667968750.6659667968751.162708334624771.162708334624770.6659667968751.16270833462477 LabLabLab GiraphGiraphVerticaVerticaVerticaGiraph 00 03.103332966566093.103332966566093.103332966566091.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.162708334624771.162708334624771.16270833462477001.162708334624771.16270833462477013.292301639914513.29230163991451.1627083346247713.2923016399145 VerticaVerticaVertica 00 3.1033329665660903.103332966566093.103332966566091.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.132107630372051.1321076303720500 13.2923016399145013.292301639914513.2923016399145 484848 484848 484848 484848 484848 484848 404040 GraphLabGraphLabGraphLab 404040 GiraphGiraphGiraph 404040 VerticaVerticaVertica 404040 GraphLabGraphLabGraphLab 404040 GiraphGiraphGiraph 404040 VerticaVerticaVertica 323232 323232 323232 323232 323232 323232 242424 242424 242424 242424 242424 242424 161616 161616 161616 161616 161616 161616 888 888 888 MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint(GB) 88 8 88 8 MemoryFootprint(GB) MemoryFootprint(GB) 88 8 MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB) MemoryFootprint (GB)

MemoryFootprint (GB) 000 MemoryFootprint (GB) 000 000 MemoryFootprint(GB) MemoryFootprint(GB) MemoryFootprint(GB) 00 0 (a) GraphLab 00 0 (b) Giraph 00 0 (c) Vertica Figure 11: The Memory Footprint of Different Systems.

252525 444 303030 252525 44 4 303030 202020 GraphLabGraphLabGraphLab GraphLabGraphLabGraphLab 242424 GraphLabGraphLabGraphLab 202020 GraphLabGiraphGraphLabGiraphGiraphGraphLab 333 GraphLabGraphLabGiraphGiraphGraphLabGiraph 242424 GraphLabGraphLabGiraphGiraphGraphLabGiraph 33 3 151515 GiraphVerticaGiraphVerticaVerticaGiraph GiraphGiraphVerticaVerticaGiraphVertica 181818 GiraphGiraphVerticaVerticaGiraphVertica 151515 VerticaVerticaVertica 222 VerticaVerticaVertica 181818 VerticaVerticaVertica 101010 22 2 121212 101010 121212 Read(GB) IO Write IO (GB) IO Write Read(GB) IO Read(GB) IO Write IO (GB) IO Write (GB) IO Write 111 555 666 Read(GB) IO Write IO (GB) IO Write Read(GB) IO Write IO (GB) IO Write Read(GB) IO 11(GB) IO Write 1 Accumulated(GB) IO 55 5 Accumulated(GB) IO Accumulated(GB) IO 66 6 Accumulated(GB) IO Accumulated(GB) IO 000 000 Accumulated(GB) IO 000 00 0 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 00 0 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 00 0 101010 101010 ReadReadRead WriteWriteWrite 11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 InputInputInput 1010 InputInputInput11 1 22 2 33 3 44 4 55 5 66 6 77 7 88 8 99 9 ReadReadRead WriteWriteWrite 10OutputOutputOutput 1010 10OutputOutputOutput InputInputInput InputInputInput OutputOutputOutput OutputOutputOutput (a) Input IO (b) Output IO (c) Total IO Figure 12: The IO Footprint of Different Systems. TableTableTable 11 1 TableTableTable 11 1 VerticaVerticaVertica GiraphGiraphGiraph GraphLabGraphLabGraphLab GiraphVerticaVertica forVertica PageRank,GiraphGiraphGiraph but only 8%GraphLabGraphLabGraphLabimprovement for SSSP. To in- 11(b), and 11(c) show the per-node memory consumption of GraphLab, 0.6189310.6189310.618931 1.187171.187171.18717 0.6824260.6824260.682426 vestigate this,0.6189310.618931 consider0.618931 the1.187171.18717 per-iteration1.18717 0.6824260.682426 runtime0.682426 of these two al- Giraph, and Vertica when running PageRank on the Twitter graph 2.664772.664772.66477 1.537161.537161.53716 0.8281970.8281970.8281973 gorithms, as shown2.664772.664772.66477 in Figures1.537161.537161.53716 9(a) and0.828197 0.8281979(b)0.828197. For PageRank, the using 4 machines. We can see that out of the total 48GB memory 2.852762.852762.85276 1.747821.747821.74782 1.115261.115261.11526 iteration runtimes2.852762.852762.85276 remain relatively1.747821.747821.74782 unchanged1.115261.115261.11526 since all vertices are per node, both GraphLab and Giraph has a peak memory usage of 3.322223.322223.32222 3.064253.064253.06425 1.265481.265481.26548 3.322223.322223.32222 3.064253.064253.06425 1.265481.265481.26548 32 66% updated in each3.247843.247843.24784 iteration. For8.878628.878628.87862 SSSP, the iteration1.551061.551061.55106 runtime peaks to a close to GB, i.e., of the total memory. In contrast, Vertica 3.247843.247843.24784 8.878628.878628.87862 1.551061.551061.55106 maximum since4.001654.001654.00165 the number13.016313.0163 of13.0163 updated vertices1.68631.68631.6863 first increases and has a peak usage of only 5.2GB, i.e. 11% of total memory. Thus, 4.001654.001654.00165 13.016313.016313.0163 1.68631.68631.6863 then decreases.4.00294.0029 From4.0029 Figures14.84314.843 9(a)14.843 and 9(b)2.013692.013692.01369, we can see that Giraph Vertica has a much smaller memory footprint. 4.00294.00294.0029 14.84314.84314.843 2.013692.013692.01369 incurs very high4.003064.003064.00306 runtime for16.034516.034516.0345 the input iteration,2.156642.156642.15664 which is the cost The picture changes completely if we look at the disk I/O. Fig- 4.003064.003064.00306 16.034516.034516.0345 2.156642.156642.15664 to load the raw4.003274.00327 data4.00327 and transform16.124116.124116.1241 it into2.45057 internal2.450572.45057 graph representa- ures 12(a) and 12(b) show the number of bytes read and written to 4.003274.003274.00327 16.124116.124116.1241 2.450572.450572.45057 tion. Vertica has4.003464.003464.00346 no such input16.352216.352216.3522 step. Both2.60352.6035 Vertica2.6035 and Giraph have disk by GraphLab, Giraph, and Vertica in each PageRank iteration 4.003464.003464.00346 16.352216.352216.3522 2.60352.60352.6035 similar runtime3.376683.376683.37668 for iterations16.513116.513116.5131 having a large2.8742.8742.874 number of vertex up- over the Twitter graph. We can see that GraphLab and Giraph have 3.376683.376683.37668 16.513116.513116.5131 2.8742.8742.874 dates, e.g. PageRank4.252544.252544.25254 iterations17.016417.016417.0164 or the peak2.996262.996262.99626 iteration in SSSP, due to high read I/O in the input step and no disk reads thereafter, whereas 4.252544.252544.25254 17.016417.016417.0164 2.996262.996262.99626 a full scan over4.050054.050054.05005 the dataset.17.252617.2526 However,17.2526 Giraph3.251393.251393.25139 has better runtimes Vertica has no upfront read but incurs disk reads in each iteration. 4.050054.050054.05005 17.252617.252617.2526 3.251393.251393.25139 for iterations with4.50324.50324.5032 fewer vertex17.331217.331217.3312 updates, e.g.3.365433.365433.36543 the non-peak iterations Likewise, Vertica incurs high write I/O in each iteration, whereas 4.50324.50324.5032 17.331217.331217.3312 3.365433.365433.36543 of SSSP, due to4.488164.488164.48816 its in-memory18.15918.15918.159 graph representation.3.684963.684963.68496 Figure 9(c) GraphLab and Giraph incur writes only in the output step. Thus, in 4.488164.488164.48816 18.15918.15918.159 3.684963.684963.68496 shows the per-iteration4.476844.476844.47684 runtimes18.523218.523218.5232 of Vertica3.844523.844523.84452 and Giraph for Con- total, while Vertica incurs less read I/O than Giraph (due to better 4.476844.476844.47684 18.523218.523218.5232 3.844523.844523.84452 4.476724.476724.47672 18.823518.823518.8235 4.153054.153054.15305 nected Components.4.476724.476724.47672 We can18.823518.823518.8235 see that by4.153054.15305 propagating4.15305 component encoding and compression), it incurs much more write I/O (due to 4.519234.519234.51923 19.053519.053519.0535 4.280764.280764.28076 ids in either edge4.519234.519234.51923 direction,19.053519.0535 Vertica19.0535 converges4.280764.280764.28076 significantly faster. materializing the output of each iteration to disk). 4.070254.070254.07025 20.270120.270120.2701 4.542934.542934.54293 Finally, Figures4.070254.070254.07025 10(a), 10(b)20.270120.270120.2701, and 10(c)4.542934.54293 show4.54293 the runtime break- This expensive I/O for writing the iteration output is also true 5.074475.074475.07447 22.393122.393122.3931 4.669884.669884.66988 down of the GraphLab,5.074475.074475.07447 Giraph,22.393122.393122.3931 and Vertica4.669884.669884.66988 on the Twitter graph. when running vertex functions as table UDFs. To illustrate this, 4.716834.716834.71683 24.358824.358824.3588 4.833184.833184.83318 We can see that4.716834.716834.71683 the total runtime24.358824.358824.3588 of GraphLab4.833184.833184.83318 is dominated by the Figure 13(a) shows the cost breakdown when running PageRank as 4.823934.823934.82393 25.410625.410625.4106 4.963414.963414.96341 load/store costs.4.823934.823934.82393 In fact, the25.410625.4106 analysis25.4106 time4.963414.96341 for4.96341 SSSP in GraphLab is table UDFs. We can see that writing intermediate output (shown in 4.810394.810394.81039 25.411425.411425.4114 5.225195.225195.22519 just 25 seconds,4.810394.810394.81039 a very small25.411425.411425.4114 fraction of5.225195.22519 the5.22519 overall cost. In con- red) is the major cost in the query runtime. 4.798794.798794.79879 25.413425.413425.4134 5.326915.326915.32691 trast, the runtimes4.798794.798794.79879 of Giraph25.413425.413425.4134 are dominated5.326915.326915.32691 by the analysis time. 4.931134.931134.93113 25.45625.45625.456 5.622095.622095.62209 4.931134.931134.93113 25.45625.45625.456 5.622095.622095.62209 4.3 In-memory Graph Analysis Vertica, on the4.915654.91565 other4.91565 hand, pipelines27.229227.229227.2292 the5.75922 data5.759225.75922 from disk and there- 4.915654.915654.91565 27.229227.229227.2292 5.759225.759225.75922 fore there is no4.370334.370334.37033 distinct load/store28.109728.109728.1097 phase.6.03886.0388 However,6.0388 efficient data To test whether we can avoid the expensive intermediate I/O in 4.370334.370334.37033 28.109728.109728.1097 6.03886.03886.0388 storage and retrieval4.132154.132154.13215 makes28.110828.1108 Vertica28.1108 competitive6.18186.18186.1818 in terms of the to- Vertica, we implemented the shared memory UDFs as described in 4.132154.132154.13215 28.110828.110828.1108 6.18186.18186.1818 tal runtime. 4.870634.870634.87063 28.468728.468728.4687 6.460826.460826.46082 Section 3.4.2. Figure 13(b) shows the runtime of the shared mem- 4.870634.870634.87063 28.468728.468728.4687 6.460826.460826.46082 4.85954.85954.8595 28.777128.777128.7771 6.611616.611616.61161 ory table UDF, on the Twitter-small graph. We can see that the 4.85954.85954.8595 28.777128.777128.7771 6.611616.611616.61161 4.2 Resource4.844044.844044.84404 Consumption28.501128.501128.5011 6.907366.907366.90736 shared memory implementation is about 7.5 times faster than the 4.844044.844044.84404 28.501128.501128.5011 6.907366.907366.90736 4.832484.832484.83248 28.283828.283828.2838 7.054247.054247.05424 disk-based table UDF and even more than 2 times faster than the We now study4.832484.832484.83248 the resource28.283828.283828.2838 consumption,7.054247.054247.05424 i.e., the memory foot- 4.822654.822654.82265 27.896427.896427.8964 7.358247.358247.35824 print and disk I/O,4.822654.822654.82265 of Vertica,27.896427.8964 Giraph,27.8964 and7.358247.35824 GraphLab.7.35824 Figures 11(a), SQL implementation. Finally, Figure 13(c) shows the cost break- 4.191154.191154.19115 27.921527.921527.9215 7.490287.490287.49028 4.191154.191154.19115 27.921527.921527.9215 7.490287.490287.49028 down of the shared memory table UDF. We can see that the system 3 5.199095.199095.19909 28.0828.0828.08 7.789647.789647.78964 The toolkit implementation5.199095.199095.19909 28.08 in28.0828.08 GraphLab7.789647.78964 doesn’t7.78964 produce the per- spends most of the time inside the UDF and there is no per-iteration 4.848024.848024.84802 28.572728.572728.5727 7.900837.900837.90083 iteration runtime.4.848024.848024.84802 28.572728.572728.5727 7.900837.900837.90083 output cost. Thus, we see that developers can indeed sacrifice mem- 4.950044.950044.95004 29.192529.192529.1925 8.148.148.14 4.950044.950044.95004 29.192529.192529.1925 8.148.148.14 4.938344.938344.93834 28.923628.923628.9236 8.27938.27938.2793 4.938344.938344.93834 28.923628.923628.9236 8.27938.27938.2793 4.922864.922864.92286 28.673928.673928.6739 8.456318.456318.45631 4.922864.922864.92286 28.673928.673928.6739 8.456318.456318.45631 9 4.911064.911064.91106 28.552228.552228.5522 8.497068.497068.49706 4.911064.911064.91106 28.552228.552228.5522 8.497068.497068.49706 4.917514.917514.91751 28.597728.597728.5977 8.580688.580688.58068 4.917514.917514.91751 28.597728.597728.5977 8.580688.580688.58068 4.469434.469434.46943 28.647928.647928.6479 8.685078.685078.68507 4.469434.469434.46943 28.647928.647928.6479 8.685078.685078.68507 5.084845.084845.08484 28.542228.542228.5422 9.786229.786229.78622 5.084845.084845.08484 28.542228.542228.5422 9.786229.786229.78622 4.968434.968434.96843 28.596728.596728.5967 10.020310.020310.0203 4.968434.968434.96843 28.596728.596728.5967 10.020310.020310.0203 4.953314.953314.95331 28.615128.615128.6151 11.202911.202911.2029 4.953314.953314.95331 28.615128.615128.6151 11.202911.202911.2029 4.941644.941644.94164 28.732328.732328.7323 11.462711.462711.4627 4.941644.941644.94164 28.732328.732328.7323 11.462711.462711.4627 4.930054.930054.93005 28.803128.803128.8031 11.984911.984911.9849 4.930054.930054.93005 28.803128.803128.8031 11.984911.984911.9849 4.920144.920144.92014 28.806828.806828.8068 12.24812.24812.248 4.920144.920144.92014 28.806828.806828.8068 12.24812.24812.248 4.106174.106174.10617 28.749628.749628.7496 12.778612.778612.7786 4.106174.106174.10617 28.749628.749628.7496 12.778612.778612.7786 4.946114.946114.94611 28.751828.751828.7518 13.043513.043513.0435 4.946114.946114.94611 28.751828.751828.7518 13.043513.043513.0435 4.743664.743664.74366 28.805928.805928.8059 13.577913.577913.5779 4.743664.743664.74366 28.805928.805928.8059 13.577913.577913.5779 4.974144.974144.97414 28.904428.904428.9044 13.843513.843513.8435 4.974144.974144.97414 28.904428.904428.9044 13.843513.843513.8435 4.958874.958874.95887 28.863128.863128.8631 14.368914.368914.3689 4.958874.958874.95887 28.863128.863128.8631 14.368914.368914.3689 4.947494.947494.94749 28.927928.927928.9279 14.632614.632614.6326 4.947494.947494.94749 28.927928.927928.9279 14.632614.632614.6326 4.93624.93624.9362 29.065529.065529.0655 15.15815.15815.158 4.93624.93624.9362 29.065529.065529.0655 15.15815.15815.158 4.939224.939224.93922 29.282529.282529.2825 15.425415.425415.4254 4.939224.939224.93922 29.282529.282529.2825 15.425415.425415.4254 4.287194.287194.28719 29.24929.24929.249 18.62618.62618.626 4.287194.287194.28719 29.24929.24929.249 18.62618.62618.626 5.188065.188065.18806 29.281829.281829.2818 21.604721.604721.6047 5.188065.188065.18806 29.281829.281829.2818 21.604721.604721.6047 4.97714.97714.9771 29.200329.200329.2003 21.604721.604721.6047 4.97714.97714.9771 29.200329.200329.2003 21.604721.604721.6047 4.965854.965854.96585 29.162229.162229.1622 25.081925.081925.0819 4.965854.965854.96585 29.162229.162229.1622 25.081925.081925.0819 4.950274.950274.95027 29.162429.162429.1624 29.861329.861329.8613 4.950274.950274.95027 29.162429.162429.1624 29.861329.861329.8613 4.938674.938674.93867 29.271429.271429.2714 31.769631.769631.7696 4.938674.938674.93867 29.271429.271429.2714 31.769631.769631.7696 4.93474.93474.9347 29.271529.271529.2715 33.066733.066733.0667 4.93474.93474.9347 29.271529.271529.2715 33.066733.066733.0667 3.936093.936093.93609 29.573429.573429.5734 33.18133.18133.181 3.936093.936093.93609 29.573429.573429.5734 33.18133.18133.181 4.474144.474144.47414 29.57429.57429.574 30.492830.492830.4928 4.474144.474144.47414 29.57429.57429.574 30.492830.492830.4928 4.490834.490834.49083 29.669929.669929.6699 30.492830.492830.4928 4.490834.490834.49083 29.669929.669929.6699 30.492830.492830.4928 4.959934.959934.95993 29.56129.56129.561 30.493730.493730.4937 4.959934.959934.95993 29.56129.56129.561 30.493730.493730.4937 4.948674.948674.94867 29.592229.592229.5922 30.494130.494130.4941 4.948674.948674.94867 29.592229.592229.5922 30.494130.494130.4941 4.933234.933234.93323 29.592229.592229.5922 30.494330.494330.4943 4.933234.933234.93323 29.592229.592229.5922 30.494330.494330.4943 4.921784.921784.92178 29.86529.86529.865 30.494430.494430.4944 4.921784.921784.92178 29.86529.86529.865 30.494430.494430.4944 4.930164.930164.93016 30.317430.317430.3174 30.494430.494430.4944 4.930164.930164.93016 30.317430.317430.3174 30.494430.494430.4944 4.19854 30.4258 30.4944 4.198544.198544.198544.19854 30.425830.425830.425830.4258 30.494430.494430.494430.4944

5.194215.194215.194215.194215.19421 30.761230.761230.761230.761230.7612 30.494430.494430.494430.494430.4944

4.97264.97264.97264.97264.9726 30.76230.76230.76230.76230.762 30.494230.494230.494230.494230.4942

4.97494.97494.97494.97494.9749 30.647530.647530.647530.647530.6475 30.496530.496530.496530.496530.4965

4.963054.963054.963054.963054.96305 30.534830.534830.534830.534830.5348 30.496330.496330.496330.496330.4963

4.954874.954874.954874.954874.95487 30.430330.430330.430330.430330.4303 30.496830.496830.496830.496830.4968

4.945874.945874.945874.945874.94587 30.430430.430430.430430.430430.4304 30.497230.497230.497230.497230.4972

4.953714.953714.953714.953714.95371 30.269530.269530.269530.269530.2695 30.49730.49730.49730.49730.497

3.72763.72763.72763.72763.7276 30.269630.269630.269630.269630.2696 30.497230.497230.497230.497230.4972

3.715873.715873.715873.715873.71587 30.270130.270130.270130.270130.2701 30.497430.497430.497430.497430.4974

3.704153.704153.704153.704153.70415 30.337930.337930.337930.337930.3379 30.497130.497130.497130.497130.4971

3.688543.688543.688543.688543.68854 30.395930.395930.395930.395930.3959 30.497530.497530.497530.497530.4975

3.676823.676823.676823.676823.67682 30.50230.50230.50230.50230.502 30.497530.497530.497530.497530.4975

3.66513.66513.66513.66513.6651 30.502630.502630.502630.502630.5026 30.497430.497430.497430.497430.4974

3.649473.649473.649473.649473.64947 30.502630.502630.502630.502630.5026 30.497530.497530.497530.497530.4975

3.637693.637693.637693.637693.63769 30.712130.712130.712130.712130.7121 30.497530.497530.497530.497530.4975

3.625983.625983.625983.625983.62598 30.997330.997330.997330.997330.9973 24.962524.962524.962524.962524.9625

3.61053.61053.61053.61053.6105 30.996930.996930.996930.996930.9969 24.962524.962524.962524.962524.9625

3.603923.603923.603923.603923.60392 30.99730.99730.99730.99730.997 24.962524.962524.962524.962524.9625

3.592923.592923.592923.592923.59292 30.99730.99730.99730.99730.997 24.962524.962524.962524.962524.9625

3.577293.577293.577293.577293.57729 30.996830.996830.996830.996830.9968 24.962524.962524.962524.962524.9625

3.565593.565593.565593.565593.56559 30.997230.997230.997230.997230.9972 24.962524.962524.962524.962524.9625

3.549973.549973.549973.549973.54997 30.997230.997230.997230.997230.9972 24.962524.962524.962524.962524.9625

3.538243.538243.538243.538243.53824 31.121831.121831.121831.121831.1218 24.962524.962524.962524.962524.9625

3.526553.526553.526553.526553.52655 31.160431.160431.160431.160431.1604 24.962524.962524.962524.962524.9625

3.510933.510933.510933.510933.51093 31.212631.212631.212631.212631.2126 24.962524.962524.962524.962524.9625

3.499213.499213.499213.499213.49921 31.212831.212831.212831.212831.2128 24.962524.962524.962524.962524.9625

3.487483.487483.487483.487483.48748 31.212931.212931.212931.212931.2129 24.962524.962524.962524.962524.9625

3.471863.471863.471863.471863.47186 31.160931.160931.160931.160931.1609 24.962524.962524.962524.962524.9625

3.460143.460143.460143.460143.46014 31.045131.045131.045131.045131.0451 24.962524.962524.962524.962524.9625

3.448423.448423.448423.448423.44842 31.021231.021231.021231.021231.0212 24.962524.962524.962524.962524.9625

3.432793.432793.432793.432793.43279 31.091931.091931.091931.091931.0919 24.96624.96624.96624.96624.966

3.421073.421073.421073.421073.42107 31.02731.02731.02731.02731.027 24.965724.965724.965724.965724.9657

3.405463.405463.405463.405463.40546 30.933130.933130.933130.933130.9331 24.965324.965324.965324.965324.9653

3.393743.393743.393743.393743.39374 30.677130.677130.677130.677130.6771 24.961224.961224.961224.961224.9612

3.382013.382013.382013.382013.38201 30.679130.679130.679130.679130.6791 24.965724.965724.965724.965724.9657

3.366393.366393.366393.366393.36639 30.679130.679130.679130.679130.6791 24.965724.965724.965724.965724.9657

3.354683.354683.354683.354683.35468 30.679130.679130.679130.679130.6791 24.965324.965324.965324.965324.9653

3.342963.342963.342963.342963.34296 30.679130.679130.679130.679130.6791 24.961224.961224.961224.961224.9612 TableTableTable 55 5 TableTableTable 44 4 TableTableTable 22 2 TableTableTable 66 6 PageRankPageRankPageRank PageRankPageRankPageRank GiraphGiraphGiraph VerticaVerticaVertica VerticaVerticaVertica VerticaVerticaVertica (UDF, (UDF, (UDF, PageRankPageRankPageRank (UDF)(UDF)(UDF) (SQL)(SQL)(SQL) SharedSharedShared Memory) Memory) Memory) ReadingReadingReading Input Input Input 193.119744193.119744193.119744 TableTableTable Scan Scan Scan + + + 4466.9244466.9244466.924 OutsideOutsideOutside UDF UDF UDF 569.66484266667569.66484266667569.66484266667 UnionUnionUnion PageRankPageRankPageRank 46.956666666666746.956666666666746.956666666666710.929257666666710.929257666666710.92925766666673.313333333333333.313333333333333.313333333333331.459559666666671.459559666666671.45955966666667 ComputeComputeCompute UDF UDF UDF 144.243712144.243712144.243712 InsideInsideInside UDF UDF UDF 702.509824702.509824702.509824 UDFUDFUDF I/O I/O I/O 2168.0542168.0542168.054 ShortestShortestShortest Path Path Path43.666666666666743.666666666666743.666666666666710.64698510.64698510.6469852.956350333333332.956350333333332.956350333333331.388714666666671.388714666666671.38871466666667 MessageMessageMessage Sorting Sorting Sorting 73.5193673.5193673.51936

UDFUDFUDF Compute Compute Compute 988.823988.823988.823 SynchronizationsSynchronizationsSynchronizations 279.91296279.91296279.91296

IntermediateIntermediateIntermediate 5861.2645861.2645861.264 FinalFinalFinal Output Output Output 11.71404811.71404811.714048 OutputOutputOutput Twitter-smallTwitter-smallTwitter-small dataset dataset dataset RemainingRemainingRemaining 32.02150432.02150432.021504 UpdatesUpdatesUpdates 2091.8972091.8972091.897

202020 InsideInsideInside UDFUDF UDF 121212 160001600016000 VerticaVerticaVertica (UDF(UDF (UDF ++ + Disk)Disk) Disk) 700700700 2,0922,0922,092 181818 OutsideOutsideOutside UDFUDF UDF 280280280 VerticaVerticaVertica (SQL(SQL (SQL ++ + Disk)Disk) Disk) RemainingRemainingRemaining 5,8615,8615,861 161616 140014001400 UpdatesUpdatesUpdates VerticaVerticaVertica (UDF(UDF (UDF ++ + SharedShared Shared Memory)Memory) Memory) 525525 FinalFinalFinal OutputOutput Output 120001200012000 141414 525 IntermediateIntermediateIntermediate OutputOutput Output 702.5702.5702.5 SynchronizationsSynchronizationsSynchronizations UDFUDFUDF ComputeCompute Compute 121212 10.9310.9310.93 10.6510.6510.65 MessageMessageMessage SortingSorting Sorting 747474 TableTable 5 5 TableTable 4 4 800080008000 UDFUDFUDF I/OI/O I/O 101010 TableTable 2 2 TableTable 6 6 350350350 Table 5 ComputeCompute UDF UDF Table 4 989989989 Table 2 Table 6 Compute UDF TableTableTable ScanScan Scan ++ +UnionUnion Union 888 700700700 144144144 ReadingReadingReadingPageRankPageRank InputInput Input PageRank Giraph Vertica Vertica Vertica (UDF, PageRank PageRank PageRankPageRank 2,1682,1682,168 Giraph GiraphVertica VerticaVertica VerticaVertica (UDF,Vertica (UDF, PageRankPageRank 666(UDF)(UDF) (SQL)(SQL) SharedShared Memory) Memory) ReadingReading Input Input 193.119744193.119744 Table Scan + 4466.924 (UDF) (SQL) Shared Memory) Outside UDF 569.66484266667569.7569.7569.7 175175175Reading Input 193.119744 Time (seconds) Time Time (seconds) Time Table Scan + 4466.924 400040004000 (seconds) Time Outside UDF 569.66484266667 Table Scan + 4466.924 4,4674,4674,467 PageRank 46.9566666666667444 10.92925766666673.313.313.313333333333333.31 1.45955966666667 2.962.962.96 Outside UDF 569.66484266667 193193193 UnionUnion PageRankPageRank46.956666666666746.956666666666710.929257666666710.92925766666673.313333333333333.313333333333331.459559666666671.45955966666667 ComputeCompute UDF UDF 144.243712144.243712 Union 1.461.461.46 Compute UDF 144.243712 Time (Milli-seconds) Time Time (Milli-seconds) Time Time (Milli-seconds) Time 1.391.391.39 InsideInside UDF UDF 702.509824702.509824 Shortest Path 43.6666666666667222 10.646985 2.956350333333331.38871466666667 Inside UDF 702.509824 UDFUDF I/O I/O 2168.0542168.054 Shortest ShortestPath 43.6666666666667 Path 43.666666666666710.64698510.6469852.956350333333332.956350333333331.388714666666671.38871466666667 MessageMessage Sorting Sorting 73.5193673.51936 Time (Milliseconds) Time Time (Milliseconds) Time UDF I/O 2168.054 (Milliseconds) Time Message Sorting 73.51936 000 000 000 000 UDFUDF Compute Compute 988.823988.823 Synchronizations 279.91296 UDF Compute 988.823 TotalTotalTotal PageRankPageRankPageRank ShortestShortestShortest PathPath Path TotalTotalTotal SynchronizationsInsideInsideInsideSynchronizations UDFUDF UDF 279.91296279.91296 Intermediate 5861.264 IntermediateIntermediate 5861.2645861.264 FinalFinal Output Output 11.71404811.714048 Output Final Output 11.714048 Output Output Twitter-smallTwitter-small dataset dataset RemainingRemaining 32.02150432.021504 Updates 2091.897 Twitter-small dataset Remaining 32.021504 Updates Updates 2091.8972091.897

InsideInside UDF UDF 1212808080 1600016000 20 2020 Inside UDF 70070012 16000 VerticaVertica (UDF (UDF + +Disk) Disk) OutsideOutside UDF UDF700 GraphLabGraphLabGraphLab 161616 2,0922,092 1818 Vertica (UDF + Disk) Outside UDF 280 280280 2,092 18 Vertica (SQL + Disk) VerticaVertica (SQL +(SQL Disk) + Disk) RemainingRemainingRemaining VerticaVerticaVertica 5,8615,861 1616 14001400 5,861 UpdatesUpdates 16 VerticaVertica (UDF (UDF + +Shared Shared Memory) Memory) 1400 525525 606060 FinalFinal Output Output GraphLabGraphLabGraphLab 12000 1200012000 PageRankPageRankPageRankUpdates (10(10 (10 iterations)iterations) iterations) ShortestShortestShortestVertica PathsPaths Paths (UDF + Shared Memory)ConnectedConnectedConnected ComponentsComponents Components LoadLoadLoad Time Time Time525 Final Output Intermediate Output14 1414 702.5702.5 121212 IntermediateIntermediate Output Output 702.5 SynchronizationsSynchronizationsSynchronizations VerticaVerticaVertica GraphLabGraphLabUDFGraphLabUDF Compute ComputeVerticaVerticaVertica 12 1210.9312 10.93GraphLab10.93GraphLabGraphLabVerticaVerticaVertica 10.6510.65 GraphLabGraphLabGraphLabVerticaVerticaVertica GraphLabGraphLabGraphLabVerticaVerticaVertica MessageMessage Sorting Sorting UDF Compute 10.65 74 Message Sorting UDF I/O 1010 35074 74404040 80008000LoadLoadLoad Time Time TimeUDF I/OUDF49.0749.0749.07 I/O 15.38515.38515.385 10LoadLoadLoad Time Time Time 50.71450.71450.71414.84914.84914.849 LoadLoadLoad Time Time Time 49494916.50516.50516.505 PageRankPageRankPageRank 49.0735049.0749.0715.38515.38535015.385 ComputeCompute UDF UDF 8000 989989 144Compute UDF 989 TableTable Scan Scan + Union+ Union 8 8 700700 144 144 ReadingReading Input Input 888 AlgorithmAlgorithmAlgorithm2,1682,168 Time TimeTable Time Scan11.511.5 +11.5 Union14.24414.24414.244 8AlgorithmAlgorithmAlgorithm Time Time Time 1.41.41.4 1.6491.6491.649 AlgorithmAlgorithmAlgorithm Time Time Time 6.6416.6416.641 7.0327.0327.032 700 ShortestShortestShortest Paths Paths Paths 50.71450.71450.71414.84914.84914.849 Reading Input 2,168 6 6 6 569.7569.7 175 Time (seconds) Time 569.7 175 40004000TotalTotalTotal Time Time Time4,467 60.57060.57060.57029.62929.62929.629 TotalTotalTotal(seconds) Time Time Time Time 52.11452.11452.1143.313.3116.49816.49816.498 TotalTotalTotal Time Time Time 55.64155.64155.64123.53723.53723.537 ConnectedConnectedConnected 17549494916.50516.50516.505 193193202020

4,467 (seconds) Time 4000 4,467 4 4 4 3.31 2.96 2.962.96 193 1.46 ComponentsComponentsComponents 444

Time (Milli-seconds) Time 1.46 1.39 Time (Milli-seconds) Time 2 1.46 1.39 Time (Milli-seconds) Time 2 2 1.39 Time (Milliseconds) Time Time (Milliseconds) Time 0 0 (Milliseconds) Time 0 0 AlgorithmAlgorithmAlgorithm Time Time Time 0 0 0 0 0 0 0 0 000 TotalTotal TotalTotal InsideInside UDF UDF Total PageRankPageRankPageRank ShortestShortestShortest Path Path Path Total GraphLabGraphLabGraphLabVerticaVerticaInsideVertica UDF PageRankPageRankPageRank ShortestShortestShortest Paths Paths Paths ConnectedConnectedConnected Components Components Components 000 PageRankPageRankPageRank ShortestShortestShortest Paths Paths Paths ConnectedConnectedConnected Components Components Components (a) Cost Breakdown (b) Vertex analysis using shared memory PageRankPageRank(c)PageRank Shared memory11.511.511.514.24414.24414.244 cost breakdown ShortestShortestShortest Paths Paths Paths 1.41.41.4 1.6491.6491.649 Figure 13: Analyzing Resource Consumption and Improving I/O Performance in Vertica. ConnectedConnectedConnected 6.6416.6416.641 7.0327.0327.032 ComponentsComponentsComponents

707070 606060 606060 AlgorithmAlgorithmAlgorithm TimeTime Time AlgorithmAlgorithmAlgorithm TimeTime Time 80 8080 AlgorithmAlgorithmAlgorithm TimeTime Time 1.41.41.4 6.66.66.6 GraphLabGraphLabGraphLab 16 1616 11.511.511.5 Load/StoreLoad/StoreLoad/Store TimeTime Time Load/StoreLoad/StoreLoad/Store TimeTime Time Load/StoreLoad/StoreLoad/Store TimeTime TimeVertica 565656 484848 50.750.750.7 484848 49.049.049.0 VerticaVertica 6060 PageRankPageRank (10 (10 iterations) iterations) ShortestShortest Paths Paths ConnectedConnected Components Components LoadLoad Time Time 60 GraphLabGraphLab PageRank (10 iterations)49.149.149.1 Shortest Paths Connected Components Load Time 1212 GraphLab 424242 363636 363636 12 VerticaVerticaVertica GraphLabGraphLabVerticaVertica GraphLab Vertica GraphLab Vertica GraphLabGraphLabVerticaVertica GraphLab Vertica GraphLabGraphLabVertica Vertica GraphLabGraphLabVertica Vertica GraphLab Vertica 40 LoadLoad Time Time 49.0749.0715.38515.385 LoadLoad Time Time 50.71450.71414.84914.849 LoadLoad Time Time 494916.50516.505 PageRankPageRank 49.0749.0715.38515.38540 40 Load Time 49.07 15.385 Load Time 50.714 14.849 Load Time 49 16.505 PageRank 49.07 15.385 8 282828 14.214.214.2 242424 242424 8 8 AlgorithmAlgorithm Time Time 11.511.514.24414.244 AlgorithmAlgorithm Time Time 1.41.4 1.6491.649 AlgorithmAlgorithm Time Time 6.6416.641 7.0327.032 ShortestShortest Paths Paths 50.71450.71414.84914.849 7.07.07.0 Algorithm Time 11.5 14.244 Algorithm Time 1.4 1.649 Algorithm Time 6.641 7.032 Shortest1.61.6 1.6Paths 50.714 14.849 Total Time 60.570 29.629 Total Time 52.114 16.498 Total Time 55.641 23.537 Connected 49 16.505 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Total(seconds) Time Time 60.570 29.629 Total Time 52.114 16.498(seconds) Time Total Time 55.641 23.537 Connected (seconds) Time 49 16.505 20 Total Time 60.570 29.629 Total Time 52.114 16.498 Total Time 55.641 23.537 Connected14.814.814.8 49 16.505 20 20 16.516.516.5 141414 15.415.415.4 121212 ComponentsComponents 121212 Components 4 4 4 AlgorithmAlgorithm Time Time 000 000 Algorithm Time 000 0 0 0 PageRank Shortest Paths Connected Components GraphLabGraphLabGraphLab VerticaVerticaVertica GraphLabGraphLabGraphLab VerticaVerticaVertica GraphLabGraphLabVerticaVerticaGraphLabGraphLabGraphLabPageRankPageRank VerticaShortestVerticaVertica ShortestPaths PathsConnectedConnected Components Components 0 0 GraphLab Vertica 0 PageRank Shortest Paths Connected Components PageRankPageRank Shortest ShortestPaths PathsConnectedConnected Components Components PageRank 11.5 14.244 (a) PageRank (b) Shortest PathPageRankPageRank 11.5 14.24411.5 14.244(c) Connected Components ShortestShortestShortest PathsPaths Paths (Twitter(Twitter (Twitter Graph)Graph) Graph) Shortest Paths 1.4 1.649 Shortest ShortestPaths Paths 1.4 1.6491.4 1.649 Connected 6.641 7.032 Figure 14:GraphLabGraphLabGraphLab In-memoryVerticaVerticaVertica Vertex-centricConnected GraphConnected Processing. 6.641 6.6417.032 7.032 Components ComponentsComponents LoadLoadLoad Time Time Time 290.923290.923290.923 7070 6060 6060 70 Algorithm Time60 Algorithm Time60 Algorithm Time AlgorithmAlgorithm Time Time AlgorithmAlgorithm Time Time AlgorithmAlgorithm Time Time LoadLoad Time Time AlgorithmAlgorithmAlgorithm Time Time Time 1.41.4 44.20744.20744.207 LoadLoad Time Time LoadLoad Time Time ory footprint for betterLoad Time I/O performance in Vertica.1.4 Load Time erties of users in6.6 a social6.66.6 media context,Load Time or web pages in a search 56 11.511.5 48 48 56 56 11.5 48 TotalTotal48Total Time Time Time 50.750.7 335.130335.130335.130 48 48 We also tested the shared memory Vertica50.7 extension with the engine. The total49.0 size of49.049.0 the Twitter graph with this metadata is 66 49.149.1 42 49.1larger graphs and compared it with GraphLab36 on a single node. GB. We consider36 the following end-to-end graph analysis: 42 42 36 36 36 36 Figures 14(a), 14(b), and 14(c) show the result on the LiveJour- (i) Sub-graph Projection & Selection. In this analysis, we ex- 28 24 24 28 28 nal graph. We can see14.2 that14.2 the actual24 24 graph analysis time (Algo- 24 24 14.2 tract the graph before performing the7.0 graph7.07.0 analysis. This includes 1.6 1.61.6 rithm Time) of GraphLab and Vertica are very similar now. How- 16.5 1414 1212 14.814.8 projecting out1212 just the node ids and16.5 discarding16.5 all other attributes 14 ever, GraphLab still15.4 suffers15.415.4 from12 the expensive loading time while14.8 as well12 as extracting a subgraph of filtered nodes (e.g., attribute 6 0 0 Vertica benefits from more efficient data0 0 storage. As a result, the = 4) connected0 0 by edges of type ‘Family’. We run PageRank and 0 GraphLabGraphLab VerticaVertica 0 GraphLabGraphLab VerticaVertica 0 GraphLabGraphLab VerticaVertica GraphLabperformance gapVertica between GraphLab and VerticaGraphLab widens. WeVertica also SSSP over suchGraphLab an extracted subgraph.Vertica scaled SSSP to the billion edge Twitter graph on the shared memory (ii) Graph Analysis Aggregation. In this analysis, we gather the Vertica extension (single node). The single node algorithm run- distributions of PageRank (density of nodes by their importance) time in this case is 44.2 seconds, which is just 1.7 times that of and distance values (density of neighbors by their distance). This GraphLab on 4 nodes4. Thus, we see that we can extend Vertica to includes computing equi-width histograms after running PageRank exhibit similar performance characteristics as main-memory graph and SSSP respectively. engines. Finally, note that on Vertica, the shared memory runtimes are (iii) Graph Joins. Finally, we combine the output of PageRank and better than the SQL runtimes (Figures 8(a), 8(b), and 8(c)) by 2.1 Shortest Paths to emit those nodes which are either very near (path times on PageRank, 2.4 times on SSSP, and 2.6 times on connected distance less than a given threshold) or are relatively very important components. This is is largely due to the fact that the shared mem- (PageRank greater than a given threshold). ory system is able to avoid expensive disk I/O after each iteration. Vertica is well suited for the relational operators in the above three analysis. We compare against Giraph, which allows users 4.4 Mixed Graph & Relational Analyses to provide custom input/output formats that could be used to per- Finally, we consider situations when users want to combine graph form the projection and selection. We write additional MapReduce analysis with relational analysis. We extended the graph datasets in jobs for the aggregation and join. Figures 15(a) to 15(c) show the our experiment with the following metadata. For each node, we result on the Twitter dataset over 4 nodes. We can see that the per- added 24 uniformly distributed integer attributes with cardinality formance difference between Vertica and Giraph is much higher varying from 2 to 109, 8 skewed (zipfian) integer attributes with now. For example, Vertica is 17 times faster on PageRank and 4 varying skewness, 18 floating point attributes with varying value times faster on SSSP, as compared to 1.6 and 1.08 earlier, when ranges, and 10 string attributes with varying size and cardinality. combining these analysis with sub-graph selection. Performance For each edge, we added three additional attributes: the weight, the on Vertica could be further improved by creating sort orders on the creation timestamp, and an edge type (friend, family, or classmate), selection attributes. These massive performance differences are be- chosen uniformly at random. These attributes are meant to model cause Vertica is highly optimized to perform selections and exploit additional relational metadata that would be associated with prop- late materialization to access the remaining attributes of only the qualifying records. In contrast, Giraph incurs a complete sequen- 4Single node GraphLab runs out of memory on the Twitter graph. tial scan of data and cannot exploit the fact that part of the analy-

10 ProjectionProjectionProjection SelectionSelectionSelection AggregationAggregationAggregation JoinJoinJoin

PageRankPageRankPageRank ShortestPShortestPShortestP PageRankPageRankPageRank ShortestPShortestPShortestP PageRankPageRankPageRank ShortestPShortestPShortestP PageRankPageRankPageRank + + Shortest +Shortest Shortest Path Path Path athathath athathath athathath GiraphGiraphGiraph 226.643333226.643333226.643333146.96333146.96333146.96333 GiraphGiraphGiraph 954.552954.552954.552 405.449405.449405.449 GiraphGiraphGiraph 1089.6461089.6461089.646 349.878349.878349.878 GiraphGiraphGiraph 1435.8821435.8821435.882 333333333333333333333333 333333333333333333333333333 VerticaVerticaVertica 28.298666628.298666628.298666654.80286654.80286654.802866 VerticaVerticaVertica 55.57955.57955.579 101.298101.298101.298 VerticaVerticaVertica 643.927126643.927126643.927126279.781814279.781814279.781814 VerticaVerticaVertica 926.993418926.993418926.993418 666666766666676666667 666666676666666766666667

300300300 100010001000 110011001100 160016001600 GiraphGiraphGiraph GiraphGiraphGiraph GiraphGiraphGiraph GiraphGiraphGiraph VerticaVerticaVertica VerticaVerticaVertica VerticaVerticaVertica VerticaVerticaVertica 750750750 825825825 120012001200 225225225

500500500 550550550 800800800

150150150 250250250 275275275 400400400 Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time (seconds) Time (seconds) Time Time (seconds) Time Time (seconds) Time Time (seconds) Time 757575 000 000 000 PageRankPageRankPageRank ShortestPathShortestPathShortestPath PageRankPageRankPageRank ShortestPathShortestPathShortestPath PageRankPageRankPageRank + Shortest Path (a) Subgraph Projection & Selection (b) Graph Analysis Aggregation (c) Graph Join 000 PageRankPageRankPageRank ShortestPathShortestPathShortestPath Figure 15: Mixed graph and relational analysis.

sis is relational in nature. Rather, developers are required to stitch Query Dataset Vertica Giraph Youtube 259.56 230.01 Strong Overlap Giraph programs with programs in another data processing engine LiveJournal-undir 381.05 out of memory Youtube 746.14 out of memory such as Hadoop MapReduce or Spark. Similar results are obtained Weak Ties on aggregation and join queries. LiveJournal-undir 1,475.99 out of memory In summary, several data management operations, such as pro- Table 3: 1-hop Neighborhood Analysis. PageRankPageRankPageRank + + DM+ DM DM ShortestPathShortestPathShortestPath + + +DM DM DM jection, selection, aggregation, and join, fit naturally with graph processing systems. For instance, in Giraph, each vertex needs to ProjectionProjectionProjection SelectionSelectionSelection AggregatiAggregatiAggregatioperations.JoinJoinJoin To perform theseProjectionProjectionProjection operationsSelectionSelectionSelection efficiently,AggregatiAggregatiAggregati JoinJoinJoin we need sev- ononon ononon GiraphGiraphGiraph 226.643333226.643333226.643333 191.32191.32191.32287.90466287.90466287.90466eral430.13466430.13466430.13466 features fromGiraphGiraphGiraph relational146.963333146.963333146.963333 databases,139.71666139.71666139.71666 including186.82166186.82166186.82166 efficient430.13466430.13466430.13466 layouts, broadcast its neighborhood in the first superstep in order to access 333333333333333333333333 666666666666666666666666666666666666666666666666666666 333333333333333333333333 666666667666666667666666667666666666666666666666666666666666666666666666666666666 VerticaVerticaVertica 28.298666628.298666628.298666611.12666611.12666611.126666 30.65933330.65933330.659333indexes,84.16852384.16852384.168523 and statistics,VerticaVerticaVertica in the54.802866654.802866654.8028666 graph6.496790333333336.49679033333333 management6.4967903333333354.03505354.03505354.035053 system.84.16852384.16852384.168523 Column- the 1-hop neighborhood. Thereafter, we perform the actual analysis 666666766666676666667 666666676666666766666667 333333333333333333333333oriented333333333333333333333333 databases are already666666766666676666667 well equipped with these333333333333333333333333 techniques, in the subsequent supersteps. Unfortunately, this results in repli- whereas native graphs stores lack optimized implementations of cating the graph several times in memory. As an optimization in them. As a result500500500 Vertica excels on benchmarks that include a mix Giraph, we can reduce the number of messages by sending mes- of relational and graph operations. GiraphGiraphGiraph VerticaVerticaVertica 430.1430.1430.1sages only to those neighbors with a higher node id. In addition, we can reduce the size of messages by sending only the ids which 4.5 Beyond375375375 Vertex-centric Analyses are smaller than the id of the receiving vertex. Still, passing neigh- Let us now look at more advanced graph analytics. Consider the borhoods as messages is not natural in the vertex-centric model and incurs the overhead of serializing and deserializing the nodes ids. following two 1-hop250250250 neighborhood graph queries. Table3 shows the performance of Vertica and Giraph over the (i) Strong Overlap. Find all pairs of nodes having a large186.8186.8186.8 number two 1-hop neighborhood analyses running on 4 nodes. We can see of common neighbors,147.0 between147.0147.0 them.139.7139.7 For139.7 simplicity, we define Time (seconds) Time Time (seconds) Time Time (seconds) Time 125125125 that Giraph runs out of memory when scaling to larger graphs (even overlap as the number of common neighbors. However, we could 84.284.284.2 easily extend the algorithm to include54.854.854.8 other definitions of overlap.54.054.054.0 after allocating 12GB to each of the 4 workers on each of the 4 6.56.56.5 nodes). This is because the graphs are quite dense and sending Such an analysis to000 find the strongly overlapping nodes in a graph could be useful in detectingProjectionProjectionProjection similar entities.SelectionSelectionSelection To implementAggregationAggregationAggregation such theJoinJoinJoin entire 1-hop neighborhood results in memory usage proportion a query using SQL we do a self-join on theRelationalRelationalRelational edge table OperatorsOperators Operators followed to the graph size times the average out-degree of a node. Vertica, by a group by on the two leaf nodes and a count of the number of on the other hand, does not suffer from such issues and works for common neighbors between them. Finally, all those pairs of nodes larger graphs as well. which have less than the threshold number of common neighbors In summary, while Giraph is good for typical graph algorithms are filtered. such as PageRank and Shortest Paths, it is hard to program and SELECT e1.from_node as n1,e2.from_node as n2, count(*) poor in performance for advanced graph analytics such as finding FROM edge e1 nodes with strong overlap. Vertica, on the other hand is quite facile JOIN edge e2 ON e1.to_node=e2.to_node at such analysis and allows programmers to perform a wide range AND e1.from_node THRESHOLD (ii) Weak Ties. Find all nodes that act as a bridge between two oth- 5. RELATED WORK erwise disconnected node-pairs. The goal is to find all such weak Existing graph data management systems address two classes of ties which connect at least a threshold number of node pairs. This query workloads: (i) low latency online graph query processing, is a slightly more complicated query because we need to test for e.g. social network transactions, and (ii) offline graph analytics, disconnection between node pairs. Using SQL, this could be im- e.g. PageRank computation. Typical examples of systems for on- plemented as a three-way join, with the second join being a left line graph processing include RDF stores (such as Jena [16] and join, and counting all cases when the third edge does not exist. AllegeroGraph [17]) and key value stores (such as Neo4j [18] and SELECT e1.to_node AS Id, HypergraphDB [19]). Some recent graph processing systems wrap sum(CASE WHEN e3.to_node IS NULL THEN 1 ELSE 0 END)/2 AS C FROM edge e1 around relational databases to provide efficient online query pro- JOIN edge e2 ON e1.to_node=e2.from_node cessing. These include TAO [20] from Facebook and FlockDB [21] AND e1.from_node<>e2.to_node from Twitter, both of which wrap around MySQL to build a dis- LEFT JOIN edge e3 ON e2.to_node=e3.from_node AND e1.from_node=e3.to_node tributed and scalable graph processing system. Thus, low latency GROUP BY e1.to_node online graph query processing can be mapped to traditional online HAVING C > THRESHOLD processing in relational databases. While the above queries are straightforward to implement and Graph analytics, on the other hand, is seen as completely dif- run on Vertica, they are tedious to implement in vertex-centric graph ferent from traditional data analytics, typically due to its scale and

11 iterative nature. As a result a plethora of graph processing systems such limitations and can efficiently execute 1-hop neighborhood have been recently proposed. These include vertex-centric systems, computations. e.g. Pregel [1], Giraph [2], GraphLab [3] and its extensions [22, RDBMSs are a box full of goodies. Relational databases come 23, 24], GPS [10], Trinity [4], GRACE [11, 25], Pregelix [5]; with many features that are not present (yet) in the next generation neighborhood-centric systems, e.g. Giraph++ [26], NScale [27, 28]; of graph processing systems. These include update handling, trans- datalog-based systems, e.g. Socialite [29, 30], GrDB [31, 32]; SPARQL- actions, checkpointing and recovery, fault tolerance, durability, in- based systems, e.g. G-SPARQL [33]; and matrix-oriented systems, tegrity constraints, type checking, ACID etc. Current-generation e.g. Pegasus [15]. Some recent works have also looked at specific graph processing systems simply do not come with this rich set of graph analysis using relational databases. Examples include trian- features. One might think of stitching multiple systems together gle counting [7], shortest paths [34,6], subgraph pattern match- and coordinating between them to achieve these features, e.g. us- ing [8], and social network analysis [35, 36]. Others have looked ing HBase for transactions and Giraph for analytics, but then we at the utility of combining graph analysis with relational opera- have the additional overhead of stitching these systems together. tors [37, 38]. Of course, with time, graph-engines may evolve these features, but In this work, we have a particular emphasis on the question as our results suggest that these graph systems should be layered on to whether, in general relational databases and Vertica in partic- RDBMS, not built outside of them, which would enable them to ular can provide a performant, easy-to-use engine on which graph inherit these features for free without giving up performance. Fur- analytics can be layered, as opposed to looking at specific analyses. thermore, in many scenarios, the raw data for the graphs is main- tained in a relational database in the first place. In such situations, 6. LESSONS LEARNED & CONCLUSION it would make sense to be able to perform graph analytics within the same data management system. Graph analytics is emerging as an important application area, with several graph data management systems having been recently 7. REFERENCES proposed. These systems, however, require users to switch to yet [1] G. Malewicz, M. H. Austern et al., “Pregel: A System for another data management system. This paper demonstrates that ef- Large-Scale Graph Processing,” SIGMOD, 2010. ficient and scalable graph analytics is possible within Vertica rela- [2] “Apache Giraph,” http://giraph.apache.org. tional database system. We implemented a variety of graph analy- [3] Y. Low, J. Gonzalez et al., “Distributed GraphLab: A Framework for ses on Vertica as well as two popular vertex-centric graph analytics Machine Learning and Data Mining in the Cloud,” PVLDB, 2012. system. Our results show that Vertica has comparable end-to-end [4] B. Shao, H. Wang, and Y. Li, “Trinity: A Distributed Graph Engine on a Memory Cloud,” SIGMOD, 2013. runtime performance, without requiring the use of a purpose-built [5] “Pregelix,” http://hyracks.org/projects/pregelix. graph engine. In addition, we showed that using table UDFs, devel- [6] A. Welc, R. Raman et al., “Graph Analysis Do We Have to Reinvent opers can trade memory footprint for reduced disk I/O and improve the Wheel?” GRADES, 2013. the performance of iterative queries even further. In summary, the [7] “Counting Triangles with Vertica,” key takeaways from our analysis and evaluation in this paper are: vertica.com/2011/09/21/counting-triangles. Graph analytics can be expressed and tuned in RDBMSs. Re- [8] J. Huang, K. Venkatraman, and D. J. Abadi, “Query Optimization of lational databases are general purpose data processing systems and Distributed Pattern Matching,” ICDE, 2014. graph operations can be expressed as relational operators in a re- [9] A. Jindal, “Benchmarking Graph Databases,” http://istc-bigdata.org lational database. Furthermore, relational databases can be effec- /index.php/benchmarking-graph-databases, 2013. tively tuned to offer good performance on graph queries, yielding [10] S. Salihoglu et al., “GPS: A Graph Processing System,” SSDBM, 2013. performance that is competitve with dedicated graph stores. [11] G. Wang, W. Xie et al., “Asynchronous Large-Scale Graph Column-oriented databases excel at graph analytics. We showed Processing Made Easy,” CIDR, 2013. that column-oriented databases like Vertica can have comparable or [12] J. Dittrich and al., “Hadoop++: Making a Yellow Elephant Run Like better performance than popular graph processing systems (specif- a Cheetah (Without It Even Noticing),” PVLDB, vol. 3, no. 1, 2010. ically, Giraph) on a variety of graph analytics. This is because [13] “Physical Design Automation in Vertica,” www.vertica.com these queries typically involves full scans, joins, and aggregates /2014/07/21/ over large tables, for which Vertica is heavily optimized. These op- physical-design-automation-in-the-hp-vertica-analytic-database. timizations include efficient physical data representation (columnar [14] A. Lamb, M. Fuller et al., “The Vertica Analytic Database: C-store 7 layout, compression, sorting, segmentation), pipelined query exe- Years Later,” PVLDB, vol. 5, no. 12, 2012. [15] U. Kang et al., “PEGASUS: A Peta-Scale Graph Mining System - cution, and an optimizer to automatically pick the best physical Implementation and Observations,” ICDM, 2009. plan. [16] “Apache Jena,” http://jena.apache.org. The RDBMS advantage: Graph + Relational Analysis. Apart [17] “AllegroGraph,” http://franz.com/agraph/allegrograph. from outperforming graph processing systems on graph analysis, [18] “Neo4j,” http://www.neo4j.org. the real advantage of relational databases is the ability to com- [19] “HyperGraphDB,” http://www.hypergraphdb.org. bine graph analysis with relational analysis. This is because graph [20] N. Bronson, Z. Amsden et al., “TAO: Facebook’s Distributed Data analysis is typically accompanied by relational analysis, either as a Store for the Social Graph,” USENIX ATC, 2013. preparatory step or as a final reporting step. Such relational anal- [21] “FlockDB,” http://github.com/twitter/flockdb. ysis is either impossible or highly inefficient in graph processing [22] J. E. Gonzalez, Y. Low et al., “PowerGraph: Distributed systems, while is natural to express in relational databases. Graph-Parallel Computation on Natural Graphs,” OSDI, 2012. Some graph queries are difficult to express in vertex-centric [23] A. Kyrola, G. Blelloch, and C. Guestrin, “GraphChi: Large-Scale graph systems. We showed that the vertex-centric programming Graph Computation on Just a PC,” OSDI, 2012. [24] C. Xie, R. Chen et al., “SYNC or ASYNC: Time to Fuse for model is a poor fit for several complex graph analyses. In particu- Distributed Graph-parallel Computation,” PPoPP, 2014. lar, it is very difficult to program and run graph analysis involving [25] W. Xie, G. Wang et al., “Fast Iterative Graph Computation with 1-hop neighborhoods on vertex-centric systems. This is because Block Updates,” PVLDB, vol. 6, no. 14, pp. 2014–2025, 2013. the scope of vertex-centric computations is restricted to a vertex [26] Y. Tian, A. Balmin et al., “From ”Think Like a Vertex” to ”Think and its immediate neighbors. In contrast, RDBMSs do not have Like a Graph”,” PVLDB, vol. 7, no. 3, pp. 193–204, 2013.

12 [27] A. Quamar, A. Deshpande, and J. Lin, “NScale: Neighborhood-centric Large-Scale Graph Analytics in the Cloud,” arxiv.org/abs/1405.1499. [28] ——, “NScale: Neighborhood-centric Analytics on Large Graphs,” VLDB, 2014. [29] M. S. Lam, S. Guo, and J. Seo, “SociaLite: Datalog Extensions for Efficient Social Network Analysis,” ICDE, 2013. [30] J. Seo, J. Park et al., “Distributed SociaLite: A Datalog-Based Language for Large-Scale Graph Analysis,” VLDB, 2013. [31] W. E. Moustafa, H. Miao et al., “A System for Declarative and Interactive Analysis of Noisy Information Networks,” SIGMOD, 2013. [32] W. E. Moustafa, G. Namata et al., “Declarative Analysis of Noisy Information Networks,” GDM, 2011. [33] S. Sakr, S. Elnikety, and Y. He, “G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs,” CIKM, 2012. [34] J. Gao, R. Jin et al., “Relational Approach for Shortest Path Discovery over Large Graphs,” PVLDB, vol. 5, no. 4, 2011. [35] S. Lawande, L. Shrinivas et al., “Scalable Social Graph Analytics Using the Vertica Analytic Platform,” BIRTE, 2011. [36] R. Angles, A. Prat-Perez´ et al., “Benchmarking Database Systems for Social Network Applications,” GRADES, 2013. [37] R. S. Xin, J. E. Gonzalez et al., “GraphX: A Resilient Distributed Graph System on Spark,” GRADES, 2013. [38] A. Jindal, P. Rawlani, and S. Madden, “Vertexica: Your Relational Friend for Graph Analytics!” VLDB, 2014.

13