Relaxed Synchronization and Eager Scheduling in Mapreduce

Relaxed Synchronization and Eager Scheduling in MapReduce Karthik Kambatla, Naresh Rapolu, Suresh Jagannathan, Ananth Grama Purdue University Abstract MapReduce [8]. Programs in MapReduce are expressed as map and reduce operations. A map MapReduce has rapidly emerged as a programming paradigm takes in a list of key-value pairs and applies for large-scale distributed environments. While the underlying programming model based on maps and reduces has been shown to be a programmer-specified function independently on effective in specific domains, significant questions relating to perfor- each pair in the list. A reduce operation takes as mance and application scope remain unresolved. This paper targets input a list indexed by a key, of all corresponding questions of performance through relaxed semantics of underlying map and reduce constructs in iterative MapReduce applications like values, and applies a reduction function on the eigenspectra/ pagerank and clustering (k-means). Specifically, it in- values. It outputs a list of key-value pairs, where vestigates the notion of partial synchronizations combined with eager the keys are unique and values are determined by scheduling to overcome global synchronization overheads associated with reduce operations. It demonstrates the use of these partial the function applied on all of the values associated synchronizations on two complex problems, illustrative of much larger with the key. The inherent simplicity of this pro- classes of problems. gramming model, combined with underlying system The following specific contributions are reported in the paper: (i) partial synchronizations combined with eager scheduling is capable support for scheduling, fault tolerance, and applica- of yielding significant performance improvements, (ii) diverse classes tion development, made MapReduce an attractive of applications (on sparse graphs), and data analysis (clustering) can platform for diverse data-intensive applications. In- be efficiently computed in MapReduce on hundreds of distributed hosts, and (iii) a general API for partial synchronizations and eager deed, MapReduce has been used effectively in a map scheduling holds significant performance potential for other wide variety of data processing applications. The applications with highly irregular data and computation profiles. large volumes of data processed at Google, Ya- hoo, and Amazon, stand testimony to the effective- 1. Introduction ness and scalability of the MapReduce paradigm. Hadoop [10], the open source implementation of Motivated by the large amounts of data generated MapReduce, is in wide deployment and use. by web-based applications, scientific experiments, A majority of the applications currently executing business transactions, etc., and the need to analyze in the MapReduce framework have a data-parallel, this data in effective, efficient, and scalable ways, uniform access profile, which makes them ideally there has been significant recent activity in devel- suited to map and reduce abstractions. Recent oping suitable programming models, runtime sys- research interest, however, has focused on more un- tems, and development tools. The distributed nature structured applications that do not lend themselves of data sources, coupled with rapid advances in naturally to data-parallel formulations. Common ex- networking and storage technologies naturally lead amples of these include sparse unstructured graph to abstractions for supporting large-scale distributed operations (as encountered in diverse domains in- applications. The inherent heterogeneity, latencies, cluding social networks, financial transactions, and and unreliability, of underlying platforms makes the scientific datasets), discrete optimization and state- development of such systems challenging. space search techniques (in business process opti- With a view to supporting large-scale distributed mization, planning), and discrete event modeling. applications in unreliable wide-area environments, For these applications, there are two major unre- Dean and Ghemawat proposed a novel program- solved questions: (i) can the existing MapReduce ming model based on maps and reduces, called framework effectively support such applications in a scalable manner? and (ii) what enhancements to the area Google cluster in the context of Pagerank MapReduce framework would significantly enhance and clustering (K-Means ) implementations. The its performance and scalability without compromis- applications are representative of a broad set of ing desirable attributes of programmability and fault application classes. They are selected because of tolerance? their relative simplicity as well as their ubiquitous This paper primarily focuses on the second ques- interaction pattern. Our extended semantics have a tion – namely, it seeks to extend the MapReduce much broader application scope. semantics to support specific classes of unstructured The rest of the paper is organized as follows: applications on large-scale distributed environments. in Section 2, we provide a more comprehensive Recognizing that one of the key bottlenecks in background on MapReduce, Hadoop, and motivate supporting such applications is the global synchro- the problem;in Section 3, we outline our primary nization associated with the reduce operation, it contributions and their significance; in Section 4, introduces notions of partial synchronization and ea- we present our new semantics and the corresponding ger scheduling. The underlying insight is that for an API; in Section 5, we discuss our implementations important class of applications, algorithms exist that of Pagerank and K-Means clustering and analyze do not need global synchronization for correctness. the performance gains of our approach. We outline Specifically, while global synchronizations optimize avenues for ongoing work and conclusions in Sec- serial operation counts, violating these synchro- tions 8 and 9. nizations merely increases operation counts without impacting correctness of the algorithm. Common 2. Background and Motivation examples of such algorithms include, computation of eigenvectors (pageranks) through (asynchronous) The primary design motivation for the functional power methods, branch-and-bound based discrete MapReduce abstractions is to allow programmers optimization with lazy bound updates, and other to express simple concurrent computations, while heuristic state-space search algorithms. For such hiding the messy details of parallelization, fault- algorithms, a global synchronization can be replaced tolerance, data distribution and load balancing in by a partial synchronization. However, this partial a single library [8]. The simplicity of the API synchronization must be combined with suitable makes programming extremely easy. Programs are locality enhancing techniques so as to minimize expressed as sequences (iterations) of map and the adverse effect on operation counts. These lo- reduce operations with intervening synchroniza- cality enhancing techniques take the form of min- tions. The synchronizations are implicit in the cut graph partitioning and aggregation into maps reduce operation, since it must wait for all map in graph analyses, periodic quality equalization in outputs. Once a reduce operation has terminated, branch-and-bound, and other such operations that the next set of maps can be scheduled. Fault are well known in the parallel processing commu- tolerance is achieved by rescheduling maps that nity. Replacing global synchronizations with partial time out. A reduce operation cannot be initiated synchronizations also allows us to schedule subse- until all maps have completed. Maps generally quent maps in an eager fashion. This has the impor- correspond to fine-grained tasks, which allows the tant effect of smoothing load imbalances associated scheduler to balance load. There are no explicit with typical applications. This paper combines par- primitives for localizing data access. Such localiza- tial synchronizations, locality enhancement, and ea- tion must be achieved by aggregating fine-grained ger scheduling, along with algorithmic asynchrony maps into coarser maps, using traditional parallel to deliver distributed performance improvements of processing techniques. It is important to remember, 100 to 200% (and beyond in several cases). The however, that increasing the granularity carries with enhanced programming semantics resulting in the it the potential downside of increased makespan significant performance improvement do not impact resulting from failures. application programmability. As may be expected, for many applications, the We demonstrate all of our results on the wide- dominant overhead in the program is associated with the global reduction operations (and the in- 3. Technical Contributions herent synchronizations). When executed in wide- area distributed environments, these synchroniza- To address the cost of global synchronization and tions often incur substantial latencies associated granularity, we propose and validate the following with underlying network and storage infrastructure. solutions within the MapReduce framework: In contrast, partial synchronizations take over an • We provide support for global and partial order of magnitude less time. For this reason, fine- synchronizations. Partial synchronizations are grained maps, which are natural for programming enforced only on a subset of maps. Global data parallel MapReduce applications do not always synchronizations are identical to reduce opera- yield expected performance

Load more