Pascal Viewer: a Tool for the Visualization of Parallel Scalability Trends

PaScal Viewer: a Tool for the Visualization of Parallel Scalability Trends 1st Anderson B. N. da Silva 2nd Daniel A. M. Cunha Pro-Reitoria´ de Pesquisa, Inovaçao˜ e Pos-Graduaç´ ao˜ Dept. de Eng. de Comp. e Automaçao˜ Instituto Federal da Paraiba Univ. Federal do Rio Grande do Norte Joao Pessoa, Brazil Natal, Brazil [email protected] [email protected] 3st Vitor R. G. Silva 4th Alex F. de A. Furtunato 5th Samuel Xavier de Souza Dept. de Eng. de Comp. e Automaçao˜ Diretoria de Tecnologia da Informaçao˜ Dept. de Eng. de Comp. e Automaçao˜ Univ. Federal do Rio Grande do Norte Instituto Federal do Rio Grande do Norte Univ. Federal do Rio Grande do Norte Natal, Brazil Natal, Brazil Natal, Brazil [email protected] [email protected] [email protected] Abstract—Taking advantage of the growing number of cores particular environment. The configuration of this environment in supercomputers to increase the scalabilty of parallel programs includes the number of cores, their operating frequency, and is an increasing challenge. Many advanced profiling tools have the size of the input data or the problem size. Among the var- been developed to assist programmers in the process of analyzing data related to the execution of their program. Programmers ious collected information, the elapsed time in each function can act upon the information generated by these data and make of the code, the number of function calls, and the memory their programs reach higher performance levels. However, the consumption of the program can be cited to name just a few information provided by profiling tools is generally designed [2]. It is possible to measure the parallel efficiency of the to optimize the program for a specific execution environment, program in the analyzed environment. However, this value with a target number of cores and a target problem size. A code optimization driven towards scalability rather than specific would probably change if the same program is executed in performance requires the analysis of many distinct execution different environments. For this reason, from the information environments instead of details about a single environment. With collected from a single run alone, it is not possible to evaluate the goal of providing more useful information for the analysis and how the efficiency will evolve when the program is executed optimization of code for parallel scalability, this work introduces with a different number of cores or with a different problem the PaScal Viewer tool. It presents an novel and productive way to visualize scalability trends of parallel programs. It consists size. To discover the efficiency trends, developers need to of four diagrams that offers visual support to identify parallel perform the same analysis in many different environments. efficiency trends of the whole program, or parts of it, when Then, they can compare the data manually and say if and running on scaling parallel environments with scaling problem when the algorithm tends to be scalable. sizes. The information provided by these profiling tools is very Index Terms—parallel programming, efficiency, scalability, performance optimization, visualization tool useful to optimize the program execution for a single environment, with a target number of cores and a target problem size. I. INTRODUCTION However, when the goal is to analyze and optimize the code for parallel scalability, developers need to focus their attention The number of cores in supercomputers continues to grow. on the variation of efficiency values when the program runs Taking advantage of this to increase the performance of in many distinct execution configurations. In this case, it is parallel programs is a continuous challenge. Developers must more relevant to know fewer details about many executions understand how their program behaves when more cores are than many more details about a single execution. In addition, used to process data, when more data need to be processed, current profiling tools use different techniques for collecting or both—when data need to be processed by more cores. and analyzing data and, in some cases, they are developed for Many techniques and profiling tools have been developed to specific architectures and/or different parallelization models. assist programmers in the process of collecting and analyzing These tools present the information collected in large data data related to the execution of their program [1]. These tools tables or complex graphs, and because of that, demand a provide a large amount of information, measurements, details, ”good” knowledge of their visualization interfaces [3]. Some and characteristics of the program. All of this information is approaches, such as [4], present the efficiency values for usually related to one single execution of the program in a different environments in a single line chart. From such chart, This research was supported by High Performance Computing Center at developers could infer the program scalability trends, but this UFRN (NPAD/UFRN). task is not always simple for a large number of environment configurations are depicted. Weak scalability trend are also framework [6]. The color diagrams are drawn using Bokeh, difficult to infer from these line charts. an interactive visualization library [7]. SPERF is a simple tool that can automatically instrument A. The Color Diagrams parallel code with the insertion of time measurement [5]. From these measurements, developers can assess the execution The proposed color diagrams simplify the understanding time of each parallel region or regions of interest. With and the visualization of the scalability trends from a parallel these specific time measurements, developers can verify the program. Fig. 1 presents the four diagrams generated from efficiency of the whole program or part of it. They can check, execution data collected from a theoretical program with the for example, if the scalability of the program as a whole following characteristics: 2 deteriorates because of a specific region. In this way, they can • The serial execution time is given by T Serial = n ; 2 focus in code regions that breaks scalability, much in the same • The parallel execution time is given by T Parallel = n =p+ way they do for optimizing single-run performance bottlenecks log2(p); using traditional tools. This work uses the output of SPERF to • p corresponds to the number of cores and n corresponds construct visualization diagrams that unveils scalability trends. to the problem size; However, since the output of SPERF is a formatted text file, Each diagram is presented as a graphic of two axes. The it can also be generated by another tool or by a productivity horizontal axis corresponds to the number of cores and the script. vertical axis corresponds to the problem size. Both are orga- This paper presents a visualization tool for the scalability nized in same order presented in the input file. The numerical analysis of parallel programs. For this, the tool takes as input values of each diagram element can be visualized in a tooltip, the execution time measurements of a program in different as shown in Fig.1. configuration environments as provided by SPERF, translates The diagram located on the upper left corner of Fig. 1 these values into the corresponding efficiency values, and presents the parallel efficiency values. Each element of the presents, through simple diagrams, the efficiency trends of this diagram, represented by a color, corresponds to a particular program. The objective of the tool is to avoid a tedious manual execution scenario, with a specific number of cores and prob- comparison of efficiency values. The tool is independent of lem size. The numerical values depicted in this first diagram architecture, parallelization model or profiling tool. It displays are showed at Table I and Fig. 2. These values serve as base four color diagrams related to each analyzed region. One for constructing the other three diagrams and provide a general diagram holds the efficiency values and the other three show view of the program behavior. the variation of these values: when the number of cores is fixed; when the problem size is fixed; and when the number TABLE I of cores and the problem size change proportionally at given EFFICIENCY VALUES OF A THEORETICAL PROGRAM. rates. This tool can assist developers during the scalability analysis of their parallel programs in a simple and productive way. It helps on the identification of hot spots that when refactored could optimize the program scalability. The remainder of this work is organized as follows Section II describes the tool and its color diagrams. The results of a simple case study are presented in Section III. Section IV presents the related work. And finally the contribution is summarized in Section V with an outlook of future works. II. THE PASCAL VIEWER The other three diagrams present the results of the difference This work presents a tool that introduces a novel and between the efficiency values represented in the first diagram productive way to view the scalability trends of a parallel for each two bordering execution scenarios. The colors in these program, named Parallel Scalability Viewer or simply PaScal diagrams change according to two distinct ranges. One range Viewer. For this, the tool translates the efficiency values for the positive values and another for the negative ones. In the collected from program executions into four color diagrams. case of Fig. 1, the color range for the positive values varies These diagrams offer support to identify efficiency variation from white to green (from #FFFFFF to #004337, in RGB) when the program runs on parallel environments and when it and for the negative values varies from white to brown (from processes different amounts of data. In this sense, the tool aids #FFFFFF to #5D3506, in RGB). developers in the identification of parallel scalability, including The diagram located on the bottom left corner allows the the analysis of whether this scalability is weak or strong, for scalability analysis of the program when the number of cores the whole program or parts of it.

Pascal Viewer: a Tool for the Visualization of Parallel Scalability Trends

Distributed Algorithms with Theoretic Scalability Analysis of Radial and Looped Load ﬂows for Power Distribution Systems

Scalability and Performance Management of Internet Applications in the Cloud

Scalability and Optimization Strategies for GPU Enhanced Neural Networks (Genn)

Parallel System Performance: Evaluation & Scalability

Multiprocessing and Scalability

Computer Architecture: Parallel Processing Basics

Parallel Programming with Openmp

Challenges for the Message Passing Interface in the Petaflops Era

Scalable SIMD-Efficient Graph Processing on Gpus

Computing at Massive Scale: Scalability and Dependability Challenges

CUDA Basics Murphy Stein New York University

Evaluation of a Multithreaded Architecture for Cellular Computing