Hardware and Software Support for High Performance Computing TIN2007-67537-C03

Jornada de Seguimiento de Proyectos, 2010 Programa Nacional de Tecnolog´ıasInformáticas Hardware and Software Support for High Performance Computing TIN2007-67537-C03 Javier D´ıazBruguera ∗ RamónDoallo Biempica y Universidad de Santiago de Compostela Universidad de A Coru~na Abstract The objectives of this project are a continuation of the results obtained during the development of the project TIN2004-07797-C02. Taking these results as our starting point, we have dealt with new objectives, some of them oriented to solve the new challenges that arise with the installation of the Finisterrae supercomputer at CESGA (Galician Supercomputing Center) in 2007. The objectives are organized into three main areas: (1) Performance and programmability improvement of HPC systems, improving the functionality of HPC systems, with special focus on irregular codes, exploring two approaches, analytical modelling, and runtime solutions. (2) Software tools for HPC and Grid facilities, developing middleware for system management of the Finisterrae supercomputer and Grid environments. (3) Performance improvement for multimedia applications and general purpose processors, where we tackle the design of algorithms and architectures for video compression, real time visualization and functional units of general purpose processors. Keywords: High performance computing, Grid computing, Constellation architecture, multicore and multithreaded processors, efficient software. 1 Objectives of the project It has to be pointed out that the project proposal comprised three subprojects and research groups: USC, UDC and CESGA; however, the project was finally approved with only of the three subprojects, USC and UDC. This have affected to some of the objectives Then, two groups have been involved in this proposal, the Computer Architecture Group at the University of Santiago de Compostela (USC Group) and the Computer Architecture Group at the University of A Coru~na(UDC Group). As global background, the project we propose is a continuation of the research lines being developed by the USC and UDC groups in the last years about High performance computing, both at the hardware side and at the software side. The objectives are organized into three main areas: 1. Improvement of the performance and programmability of HPC systems. The main concern of this part of the project is to improve the functionality of HPC systems, with ∗Email: [email protected] yEmail: [email protected] TIN2007-67537-C03 special focus on irregular codes. We organized the proposal into two main topics: the characterization of irregular codes, combining compiler and run-time techniques, and the study of the functionalities of PGAS languages as an efficient alternative for constellation architectures (like Finisterrae). (a) Compiler and Run-time support for performance analysis and optimization of irregular codes. We explore two approaches to deal with their complexity: analytical modeling, and runtime solutions such as inspector/executor. Both approaches require compiler support for an- alyzing these complex codes. This support will be provided by XARK (http://xark.des.udc.es), a compiler framework developed by the UDC Group [4]. (b) Analysis and improvement of performance and programmability in HPC systems using PGAS approaches. The objectives of this line are: (1) to compare the programmability and performance using traditional approaches versus PGAS languages, (2) to propose performance optimizations and programmability enhancements by means of PGAS language extensions and/or libraries, and (3) to extend the usability and performance features of the HTA for the programming of hybrid systems such as constellation architectures. 2. Software tools for HPC and GRID facilities. The understanding and characterization of the performance of Grid applications and the accurate simulation of Grid systems is one of the focus of this research line. (a) Management of large-scale HPC facilities. The research lines proposed in this new project take advantage of the results achieved in the previous MEC project. Specifically, the first goal is to use AdCIM [27] to develop customized and integrated system administration applications for the Finisterrae constellation architecture and for Grid. Objectives related to the application of AdCIM and one of them "Application of the AdCIM framework for the systematic development of customized tools for selected administration domains of CESGA supercomputers", cannot not be carried out since CESGA subproject was not funded, and the staff and infrastructure of CESGA was essential for the achievement of this objective. (b) Fault tolerance of high-performance applications. In the frame of the previous MEC project a tool named CPPC (Controller/Precompiler for Portable Checkpointing) has been developed [23]. In this project we continue the development of the CPPC tool to achieve the complete automation of the process, so that the tool makes a source-to-source transformation of a MPI code in a fault-tolerant one by inserting the necessary functions of the library in safe points. (c) Software support for performance characterization and optimization of Grid applications. We propose to study the optimization of Grid resources to execute massively computational applications efficiently. Therefore, we consider the Grid as a huge computing system to execute very time-consuming applications. To deal with this topic, some objectives will be considered: the understanding and characterization of the performance of these applications in Grids, the accurate simulation of Grid systems to reproduce real executions, and the improvement of the performance of the execution of these HPC applications. 3. Hardware for multimedia and general purpose processors. We tackle the implementation of video compression algorithms on programmables processors, exploiting the parallelism provided by the organization of the processors. On the other hand, we focus on the TIN2007-67537-C03 design of improved algorithms and architectures for the computation of essential operations for multimedia and other applications. (a) Algorithms and architectures for multimedia. We focus on processor with EPIC and VLIW architectures. These processors provide instruction level parallelism and require efficient programming methodologies to exploit the SIMD programming paradigm, multithreading, the software pipelining and memory hierarchy. We will implement the video compression algorithms in those architectures. An special effort is done on the development of new algorithms for motion estimation. Moreover, we have addressed the design of units for real time visualization for various applications. (b) Design of functional units for general-purpose processors. Our goal in this project is to improve even more the implementation of essential operations, as square root and inverse square root, developing multiplicative algorithms with reduced latency. Computations related with multimedia are error sensitive. That means, that small errors can propagate and result in large final errors; then we propose the use of error estimates that could help to obtain more reliable results. Another topic addressed in the project is the design of decimal floating-point hardware 2 Level of success achieved in the project It has to be pointed out that the project proposal comprised three subprojects and research groups: USC, UDC and CESGA; however, the project was finally approved with only of the three subprojects, USC and UDC. This might have affected to the level of success of some of the objectives, but most of the objectives has been addressed. The two groups finally participating in the project have a strong research collaboration for many years. In fact there several topics that are being developed by teams composed of members of both groups. In any case, it is clear that the research interest of both groups are complementary. In order to understand how the objectives of the project are being deal with, we indicate for every objective in the previous section the level of success and the group involved in its development. Note that in the technical memory of the project the objectives were decomposed in a set of more detailed subobjectives or tasks, that for space reasons are not listed here. We strongly recommend to see these tasks in the technical memory. For every topic a few representative publications are included, although there are other publications not referenced here. Most outstanding results are: 1. Improvement of the performance and programmability of HPC systems (a) Compiler and Run-time support for performance analysis and optimization of irregular codes (UDC, USC). This objective had several tasks: (1) extension of XARK, (2) analysis and optimization of complex memory hierarchies, and (3) run-time characterization and performance improvement of parallel irregular codes. The XARK compiler framework (http://xark.des.udc.es) developed by the Computer Ar- chitecture Group of the UDC has been extended. In this part of the project two main research lines have been conducted. The main contribution of the first research line is the formaliza- tion of a recognition algorithm that enables to build a hierarchical representation of a program using the concept of computational kernel. This hierarchical representation provides an TIN2007-67537-C03 optimizing compiler with relevant information for improving the performance of a program on general-purpose parallel architectures (e.g., multi-core processors) and on specific-purpose parallel architectures

Load more