High Performance Computing on an IBM Cell Processor

High Performance Computing on an IBM Cell Processor

Bioinformatics

May08-24

Final Report

Client:

Iowa State University

Department of Electrical and Computer Engineering

Faculty Advisor:

Dr. Zhao Zhang

Team Members:

Kyle Byerly

Matt Rohlf

Shannon McCormick

Bryan Venteicher

Submitted: May 5, 2008

DISCLAIMER: This document was developed as a part of the requirements of an electrical and computer engineering course at Iowa State University, Ames, Iowa. This document does not constitute a professional engineering design or a professional land surveying document. Although the information is intended to be accurate, the associated students, faculty, and Iowa State University make no claims, promises, or guarantees about the accuracy, completeness, quality, or adequacy of the information. The user of this document shall ensure that any such use does not violate any laws with regard to professional licensing and certification requirements. This use includes any work resulting from this student-prepared document that is required to be under the responsible charge of a licensed engineer or surveyor. This document is copyrighted by the students who produced this document and the associated faculty advisors. No part may be reproduced without the written permission of the senior design course coordinator

ii

31

Table of Contents

1. Requirement Specifications 1

1.1 Problem/Need Statement 1

1.2 Proposed Solution 2

1.3 Concept Sketch 2

1.4 System Description 2

1.5 Operating Environment 3

1.6 User Interface Description 3

1.7 Functional Requirements 3

1.8 Non-functional Requirements 3

1.9 Market/Literature Survey 3

1.10 Deliverables 3

2 Project Plan 3

2.1 Work Breakdown Structure 3

2.2 Resources 8

2.2.1 Organizational Chart 9

2.2.2 Personnel 9

2.2.3 Materials 10

2.2.4 Financial 10

2.3 Project Schedule 11

2.3.1 Project Gantt Chart 11

2.3.2 Deliverable Schedule Gantt Chart 13

3.0 Project Design 15

3.1 Design Method and Engineering Specifications 15

3.1.1 Design Method 15

3.1.2 Input Specification 16

3.1.3 Output Specification 16

3.1.4 User Interface Specification 16

3.1.5 Hardware Specification 17

3.1.6 Software Specification 17

3.1.7 Test Specification 18

4.0 Implementation 19

4.1 Ported Program Selection 19

4.2 Explanation of DNAPenny 20

4.3 Previous Parallelization of DNAPenny 20

4.4 Cell/B.E. Parallelization Models 20

4.5 Libspe2 21

4.6 ClustalW Cell/B.E. Prototype 21

5 Testing 23

7.0 Other stuff 28

8.0 References 29


List of Figures

Figure 1: System Block Diagram 2

Figure 2: Work Breakdown Structure 4

Figure 3: Organization Chart 9

Figure 4: Project Schedule Gantt Chart 12

Figure 5: Project Deliverables Gantt Chart 14

Figure 6 – infile.orig 26

Figure 7 - in_b17x1237.txt 26

Figure 8 - in_b12x1236.txt 27

Figure 9 - in_b10x1219.txt 27

Figure 10 - in_a18x822.txt 28

Figure 11 - in_a15x822.txt 28

List of Tables

Table 1: Personnel Resources 9

Table 2: Hours Fall 2007 semester, excluding first three weeks 10

Table 3: Materials Resources 10

Table 4: Financial Resources 10

Table 5 - infile.orig 24

Table 6 - in_b17x1237.txt 24

Table 7 - in_b12x1236.txt 25

Table 8 - in_b10x1219.txt 25

Table 9 - in_a18x822.txt 25

Table 10 - in_a15x822.txt 25

Table 11: foo 29

List of Definitions

BioPerf: Suite of bioinformatics applications packaged to benchmark

Cell/B.E: Cell Broadband Engine; advanced microprocessor architecture designed by Sony, Toshiba, and IBM

EIB: Element Interconnect Bus; a high speed bus in the Cell/B.E. connecting the PPE and the SPEs

GProf: The GNU Profiler; application which profiles applications

HPC: High performance computing; comprises parallel applications on supercomputers or computer clusters

Linux: Free open-source operating system modeled after Unix

PowerPC: A family of RISC processors designed by Apple, IBM, and Motorola

PPE: Power Processor Element; PowerPC processor core with some extensions

RAM: Random access memory; most common type of computer memory which is used by applications to perform essential tasks

Sony PlayStation 3: Video game console released in fall 2006

SPE: Synergistic Processing Elements; specialized vector processor found in the Cell/B.E.

31

1. Requirement Specifications

This section contains the problem/need statement, concept sketch, system block diagram, system description, operating environment, user interface description, functional requirements, non-functional requirements, market/literature survey, and the deliverables.

1.1  Problem/Need Statement

The advent of high performance super computers in the 1980’s allowed researchers in biology, physics, engineering, and mathematics to tackle ever increasing complex problems. However, until recently, the computer systems needed to do such research were extremely expensive and remained out of reach of all but the well-funded researchers and institutions. The significant technological advancements in the last decade have put high performance computing within the reach of even modest budgets.

Biological researchers are faced with ever increasing computational time due to the exponentially growing data needed to be processed. Currently commodity computing hardware is unable to provide adequate performance. However, the team believes that the Cell Broadband Engine (Cell/B.E.) found in the Sony PlayStation 3 (PS3) will be able to achieve superior performance at a similar cost.

Constraints do not seem to be much of a factor for this project. This project does not require any budget because the PlayStation 3 is provided by the client and the software to be ported will be open source software. A list of constraint considerations is provided below.

·  Only one PS3 available for group to use.

·  Simulator provided is not useful for speedup comparisons.

·  Only one book on bioinformatics algorithms available.

The team will follow the following approach to port the applications to the Cell/B.E.

1.1.1  Familiarization with Programming on the Cell/B.E.

Learning how to program on the Cell/B.E. is the first step in completing this project. This will include learning how to utilize the SPEs, and how the PPE and the SPEs coordinate with one another. The group will complete some of the labs provided on a M.I.T. website which will further aid in learning how to program the Cell/B.E.

1.1.2  Determining Which Application to Port to the Cell/B.E.

A choice of application is to be made to port to the Cell/B.E. Research into what work has already been done in porting these applications will be performed to ensure that the group is not working on a project that has already been started/finished by others working in the field.

1.1.3  Familiarization with the Application and Algorithms

In order to port the application to the Cell/B.E., the group must first understand the original application. This will include reading the source code and learning about the underlying algorithms implemented in these application. A book on algorithms provided by the faculty advisor will aid in the process of learning these algorithms and will be studied by all members of the group.

1.1.4  Porting the Application to the Cell/B.E.

This is the actual task of modifying and re-compiling the application to run on the Cell/B.E.

1.2  Proposed Solution

The team believes the Cell/B.E. found in the PlayStation 3 can provide exceptional performance when compared to traditional computers as a similar price point. The team will port the application from the BioPerf benchmark suite to the Cell/B.E.

1.3  Concept Sketch

The team is working on porting a bioinformatics application from the BioPerf benchmark suite to the Cell/B.E. processor. The Cell/B.E. processor is well suited to provide high-performance in the computationally invasive application. The team is not creating anything new; instead the team is making existing application perform better.

1.4  System Description

The Cell/B.E. is comprised of a PPE and up to eight SPEs. The PPE is responsible for running the operating system and coordinating the SPEs. Each SPE is an independent vector processor capable of doing four operations at once. The PPE and SPEs are connected by a high speed bus called the Element Interconnect Bus (EIB).

The system block diagram below shows the same data being inputted into the same application. The application on the top are not being run on a PlayStation 3, while the application on the bottom have been ported to take advantage of the Cell/B.E. found in the PlayStation 3.

Figure 1: System Block Diagram

1.5  Operating Environment

The ported application will execute on the Linux operating system running on the Sony PlayStation 3. The PlayStation 3 will be stored in a dry and temperature controlled environment.

1.6  User Interface Description

The existing applications are all command line based. The team has no reason to change the interfaces.

1.7  Functional Requirements

Below is a list of functional requirements for the ported application.

·  Application ported shall run on the Cell/B.E.

·  Ported application shall return the same results as the original application.

·  Ported application shall return its running time for comparison to original application.

1.8  Non-functional Requirements

Below is a list of non-functional requirements for the ported application.

·  The ported application shall run faster with the SPEs than without.

·  The user interface will not be altered.

1.9  Market/Literature Survey

There are not any paying consumers for the team’s deliverables. However, there are researchers that would be interested in using the ported application to reduce the time they spend computing results.

There is previous work done by a group of researchers to port parts of the BioPerf suite to the Cell/B.E. [Sachdeva]. The team’s work will be similar that of Sachdeva. However, the team will port an application that was not ported by Sachdeva.

1.10  Deliverables

The team will deliver the source code of the ported application and benchmarks comparing the ported and un-ported application. Also, the team will deliver the project plan, design document, poster, website, and presentations.

2  Project Plan

This section contains the work breakdown, resource requirements, and the project schedule.

2.1  Work Breakdown Structure

This section contains the work breakdown structure.

Figure 2: Work Breakdown Structure

Task No. 1 – Problem Definition

Task Objective: To determine the scope of the project considered, and to decide what is to be done in terms of speed comparisons and benchmarking.

Task Approach: Meeting with the client and faculty advisor. Researching the BioPerf website and technical websites to determine what work is already in progress/completed in porting the application to the Cell/B.E.

Task Expected Results: Which application to port and which other processors to compare run-times with.

Subtask 1a – Researching the Programming of the Cell/B.E.

Subtask Objective: The team needs to become familiar with programming on the Cell/B.E., the libraries available for the Cell/B.E. and how the SPEs can be used in a useful and efficient way.

Subtask Approach: Studying the M.I.T. lecture slides provided by the faculty advisor and completing various labs offered on the M.I.T. lecture slides. Researching other technical sites which focus on programming on the Cell/B.E.

Subtask Expected Results: A better understanding of how to program the Cell/B.E., as well as how it works and how to use the SPEs in a useful way.

Subtask 1b – Research of BioPerf Suite

Subtask Objective: To become familiar with the BioPerf suite of applications and to aid in the decision of what application to port.

Subtask Approach: Reading documentation provided with the applications in the BioPerf suite, as well as determining which applications have already been ported or are in the process of being ported to the Cell/B.E.

Subtask Expected Results: Aid in the decision of which application to port to the Cell/B.E. Understanding the application that is to be ported and how the Cell/B.E. may be able to reduce the run-time of the application with proper use of the SPEs in the Cell/B.E.

Subtask 1c – Research Parallel Algorithms

Subtask Objective: Gain a better understanding of the underlying algorithms utilized in bioinformatics applications.

Subtask Approach: Read and understand the algorithms book provided by the faculty advisor.

Subtask Expected Results: An understanding of the algorithms present in bioinformatics applications sufficient enough to allow the team to port one of the applications to the Cell/B.E.

Subtask 1d – Research Existing Results

Subtask Objective: To identify applications already ported to the Cell/B.E. so redundant work is avoided.

Subtask Approach: Studying technical papers on the Cell/B.E. in order to identify any porting already in progress.

Subtask Expected Results: Knowledge of previous work done in porting the applications to the Cell/B.E. which will aid in the decision of which application the group will port.

Task 2 – Technology and Implementation Considerations and Selection

Task Objective: Find the software that is best suited for the project.

Task Approach: This will be broken up into two main parts. Finding the best suited Linux distribution for the team’s task, and determining which bioinformatics application to port.

Task Expected Results: A pairing of Linux distribution and application to port.

Subtask 2a – Decide Which Linux to Install

Subtask Objective: Find the distribution of Linux that will be the most pertinent to install.

Subtask Approach: Evaluate different distributions, such as Ubuntu, Red Hat Enterprise Linux, Fedora Core, or Yellowdog for their suitability for the task. Considerations include ease of install, the team’s prior knowledge of the different Linux distributions and importantly, which have the greatest support for the bioinformatics software the team will be optimizing for the Cell/B.E.

Subtask Expected Results: A distribution of Linux chosen that best meets the team’s needs.

Subtask 2b – Decide Which Specific Application to Port

Subtask Objective: Using the team’s prior research of the BioPerf suite the team will evaluate the individual applications that make up the suite individually for suitability of porting.

Subtask Approach: The team will find applications that have many elements that the team feels would be efficient to parallelize on the Cell/B.E., and that would be possible for a team of four to accomplish. The team will find these by analyzing the code base of the individual applications, as well as finding algorithms that are parallelizable or uniquely suited for the Cell/B.E.

Subtask Expected Results: An application hand selected for porting.

Task 3 – End-Product Design

Task Objective: Team will have design for the end-product.

Task Approach: The end-product design process will contain three parts: the design requirements, design process, and design documentation.

Task Expected Results: The team will have a design for the end-product.

Subtask 3a – Design Requirements