Using Cell B.E. and System-Z to Accelerate an Image

USING CELL B.E. AND SYSTEM-Z TO ACCELERATE AN IMAGE STITCHING APPLICATION USING CELL B.E. AND SYSTEM-Z TO ACCELERATE AN IMAGE STITCHING APPLICATION A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science By Sree Vardhani Malladi Jawaharlal Technological University Bachelor of Technology in Information Technology, 2007 August 2009 University of Arkansas ABSTRACT Image stitching is application software that combines multiple images with overlapping fields of view to produce a high-resolution composite image. This thesis examines the performance of a parallelizable portion of an image stitching algorithm when it is executed on Cell Broadband Engine, a multicore chip that has been architected for intensive gaming and high performance computing applications. Performance is compared for the same algorithm running on a PC, an IBM System-z mainframe, an IBM QS21 Cell BE just using the main processor, and a QS21 Cell BE using the main and 16 satellite processors running in parallel. This thesis is approved for recommendation to the Graduate Council. Thesis Director: ____________________________________ Dr. Craig W. Thompson Thesis Committee: ____________________________________ Dr. Amy Apon ____________________________________ Dr. Gordon Beavers ____________________________________ Dr. Jackson Cothren THESIS DUPLICATION RELEASE I hereby authorize the University of Arkansas Libraries to duplicate this thesis when needed for research and/or scholarship. Agreed ________________________________________ Sree Vardhani Malladi Refused ________________________________________ ACKNOWLEDGEMENTS I thank my thesis advisor Dr. Craig Thompson; my thesis committee Dr. Amy Apon, Dr. Gordon Beavers, and Dr. Jackson Cothren; Dr. David Douglas; my project mates Hung Bui and Wesley Emeneker; For all their help, guidance, encouragement and support to complete the project. I also thank my father, mother, family and friends for all their support and encouragement during this project. Finally, I thank IBM for supporting this research project and especially Hema Reddy from the IBM Systems and Technologies University Alliances team for her guidance. v TABLE OF CONTENTS 1. Introduction ....................................................................................................................... 1 1.1 Context .......................................................................................................................... 1 1.2 Problem ......................................................................................................................... 2 1.2 Thesis Statement ........................................................................................................... 3 1.3 Approach ....................................................................................................................... 3 1.4 Organization of this Thesis .......................................................................................... 4 2. Background ........................................................................................................................ 5 2.1 Image Stitching Algorithm....................................................................................... 5 2.1.1 Image Processing to Find Key points .............................................................. 5 2.1.2 Image Matching ................................................................................................. 7 2.1.3 Image Orientation .............................................................................................. 7 2.1.4 Rebuild Image.................................................................................................... 7 2.2 System-z and DB2 .................................................................................................... 8 2.3 Cell BE ...................................................................................................................... 9 2.3.1 Power Processing Element (PPE) .................................................................. 11 2.3.2 Synergistic Processing Elements (SPE) ......................................................... 13 2.3.3 QS21 and QS22 ............................................................................................... 14 QS21 ....................................................................................................................... 14 QS22 ....................................................................................................................... 15 2.3.4 Programming the Cell BE ............................................................................... 16 2.3.5 Performance Optimizations of the QS21 ....................................................... 17 vi 3. Architecture ..................................................................................................................... 19 3.1 High Level Design ...................................................................................................... 19 3.2 Design .......................................................................................................................... 20 3.3 Implementation ........................................................................................................... 23 4. Methodology, Results and Analysis.............................................................................. 28 4.1 Methodology ............................................................................................................... 28 4.2 Results ......................................................................................................................... 29 4.2.1 Timing the Overall Application .......................................................................... 29 4.2.2 Timing the Key Point Detection ......................................................................... 31 4.2.3 Timing Results in more detail............................................................................. 32 4.2.4 Timing the Communication ................................................................................ 33 4.3 Analysis ....................................................................................................................... 35 4.3 Comparison with Recent Results ............................................................................... 36 5. Conclusions ...................................................................................................................... 37 5.1 Summary ..................................................................................................................... 37 5.2 Contributions............................................................................................................... 38 5.3 Future Work ................................................................................................................ 39 References .............................................................................................................................. 41 vii LIST OF FIGURES Figure 1 : Stitched Image ......................................................................................................... 2 Figure 2 : High Level Block diagram of Cell Broadband Engine....................................... 10 Figure 3 : Block Diagram of PPE.......................................................................................... 12 Figure 4 : Block Diagram of SPE.......................................................................................... 13 Figure 5 : High level Block Diagram of IBM QS21 ............................................................ 15 Figure 6 : Architecture Design .............................................................................................. 19 Figure 7 : Image Extraction Runtime .................................................................................... 30 Figure 8 : Key point Detection Run time.............................................................................. 31 Figure 9 : Communication time from Z to Cell.................................................................... 34 Figure 10 : Communication time from Cell to Z ................................................................. 35 viii 1. INTRODUCTION 1.1 Context Images often overlap. Image registration involves mapping and matching image features to a common scale and coordinate system so the images can be overlaid to form a larger image or even a model. When there are many images in an image set, one can refer to the process of registering them all into one large mosaic image as image stitching. Image stitching is important in military and industrial applications. These applications include urban modeling, precision agriculture, disaster management, and medical images. In all these applications, we take many images of a location, then match the features and line them up to build an image collage. The consumer market is beginning to explore applications of image stitching – for instance, composing many pictures of the Taj Mahal taken from points of view into a composite panorama. Image stitching involves first identifying interesting points (key features) in each image along with their descriptors, then matching key points across images that may overlap, then transforming the images’ scale and coordinate system to create the composite image. Below is an example of a Stitched Image. 1 Figure 1 : Stitched Image 1.2 Problem Image stitching is often a manual process in which

Load more