A Thesis Entitled Design of a Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm and Its Implementation B
Total Page:16
File Type:pdf, Size:1020Kb
A Thesis entitled Design of A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm and Its Implementation by Xinyu Guo Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Engineering _______________________________________ Dr.Hong Wang, Committee Chair _______________________________________ Dr.Vijay Devabhaktuni, Committee Member _______________________________________ Dr. Mansoor Alam, Committee Member _______________________________________ Dr. Patricia R. Komuniecki, Dean College of Graduate Studies The University of Toledo August 2012 Copyright 2012, Xinyu Guo This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author. An Abstract of Design of A Systolic Array-Based FPGA Parallel Architecture for the BLAST Algorithm and Its Implementation By Xinyu Guo Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Engineering The University of Toledo August 2012 A design of systolic array based Field Programmable Gate Array (FPGA) parallel architecture for Basic Local Alignment Search Tool (BLAST) Algorithm is proposed in the thesis. BLAST is a heuristic biological sequence alignment algorithm which has been used by bioinformatics experts. In contrast to other designs that detect at most one hit in one clock cycle, this design applies a Multiple Hits Detection Module which is a pipelining systolic array to search multiple hits in a single clock cycle. Further, A Hits Combination Block which combines overlapping hits from systolic array into one hit is proposed. These implementations completed the first step of BLAST architecture. After parallelizing the Ungapped Extension Blocks to accelerate the second step of BLAST, the new design in the thesis achieved better results than previously published architectures. iii Acknowledgements I sincerely thank my advisor Dr. Hong Wang and co-advisor Dr. Vijay Devabhaktuni for giving me the opportunity to pursue my research under their guidance. It has been a great learning experience working with them. I would also like to express my deepest gratitude towards them for offering instructive advice at all times, treating me with endurance, and constantly pushing me to do better every time. Last, but not the least, I thank my family, and friends who have been a constant source of encouragement and support for me throughout my journey in the Master‟s Degree program. iv Table of Contents Abstract .............................................................................................................................. iii Acknowledgements ............................................................................................................ iv Table of Contents ................................................................................................................ v List of Tables .................................................................................................................... vii List of Figures .................................................................................................................. viii List of Abbreviations .......................................................................................................... x 1 Introduction ...................................................................................................................... 1 1.1 Background ..................................................................................................... 1 1.2 Research Approach ......................................................................................... 5 1.3 The Documentation Organization ................................................................... 6 2 The BLAST Algorithm .................................................................................................... 7 2.1 BLAST Overview ........................................................................................... 7 2.2 Finding Hits .................................................................................................... 8 2.3.Ungapped Extension ....................................................................................... 9 2.4 Gapped Extension ........................................................................................... 9 3 Previous Work ............................................................................................................... 12 4 Parallel Architecture Design of The BLAST Algorithm with Multiple Hits Detection 14 4.1 The Architecture Overview........................................................................... 14 4.2 Multiple Hits Detection Module ................................................................... 16 v 4.3 Hits Combination Block ............................................................................... 25 4.4 Parallel Architecture for Multiple Hits Detection Module and Hits Combination Block ............................................................................................. 27 4.5 Ungapped Extension Block........................................................................... 28 4.6 Gapped Extension Block............................................................................... 31 5 Performance Analysis .................................................................................................... 33 5.1 Hardware Performance Comparison ............................................................. 34 5.2 Comparing Execution Time to Software Versions ....................................... 36 6 Conclusion and Future Work ......................................................................................... 40 References ......................................................................................................................... 42 A Project Architecture Generated from the Schematic Viewer ........................................ 47 A.1 Architecture of the array_component .......................................................... 47 A.2 Architecture of two systolic arrays in the project ........................................ 48 A.3 The connection of array_components .......................................................... 49 B Source Code of the Project ............................................................................................ 50 B.1 Source Code for the Processing Unit ............................................................ 50 B.2 Source Code for the Clock_Division Block .................................................. 52 B.3 Source Code for 3-input AND Gate .............................................................. 52 B.4 Source Code for Multiple Hits Detection Module ........................................ 53 B.4 Testbench Source Code for the Systolic Array Architecture ........................ 58 vi List of Tables 4.1 5-bit binary description for amino acids in a protein. ................................................. 21 5.1 Synthesis performance comparison between Tree-BLAST and ours. ........................ 35 5.2 Synthesis performance comparison between families of FPGA-based Accelerators and ours. ............................................................................................................................ 35 5.3 Experimental database for different algorithms. ......................................................... 36 5.4 Execution time of our architecture and software versions. ......................................... 37 vii List of Figures 1-1 Exponential growth of biological sequence database on a yearly basis [1]. ................ 3 1-2 A FPGA manufactured by ALTERA. .......................................................................... 4 1-3 Logic blocks distribution in a FPGA. ........................................................................... 4 2-1 Illustration of Needleman-Wunch algorithm. ............................................................. 11 4-1 The structure of the project. ........................................................................................ 15 4-2 A 2-D 4x4 systolic array. ............................................................................................ 16 4-3 Architecture of the Multiple Hits Detection Module.................................................. 17 4-4 Behavior Simulation of Clock Division Block. .......................................................... 18 4-5 Architecture of Processing Unit of Multiple Hits Detection Module. ........................ 19 4-6 The Architecture of Multiple Hits Detection Module. ............................................... 21 4-7 Systolic array status 1. ................................................................................................ 23 4-8 Systolic array status 2. ................................................................................................ 24 4-9 Systolic array status 3. ................................................................................................ 24 4-10 Hits Combination Block. .......................................................................................... 26 4-11 Parallel Architecture of Multiple Hits Finder Array and Hits Combination Block. 27 4-12 Ungapped Extension Block. ..................................................................................... 29 4-13 Extension procedure. ...............................................................................................