Numerical Solutions of Differential Equations on Fpga-Enhanced Computers

NUMERICAL SOLUTIONS OF DIFFERENTIAL EQUATIONS ON FPGA-ENHANCED COMPUTERS A Dissertation by CHUAN HE Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2007 Major Subject: Electrical Engineering NUMERICAL SOLUTIONS OF DIFFERENTIAL EQUATIONS ON FPGA-ENHANCED COMPUTERS A Dissertation by CHUAN HE Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Co-Chairs of Committee, Mi Lu Wei Zhao Committee Members, Guan Qin Gwan Choi Jim Ji Head of Department, Costas N. Georghiades May 2007 Major Subject: Electrical Engineering iii ABSTRACT Numerical Solutions of Differential Equations on FPGA-Enhanced Computers. (May 2007) Chuan He, B.S., Shandong University; M.S., Beijing University of Aeronautics and Astronautics Co-Chairs of Advisory Committee: Dr. Mi Lu Dr. Wei Zhao Conventionally, to speed up scientific or engineering (S&E) computation programs on general-purpose computers, one may elect to use faster CPUs, more memory, systems with more efficient (though complicated) architecture, better software compilers, or even coding with assembly languages. With the emergence of Field Programmable Gate Array (FPGA) based Reconfigurable Computing (RC) technology, numerical scientists and engineers now have another option using FPGA devices as core components to address their computational problems. The hardware-programmable, low-cost, but powerful “FPGA-enhanced computer” has now become an attractive approach for many S&E applications. A new computer architecture model for FPGA-enhanced computer systems and its detailed hardware implementation are proposed for accelerating the solutions of computationally demanding and data intensive numerical PDE problems. New FPGA- optimized algorithms/methods for rapid executions of representative numerical methods such as Finite Difference Methods (FDM) and Finite Element Methods (FEM) are designed, analyzed, and implemented on it. Linear wave equations based on seismic data processing applications are adopted as the targeting PDE problems to demonstrate the effectiveness of this new computer model. Their sustained computational performances are compared with pure software programs operating on commodity CPU- based general-purpose computers. Quantitative analysis is performed from a hierarchical set of aspects as customized/extraordinary computer arithmetic or function units, iv compact but flexible system architecture and memory hierarchy, and hardware- optimized numerical algorithms or methods that may be inappropriate for conventional general-purpose computers. The preferable property of in-system hardware reconfigurability of the new system is emphasized aiming at effectively accelerating the execution of complex multi-stage numerical applications. Methodologies for accelerating the targeting PDE problems as well as other numerical PDE problems, such as heat equations and Laplace equations utilizing programmable hardware resources are concluded, which imply the broad usage of the proposed FPGA-enhanced computers. v DEDICATION To my wonderful and loving wife vi TABLE OF CONTENTS Page ABSTRACT..................................................................................................................... iii DEDICATION ...................................................................................................................v TABLE OF CONTENTS..................................................................................................vi LIST OF TABLES ......................................................................................................... viii LIST OF FIGURES...........................................................................................................ix 1 INTRODUCTION......................................................................................................1 2 BACKGROUND AND RELATED WORK..............................................................4 2.1 Application Background: Seismic Data Processing...........................................4 2.2 Numerical Solutions of PDEs on High-Performance Computing (HPC) Facilities .............................................................................................................6 2.3 Application-Specific Computer Systems ...........................................................7 2.4 FPGA and Existing FPGA-Based Computers....................................................9 2.4.1 FPGA and FPGA-Based Reconfigurable Computing....................................9 2.4.2 Hardware Architecture of Existing FPGA-Based Computers......................10 2.4.3 Floating-Point Arithmetic on FPGAs...........................................................13 2.4.4 Numerical Algorithms/Methods on FPGAs.................................................14 3 HARDWARE ARCHITECTURE OF FPGA-ENHANCED COMPUTERS FOR NUMERICAL PDE PROBLEMS...................................................................16 3.1 SPACE System for Seismic Data Processing Applications.............................17 3.2 Universal Architecture of FPGA-Enhanced Computers ..................................20 3.3 Architecture of FPGA-Enhanced Computer Cluster........................................23 4 PSTM ALGORITHM ON FPGA-ENHANCED COMPUTERS ............................28 4.1 PSTM Algorithm and Its Implementation on PC Clusters...............................28 4.2 The Design of Double-Square-Root (DSR) Arithmetic Unit...........................32 4.2.1 Hybrid DSR Arithmetic Unit .......................................................................32 4.2.2 Fixed-point DSR Arithmetic Unit................................................................36 4.2.3 Optimized 6th-Order DSR Travel-Time Solver...........................................38 4.3 PSTM Algorithm on FPGA-Enhanced Computers..........................................40 4.4 Performance Comparisons ...............................................................................43 5 FDM ON FPGA-ENHANCED COMPUTER PLATFORM...................................48 5.1 The Standard Second Order and High Order FDMs ........................................50 5.1.1 2nd-Order FD Schemes in Second Derivative Form ...................................50 5.1.2 High Order Spatial FD Approximations ......................................................54 vii Page 5.1.3 High Order Time Integration Scheme..........................................................59 5.2 High Order FD Schemes on FPGA-Enhanced Computers ..............................61 5.2.1 Previous Work and Their Common Pitfalls .................................................61 5.2.2 Implementation of Fully-Pipelined Laplace Computing Engine .................63 5.2.3 Sliding Window Data Buffering System......................................................64 5.2.4 Data Buffering for High Order Time Integration Schemes..........................73 5.2.5 Data Buffering for 3D Wave Modeling Problems .......................................74 5.2.6 Extension to Elastic Wave Modeling Problems...........................................76 5.2.7 Damping Boundary Conditions....................................................................78 5.3 Numerical Simulation Results..........................................................................80 5.3.1 Wave Propagation Test in Constant Media..................................................81 5.3.2 Acoustic Modeling of Marmousi Mode ......................................................84 5.4 Optimized FD Schemes with Finite Accurate Coefficients .............................88 5.5 Accumulation of Floating-Point Operands ......................................................94 5.6 Bring Them Together: Efficient Implementation of the Optimized FD Computing Engine............................................................................................98 6 FEM ON FPGA-ENHANCED COMPUTER PLATFORM .................................103 6.1 Floating-Point Summation and Vector Dot-Product on FPGAs ....................106 6.1.1 Floating-Point Summation Problem and Related works............................106 6.1.2 Numerical Error Bounds of the Sequential Accumulation Method ...........109 6.1.3 Group-Alignment Based Floating-Point Summation Algorithm ...............111 6.1.4 Formal Error Analysis and Numerical Experiments ..................................113 6.1.5 Implementation of Group-Alignment Based Summation on FPGAs.........116 6.1.6 Accurate Vector Dot-Product on FPGAs ...................................................122 6.2 Matrix-Vector Multiply on FPGAs ................................................................124 6.3 Dense Matrix-Matrix Multiply on FPGAs .....................................................131 7 CONCLUSIONS....................................................................................................138 7.1 Summary of Research Work ..........................................................................138 7.2 Methodologies for Accelerating Numerical PDE Problems on FPGA- Enhanced Computers......................................................................................141 REFERENCES...............................................................................................................145 VITA ..............................................................................................................................152 viii LIST OF

Load more