On Accelerating Concurrent Short-Running General-Purpose Tasks Using Fpgas

On accelerating concurrent short-running general-purpose tasks using FPGAs Alexander Kroh A thesis in fulfillment of the requirements for the degree of Doctor of Philosophy School of Computer Science and Engineering Faculty of Engineering The University of New South Wales February 2020 THE UNIVERSITY OF NEW SOUTH WALES Thesis/Dissertation Sheet Surname or Family name: Kroh First name: Alexander Other name/s: Andre Abreviation for degree as given in the University calendar: PhD School: School of Computer Science and Engineering Faculty: Faculty of Engineering Title: On accelerating concurrent short-running general-purpose tasks using FPGAs Abstract FPGA technology is becoming a vital alternative to CPU-based processing as the performance of CPU technology plateaus. This is particularly prevalent in data centers and is impacting supercomputer design. In this thesis, I investigate the ability of FPGA technology to accelerate general-purpose systems, such as desktop computers and mobile devices. With high-performance, energy-efficient FPGA compute hardware, such systems could benefit from both a reduced execution time and an extended battery life. Rather than executing a few long-running tasks, general-purpose systems typically host a large volume and variety of short-running tasks. For this reason, low-latency communication between the CPU and the FPGA is essential. My investigation begins by evaluating communication mechanisms between the tightly-coupled CPU and FPGA on the Xilinx Zynq platform as an example of a modern, commercial, heterogeneous system. This platform is an example of the growing trend of improving communication latency and throughput by co-locating the FPGA and the CPU in the same package. My investigation assesses the potential of accelerating an operating system task scheduler. I demonstrate that scheduler priority queue acceleration can improve the performance of inter-process communication, but only if the communication method between CPU and FPGA is carefully chosen. I then derive a formula for minimising a single task's completion time by partitioning the workload between the CPU and FPGA. That formula considers the latency and computational overhead required for signalling between the CPU and FPGA. The formula decides if, and by how much, the task should be executed in hardware. I extend that formula to a dynamic execution environment and propose a framework that supports the acceleration of concurrent short-lived general-purpose workloads. I evaluate that framework in an emulated multitasking environment. Multiple applications that share a diverse set of accelerators are used to emulate general-purpose workloads. I conclude that modern tightly-coupled devices can support short-lived general-purpose workloads and advance the state-of-the-art in how to effectively use such technology for this application. In a dynamic environment, response to changing demand must be quick or a diverse set of resident accelerators must be maintained to avoid reconfiguration delays overwhelming the throughput gains of FPGA acceleration. Declaration relating to disposition of project thesis/dissertation I hereby grant to the University of New South Wales or its agents a non-exclusive licence to archive and to make available (including to members of the public) my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known. I acknowledge that I retain all intellectual property rights which subsist in my thesis or dissertation, such as copyright and patent rights, subject to applicable law. I also retain the right to use all or part of my thesis or dissertation in future works (such as articles or books). Signature Witness Date The University recognises that there may be exceptional circumstances requiring restrictions on copying or conditions on use. Requests for restriction for a period of up to 2 years can be made when submitting the final copies of your thesis to the UNSW Library. Requests for a longer period of restriction may be considered in exceptional circumstances and require the approval of the Dean of Graduate Research. Originality Statement Ì hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.' Alexander Kroh February 6, 2020 Copyright Statement Ì hereby grant the University of New South Wales or its agents a non-exclusive licence to archive and to make available (including to members of the public) my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known. I acknowledge that I retain all intellectual property rights which subsist in my thesis or dissertation, such as copyright and patent rights, subject to applicable law. I also retain the right to use all or part of my thesis or dissertation in future works (such as articles or books).' `For any substantial portions of copyright material used in this thesis, written permission for use has been obtained, or the copyright material is removed from the final public version of the thesis.' Alexander Kroh February 6, 2020 Authenticity Statement Ì certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis.' Alexander Kroh February 6, 2020 INCLUSION OF PUBLICATIONS STATEMENT UNSW is supportive of candidates publishing their research results during their candidature as detailed in the UNSW Thesis Examination Procedure. Publications can be used in their thesis in lieu of a Chapter if: • The candidate contributed greater than 50% of the content in the publication and is the “primary author”, ie. the candidate was responsible primarily for the planning, execution and preparation of the work for publication • The candidate has approval to include the publication in their thesis in lieu of a Chapter from their supervisor and Postgraduate Coordinator. • The publication is not subject to any obligations or contractual agreements with a third party that would constrain its inclusion in the thesis Please indicate whether this thesis contains published material or not: This thesis contains no publications, either published or submitted for publication ☐ Some of the work described in this thesis has been published and it has been ☒ documented in the relevant Chapters with acknowledgement This thesis has publications (either published or submitted for publication) ☐ incorporated into it in lieu of a chapter and the details are presented below CANDIDATE’S DECLARATION I declare that: • I have complied with the UNSW Thesis Examination Procedure • where I have used a publication in lieu of a Chapter, the listed publication(s) below meet(s) the requirements to be included in the thesis. Candidate’s Name Signature Date (dd/mm/yy) Abstract FPGA technology is becoming a vital alternative to CPU-based processing as the performance of CPU technology plateaus. This is particularly prevalent in data centers and is impacting supercomputer design. In this thesis, I investigate the ability of FPGA technology to accelerate general-purpose systems, such as desktop computers and mobile devices. With high-performance, energy-efficient FPGA compute hardware, such systems could benefit from both a reduced execution time and an extended battery life. Rather than executing a few long-running tasks, general-purpose systems typically host a large volume and variety of short-running tasks. For this reason, low-latency communication between the CPU and the FPGA is essential. My investigation begins by evaluating communication mechanisms between the tightly-coupled CPU and FPGA on the Xilinx Zynq platform as an example of a modern, commercial, heterogeneous system. This platform is an example of the growing trend of improving communication latency and throughput by co-locating the FPGA and the CPU in the same package. My investigation assesses the potential of accelerating an operating system task scheduler. I demonstrate that scheduler priority queue acceleration can improve the performance of inter-process communication, but only if the communication method between CPU and FPGA is carefully chosen. I then derive a formula for minimising a single task's completion time by partitioning the workload between the CPU and FPGA. That formula considers the latency and computational overhead required for signalling between the CPU and FPGA. The formula decides if, and by how much, the task should be executed in hardware. I extend that formula to a dynamic execution environment and propose a framework that supports the acceleration of concurrent short-lived general-purpose workloads. I evaluate that framework in an emulated multitasking environment. Multiple applications that share a diverse set of accelerators are used to emulate general-purpose workloads. I conclude that modern tightly-coupled devices can support short-lived general-purpose workloads and advance the state-of-the-art in how to effectively use such technology for this application. In a dynamic environment, response to changing demand must be quick or a diverse set of resident accelerators must be maintained to avoid reconfiguration delays overwhelming the throughput gains of FPGA acceleration. i Acknowledgements I would first and foremost like to acknowledge my supervisor, Oliver Diessel. Oliver and I have shared numerous good times throughout my PhD. I am thankful that he kept me motivated and squelched dark thoughts of returning to industry before completion. Without his guidance and support, I would not have grown to become the researcher and author that I am today.

Load more