Hardware Accelerated Cache Design Space Exploration for Application Speciﬁc Mpsocs

Hardware Accelerated Cache Design Space Exploration for Application Specific MPSoCs Isuru Nawinne A thesis in fulfillment of the requirements for the degree of Doctor of Philosophy School of Computer Science and Engineering Faculty of Engineering The University of New South Wales June 2016 .-----------------------T----HE U--N--IV--ER--SIT--Y-- O--F N--E--W --SO--U--T H-- WA----LE--S -----------------------. Thesis/Dissertation Sheet Surname or Fami ly name: N awinne First name: lsuru Other name/ s: B andara Abreviation for degree as given in the University ca lendar: PhD School: School of Computer Science and Engineering Faculty: Faculty of Engineer ing Title: Hardware Accelerated Cache Design Space Exploration for Application Specific MPSoCs Abstract 350 words maximum The performance of a computing system heavily depend on the memory hierarchy. Fast but expensive cache memories are commonly employed t o bridge the increasing performance gap between processors and memory devices. Benefits drawn from a cache vary significantly with the diverse memory access pattern s of software application programs, especially in the domain of embedded systems. Modern embedded procc:ssvrs acknowledge this relat;on between applications and caches, by incorporating cache 11e:ncries wh'ch are configurable at design-time. Design space ex ploration of caches in an application specific system is a difficult problem, which typically takes days to solve, if not weeks, using software-based techniques. The problem becomes more complex for multiprocessor systems with hierarchical caches, executing many application programs. A typical such design space can be of vast proportions containing up to several trillions of unique design points, which is infeasible to be accurately explored using existing techniques. This dissertation presents a design space exploration framework which uses hardware accelerated sim ulation t o quickly determine the best set of cache configuration s for a multiprocessor cache hierarchy. T he proposed framework was able to achieve up to 456 t imes faster simulation times compared to the fastest known software-based si mulator, with similar accuracy in cac he access time. Further, a novel exploration algorithm is presented, which was able to improve the cache access times by up to 18.9%, while reducing total cache size by up to 74.15% at the same time. A new ru n-time concept is introduced, called switchable cache, where a processor can select from multiple pre-defined cache configurations, leveraging the abundant transistors available due to what is known as the dark silicon phenomenon. An architecture to enable seamless integration of multiple cache conftgurations is described. A novel design space exploration algorithm is presented to rapidly pre--deterr..,ine th~ cm t imal set of configurations at design-time, for a given group of applications. The use of Answer Set Programming, which guarantees optimal solu tions for NP-Hard problems, is explored J w reliably solve the switcha ble cache t uning problem. ,.--------------- - - - - Declaration relat ing to disposition of project t hesis/ dissertation I hereby grant 1o -.:h€' University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or in part in the University libraries in all forms of media, now or here after known , subject to the provisrons or the Copyright Act 1968. I retain all property nghts, such as pa tent rights. I also retain the right to use in tuture work$ (su.:. h ;,·;articles 01 boo:<s) all c.r part of t his lhesis or dissertat ion. I also authorise University Microfilms to use the 350 word abstract of my t hesis in Dissert ation Abstracts International I(this is applitable t o doctoral t heses or1ly). The University recognises that there may be exceptiona l circumstances requiring res trictions on copying or conditions on use. Req uests fer restriction for a period of up to 2 years must be made in writing. Requests for a longer period of restrict i0•1 m; y !.Je cor;.;:dered in exceptional circumstances <~nd requ ire the approval of the Dea n of Graduate Research . ORIGINALITY STATEMENT ‘I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgement is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the project's design and conception or in style, presentation and linguistic expression is acknowledged.’ Signed …………………………………………….............. Date …………………………………………….............. COPYRIGHT STATEMENT ‘I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.' Signed ……………………………………………........................... Date ……………………………………………........................... AUTHENTICITY STATEMENT ‘I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.’ Signed ……………………………………………........................... Date ……………………………………………........................... Abstract The performance of a computing system heavily depends on the memory hierarchy. Fast but expensive cache memories are commonly employed to bridge the increasing performance gap between processors and memory devices. Benefits drawn from a cache vary significantly with the diverse memory access patterns of software application programs, especially in the domain of embedded systems. Modern embedded processors acknowledge this relation between applications and caches, by incorporating cache memories which are configurable at design-time. Design space exploration of caches in an application specific system is a difficult problem, which typically takes days to solve, if not weeks, using software-based techniques. The problem becomes more complex for multiprocessor systems with hierarchical caches, executing many application programs. A typical such design space can be of vast proportions containing up to several trillions of unique design points, which is infeasible to be accurately explored using existing techniques. This dissertation presents a design space exploration framework which uses hardware accelerated simulation to quickly determine the best set of cache configurations for a multiprocessor cache hierarchy. The proposed framework was able to achieve up to 456 times faster simulation times compared to the fastest known software-based i simulator. Further, a novel exploration algorithm is presented, which was able to improve the cache access times by up to 18.9%, while reducing total cache size by up to 74.15% at the same time. A new run-time concept is introduced, called switchable cache, where a processor can select from multiple pre-defined cache configurations, leveraging the abundant transistors available due to what is known as the dark silicon phenomenon. An architecture to enable seamless integration of multiple cache configurations is described. A novel design space exploration algorithm is presented to rapidly pre-determine the optimal set of configurations at design-time, for a given group of applications. The use of Answer Set Programming, which guarantees optimal solutions for NP-Hard problems, is explored to reliably solve the switchable cache tuning problem. ii Publications Isuru Nawinne, Haris Javaid, Roshan Ragel and Sri Parameswaran. Switchable Cache: Utilizing Dark Silicon for Application Specific Cache Optimizations. IET Computers & Digital Techniques, IET, 2016 Isuru Nawinne, Haris Javaid, Roshan Ragel, Swarnalatha Radhakrishnan and Sri Parameswaran. Exploring Multi-Level Cache Hierarchies in Application Specific MPSOCs. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), 34(12), Pages 1991-2003, IEEE, 2015. Isuru Nawinne, Josef Schneider, Haris Javaid and Sri Parameswaran. Hardware- Based Fast Exploration of Cache Hierarchies in Application Specific MPSoCs. In Proceedings of the International Conference on Design, Automation & Test in Eu- rope (DATE`14), Article No. 283, European Design and Automation Association (EDAA), 2014. Isuru Nawinne and Sri Parameswaran. A Survey on Exact Cache Design Space Exploration Methodologies for Application Specific

Load more