Application Specific Programmable Processors for Reconfigurable Self-Powered Devices
Total Page:16
File Type:pdf, Size:1020Kb
C651etukansi.kesken.fm Page 1 Monday, March 26, 2018 2:24 PM C 651 OULU 2018 C 651 UNIVERSITY OF OULU P.O. Box 8000 FI-90014 UNIVERSITY OF OULU FINLAND ACTA UNIVERSITATISUNIVERSITATIS OULUENSISOULUENSIS ACTA UNIVERSITATIS OULUENSIS ACTAACTA TECHNICATECHNICACC Teemu Nyländen Teemu Nyländen Teemu University Lecturer Tuomo Glumoff APPLICATION SPECIFIC University Lecturer Santeri Palviainen PROGRAMMABLE Postdoctoral research fellow Sanna Taskila PROCESSORS FOR RECONFIGURABLE Professor Olli Vuolteenaho SELF-POWERED DEVICES University Lecturer Veli-Matti Ulvinen Planning Director Pertti Tikkanen Professor Jari Juga University Lecturer Anu Soikkeli Professor Olli Vuolteenaho UNIVERSITY OF OULU GRADUATE SCHOOL; UNIVERSITY OF OULU, FACULTY OF INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING Publications Editor Kirsti Nurkkala ISBN 978-952-62-1874-8 (Paperback) ISBN 978-952-62-1875-5 (PDF) ISSN 0355-3213 (Print) ISSN 1796-2226 (Online) ACTA UNIVERSITATIS OULUENSIS C Technica 651 TEEMU NYLÄNDEN APPLICATION SPECIFIC PROGRAMMABLE PROCESSORS FOR RECONFIGURABLE SELF-POWERED DEVICES Academic dissertation to be presented, with the assent of the Doctoral Training Committee of Technology and Natural Sciences of the University of Oulu, for public defence in the Wetteri auditorium (IT115), Linnanmaa, on 7 May 2018, at 12 noon UNIVERSITY OF OULU, OULU 2018 Copyright © 2018 Acta Univ. Oul. C 651, 2018 Supervised by Professor Olli Silvén Reviewed by Professor Leonel Sousa Doctor John McAllister ISBN 978-952-62-1874-8 (Paperback) ISBN 978-952-62-1875-5 (PDF) ISSN 0355-3213 (Printed) ISSN 1796-2226 (Online) Cover Design Raimo Ahonen JUVENES PRINT TAMPERE 2018 Nyländen, Teemu, Application specific programmable processors for reconfigurable self-powered devices. University of Oulu Graduate School; University of Oulu, Faculty of Information Technology and Electrical Engineering Acta Univ. Oul. C 651, 2018 University of Oulu, P.O. Box 8000, FI-90014 University of Oulu, Finland Abstract The current Internet of Things solutions for simple measurement and monitoring tasks are evolving into ubiquitous sensor networks that are constantly observing both our well being and the conditions of our living environment. The oncoming omnipresent wireless infrastructure is expected to feature artificial intelligence capabilities that can interpret human actions, gestures and even needs. All of this will require processing power on a par with and energy efficiency far beyond that of the current mobile devices. The current Internet of Things devices rely mostly on commercial low power off-the-shelf micro-controllers. Optimized solely for low power, while paying little attention to computing performance, the present solutions are far from achieving the energy efficiency, let alone, the compute capability requirements of the future Internet of Things solutions. Since this domain is application specific by nature, the use of general purpose processors for signal processing tasks is counterintuitive. Instead, dedicated accelerator based solutions are more likely to be able to meet these strict demands. This thesis proposes one potential solution for achieving the necessary low energy, as well as the flexibility and performance requirements of the Internet of Things domain in a cost effective manner using reconfigurable heterogeneous processing solutions. A novel graphics processing unit-style accelerator for the Internet of Things application domain is presented. Since the accelerator can be reconfigured, it can be used for most applications of the Internet of Things domain, as well as other application domains. The solution is assessed using two computer vision applications, and is demonstrated to achieve an excellent combination of performance and energy efficiency. The accelerator is designed using an efficient and rapid co-design flow of software and hardware, featuring ease of development characteristics close to commercial off-the-shelf solutions, which also enables cost- efficient design flow. Keywords: application specific processing, energy efficient computing, general purpose computing on graphics processing units, internet of things, reconfigurable architectures Nyländen, Teemu, Sovelluskohtaiset ohjelmoitavat prosessorit uudelleen- konfiguroitaviin energiaomavaraisiin laitteisiin. Oulun yliopiston tutkijakoulu; Oulun yliopisto, Tieto- ja sähkötekniikan tiedekunta Acta Univ. Oul. C 651, 2018 Oulun yliopisto, PL 8000, 90014 Oulun yliopisto Tiivistelmä Esineiden internet tulee muuttamaan tulevaisuudessa elinympäristömme täysin. Se tulee mah- dollistamaan interaktiiviset ympäristöt nykyisten passiivisten ympäristöjen sijaan. Lisäksi elin- ympäristömme tulee reagoimaan tekoihimme ja puheeseemme sekä myös tunteisiimme. Tämä kaikkialla läsnä olevan langaton infrastruktuuri tulee vaatimaan ennennäkemätöntä laskentate- hokkuutta yhdistettynä äärimmäiseen energiatehokkuuteen. Nykyiset esineiden internet ratkaisut nojaavat lähes täysin kaupallisiin "suoraan hyllyltä" saa- taviin yleiskäyttöisiin mikrokontrollereihin. Ne ovat kuitenkin optimoituja pelkästään matalan tehonkulutuksen näkökulmasta, eivätkä niinkään energiatehokkuuden, saati tulevaisuuden esi- neiden internetin vaatiman laskentatehon suhteen. Kuitenkin esineiden internet on lähtökohtai- sesti sovelluskohtaista laskentaa vaativa, joten yleiskäyttöisten prosessoreiden käyttö signaalin- käsittelytehtäviin on epäloogista. Sen sijaan sovelluskohtaisten kiihdyttimien käyttö laskentaan, todennäköisesti mahdollistaisi tavoitellun vaatimustason saavuttamiseen. Tämä väitöskirja esittelee yhden mahdollisen ratkaisun matalan energian kulutuksen, korkean suorituskyvyn ja joustavuuden yhdenaikaiseen saavuttamiseen kustannustehokkaalla tavalla, käyttäen uudelleenkonfiguroitavia heterogeenisiä prosessoriratkaisuja. Työssä esitellään uusi grafiikkaprosessori-tyylinen uudelleen konfiguroitava kiihdytin esineiden internet sovellusalu- eelle, jota pystytään hyödyntämään useimpien laskentatehoa vaativien sovellusten kanssa. Ehdotetun kiihdyttimen ominaisuuksia arvioidaan kahta konenäkösovellusta esimerkkinä käyttäen ja osoitetaan sen saavuttavan loistavan yhdistelmän energia tehokkuutta ja suoritusky- kyä. Kiihdytin suunnitellaan käyttäen tehokasta ja nopeaa ohjelmiston ja laitteiston yhteissuun- nitteluketjua, jolla voidaan saavuttaa lähestulkoon kaupallisten "suoraan hyllyltä" saatavien pro- sessoreiden kehitystyön helppous, joka puolestaan mahdollistaa kustannustehokkaan kehitys- ja suunnittelutyön. Asiasanat: energiatehokas laskenta, esineiden internet, sovelluskohtainen prosessointi, uudelleenkonfiguroitavat arkkitehtuurit, yleiskäyttöinen grafiikkaprosessori To my family 8 Preface The research work for this thesis was conducted in the Center for Machine Vision and Signal Analysis (CMVS) at the University of Oulu, between 2010 and 2017. These years were extremely educational, both scientifically and personally. First of all, I would like to thank Professor Olli Silvén for supervising this thesis. Besides co-authoring all of my publications, I am deeply grateful for him for his invaluable guidance, support and encouragement during the research work. Working with him has been extremely interesting and motivating. I would also like to thank Dr. Jari Hannuksela for hiring me to CMVS in the first place and giving me a change to work in such an inspiring environment. Docent Jani Boutellier deserves huge thanks for all the invaluable advices during the work. He always had the time for discussions and advices when needed. I thank all the co-authors of my publications, Dr. Janne Janhunen, Karri Nikunen, Ilkka Hautala and Heikki Kultala. Their cooperation helped to improve the quality of this thesis. I would also like to thank Dr. Miguel Bordallo López for the all the fruitful discussions, particularly regarding GPGPU. I am grateful for the reviewers of this thesis, Professor Leonel Sousa and Dr. John McAllister, for their constructive feedback. The corrections and additions made based on their comments helped to improve the scientific content of this thesis. The main funding partners concerning this thesis have been the Finnish Metals and Engineering Competence Cluster Ltd (FIMECC) and Finnish Funding Agency for Innovation (Tekes). Both foundations are highly appreciated for supporting energy efficient computing research. I was also fortunate to receive personal grants from several Finnish funding organizations: Nokia Foundation, Walter Ahlström Foundation, Riitta and Jorma J. Takanen Foundation, Ulla Tuominen Foundation and Tauno Tönning Foundation. All these funders are highly appreciated. Furthermore, my warmest thoughts go to Dr. Ville Niemelä, Kalle Kaisto and Tuomo Hänninen. The coffee breaks with the discussions fully lacking any scientific content, were a perfect counterweight for the scientific work. I would also like to thank Dr. Kai Loo for the discussions and peer support during this thesis. 9 Finally, my parents, my sister Paula and my fiancé Elina for your invaluable support and encouragement, and especially for enduring my absentmindedness during the past couple of years. Oulu, March 14, 2017 Teemu Nyländen 10 Abbreviations AI artificial intelligence API application programming interface AR augmented reality ASIC application specific integrated circuit ASP application specific processor CGRA coarse grained reconfigurable array COTS commercial off-the-shelf CPU central processing unit DVFS dynamic voltage and frequency scaling DLP data level parallelism EPI energy per instruction etc. et cetera FFT fast fourier transform FPGA