ART Vs. NDK Vs. GPU Acceleration: a Study of Performance of Image Processing Algorithms on Android

DEGREE PROJECT IN COMPUTER SCIENCE AND ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2017 ART vs. NDK vs. GPU acceleration: A study of performance of image processing algorithms on Android ANDREAS PÅLSSON KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF COMPUTER SCIENCE AND COMMUNICATION ART vs. NDK vs. GPU acceleration: A study of performance of image processing algorithms on Android ANDREAS PÅLSSON Master in Computer Science Date: June 26, 2017 Supervisor: Cyrille Artho Examiner: Johan Håstad Swedish title: ART, NDK eller GPU acceleration: En prestandastudie av bildbehandlingsalgoritmer på Android School of Computer Science and Communication iii Abstract The Android ecosystem contains three major platforms for execution suit- able for different purposes. Android applications are normally written in the Java programming language, but computationally intensive parts of An- droid applications can be sped up by choosing to use a native language or by utilising the parallel architecture found in graphics processing units (GPUs). The experiments conducted in this thesis measure the performance benefits by switching from Java to C++ or RenderScript, Google’s GPU acceleration framework. The experiments consist of often-done tasks in image processing. For some of these tasks, optimized libraries and implementations already exist. The performance of the implementations provided by third parties are compared to our own. Our results show that for advanced image processing on large images, the benefits are large enough to warrant C++ or RenderScript usage instead of Java in modern smartphones. However, if the image processing is conducted on very small images (e.g. thumbnails) or the image processing task contains few calculations, moving to a native language or RenderScript is not worth the added development time and static complexity. RenderScript is the best choice if the GPU vendors provide an optimized implementation of the processing task. If there is no such implementation provided, both C++ and RenderScript are viable choices. If full precision is required in the floating point arithmetic, a C++ implementation is the recommended. If it is possible to achieve the desired effect without compliance with IEEE Floating Point Arithmetic standard, RenderScript provides better run time performance. iv Sammanfattning Android-ekosystemet innehåller tre exekveringsplattformer passande för oli- ka syften. Android-applikationer är vanligtvis skrivna i programmerings- språket Java, men beräkningsintensiva delar av en Android-applikation kan snabbas upp genom att använda en statiskt kompilerat språk eller genom att utnyttja den parallella arkitekturen som hittas i grafikprocessorer. Experi- menten utförda i det här projektet ämnar mäta prestandasförbättringar som kan uppnås genom att byta från Java till C++ eller RenderScript, Googles grafikaccelerationsramverk. Experimenten består av ofta använda algoritmer inom bildhantering. För någ- ra av dessa finns det optimerade bibliotek och övriga färdiga implementationer. Prestandan av tredjepartsbiblioteken jämförs med våra implementationer. Våra resultat visar att för avancerad bildhantering är prestandaförbättringar- na tillräckligt bra för att använda C++ eller RenderScript istället för Java på moderna smartphones. I de fall bildhanteringen görs på väldigt små bilder eller innehåller få beräkningar (exempelvis miniatyrbilder) är bytet från Ja- va till RenderScript eller C++ inte värt den extra utvecklingstiden samt den statiska kodkomplexiteten. RenderScript är det bästa valet då grafikprocessortillverkarna tillhandahål- ler implementationer av algoritmen som ska köras. Om det inte finns någon sådan implementation är både C++ och RenderScript tillämpbara val. Om noggrann precision krävs rekommenderas en C++-implementation. Däremot om full precision inte behövs vid flyttalsberäkningar rekommenderas istället RenderScript. Contents 1 Introduction 1 1.1 Problem . .2 1.2 Research Question . .2 1.3 Scope . .3 1.4 Ethics and sustainability . .3 1.5 Structure . .3 2 Background 4 2.1 Native and interpreted languages . .4 2.1.1 Java . .4 2.1.2 C++ . .5 2.1.3 Performance . .6 2.2 Android . .6 2.3 Android application compilation . .7 2.4 Dalvik . .8 2.5 Android Runtime . .8 2.5.1 Ahead-of-time (AOT) compilation . .9 2.5.2 Improved garbage collection . 10 2.5.3 Just-in-time (JIT) compilation . 10 2.6 Android Native Development Kit . 11 2.7 RenderScript . 11 2.7.1 Compilation and deployment . 12 2.7.2 Floating point precision . 13 2.8 Image Processing . 14 2.8.1 Image smoothing . 14 2.8.2 Grayscaling . 16 2.8.3 Thresholding . 17 2.9 Color spaces . 18 3 Related Work 19 3.1 Java and C++ benchmarks . 19 v vi CONTENTS 3.2 Dalvik vs ART . 20 3.3 Using GPU for calculations . 21 3.3.1 RenderScript . 21 3.3.2 OpenCL . 21 4 Method 23 4.1 Choice of method and algorithms . 23 4.2 Development environment and devices . 23 4.3 Implementation . 24 4.3.1 Color space conversion . 25 4.3.2 Blurring . 27 4.3.3 Grayscaling and thresholding . 30 4.4 Measuring Runtime Performance . 30 4.4.1 Image processing . 30 4.4.2 Setup . 31 4.5 Verifying results . 32 5 Results 34 5.1 Color space conversion . 34 5.2 Blurring . 35 5.2.1 Box filter . 35 5.2.2 Median filter . 36 5.2.3 Gaussian filter . 37 5.3 Grayscaling . 38 5.4 Thresholding . 39 6 Discussion 40 6.1 Color space conversion . 40 6.2 Blurring . 41 6.3 Grayscaling and thresholding . 42 6.4 Overall Performance . 43 6.5 Threats to validity . 45 6.5.1 Choice of algorithms . 45 6.5.2 High variance . 46 6.5.3 Devices . 46 6.5.4 Image sizes . 46 6.5.5 Optimization . 46 6.6 Future Research . 47 7 Conclusion 49 Bibliography 50 CONTENTS vii A Tables 53 A.1 Blurring . 53 A.1.1 Box filter . 55 A.1.2 Median filter . 57 A.1.3 Gaussian filter . 60 A.2 Grayscaling . 63 A.3 Thresholding . 64 Chapter 1 Introduction The first version of the mobile operating system Android was released in fall 2008. It is, as of January 2017, the most widely used smartphone operating system [14]. It is used all over the world, with varying device and network quality. Because of these reasons, it is important to mobile application developers to be able to develop high quality applications that work well on low-end devices in third world countries. Android application developers can choose to write the business logic of an application in a native language (a source language that is directly compiled to machine code) or Java, where Google recommends the use of Java [8]. However, when conducting computationally intensive tasks it can be advan- tageous to use native languages, as it is generally faster than Java [8], to not impede the user experience. Moreover, a developer can utilize GPU (graphics processing unit) accelerated computing to utilize the full capabilities of the device. This means using the device’s graphics processor to offload compute-intensive portions of code to the GPU, while the remainder of the code remains on the CPU (central processing unit). This allows the device to take advantage of the massively parallel architecture of the GPU. Code written to run on a GPU does not have to be custom-written for each different type of GPU, but can be compiled from a higher-level language. This means that GPU acceleration is more readily available for developers today than what it traditionally has been. 1 2 CHAPTER 1. INTRODUCTION As today’s users of technological products see more and more of virtual and augmented reality products, it is of utmost importance to keep the experience as smooth as possible. Many new technologies offer a more visual experience than before, which further increases the need for performance, since graphics processing require large amounts of heavy calculations. 1.1 Problem Java is the recommended programming language for building Android applications. However, the Java programming language contains features de- signed to improve safety and convenience at the expense of performance, e.g., the automatic memory management. Therefore, Google suggests that it might be useful for a developer to use a native language over Java in two cases [8]: • Squeeze extra performance out of a device to achieve low latency or run computationally intensive applications, such as games or physics simulations. • Reuse your own or other developers’ C or C++ libraries. This thesis intends to examine the first bullet point and investigate how large the performance benefits can be when conducting real time image processing. Furthermore, the usage of GPU acceleration can provide greater performance improvements due to increased levels of parallelization in the hardware. The problem is that with increased performance from the Android system, it is hard to know whether the performance benefit of using a different language than Java is worth the extra complexity needed to add another programming language to a software project. Image processing contains many computationally intensive processes and is therefore a candidate where it might be useful to switch to a native language or a framework that allows use of GPU acceleration. 1.2 Research Question The question this thesis intends to answer is the following: CHAPTER 1. INTRODUCTION 3 Can performance increases in run time warrant the usage of C++ or GPU acceleration frameworks over Java when writing image processing algorithms on Android? 1.3 Scope The reason a developer might not want to choose a native language or a GPU acceleration framework over Java despite performance benefits is likely that the added performance improvements do not outweigh the complexity added to the software project. This project does not intend to extensively measure the code development complexity added by using these components in an Android project. 1.4 Ethics and sustainability The work presented in this thesis aims to be as ethical as possible, in the sense that all results in presented in this thesis are reproducible from the descrip- tion in chapter 4.

ART Vs. NDK Vs. GPU Acceleration: a Study of Performance of Image Processing Algorithms on Android

AVG Android App Performance and Trend Report H1 2016

Embedded Android

A Scalable and Reliable Mobile Code Offloading Solution

Performance Analysis of Mobile Applications Developed with Different Programming Tools

Renderscript Basic Tutorial for Android* OS

Renderscript Lecture 10

Android™ Hacker's Handbook

Performance Portability for Embedded Gpus

Intel® Technology Journal | Volume 18, Issue 2, 2014

Optimizing Android in the ARM Ecosystem [email protected] ARM Strategic Software Alliances

Phase-Based Accurate Power Modeling for Mobile Application Processors

Arxiv:1602.04868V1 [Cs.CV] 16 Feb 2016