Optimizing Android in the ARM Ecosystem [email protected] ARM Strategic Software Alliances

1 ARM Engineering Global Coverage

[email protected]

2 ARM Android Ecosystem Strategy § Deliver value throughout the growing Ecosystem § Make it easy for ARM Silicon Partners to Deliver Android § Off-the-shelf ARM Processor/Board ports and recipes § Make Android better on ARM for OEM’s § Optimize Key Open Source ingredients § Engage Developers in use of advanced ARM Tech § Blog Posts, Webinars § Tools, Libraries § Developer Knowledge Sites § Developer Relations

3 ARM Android Ecosystem Strategy § Deliver value throughout the growing Ecosystem § Make it easy for ARM Silicon Partners to Deliver Android § Off-the-shelf ARM Processor/Board ports and recipes § Make Android better on ARM for OEM’s § Optimize Key Open Source ingredients § Engage Developers in use of advanced ARM Tech § Blog Posts, Webinars § Tools, Libraries § Developer Knowledge Sites § Developer Relations

4 Android Boot Recipe for new ARM SoC’s § ARM brings the latest Android releases up on latest SoC’s § “From Zero to Boot” Recipe Blog Post § ARM Partners can easily apply recipe to their SoC § Typical bring-up times are in the order of a few days!

5 Android Bring-Up on ARM SoC’s Info § From Zero to Boot recipe § http://blogs.arm.com/software-enablement/498-from-zero-to-boot-porting-android- to-your-arm-platform/ § Building Android for ARM Boards - Existing ports description § http://linux-arm.org/LinuxKernel/LinuxAndroidPlatform § Git repos - existing ports § and Android Filesystem Patches for ARM Boards § http://linux-arm.org/git?p=armdroid.git;a=summary § Android Kernel for ARM Boards source tree § http://linux-arm.org/git?p=linux-2.6-armdroid.git;a=summary § ’s § http://source.android.com/source/initializing.html § http://developer.android.com/index.html

66 ARM Android Ecosystem Strategy § Deliver value throughout the growing Ecosystem § Make it easy for ARM Silicon Partners to Deliver Android § Off-the-shelf ARM Processor/Board ports and recipes § Make Android better on ARM for OEM’s § Optimize Key Open Source ingredients § Engage Developers in use of advanced ARM Tech § Blog Posts, Webinars § Tools, Libraries § Developer Knowledge Sites § Developer Relations

7 Key Ingredient Technologies In Google Platforms

Mobile Tablet Google TV Chrome OS

Chrome Chrome Chrome Chrome Browser Browser Browser Browser

AIR Flash Flash Flash Flash

Webkit VP8

Webkit Webkit Webkit LLVM LLVM LLVM LLVM LLVM LLVM LLVM

V8 JavaScript V8 JavaScript V8 JavaScript V8 JavaScript Engine Engine Engine Engine

SKIA 2D SKIA 2D SKIA 2D SKIA 2D Graphics

Android Android Android Linux Middleware Middleware Middleware LLVM LLVM LLVM LLVM VP8

VP8 Middleware LLVM LLVM VP8 Dalvik Dalvik X Windows

Linux Kernel Linux Kernel Linux Kernel Linux Kernel

§ ARM and Partners focus on optimizing Common Ingredient Technologies across Google OS § GCC, Kernel, V8 JavaScript Engine, Webkit, LLVM, Browser, VP8 § Ongoing contributions to many upstream open source projects § Deep technical engagements with many industry shaping companies including Google

8 www.linaro.org § Linaro is a not for profit engineering company that delivers core Linux technology for the benefit of members Key goals: § Use shared investment to provide high ROI to members § Accelerate time to market for member products § Reduce fragmentation and resulting costs § Work closely with ARM to deliver Linux software and tools for new ARM technology – big.LITTLE, server, ARMv8 § Make ARM a leading architecture in open source

9 ARM and Linaro Deliver significant Goodness § ARM Upstream § GCC Patches for new CPU Cores § Kernel Patches for new CPU Cores § Linaro § Upstream latest Kernel & GCC patches § Transfer advanced ARM tech to world § Deliver customized tools and platforms for membership

10 Linaro Android Performance Improvements § PandaBoard 4430 running Android 4.0.4 (ICS) from Linaro twice as fast as stock Android 4.0.4 § http://www.youtube.com/watch?v=mrQRYmYip6Q&feature=plcp

Linaro Android

Stock Android

11 Linaro Available § 10 Jul: Jelly Bean on AOSP § 13 Jul: Linaro initial build for § 26 Jul: Linaro experimental builds § Origen and PandaBoard § 30 Aug: Linaro Jelly Bean builds in regular release

§ Links: § https://groups.google.com/forum/?fromgroups=#!topic/android-platform/_W63mhUNU0E § http://www.youtube.com/watch?v=YPFHwOpW_Ts&feature=plcp § http://www.androidauthority.com/galaxy-nexus-gt-i9250-android-4-1-1-jelly-bean-jro03l-aosp-rom-108895/ § http://www.linaro.org/linaro-blog/2012/08/26/linaro-android-jellybean-on-galaxy-nexus-gsm-speeding-up-phones/

12 Linaro Core Roadmap

Ubuntu 12.10 Android K Concept Platform OpenEmbedded AArch64 Adv. Planning OpenEmbedded bootstrap hard-float bootstrap Optimized AArch64 libs Development 12.04 Android JellyBean Android AArch64 Android debug, performance and tracing Released AArch64 toolchain Upstream bootstrap LLVM OpenEmbedded Tools NEON optimizations meta-Linaro layerGDB for Android for libraries LAVA power/ video probes Windows hosted LAVA FastModels LAVA full SD card images toolchain GCC performance and optimizations Power-aware scheduler UFS Kernel (big.LITTLE MP) DeviceTree Swap on uprobes Kernel driver validation baseline and flash TrustZone/TEE pin control QEMU KVM Kernel framework LPAE zImage stress testing w/mult. USBs AArch64 Common OpenGL ES kernel bootstrap struct clock test suite Android Struct clk for upstreaming KVM Big.LITTLE other SoCs Per-CPU latency (runtime DMA-BUF in-kernel eMMC 4.5 PM, pmqos, CPUIDLE) switcher extended features 2012 H1 2012 H2 2013 H1 2013 H2 Future Version 1.0, November 2012 13 Linaro AArch64 Roadmap

Concept

Adv. Planning

LEG Core AArch64 support: LEG Bottom-up AArch64 support: Development - pre-built kernel - pre-built kernel - pre-built GNU tools - pre-built GNU tools - File system support Released Upstream

AArch64 LAVA Integration: AArch64 LAVA Integration: - Community AEM model - Member platform readiness - LTP tests running - Directed Member LAVA testing Upstreaming Community Support AArch64 Libraries - Basic Libraries - Performance Libraries

AArch64 Cross build platform: Upstreaming Community Support - Initial platform bootstrapping - based on OpenEmbedded Member Landing Team: - Readiness for early Si - Model à FPGA à Si - Pulls from WGs

big.LITTLE: Summary AArch64 GNU Tool Chain: - 64-bit support - Upstreaming to OSS projects - In-kernel migration path • Focus on 64-bit static - MP Focus compiler, assembler linker, bootstrap for members loader, • Integration directly andAArch64 library Kernel Debugger/Profiler Review: Upstreaming Community Support - Public set of patches available into LAVA - Linaro support for public review • Core support for server AArch64 Private code review: • big.LITTLE enablement - Invitation only - F2F at Connect - Review against ARM specification 2012 H1 2012 H2 2013 H1 2013 H2 2014 H1 Future Version 1.0, November 2012 14 Key Ingredient Technologies In Google Platforms

Mobile Tablet Google TV Chrome OS

Chrome Chrome Chrome Chrome Browser Browser Browser Browser

AIR Flash Flash Flash Flash

Webkit VP8

Webkit Webkit Webkit LLVM LLVM LLVM LLVM LLVM LLVM LLVM

V8 JavaScript V8 JavaScript V8 JavaScript V8 JavaScript Engine Engine Engine Engine

SKIA 2D SKIA 2D SKIA 2D SKIA 2D Graphics

Android Android Android Linux Middleware Middleware Middleware LLVM LLVM LLVM LLVM VP8

VP8 Middleware LLVM LLVM VP8 Dalvik Dalvik Dalvik X Windows

Linux Kernel Linux Kernel Linux Kernel Linux Kernel

§ Webkit is a common key component across all Google Client Platforms § Underlying Framework for high performance HTML5

15 Optimizing HTML5 with NEON § ARM and partners optimizing HTML5 tags § 2D bitmap graphics § NEON optimized- SKIA

§ 2D vector graphics § NEON optimized- SKIA § feFilters optimized using SMP and NEON § Up to 4X improvement

§

16 Improving HTML5 with

§ CSS3 Animation (Accelerated Compositing) § HTML5 2D Canvas (Accelerated 2D Canvas)

Evidence of improvement in Webkit using a GPU

17 Key Ingredient Technologies In Google Platforms

Mobile Tablet Google TV Chrome OS

Chrome Chrome Chrome Chrome Browser Browser Browser Browser

AIR Flash Flash Flash Flash

Webkit VP8

Webkit Webkit Webkit LLVM LLVM LLVM LLVM LLVM LLVM LLVM

V8 JavaScript V8 JavaScript V8 JavaScript V8 JavaScript Engine Engine Engine Engine

SKIA 2D SKIA 2D SKIA 2D SKIA 2D Graphics

Android Android Android Linux Middleware Middleware Middleware LLVM LLVM LLVM LLVM VP8

VP8 Middleware LLVM LLVM VP8 Dalvik Dalvik Dalvik X Windows

Linux Kernel Linux Kernel Linux Kernel Linux Kernel

§ The V8 JavaScript Engine is a common key component in high performance HTML5

18 JavaScript Acceleration on ARM § JavaScript accelerated by compiling to native code using JIT

§ ARM performance increased 5X in 1 year Google V8 § ARM has worked on Multiple JITs Chrome § Google V8 Android Chrome OS GoogleTV § TraceMonkey § JaegerMonkey § IonMonkey § Squirrel Fish Nitro § Tamarin § ARM contributions to upstream projects result in performance optimizations across the ecosystem § Example: ARM and Google continually optimizing V8 together § Optimizations released into public: http://code.google.com/p/v8 § More info at: http://bit.ly/v8ARMTurbo

19 Renderscript: High Performance GP/GPU Compute § Android’s Renderscript Compute is the first computation platform ported to run directly on a mobile device GPU § SW parallelization across all cores Renderscript Code (C99 Like Language) § ARM directly sponsoring LLVM activity § ARM LLVM CPU code-generation Compute API § ARM optimizing LLVM code Gen All work delivered upstream LLVM § § supports Renderscript Compute Renderscript “binary” directly on ARM Mali-T604 (LLVM Intermediary)

OpenGL ES 2.0 Compute Driver & (Native) (Native Code)

ARMv7 CPU Mali GPU (VFP, NEON)

20 LLVM for future Native Web Apps

C/C++ Code § NaCl § Native Client enables native compiled code in the browser NaCl SDK § Current support for C/C++ PNaCL Cross Compiler

pexe portable executable HTML5 (LLVM Bitcode)

Internet

Browser § PNaCl

LLVM Backend Translator § Portable Native Client compiles NaCl applications to LLVM bitcode ARMv7 CPU (VFP, NEON)

21 ARM Android Ecosystem Strategy § Deliver value throughout the growing Ecosystem § Make it easy for ARM Silicon Partners to Deliver Android § Off-the-shelf ARM Processor/Board ports and recipes § Make Android better on ARM for OEM’s § Optimize Key Open Source ingredients § Engage Developers in use of advanced ARM Tech § Blog Posts, Webinars § Tools, Libraries § Developer Knowledge Sites § Developer Relations

22 ARM Engaging with Android Developers § DS-5 Advanced Native Development Tools § Detailed Profiling of ARM CPU and GPU § Free Community Edition § androidtools.org site § Aggregate website showing Android App development methodologies Test and review of available Android tools § Blogs § NEON Coding § Tools Setup § V8 JavaScript Activity § projectNe10.org open source library project § Lets developers get the most out of ARMv7/NEON coding § Open Source Project suggestions/contributions welcome § http://www.malideveloper.com/ Mali Dev Resources

23

DS-5 Streamline Profiler for Android

§ Find hotspots, system glitches, critical conditions at a glance

Select from 40+ CPU counters, OS level and custom metrics Select one or more processes to visualize their instant load on CPU

Accumulate counters, measure time and find instant hotspots

Combined task switch trace and sampled profile for all threads

24 Mali GPUCPU, and Graphics GPU fragment and Analysis vertex processing activity OpenGL® API events

Frame buffer filmstrip

Hardware and Software counters

Visualize application activity per processor or processor activity per application

25 NEON - Enhancing User Experiences

Watch any video Game in any format processing

Edit & Enhance Process captured videos megapixel Video stabilization photos quickly

Antialiased Voice rendering recognition & compositing

Advanced Powerful User Interfaces multichannel hi-fi audio processing

26 NEON in Open Source Today § Google WebM – 11,000 lines NEON assembler! § Bluez – official Linux protocol stack § Pixman (part of cairo 2D graphics library) § ffmpeg () – libavcodec § LGPL media player used in many Linux distros and products § Extensive NEON optimizations § x264 – 2009 § GPL H.264 encoder – e.g. for video conferencing § Android – NEON optimizations § Skia library, S32A_D565_Opaque 5x faster using NEON § Available in Google Skia tree from 03-Aug-2009 § LLVM – code generation backend used by Android RenderScript § Eigen2 – C++ vector math / linear algebra template library § TheorARM – libtheora NEON version (optimized by Google) § libjpeg / libjpeg-turbo – optimized JPEG decode § libpng – optimized PNG decode § FFTW – NEON enabled FFT library § Liboil / liborc – runtime compiler for SIMD processing § webkit – used by Chrome Browser

27 NEON and Project Ne10 § NEON™ is a wide SIMD data processing architecture § Extension of the ARM® instruction set § 32 registers, 64-bits wide (dual view as 16 registers, 128-bits wide) § Dozens of Android™ subsystems & thousands of applications use NEON § NEON optimizations improve execution and battery performance § VP8 codec contains >11K lines of NEON code § Some Skia routines >500% improvement § HTML5 SVG filters improved by up to 400% § Ne10 is a library of highly optimized common functions callable from C § Lets you get the most out of ARMv7/NEON without arduous coding § Free as in both beer and speech (Apache License) § Easy to use out-of-the-box § Project Ne10 is an open source project hosted by ARM § Hosted on github, suggestions/contributions welcome github.com/projectNe10

28 Additional NEON Technical Resources § NEON blog posts http://szeged.github.com/nevada/ § Coding Examples

§ NEON simulator Web App § Nevada Project on github § Joint project with Szeged § Open Source

§ NEON Best Practices § In Cortex-A Guide § When and how to use NEON

http://infocenter.arm.com/help/topic/com.arm.doc.den0013c/index.html

29 HTML5 – The Converging OS

§ Websites become the App Store § Apps become the Web! § Open architecture enables many stores § Ability to monetize remains § Do you sign off on this App? § Many leading Apps already are in HTML5 Standards Enable Convergence

Unified Content Delivery

30 Profiling JavaScript HTML5 Execution § ARM created an extension to and Webkit Browsers § Developers see hotspot analysis while specific JavaScript executing § Zero in on key areas to optimize in browser engine for web Apps § Find bottlenecks in specific web Apps

31 Mobile OS: A True Web-based Platform § Firefox Mobile OS uses Android Kernel § ARM & Thundersoft are collaborating to integrate Streamline § Full profiling of Firefox Mobile OS from Web Apps to Kernel User#Interface##&#APPS#

Mozilla#Gecko#Web#Engine#

Standard#API’s#(Javascript)#

Contacts# NFC# Camera#Bluetooth#SMS#Telephony#Audio#LocaLon#SeBngs#

OS#Kernel#(e.g.,#Android#Linux,#etc.)#Android Kernel & Device Driver Framework }

Device#Hardware#

32

12/3/12 32 ARM growing the Android Ecosystem § Creating and growing an Ecosystem requires: § Correct Technology Investment § Processor Architecture for 21st Century Web-centric computing § Tools to get the best from Processor Architecture § Optimization of Open SW Technologies for the present and future § Ready availability of Tools and Hardware Reference Designs § An Ecosystem nucleus- partners and customers can orbit around § Access the latest technology § Collaborate § Compete on a level playing field § Partner with each other § Deliver solutions § Attain Success!

33 34