2015-Samos-06

Total Page:16

File Type:pdf, Size:1020Kb

2015-Samos-06 Imposing Coarse-Grained Reconfiguration to General Purpose Processors M. Duric⇤, M. Stanic⇤, I. Ratkovic⇤, O. Palomar⇤†, O. Unsal⇤, A. Cristal⇤†‡, M. Valero⇤†, A. Smith§ Barcelona Supercomputing Center first . last @bsc.es Universitat Politecnica` de Catalunya IIIA-CSIC ⇤ { } { } † ‡ §Microsoft Research [email protected] Abstract—Mobile devices execute applications with diverse the appropriate unit for each workload, depending on the compute and performance demands. This paper proposes a workload characteristics and particular needs. general purpose processor that adapts the underlying hardware Accelerators usually increase performance by using fixed to a given workload. Existing mobile processors need to utilize more complex heterogeneous substrates to deliver the hardware structures that match computation of hot code demanded performance. They incorporate different cores and regions. Fixed-function hardware limits the applicability specialized accelerators. On the contrary, our processor utilizes of accelerators. Hence, multiple accelerators are typically only modest homogeneous cores and dynamically provides incorporated in the design. They incur additional area over- an execution substrate suitable to accelerate a particular heads, as well as increase the design and verification cost of workload. Instead of incorporating accelerators, the processor reconfigures one or more cores into accelerators on-the-fly. It the mobile processors. Minimizing their costs directly affects improves performance with minimal hardware additions. the final cost of the mobile processor. To moderate the costs The accelerators are made of general purpose ALUs recon- and mitigate the limited applicability, various designs pro- figured into a compute fabric and the general purpose pipeline pose reconfigurable compute accelerators [3], [4], [5], [6], that streams data through the fabric. To enable reconfiguration [7], [8], [9], [10]. They increase performance over different of ALUs into the fabric, the floorplan of a 4-core processor is changed to place the ALUs in close proximity on the workloads, but require including the remarkable amount of chip. A configurable switched network is added to couple and compute resources to enable per each workload specialized dynamically reconfigure the ALUs to perform computation acceleration. To avoid such addition of the resources in of frequently repeated regions, instead of executing general mobile processors, we propose reconfiguration of the general purpose instructions. Through this reconfiguration, the mobile purpose cores into accelerators. processor specializes its substrate for a given workload and maximizes performance of the existing resources. Our results This work contributes a novel reconfigurable and yet show that reconfiguration accelerates a set of selected compute low cost chip multiprocessor (CMP) which: 1) maximizes intensive workloads by 1.56x, 2,39x, 3,51x, when configuring computational capabilities of the existing general purpose the accelerator of 1-, 2-, or 4- cores respectively. cores; 2) minimizes the amount of extra hardware for Keywords-reconfigurable computing, dynamic processors accelerators.1 The processor is made of four composable lightweight cores [12], [13]. Composability avoids static I. INTRODUCTION placement of the resources and allows a workload to dy- namically optimize its execution substrate by composing one Mobile devices are becoming ubiquitous and the mobile or more cores into a large processor. Our approach goes market segment is starting to dominate the computer indus- one step further than composability and reconfigures one or try. The competitive market requires constant improvement more composable cores into accelerators. The accelerators in the quality of these devices. Modern mobiles enable new consist of the core’s ALUs reconfigured into a compute compelling user experiences such as speech and gesture fabric and the core’s pipeline that streams data through the interfaces. Mobile applications manage an ever-increasing fabric. The fabric resembles the power-efficient computing amount of data such as: photos, videos, and a world of substrate introduced by [8]. However, instead of integrating content available in the cloud. On such inputs, the applica- such substrate, our fabric places all the ALUs available in the tions repeat their computation many times and offer potential processor to the center of the chip and connects them with a for acceleration. To harness this potential, recent mobile configurable circuit-switched network. The network permits processors have shifted toward heterogeneous computing. the fabric to be configured to perform various commonly Heterogeneous architectures [1], [2] integrate cores with repeated computations. The pipeline of the accelerator can different power and performance characteristics, GPUs and be configured as well, by composing the resources of one or specialized accelerators in a single chip. These architectures more cores around the fabric to tune the memory capabilities are more complex, but yield extra performance by choosing of the accelerator and further increase its performance. This work has been partially funded by the Spanish Government (TIN2012-34557), the European Research Council under the EUs 7th FP 1A sketch of this idea [11] with a very basic implementation and (FP/2007-2013) / ERC GA n. 321253. and Microsoft Research. simplified evaluation showed promising results. 978-1-4799-2079-2/13/$31.00 c 2013 IEEE 978-1-4673-7311-1/15/$31.00 ©2015 IEEE 1 II. BACKGROUND AND RELATED WORK architectures, which incorporate the set of configurable data processing units. Each unit performs one compute task We build our design on composable cores based on an of the parallel workload (e.g. matrix arithmetic, signal or Explicit Data Graph Execution (EDGE) architecture [14], image processing), while data passes from one unit to [15]. The EDGE architecture is an efficient research vehicle another in pipeline. The units are coupled by using a for low power mobile computing. Compiler analysis is configurable interconnect in a grid-like compute fabric. Such used to divide a program into blocks of instructions that fabric is integrated into a processor pipeline like a back- execute atomically (Atomic Instruction Block or AIB). An end processor. The pipeline arranges compute data, while AIB consists of a sequence of dataflow instructions. EDGE the fabric computes the data. The fabric is configured by compilers statically generate the dataflow and encode it in mapping compute instructions to it. It improves performance the EDGE ISA. The encoded dataflow defines producer- by “spatially” executing multiple compute instructions in consumer relationships between EGDE instructions and different stages of the fabric. The fabric also improves avoids power hungry out-of-order hardware. Instructions efficiency by repeating once configured computation many inside the block communicate directly. Each instruction times, which avoids per compute instruction fetch and de- leverages two reservation stations, which hold left and right code overheads [19]. Instead of integrating such fabric into operands respectively. Producer instructions encode targets a lightweight mobile processor, we propose reconfiguration that route their outputs to the appropriate reservation stations of the processor compute resources into a fabric. Our fabric of consumer instructions. Register operations are used only is designed to resemble the previously proposed computing for handling less-frequent inter-block communication, by substrates, their performance and efficiency, while avoiding keeping the temporary results between the AIBs. Composing as much as possible of their area overheads. EDGE cores increases performance and efficiency [12], [13]. Rather than fixing the size of cores at design time, one or III. RECONFIGURATION OF CORES more lightweight physical cores can be composed at runtime to form a larger, wider-issue logical core by using an on- The reconfiguration of composable cores extends their chip routed network between the cores. Reconfiguration of capabilities beyond general purpose processing. Bulk re- composable cores is an advanced dynamic feature that adds a sources can be allocated and used either as general purpose circuit-switched network between the ALUs of composable processors or accelerators. Each application running on cores to specialize one or more cores into an accelerator and this processor dynamically tunes the amount of allocated further increase their performance and efficiency. resources and their configuration to achieve the desired There is previous work that proposes to reconfigure exist- performance and efficiency. For example, one application ing resources of a processor into some kind of accelerator. may be executed by using one or more cores of the CMP In the context of EDGE architectures, vector execution on for general purpose processing and another one accelerated EDGE cores [16], [17] dynamically repurposes the general by using the cores reconfigured into an accelerator. purpose cores into a vector processor. The vector processor Reconfiguration makes an accelerator of one or more executes vector AIBs, which allocate the existing compute general purpose cores. The accelerator is composed of resources and repeat the computation by streaming the the general purpose ALUs reconfigured to perform like a values of large vectors. A vector memory unit is incorporated compute fabric and the general purpose pipeline that streams to decouple vector memory accesses
Recommended publications
  • Webcore: Architectural Support for Mobile Web Browsing
    WebCore: Architectural Support for Mobile Web Browsing Yuhao Zhu Vijay Janapa Reddi Department of Electrical and Computer Engineering The University of Texas at Austin [email protected], [email protected] Abstract The Web browser is undoubtedly the single most impor- Browser Browser tant application in the mobile ecosystem. An average user 63% 54% spends 72 minutes each day using the mobile Web browser. Web browser internal engines (e.g., WebKit) are also growing 23% 8% 32% Media 6% in importance because they provide a common substrate for 7% 7% Others developing various mobile Web applications. In a user-driven, Media Games Others interactive, and latency-sensitive environment, the browser’s Email performance is crucial. However, the battery-constrained na- (a) Time dist. of window focus. (b) Time dist. of CPU processing. ture of mobile devices limits the performance that we can de- Fig. 1: Mobile Web browser share study conducted by our industry liver for mobile Web browsing. As traditional general-purpose research partner on their employees’ devices [2]. Similar observa- techniques to improve performance and energy efficiency fall tions were reported by NVIDIA on Tegra-based mobile handsets [3,4]. short, we must employ domain-specific knowledge while still maintaining general-purpose flexibility. network limited. However, this trend is changing. With about In this paper, we first perform design-space exploration 10X improvement in round-trip time from 3G to LTE, network to identify appropriate general-purpose architectures that latency is no longer the only performance bottleneck [51]. uniquely fit the characteristics of a popular Web browsing Prior work has shown that over the past decade, network engine.
    [Show full text]
  • EVA: an Efficient Vision Architecture for Mobile Systems
    EVA: An Efficient Vision Architecture for Mobile Systems Jason Clemons, Andrea Pellegrini, Silvio Savarese, and Todd Austin Department of Electrical Engineering and Computer Science University of Michigan Ann Arbor, Michigan 48109 fjclemons, apellegrini, silvio, [email protected] Abstract The capabilities of mobile devices have been increasing at a momen- tous rate. As better processors have merged with capable cameras in mobile systems, the number of computer vision applications has grown rapidly. However, the computational and energy constraints of mobile devices have forced computer vision application devel- opers to sacrifice accuracy for the sake of meeting timing demands. To increase the computational performance of mobile systems we Figure 1: Computer Vision Example The figure shows a sock present EVA. EVA is an application-specific heterogeneous multi- monkey where a computer vision application has recognized its face. core having a mix of computationally powerful cores with energy The algorithm would utilize features such as corners and use their efficient cores. Each core of EVA has computation and memory ar- geometric relationship to accomplish this. chitectural enhancements tailored to the application traits of vision Watts over 250 mm2 of silicon, typical mobile processors are limited codes. Using a computer vision benchmarking suite, we evaluate 2 the efficiency and performance of a wide range of EVA designs. We to a few Watts with typically 5 mm of silicon [4] [22]. show that EVA can provide speedups of over 9x that of an embedded To meet the limited computation capability of mobile proces- processor while reducing energy demands by as much as 3x. sors, computer vision application developers reluctantly sacrifice image resolution, computational precision or application capabili- Categories and Subject Descriptors C.1.4 [Parallel Architec- ties for lower quality versions of vision algorithms.
    [Show full text]
  • 6Th Generation Intel® Core™ Processors Based on the Mobile U-Processor for Iot Solutions (Intel® Core™ I7-6600U, I5-6300U, and I3-6100U Processors)
    PLATFORM BRIEF 6th Generation Intel® Core™ Mobile Processor Family Internet of Things 6th Generation Intel® Core™ Processors Based on the Mobile U-Processor for IoT Solutions (Intel® Core™ i7-6600U, i5-6300U, and i3-6100U Processors) Harness the Performance, Features, and Edge-to-Cloud Scalability to Build Tomorrow’s IoT Solutions Today Product Overview Stunning Visual Performance Intel is proud to announce its 6th The 6th generation Intel Core generation Intel® Core™ processor processors utilize the new Gen9 family featuring ultra low-power, graphics engine, which improves 64-bit, multicore processors built on graphic performance by up to the latest 14 nm technology. Designed 34 percent.1 The improvements are for small form-factor applications, this demonstrated through faster 3-D multichip package (MCP) integrates graphics performance and rendering a low-power CPU and platform applications at low power. Video controller hub (PCH) onto a common playback is also faster and smoother package substrate. thanks to the new multiplane overlay capability. The new generation offers The 6th generation Intel Core processor up to three independent audio streams family offers dramatically higher CPU and displays, Ultra HD 4K support, and and graphics performance, a broad workload consolidation for lower BOM range of power and features scaling costs and energy output. the entire Intel product line, and new, advanced features that boost edge-to- Users will also enjoy enhanced cloud Internet of Things (IoT) designs high-density streaming applications in a wide variety of markets. These and optimized 4K videoconferencing processors run at 15W thermal design with accelerated 4K hardware media power (TDP) and are ideal for small, codecs HEVC (8-bit), VP8, VP9, and energy-efficient, form-factor designs, VDENC encoding, decoding, and including digital signage, point-of-sale transcoding.
    [Show full text]
  • PART I ITEM 1. BUSINESS Industry We Are
    PART I ITEM 1. BUSINESS Industry We are the world’s largest semiconductor chip maker, based on revenue. We develop advanced integrated digital technology products, primarily integrated circuits, for industries such as computing and communications. Integrated circuits are semiconductor chips etched with interconnected electronic switches. We also develop platforms, which we define as integrated suites of digital computing technologies that are designed and configured to work together to provide an optimized user computing solution compared to ingredients that are used separately. Our goal is to be the preeminent provider of semiconductor chips and platforms for the worldwide digital economy. We offer products at various levels of integration, allowing our customers flexibility to create advanced computing and communications systems and products. We were incorporated in California in 1968 and reincorporated in Delaware in 1989. Our Internet address is www.intel.com. On this web site, we publish voluntary reports, which we update annually, outlining our performance with respect to corporate responsibility, including environmental, health, and safety compliance. On our Investor Relations web site, located at www.intc.com, we post the following filings as soon as reasonably practicable after they are electronically filed with, or furnished to, the U.S. Securities and Exchange Commission (SEC): our annual, quarterly, and current reports on Forms 10-K, 10-Q, and 8-K; our proxy statements; and any amendments to those reports or statements. All such filings are available on our Investor Relations web site free of charge. The SEC also maintains a web site (www.sec.gov) that contains reports, proxy and information statements, and other information regarding issuers that file electronically with the SEC.
    [Show full text]
  • NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad Core
    Whitepaper NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad core 1 Table of Contents ...................................................................................................................................................................... 1 Introduction .............................................................................................................................................. 3 NVIDIA Tegra 4 Family of Mobile Processors ............................................................................................ 3 Benchmarking CPU Performance .............................................................................................................. 4 Tegra 4 Family CPUs Architected for High Performance and Power Efficiency ......................................... 6 Wider Issue Execution Units for Higher Throughput ............................................................................ 6 Better Memory Level Parallelism from a Larger Instruction Window for Out-of-Order Execution ...... 7 Fast Load-To-Use Logic allows larger L1 Data Cache ............................................................................. 8 Enhanced branch prediction for higher efficiency .............................................................................. 10 Advanced Prefetcher for higher MLP and lower latency .................................................................... 10 Large Unified L2 Cache .......................................................................................................................
    [Show full text]
  • Architectural Support for Javascript Type-Checking on Mobile Processors
    Checked Load: Architectural Support for JavaScript Type-Checking on Mobile Processors Owen Anderson Emily Fortuna Luis Ceze Susan Eggers Computer Science and Engineering, University of Washington http://sampa.cs.washington.edu Abstract applications, including Google Maps, Twitter, and Face- book would not be feasible without both high-throughput Dynamic languages such as Javascript are the de-facto and low-latency JavaScript virtual machines on the client. standard for web applications. However, generating effi- At the same time, innovations in mobile device pro- cient code for dynamically-typed languages is a challenge, grammability have opened up embedded targets to the same because it requires frequent dynamic type checks. Our anal- class of programmers. Today’s smart mobile devices are ysis has shown that some programs spend upwards of 20% expected to provide a developer API that is usable by of dynamic instructions doing type checks, and 12.9% on normal application developers, as opposed to the special- average. ized embedded developers of the past. One such plat- In this paper we propose Checked Load, a low- form, HP/Palm’s WebOS [17], uses JavaScript as its pri- complexity architectural extension that replaces software- mary application development language. Others encourage based, dynamic type checking. Checked Load is comprised JavaScript-heavy web applications in addition to their na- of four new ISA instructions that provide flexible and au- tive development environments, as a means of providing tomatic type checks for memory operations, and whose im- feature-rich, portable applications with minimal develop- plementation requires minimal hardware changes. We also ment costs. propose hardware support for dynamic type prediction to Because of their power and space constraints, embedded reduce the cost of failed type checks.
    [Show full text]
  • Intel® Core™2 Duo Mobile Processor for Intel® Centrino® Duo Mobile Processor Technology
    Intel® Core™2 Duo Mobile Processor for Intel® Centrino® Duo Mobile Processor Technology Datasheet September 2007 Document Number: 314078-004 INFORMATIONLegal Lines and Disclaimers IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
    [Show full text]
  • Design and Implementation of Low Power 5 Stage Pipelined 32 Bits MIPS Processor Using 28Nm Technology
    International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-8 Issue-4S2 March, 20 Design and implementation of low power 5 stage Pipelined 32 bits MIPS Processor using 28nm Technology V.Prasanth, V.Sailaja, P.Sunitha, B.Vasantha Lakshmi processor. The operations used by MIPS processor in Abstract— MIPS is a simple streamlined highly scalable instruction set which are generally used to access memory in RISC architecture is most used in android base devices and best MIPS processor are load and Store and other operations suited for portable mobile devices. This Paper presents a design which are remaining are performed on register to register of 5 stage pipelined 32 bit MIPS processor on a 28nm basis [2] this results in more clear instruction set design Technology. The processor is designed using Harvard architecture. The most important feature of pipelining is where it allows execution of one instruction-per cycle rate. performance and speed of the processor, this results in increase The pipelining uses parallelism at instruction level to execute of device power. To reduce dynamic power using RTL clock multiple instructions simultaneously using a single processor gating inside FPGA device we presented a novel approach in this [2]. The major disadvantage with MIPS processor design is paper. Design functionality in terms of area power and speed is dynamic power consumption results due to clock power and analyzed using kintex 7 platform board. switching-activity. Clock Gating is a method which employed in the design to reduce power consumption [17] by Keywords: RISC,MIPS, Clock Gating, Dynamic Power, reducing switching activity of non active blocks.
    [Show full text]
  • 5 Microprocessors
    Color profile: Disabled Composite Default screen BaseTech / Mike Meyers’ CompTIA A+ Guide to Managing and Troubleshooting PCs / Mike Meyers / 380-8 / Chapter 5 5 Microprocessors “MEGAHERTZ: This is a really, really big hertz.” —DAVE BARRY In this chapter, you will learn or all practical purposes, the terms microprocessor and central processing how to Funit (CPU) mean the same thing: it’s that big chip inside your computer ■ Identify the core components of a that many people often describe as the brain of the system. You know that CPU CPU makers name their microprocessors in a fashion similar to the automobile ■ Describe the relationship of CPUs and memory industry: CPU names get a make and a model, such as Intel Core i7 or AMD ■ Explain the varieties of modern Phenom II X4. But what’s happening inside the CPU to make it able to do the CPUs amazing things asked of it every time you step up to the keyboard? ■ Install and upgrade CPUs 124 P:\010Comp\BaseTech\380-8\ch05.vp Friday, December 18, 2009 4:59:24 PM Color profile: Disabled Composite Default screen BaseTech / Mike Meyers’ CompTIA A+ Guide to Managing and Troubleshooting PCs / Mike Meyers / 380-8 / Chapter 5 Historical/Conceptual ■ CPU Core Components Although the computer might seem to act quite intelligently, comparing the CPU to a human brain hugely overstates its capabilities. A CPU functions more like a very powerful calculator than like a brain—but, oh, what a cal- culator! Today’s CPUs add, subtract, multiply, divide, and move billions of numbers per second.
    [Show full text]
  • Introducing Intel's First-Ever Core™ I9 Mobile Processor
    Product Brief Mobile 8TH Gen Intel® Core™ Processors INTRODUCING INTEL’s FIRST-EVER CORE™ I9 MOBILE PROCESSOR. THE PERFORMANCE POWERHOUSE FOR WHAT THE MOST DEMANDING MOBILE ENTHUSIASTS NEED TODAY AND FOR WHAT COMES NEXT. Be ready for amazing experiences in gaming, VR, content creation, and entertainment wherever your computing takes you with the highest- performance mobile 8th Generation Intel® Core™ processor family. This latest addition to the 8th generation processor family extends all the capabilities users have come to love in our mobile H series platforms with advanced innovations that deliver exciting new features to immerse you in incredible experiences on a variety of form factors. Product Brief Mobile 8TH Gen Intel® Core™ Processors REDEFINE ENTHUSIAST MOBILE PC PERFORMANCE ULTIMATE MOBILE PLATFORM PERFORMANCE The newest 8th Generation Intel Core processors redefine enthusiast mobile PC performance now with up to six cores and 12 MB of cache memory for more processing power—that’s two more cores than the previous generation Intel Core processor family—Intel® Turbo Boost 2.0 technology and new Intel® Thermal Velocity Boost to opportunistically and automatically increase core frequency whenever processor temperature and turbo budget allows. Intel® Hyper-Threading Technology delivers multitasking support in the latest generation of Intel Core processors. For the enthusiast, the fully-unlocked 8th Generation Intel® Core™ i9-8950HK processor provides the opportunity to tweak the platform performance to its fullest potential and enjoy hardcore mobile gaming and VR experiences. 2 Product Brief Mobile 8TH Gen Intel® Core™ Processors THE NEW MOBILE 8TH GENERATION INTEL® CORE™ PROCESSOR FAMILY DELIVERS: • An impressive portfolio of standard and unlocked systems for a broad range of usages and performance levels.
    [Show full text]
  • Intel® Core™ I7-900/I7-800/I7-700 Mobile Processor
    Intel® Core™ i7-900 Mobile Processor Extreme Edition Series, Intel Core i7-800 and i7-700 Mobile Processor Series Datasheet- Volume One This is volume 1 of 2. Refer to document 320766 for Volume 2 September 2009 Document Number: 320765 -001 INFORMATIONLegal Lines and Disclaimers IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications.
    [Show full text]
  • Android in the Cloud on Arm Native Servers
    Android in the cloud on Arm native servers December 2020 Executive summary All the chatter about Arm servers and the role of Android in the cloud in recent years generates questions about what these things could mean for a largely mobile based platform like Arm/Android. This whitepaper examines the practical and commercial use cases that are driving a new paradigm linking the primary mobile and embedded computing platform in the world today with its new cousin in cloud infrastructure: Arm servers + virtualisation + native Android execution. While detailing the use cases driving a mobile platform migration to cloud infrastructure, this whitepaper also showcases the solution that Ampere, Canonical, and NETINT Technologies have partnered to produce addressing this migration. The solution is an example of how a service provider or developer can take advantage of Arm native computing in a cloud context to bring together an ecosystem of over 3 million primarily mobile apps integrated with benefits from a cloud enabled infrastructure. A case study around cloud gaming fills in the practical real-world considerations of this new paradigm. Cloud gaming really showcases the advantages of all the solution components the partners have integrated to make a scalable, efficient cloud resource available to a new class of cloud-native applications catering to the billions of users with Arm based devices worldwide. The whitepaper concludes with thoughts and predictions about the evolution of this new paradigm as hardware and software components mature in the coming years. A new paradigm The Arm architecture has dominated the mobile processor market with its unrivaled ability to maximise power-efficiency.
    [Show full text]