4X ARM Cortex A57 • 4X ARM Cortex A53 • Adreno 430 GPU • Hexagon V56 DSP • 20Nm Process Technology
Total Page:16
File Type:pdf, Size:1020Kb
Load more
										Recommended publications
									
								- 
												  Enabling the Use of Low Power Mobile and Embedded Technologies ForEnabling the Use of Embedded and Mobile Technologies for High-Performance Computing Author: Advisor: Nikola Rajovic´ Alex Ramirez A THESIS SUBMITTED IN FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor per la Universitat Polit`ecnicade Catalunya Departament d’Arquitectura de Computadors Barcelona, Spring 2017 To my family . Моjоj породици . Acknowledgements During my PhD studies I received help and support from many people - it would be strange if I did not. Thus I would like to thank professors Mateo Valero and Veljko Milutinovi´cfor recognizing my interest in HPC and introducing me to my advisor, professor Alex Ramirez. I would like to thank him for trusting in me and giving me an opportunity to be a pioneer of HPC on mobile ARM platforms, for guiding and shaping my work last five years, for helping me understand what real priorities are, and for being pushy when it was really needed. In addition, thank you Carlos and Puzo for helping me to have a smooth start of my PhD and for filling the gap when Alex was too busy. Last two years of my PhD I have been working with Alex remotely, and I would like to thank the local crew who helped me: professor Eduard Ayguade for providing me with some very hard-to-get manuscripts and accepting to be my "ponente" at the University; professor Jesus Labarta for thorough dis- cussions about parallel performance issues and know-how lessons on demand; Alex Rico for helping me deliver work for the Mont-Blanc project and for friendly advices every so little; Filippo Mantovani for making sure I could use the prototypes without outages and for filtering-out internal bureaucracy issues.
- 
												  Stochastic Modeling and Performance Analysis of Multimedia SocsStochastic Modeling and Performance Analysis of Multimedia SoCs Balaji Raman, Ayoub Nouri, Deepak Gangadharan, Marius Bozga, Ananda Basu, Mayur Maheshwari, Jérôme Milan, Axel Legay, Saddek Bensalem, Samarjit Chakraborty To cite this version: Balaji Raman, Ayoub Nouri, Deepak Gangadharan, Marius Bozga, Ananda Basu, et al.. Stochastic Modeling and Performance Analysis of Multimedia SoCs. SAMOS XIII - Embedded Computer Sys- tems: Architectures, Modeling, and Simulation, Jul 2013, Agios konstantinos, Samos Island, Greece. pp.145-154, 10.1109/SAMOS.2013.6621117. hal-00878094 HAL Id: hal-00878094 https://hal.archives-ouvertes.fr/hal-00878094 Submitted on 29 Oct 2013 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Stochastic Modeling and Performance Analysis of Multimedia SoCs Balaji Raman1, Ayoub Nouri1, Deepak Gangadharan2, Marius Bozga1, Ananda Basu1, Mayur Maheshwari1, Jerome Milan3, Axel Legay4, Saddek Bensalem1, and Samarjit Chakraborty5 1 VERIMAG, France, 2 Technical Univeristy of Denmark, 3 Ecole Polytechnique, France, 4 INRIA Rennes, France, 5 Technical University of Munich, Germany. E-mail: [email protected] Abstract—Quality of video and audio output is a design-time to estimate buffer size for an acceptable output quality. The constraint for portable multimedia devices.
- 
												  An Emerging Architecture in Smart PhonesInternational Journal of Electronic Engineering and Computer Science Vol. 3, No. 2, 2018, pp. 29-38 http://www.aiscience.org/journal/ijeecs ARM Processor Architecture: An Emerging Architecture in Smart Phones Naseer Ahmad, Muhammad Waqas Boota * Department of Computer Science, Virtual University of Pakistan, Lahore, Pakistan Abstract ARM is a 32-bit RISC processor architecture. It is develop and licenses by British company ARM holdings. ARM holding does not manufacture and sell the CPU devices. ARM holding only licenses the processor architecture to interested parties. There are two main types of licences implementation licenses and architecture licenses. ARM processors have a unique combination of feature such as ARM core is very simple as compare to general purpose processors. ARM chip has several peripheral controller, a digital signal processor and ARM core. ARM processor consumes less power but provide the high performance. Now a day, ARM Cortex series is very popular in Smartphone devices. We will also see the important characteristics of cortex series. We discuss the ARM processor and system on a chip (SOC) which includes the Qualcomm, Snapdragon, nVidia Tegra, and Apple system on chips. In this paper, we discuss the features of ARM processor and Intel atom processor and see which processor is best. Finally, we will discuss the future of ARM processor in Smartphone devices. Keywords RISC, ISA, ARM Core, System on a Chip (SoC) Received: May 6, 2018 / Accepted: June 15, 2018 / Published online: July 26, 2018 @ 2018 The Authors. Published by American Institute of Science. This Open Access article is under the CC BY license.
- 
												  Comparative Study of Various Systems on Chips Embedded in Mobile DevicesInnovative Systems Design and Engineering www.iiste.org ISSN 2222-1727 (Paper) ISSN 2222-2871 (Online) Vol.4, No.7, 2013 - National Conference on Emerging Trends in Electrical, Instrumentation & Communication Engineering Comparative Study of Various Systems on Chips Embedded in Mobile Devices Deepti Bansal(Assistant Professor) BVCOE, New Delhi Tel N: +919711341624 Email: [email protected] ABSTRACT Systems-on-chips (SoCs) are the latest incarnation of very large scale integration (VLSI) technology. A single integrated circuit can contain over 100 million transistors. Harnessing all this computing power requires designers to move beyond logic design into computer architecture, meet real-time deadlines, ensure low-power operation, and so on. These opportunities and challenges make SoC design an important field of research. So in the paper we will try to focus on the various aspects of SOC and the applications offered by it. Also the different parameters to be checked for functional verification like integration and complexity are described in brief. We will focus mainly on the applications of system on chip in mobile devices and then we will compare various mobile vendors in terms of different parameters like cost, memory, features, weight, and battery life, audio and video applications. A brief discussion on the upcoming technologies in SoC used in smart phones as announced by Intel, Microsoft, Texas etc. is also taken up. Keywords: System on Chip, Core Frame Architecture, Arm Processors, Smartphone. 1. Introduction: What Is SoC? We first need to define system-on-chip (SoC). A SoC is a complex integrated circuit that implements most or all of the functions of a complete electronic system.
- 
												  Overview of CPU Power Consumption and Management in SmartphonesOverview of CPU Power Consumption and Management in Smartphones Prof. Sasu Tarkoma University of Helsinki, Aalto University, Helsinki Institute for Information Technology Contents • Modern smartphone SoC and CPUs – The CPU: power states – Power management basics • Smartphone solutions – Linux CPU Frequency subsystem – Power models • Intra-device task offloading – Sensor hub – Heterogeneous multiprocessing • Computation offloading Smartphones • Smartphones have become hubs for applications and connecting with the Internet • Cloud has emerged as a backend for mobile applications • Mobile data and WiFi are the dominant protocols for connecting with Internet resources • The next generation solutions are addressing limitations of the current smartphones – Coordination of resource usage – Offloading in its many forms – Heterogeneous environment and the emergence of IoT / M2M / wearables Observations • Smartphone and mobile device hardware and software evolve rapidly • Multiple wireless protocols • Heterogeneous computing over multiple cores – Dedicated subsystems (sensor hubs) – Increasing number of sensing subsystems – Always-on sensing • Battery technology has not kept pace with the development • Software is not, in many cases, optimized • Difficult to balance between local versus distributed processing • Difficult to control traffic across interfaces Mobile Evolution 1995 2000 2005 2010 2015 Processor Single Single Single Dual-core Quad-core and beyond, auxiliary processors, sensor hubs Cellular 2G 2.5-3G 3.5G Transition 4G generation toward 4G Standard GSM GPRS HSPA HSPA, LTE LTE, LTE-A Downlink (Mb/ 0.01 0.1 1 10 100 s) Display pixels 4 16 64 256 1024 (x1000) Communicatio - - WiFi, WiFi, WiFi, Bluetooth LE, ns modules Bluetooth Bluetooth RFID Battery 1 2 3 4 5 capacity (Wh) Software (MB) 0.1 1 10 100 1000 Example Smartphone SoC Modem Subsystem Multicore Subsystem Multimedia Subsystem LTE Adreno World KRAIT CPU KRAIT CPU GPU Modem Audio, GPS,Wi-Fi, Video HW, BT,FM Accelerator L1 Cache L1 Cache s DSP DSP DSP L2 Cache Multim.
- 
												  DISI - University of TrentoPhD Dissertation International Doctorate School in Information and Communication Technologies DISI - University of Trento Cyber-Physical Systems: Two Case Studies in Design Methodologies Luca Rizzon Advisor: Prof. Roberto Passerone Universit`adegli Studi di Trento April 2016 Abstract To analyze embedded systems, engineers use tools that can simulate the performance of software components executed on hardware architectures. When the embedded system functionality is strongly correlated to physical quantities, as in the case of Cyber-Physical System (CPS), we need to model physical processes to determine the overall behavior of the system. Unfortunately, embedded systems simulators are not generally suitable to evaluate physical processes, and in the same way physical model simulators hardly capture the functionality of computing systems. In this work, we present a methodology to concurrently explore these aspects using the metroII design framework. The methodology provides guidelines for the implementation of these models in the design environment. To demonstrate the feasibility of the proposed approach, we applied the methodology to two case studies. A case study regards a binaural guidance system developed to be included into a smart rollator for older adults. The second case consists of an energy recovery device which gets energy from the heat dissipated by a high performance processor and power a smart sink able to provide cooling or to serve as a wireless sensing node. Keywords [Cyber-Physical Systems, Design Methodology, Binaural Synthesis, Energy Harvesting] Contents 1 Introduction 1 1.1 CPS Design Challenges . .1 1.2 Structure of the Thesis . .3 2 Design Methodology 5 2.1 State of the Art . .5 2.2 MetroII Design Framework .
- 
												  M-Line BROCHUREThe Novasom Industries M-line was created for those advanced multimedia applications where the computing power and the presence of specific HW accelerators are needed as much as the advanced connectivity to various kinds of displays while maintaining the classic low-level industrial connectivity. Novasom Industries M11 series, based on the new Intel Apollo Lake x5 6th generation Atom CPU with Microsoft Windows 10 and UHD (4k) video capabilities, is perfect for the typical Kiosks & Digital-Signage applications. The M7 is based on the Rockchip RK3328, a 4X A53 processor and can drive UHD (4k) displays, has USB3 & 2, HDMI and supports Android OS. Complete SBC with immediate bootstrap Native Android & Linux support O.S. (M7, M8 & M9) M8 board runs Qualcomm Snapdragon 410E with Android and Windows 10 IoT and can be connected to FHD displays. Native Window 10 and Linux (M11) Embedded UPS manager with battery and Redundant The M9, based on the Rockchip RK3399 offers an android like Power Input experience and high side multimedia application with UHD, multiple video and camera input. HD Audio output and Optical SPDIF mPCIe interface slot (M9, M7 & M11) All the M-Line boards support Linux OS. Fluidity and no scratch on Heavy UHD play RASPMOOD form factor for M7, M8 and M9: dimensions, guaranteed mechanical holes, expansion pin on strip, connector Fully certified board, visit kind and position are similar to the famous Pi Family. www.novasomindustries.com for details So if you've started with a toy-board and want to use in an industrial proposal, we are ready.
- 
												  Introducing Slambench, a Performance and Accuracy Benchmarking Methodology for SLAMIntroducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM Luigi Nardi1, Bruno Bodin2, M. Zeeshan Zia1, John Mawer3, Andy Nisbet3, Paul H. J. Kelly1, Andrew J. Davison1, Mikel Lujan´ 3, Michael F. P. O’Boyle2, Graham Riley3, Nigel Topham2 and Steve Furber3 Kernels Architectures Correctness Performance Metrics Frame rate Computer bilateralFilter (..) halfSampleRobust (..) Vision renderVolume (..) integrate (..) : Energy : Accuracy Compiler/Runtime Hardware ICL-NUIM Dataset Fig. 1: The SLAMBench framework makes it possible for experts coming from the computer vision, compiler, run-time, and hardware communities to cooperate in a unified way to tackle algorithmic and implementation alternatives. Abstract— Real-time dense computer vision and SLAM offer While classical point feature-based simultaneous locali- great potential for a new level of scene modelling, tracking sation and mapping (SLAM) techniques are now crossing and real environmental interaction for many types of robot, into mainstream products via embedded implementation in but their high computational requirements mean that use on mass market embedded platforms is challenging. Mean- projects like Project Tango [3] and Dyson 360 Eye [1], while, trends in low-cost, low-power processing are towards dense SLAM algorithms with their high computational re- massive parallelism and heterogeneity, making it difficult for quirements are largely at the prototype stage on PC or robotics and vision researchers to implement their algorithms laptop platforms. Meanwhile, there has been a great focus in a performance-portable way. In this paper we introduce in computer vision on developing benchmarks for accuracy SLAMBench, a publicly-available software framework which represents a starting point for quantitative, comparable and comparison, but not on analysing and characterising the validatable experimental research to investigate trade-offs in envelopes for performance and energy consumption.
- 
												  QTEE STOR, Is Shown in Figure 4Qualcomm Snapdragon, Qualcomm Trusted Execution Environment, Qualcomm Secure Storage Solutions, Qualcomm Secure Processing Unit, Qualcomm Secure File System, and Qualcomm Fast Trusted Storage are products of Qualcomm Technologies, Inc. and/or its subsidiaries. Qualcomm and Snapdragon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Other products and brand names may be trademarks or registered trademarks of their respective owners. The contents of this document are provided on an “as-is” basis without warranty of any kind. Qualcomm Technologies, Inc. specifically disclaims the implied warranties of merchantability and fitness for a particular purpose. Qualcomm Technologies, Inc. 5775 Morehouse Drive San Diego, CA 92121 U.S.A. © 2019 Qualcomm Technologies, Inc. and/or its affiliated companies. All Rights Reserved. Overview .............................................................................................................................. 1 Acronyms ............................................................................................................................. 2 Limitation of pure software-based solutions ...................................................................... 3 Hardware building blocks .................................................................................................... 4 Qualcomm® Trusted Execution Environment ................................................................. 4 Hardware Crypto Engine ................................................................................................
- 
												  Report on Tuned Linux-ARM Kernel and Delivery of Kernel Patches to the Linux Kernel Version 1.0D5.4{ Report on Tuned Linux-ARM kernel and delivery of kernel patches to the Linux kernel Version 1.0 Document Information Contract Number 288777 Project Website www.montblanc-project.eu Contractual Deadline M18 Dissemintation Level PU Nature Other Coordinator Alex Ramirez (BSC) Contributors Roxana Rusitoru (ARM) Reviewers Isaac Gelado (BSC) Keywords Linux kernel, HPC, Transparent HugePage Notices: The research leading to these results has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement no 288777 c 2013 Mont-Blanc Consortium Partners. All rights reserved. D5.4 - Report on Tuned Linux-ARM kernel and delivery of kernel patches to the Linux kernel Version 1.0 Change Log Version Description of Change v0.1 Initial draft released to the WP5 contributors v0.2 Revised draft released to the WP5 contributors v1.0 Version to send to the WP leaders and the EC 2 D5.4 - Report on Tuned Linux-ARM kernel and delivery of kernel patches to the Linux kernel Version 1.0 Contents 1 Introduction5 2 Analysis of Transparent HugePages: pandaboard5 2.1 Introduction...................................5 2.2 Hardware setup.................................5 2.2.1 Limitations...............................5 2.3 Benchmarks...................................6 2.3.1 Setup..................................8 2.3.1.1 Compilation flags......................8 2.3.1.2 Processor affinity.......................8 2.3.1.3 ATLAS............................8 2.4 Sample HPC application - bigDFT......................8 2.5 Methodology..................................9 2.6 Results...................................... 12 2.6.1 Observations.............................. 15 2.7 Conclusions................................... 15 2.8 Further work.................................. 15 2.8.1 Kernel profiling............................. 15 2.8.2 In-depth analysis of results.....................
- 
												![Arxiv:1910.06663V1 [Cs.PF] 15 Oct 2019](https://docslib.b-cdn.net/cover/5599/arxiv-1910-06663v1-cs-pf-15-oct-2019-1465599.webp)  Arxiv:1910.06663V1 [Cs.PF] 15 Oct 2019AI Benchmark: All About Deep Learning on Smartphones in 2019 Andrey Ignatov Radu Timofte Andrei Kulik ETH Zurich ETH Zurich Google Research [email protected] [email protected] [email protected] Seungsoo Yang Ke Wang Felix Baum Max Wu Samsung, Inc. Huawei, Inc. Qualcomm, Inc. MediaTek, Inc. [email protected] [email protected] [email protected] [email protected] Lirong Xu Luc Van Gool∗ Unisoc, Inc. ETH Zurich [email protected] [email protected] Abstract compact models as they were running at best on devices with a single-core 600 MHz Arm CPU and 8-128 MB of The performance of mobile AI accelerators has been evolv- RAM. The situation changed after 2010, when mobile de- ing rapidly in the past two years, nearly doubling with each vices started to get multi-core processors, as well as power- new generation of SoCs. The current 4th generation of mo- ful GPUs, DSPs and NPUs, well suitable for machine and bile NPUs is already approaching the results of CUDA- deep learning tasks. At the same time, there was a fast de- compatible Nvidia graphics cards presented not long ago, velopment of the deep learning field, with numerous novel which together with the increased capabilities of mobile approaches and models that were achieving a fundamentally deep learning frameworks makes it possible to run com- new level of performance for many practical tasks, such as plex and deep AI models on mobile devices. In this pa- image classification, photo and speech processing, neural per, we evaluate the performance and compare the results of language understanding, etc.
- 
												  Arm-Based Computing Platform Solutions Accelerating Your Arm Project DevelopmentArm-based Computing Platform Solutions Accelerating Your Arm Project Development Standard Hardware Solutions AIM-Linux & AIM-Android Services Integrated Peripherals Trusty Ecosystem QT Automation SUSI API Transportation Medical MP4 BSP NFS Video Driver LOADER Acceleration Kernel Security Boot Loader Networking www.advantech.com Key Factors for Arm Business Success Advantech’s Arm computing solutions provide an open and unified development platform that minimizes effort and increases resource efficiency when deploying Arm-based embedded applications. Advantech Arm computing platforms fulfill the requirements of power- optimized mobile devices and performance-optimized applications with a broad offering of Computer-on-Modules, single board, and box computer solutions based on the latest Arm technologies. This year, Advantech’s Arm computing will roll out three new innovations to lead embedded Arm technologies into new arena: 1. The i.MX 8 series aims for next generation computing performance and targets new application markets like AI. 2. Developing a new standard: UIO20/40-Express, an expansion interface for extending various I/Os easily and quickly for different embedded applications. 3. We are announcing the Advantech AIM-Linux and AIM-Android, which provide unfiled BSP, modularized App Add-Ons, and SDKs for customers to accelerate their application development. Standardized Hardware Solutions • Computer on Module • Single Board Computer • Computer Box AIM-Linux AIM-Linux & AIM-Android • Unified Embedded Platforms AIM-Android • App Add-Ons