Hexagon Dsp: an Architecture Optimized for Mobile Multimedia and Communications

Total Page:16

File Type:pdf, Size:1020Kb

Hexagon Dsp: an Architecture Optimized for Mobile Multimedia and Communications ................................................................................................................................................................................................................. HEXAGON DSP: AN ARCHITECTURE OPTIMIZED FOR MOBILE MULTIMEDIA AND COMMUNICATIONS ................................................................................................................................................................................................................. THE QUALCOMM HEXAGON DSP IS USED FOR BOTH MODEM PROCESSING AND MULTIMEDIA ACCELERATION.BY OFFLOADING MULTIMEDIA TASKS FROM THE CPU TO THE DSP, SIGNIFICANT POWER SAVINGS CAN BE ACHIEVED.THIS ARTICLE PROVIDES AN OVERVIEW OF THE HEXAGON ARCHITECTURE.THE PROCESSOR IS DESIGNED TO DELIVER SUPERIOR ENERGY EFFICIENCY COMPARED TO MOBILE CPU ALTERNATIVES AND THEREBY HELP ACHIEVE LONG BATTERY LIFE FOR IMPORTANT MOBILE APPLICATIONS. ......To be competitive, a modern multimedia DSP, however, is licensed for mobile product must provide a rich user expe- programming by OEMs and third-party soft- rience and long battery life. Chips for these ware vendors. This article provides an over- Lucian Codrescu ecosystems integrate multiple subsystems, view of the multimedia DSP and builds on each customized for a particular application the presentation from HotChips 25.1 Willie Anderson domain. By specializing a subsystem to a task, Figure 2 shows the various Hexagon gen- performance and power can be enhanced erations. Version 2 (V2) was the first pro- Suresh Venkumanhanti beyond what is possible with a homogenous duction version and appeared in the initial CPU-based computing platform. Snapdragon mobile products in 2007. V3 Mao Zeng Figure 1 shows a block diagram of the featured an improved implementation with Snapdragon 800. This chip contains dedicated better power consumption. These early ver- Erich Plondke subsystems for camera, display, video, audio/ sions of Hexagon targeted voice and audio voice, sensors, graphics, cellular modem, Wi- processing. Example functions include wide- Chris Koob Fi, and more. Each subsystem contains dedi- band vocoders, echo cancellation, audio cated hardware, and many contain special- postprocessing filters, MP3/AAC play- Ajay Ingle purpose processing engines and software cus- back, speaker protection algorithms, and tomized to the task. so on. Charles Tabony The Snapdragon 800 has two instances of V4 and V5 expanded the application tar- the Hexagon digital-signal processor (DSP). gets to include image processing for camera Rick Maule The modem (mDSP) is dedicated and and video; computer vision tasks such as customized for modem processing, whereas hand, gesture, and face recognition; and Qualcomm the application DSP (aDSP) is used for mul- processing of sensor input (gyro, accelerome- timedia acceleration. The modem processor ter, fingerprint, and so on). Unless otherwise is a closed subsystem and is programmed noted, this article will focus on the latest only within Qualcomm Technologies. The Hexagon V5 core. ....................................................... 34 Published by the IEEE Computer Society 0272-1732/14/$31.00 c 2014 IEEE • aDSP: Real-time media & sensor processing Snapdragon 800 Audio Camera Krait Krait Adreno CPU CPU Sensors Display GPU JPEG Krait Krait Hexagon Misc. CPU CPU Video aDSP connectivity Other 2-Mbyte L2 Multimedia fabric System fabric Hexagon Modem Fabric & memory controller mDSP • mDSP: Dedicated LPDDR3 LPDDR3 modem processing Figure 1. Snapdragon 800 block diagram. The chip contains dedicated subsystems for camera, display, video, audio/voice, sensors, graphics, cellular modem, and Wi-Fi. V5 mDSP V3 mDSP V4 mDSP 28 nm 45 nm 28 nm Dec. 2012 June 2009 Dec. 2010 V4 aDSP V3 aDSP Low-tier V1 aDSP V2 aDSP Low-tier 28 nm 65 nm 65 nm 45 nm Apr. 2011 Oct. 2006 Dec. 2007 Nov. 2009 V5 aDSP V3 aDSP V4 aDSP 28 nm 45 nm 28 nm Dec. 2012 Aug. 2009 Dec. 2010 Time Figure 2. Hexagon digital-signal processor (DSP) evolution. The figure shows the evolution from Version 1 in October 2006 through Version 5 in December 2012. ............................................................. MARCH/APRIL 2014 35 .............................................................................................................................................................................................. HOT CHIPS All user-level registers are replicated per thread. There are two sets of user registers: Instruction general registers and control registers. The cache general registers include thirty-two 32-bit registers that can be accessed either as single Instruction unit registers or as aligned 64-bit register pairs. The general registers contain all pointer, sca- lar, vector, and accumulator data. The con- L2 trol registers include special-purpose registers cache/ such as the program counter, status register, TCM and loop registers. Data unit Data Unit Execution Execution (load/ (load/ unit unit store/ store/ (64-bit (64-bit Data-processing instructions ALU) ALU) vector) vector) There are two identical 64-bit single- instruction, multiple-data (SIMD) execution Data cache units. Each unit supports all multiply, shift, arithmetic logic unit (ALU), and bit manipu- lation instructions. Supported data types include Register file/thread 8-, 16-, 32-, and 64-bit integer; 16- and 32-bit fractional with optional rounding and saturation; 16-bit complex; and Figure 3. Hexagon block diagram. The architecture features a four-wide very single-precision IEEE-compatible float- long instruction word (VLIW) with dual load/store and dual single-instruction, ing point. multiple-data (SIMD) execution units and supports hardware multithreading. Each unit is capable of supporting: Hexagon is a multithreaded very long four 16 Â 16 multiplies; instruction word (VLIW) DSP. The design two 32 Â 16 multiplies; or philosophy is to maximize work per cycle for one 32 Â 32 multiply, one complex performance, but target the microarchitec- multiply, or one floating-point fused ture to modest clock speeds and low power. multiply-add (FMA). Many of the instructions are complex and Instruction-set architecture overview application specific. Complex instructions The architecture’s foundation is a stati- targeted to a particular application domain cally scheduled four-way VLIW. The VLIW can provide high performance and energy approach puts the burden of instruction par- efficiency. For example, Figure 4 depicts a allelism on the compiler and thereby avoids complex multiply instruction used in a costly and power-hungry dynamic-scheduling 16-bit fixed-point fast-Fourier transform hardware.2 The VLIW approach is popular (FFT). Without such an instruction, it would among commercial DSPs. Figure 3 shows a take four multiplies, four shifts, four adds, block diagram of Hexagon. and two saturates to perform the operation. It should be clear that packing all the work in Registers and memory a single instruction executed in a single pipe- The Hexagon processor features a unified lined execution unit provides large efficiency byte-addressable memory. This memory has gains. a single 32-bit virtual address space that holds The Hexagon instruction set architecture both instructions and data. It operates in (ISA) contains numerous special-purpose little-endian mode. A full-featured memory instructions designed to accelerate key multi- management unit (MMU) translates virtual media kernels. Multimedia algorithms with to physical addresses. special instruction support include ............................................................ 36 IEEE MICRO variable-length encode/decode, such as context-adaptive binary-arithmetic- Rs I R I R Rs coding processing in H.264 video; features from accelerated segment Rt I R I R Rt test (FAST) corner detection image processing; FFTalgorithms; ∗∗ ∗∗ sliding-window filters; linear-feedback shift; table lookup from an arbitrary bit 32 32 32 32 field index; 0x8000 <<0–1 <<0–1 <<0–1 <<0–1 0x8000 elliptic curve cryptography; and – cyclic redundancy check (CRC) calculation. Add Add Load/store instructions Sat_32 Sat_32 Dual load/store units access signed or High 16bits High 16bits unsigned 8-, 16-, 32-, and 64-bit values in memory. There is a rich variety of addressing modes, including I R Rd absolute 32-bit, Figure 4. Complex multiply instruction. Such an instruction executed in a base plus scaled immediate and base single pipelined execution unit provides large efficiency gains for plus scaled register, applications that use complex arithmetic. auto-incrementing by register and immediate, circular addressing, and bit reversed. that is generated from it by the compiler. The “.new” suffix implies the source predicate is To increase the number of instruction generated in the same packet. In this exam- combinations allowed in packets, the load/ ple, the dot-new construct enables the work store units also support 32-bit ALU to be done in one instruction packet instead instructions. of two. The C statement is as follows: Conditional execution and program flow if (R2 == 4) The Hexagon ISA includes conditional R3 = *R4; execution. Conditional execution is useful to else remove branches through if-conversion and R5 = 5; is helpful for a VLIW processor. Compare Assembly code with braces delineate instructions target one of four predicate regis- packet boundaries: ters. These predicate registers can then be { used to conditionally execute certain instruc- P0 = cmp.eq(R2,#4) tions. Not all instructions can be condi- if (P0.new) R3 = memw(R4) tional—only
Recommended publications
  • Snapdragon-870-5G-Mobile-Platform
    The Snapdragon 870 5G Mobile Platform is the total package, backed by truly global 5G, premium intelligence, boosted performance, and geared-up gaming. These features and more deliver elite experiences with unstoppable speed and efficiency. Truly powerful, truly global 5G The world is yours to explore with truly global 5G connectivity. Multi-gigabit speeds and support for both mmWave and Sub-6 GHz spectrums empower new potential—no matter where you go with 5G. 870 Stream desktop-quality content right on your mobile device, upload or download in a flash, and communicate with friends (near or far) in real time. • Snapdragon X55 5G Modem-RF System with insanely fast peak speeds up to 7.5 Gbps • Supports all key regions and frequency bands including mmWave, sub-6, TDD, FDD and Dynamic Spectrum Sharing (DSS) • Supports both standalone and non-standalone modes, global roaming and global multi-SIM Boosted performance Break mobile barriers with our Qualcomm® Kryo™ 585 CPU—now supplying an uptick in speed. Designed for powerful computing, it delivers top performance with maximum power efficiency. Plus, when your battery does need a boost, Qualcomm® Quick Charge™ 4+ technology gets you back on track in record time. • Kryo 585 CPU clocks up to 3.2 GHz • Display support for up to 4K at 60 Hz Smarter, faster, longer Our 5th gen Qualcomm® AI Engine works with mind-boggling speed and efficiency, enabling responsive on-device interactions and assistance. Plus, the Qualcomm® Hexagon™ 698 Processor with Hexagon Tensor Accelerator pushes up to 15 TOPS performance. • AI real-time translation makes it easy to form connections with users across the globe • Qualcomm® Sensing Hub provides always-on contextual awareness Geared-up gaming No challenge is too great with the complete Qualcomm® Snapdragon Elite Gaming™ armory on your side.
    [Show full text]
  • 1028224 Shihua Hang a Flexibility Metric for Processors
    Eindhoven University of Technology MASTER A flexibility metric for processors Huang, S. Award date: 2019 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain Department of Electrical Engineering Electronic System Group A Flexibility Metric for Processors Master Thesis Report Shihua Huang 1028224 Supervisors: Luc Waeijen Henk Corporaal Kees van Berkel Version 1 Eindhoven, February 2019 Abstract In recent years, the substantial growth in computing power has encountered a bottleneck, as the miniaturization of CMOS technology reaches its limit. No more exponential scaling occurs as being described by Moore's law. In the coming years of computing, new advancements have to be made on the architectural side.
    [Show full text]
  • Qualcomm® Vision Intelligence 300/400 Platforms (QCS603/QCS605)
    Qualcomm® Vision Intelligence 300/400 Platforms (QCS603/QCS605) The Qualcomm Vision Intelligence Platform is Highlights purpose-built with powerful image processing and machine learning for smart camera products in the consumer and enterprise IoT spaces. Cameras have evolved to become smarter Reference platforms designed to support and more relevant in the Internet of Things (IoT). The Qualcomm Vision Intelligence a host of machine learning solutions Platform is designed to provide superior image Perform face/body detection, face recognition, object processing together with enhanced Artificial classification, license plate recognition, etc. Third party Intelligence (AI) capabilities in a cost-effective platform to serve a variety of IoT devices. algorithms for performing on-device stitching of dual camera These include action, VR/360, home security, streams are also demonstrated on the reference platforms. enterprise security and wearable cameras. The platform features Qualcomm Dual ISPs and 4K Ultra HD video with Technologies’ first family of system-on-chips enhanced features (SoCs) built specifically for IoT in an advanced 10-nanometer process and is engineered Dual ISPs support staggered HDR, low light noise reduction, to support exceptional power and thermal and enhanced auto-focus performance. Premium 4K efficiency. It also features Qualcomm @60fps HEVC video capture and playback with support for Technologies’ most advanced image sensor secondary streams for preview and streaming. processor (ISP) and digital signal processor (DSP) to date, along with cutting-edge CPU, GPU, camera processing software, Heterogeneous computing for on- connectivity and security. device machine learning and more The Qualcomm Vision Intelligence 300 Highly optimized custom CPU, GPU and DSP designed platform is based on our QCS603 and to provide high compute capability at low power.
    [Show full text]
  • Qualcomm® Snapdragon™ Embedded Platforms HW and SW Overview
    Qualcomm® Snapdragon™ embedded platforms HW and SW Overview ​Ziv Kahana, Director of Engineering ​Constantine Elster, Senior Staff Engineer ​Qualcomm Israel, Ltd. ​Oct 2017 Qualcomm Snapdragon is a product of Qualcomm Technologies, Inc. Agenda 1 2 3 4 5 Chipset Hardware Development Software Eco system overview sub-systems platforms distributions and typical (DragonBoard) and features applications overview 2 Bringing Snapdragon platforms to embedded devices Identifying the challenges Mobile OEMs Embedded Customers Relationship o High touch, 1-1 o Low-touch, web-based Primary fulfillment o Direct o Distribution Minimum order o 10,000s o 100 Customers o High dependency, few o Low dependency, many Roadmap influence o Strong o Weak Engineering capability o Strong, large teams o Varied, small teams Primary support o Direct o Web-based/Contract work End-product volume o High o Low Design type o Iterative o Clean-slate 3 Snapdragon 410E and 600E embedded platforms Drawing from the mobile portfolio for a targeted, tiered offering Supported for longevity Snapdragon 600E o Available through distribution for a 1.5 GHz quad-core Qualcomm® Krait™ 300 CPU minimum of 10 years from Snapdragon 600 and 410 commercial sample in 2015 Available through Snapdragon 410E Arrow Electronics 1.2 GHz quad-core ARM v8 Cortex-A53, o 1st time Snapdragon platforms are 32/64-bit capable sold through 3rd party distribution Qualcomm Krait is a product of Qualcomm Technologies, Inc. 4 Snapdragon embedded platforms Snapdragon 410E Snapdragon 600E Application Processor - APQ8016E
    [Show full text]
  • Qt5wm6k4xb.Pdf
    UC Irvine UC Irvine Electronic Theses and Dissertations Title Cooperative Power and Resource Management for Heterogeneous Mobile Architectures Permalink https://escholarship.org/uc/item/5wm6k4xb Author Hsieh, Chenying Publication Date 2019 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, IRVINE Cooperative Power and Resource Management for Heterogeneous Mobile Architectures DISSERTATION submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in Computer Science by Chenying Hsieh Dissertation Committee: Professor Nikil Dutt, Chair Professor Tony Givragis Professor Ardalan Amiri Sani 2019 Portion of Chapter 2 © 2018 Springer Portion of Chapter 3 © 2019 IEEE Portion of Chapter 4 © 2019 IEEE All other materials © 2019 Chenying Hsieh TABLE OF CONTENTS Page LIST OF FIGURES iv LIST OF TABLES vi ACKNOWLEDGMENTS vii CURRICULUM VITAE viii ABSTRACT OF THE DISSERTATION x 1 Introduction 1 1.1 Emerging Mobile Heterogeneous Architectures . .1 1.2 Motivation and Challenges . .4 1.3 Thesis Overview . .5 2 MEMCOP: Memory-Aware Co-operative Power Management Governor for Mobile Games 7 2.1 Introduction . .7 2.2 Related Work . 10 2.2.1 Standalone Governors . 12 2.2.2 Integrated Governors . 12 2.3 MEMCOP: Memory-Aware Cooperative CPU-GPU DVFS Governor . 13 2.3.1 Background . 13 2.3.2 Performance and Energy Efficiency for Mobile Games . 15 2.3.3 Memory Access Behavior of Mobile Games . 17 2.3.4 Memory-aware Cooperative CPU-GPU DVFS governor . 19 2.4 Experiments . 25 2.4.1 Experimental Setup and Methodology . 25 2.4.2 Not All F-V Settings Are Useful .
    [Show full text]
  • Qualcomm Technologies, Inc
    T Qualcomm Technologies, Inc. Qualcomm, Snapdragon, Adreno, Vuforia, DragonBoard, Chromatix, and Hexagon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. OptiZoom, UbiFocus, ChromaFlash and FastCV are trademarks of Qualcomm Incorporated. All Qualcomm Incorporated trademarks are used with permission. Other products and brand names may be trademarks or registered trademarks of their respective owners. Qualcomm Snapdragon, Qualcomm Adreno, Qualcomm Hexagon, FastCV, Chromatix, Qualcomm OptiZoom, Qualcomm UbiFocus and Qualcomm ChromaFlash are products of Qualcomm Technologies, Inc. Qualcomm Vuforia is a product of Qualcomm Connected Experiences, Inc. Qualcomm Technologies, Inc. 5775 Morehouse Drive San Diego, CA 92121 U.S.A. © 2014 Qualcomm Technologies, Inc. All Rights Reserved. 1 Executive summary ............................................................................................................................... 1 2 The rise of mobile imaging ..................................................................................................................... 1 2.1 Ecosystem drivers .................................................................................................................... 2 2.2 Technology drivers ................................................................................................................... 3 3 Supporting DSLR-like image quality in a mobile form factor ................................................................. 5 3.1 Flexibility and
    [Show full text]
  • Qualcomm® Snapdragon™ 670 Mobile Platform
    Qualcomm® Snapdragon™ 670 Mobile Platform Snapdragon 670 Mobile Sought-aer mobile experiences Platform advancements Connecting you to your world, faster • Snapdragon X12 LTE modem: Lightning fast Get the speed you need with the Snapdragon X12 LTE Modem, download speeds of up to 600 Mbps with 3x supporting cutting-edge features for next-generation connectivity. carrier aggregation. In a snap you can Download TV shows or the latest tracks of your favorite artist nearly download music and listen to it via Bluetooth 5 2x faster than the previous generation modem—even in congested areas that see a lot of action. Plus, switch between LTE and Wi-Fi with audio broadcast and ultra-low power ear automatically, selecting the strongest signal so your voice and video bud support. calls come in loud and clear. • Qualcomm Spectra™ 250 Image Signal A smarter personal assistant Processor: Designed with new architectures to Qualcomm Artificial Intelligence (AI) Engine fuels innovative bring the finest photos and video experiences on-device experiences, making your interactions smart and intuitive. to your device. Aspiring videographers will be With up to 1.8x overall improvement over the previous generation, the excited to capture images up to 25 megapixels AI Engine runs applications quickly and eciently while providing and enjoy cinema-grade quality video with better battery management. It can also help you capture better photos with automatic setting adjustments and perform tasks with vibrant colors. enhanced voice recognition. • Qualcomm® Adreno™ 615 Visual Processing Subsystem: Supports life-like experiences, Powering your favorite activities stunning graphics, and console-like gaming at Experience blazing-fast performance and cross-tasking that’s so lower power.
    [Show full text]
  • Qualcomm-Snapdragon-7C-Compute-Platform-Product-Brief.Pdf
    The Qualcomm® Snapdragon™ 7c compute platform is upgrading what you should expect in an entry-level PC Qualcomm® Beautifully thin and light systems that offer multi-day battery life and offer LTE cellular connectivity can Snapdragon™ be available for everyone. Why haven’t traditional PCs offered that experience? Delivering an always on, always connected superior PC experience requires innovation and invention. It needs the advanced and efficient technologies that Qualcomm Technologies has been putting in smart phones for years, brought into your PC. Snapdragon 7c compute platforms deliver that 7c experience. Thinner, lighter, and quieter with up to 20% faster system performance and up 2x the Compute Platform battery life of typical entry-level PCs. Upgrade your PC to a Snapdragon 7c and get the battery life you deserve, with the performance you need. Always On, Always Connected Your phone is always on and always connected. Why isn’t your PC? Delivering the smartphone experience on your PC requires extraordinary efficiency, delivering long battery life, and robust connectivity options. The Snapdragon 7c compute platform is specifically engineered to bring those capabilities to entry level PCs. With battery life that can be measured in days and Wi-Fi and cellular connectivity, you can have a superior Always On, Always Connected PC computing experience. Multi-day battery life Using an advanced 8nm manufacturing process, the Snapdragon 7c compute platform delivers great performance capabilities with incredible efficiency for unbelievable battery life. Bringing all the core technologies of your PC into a single system on a chip (SoC) highly integrates an octa-core processor and powerful graphics with Wi-Fi and cellular connectivity technologies to give you the performance you need in a system that provides battery life measurable in days, not hours.
    [Show full text]
  • Thundersoft Turbox® D845 SOM Datasheet
    Thundercomm TurboX D845 System on Module Thundercomm TurboXTM D845 System on Module A high performance embedded platform based on Qualcomm® Snapdragon ™ 845 processor Description Thundercomm TurboXTM D845 System on Module(SOM) is a high performance intelligent module, integrating Android or Linux system, based on Qualcomm SDA845 processor. It integrates the advanced 10 nm Fin FET process, a customized 64-bit ARM v8-compliant Octacore Qualcomm Kryo 385 applications processor. TurboX D845 SOM supports short-range wireless communication through Wi-Fi 802.11 a/b/g/n/ac and BT5.0. This SOM supports 3840*2400@60fps display, it can connect 4 x camera modules, and integrating multiple audio and video input/output interfaces. This SOM provides a variety of GPIO, I2C, UART and SPI standard interfaces. In addition, it supports two MIPI-DSI, four MIPI-CSI and SOM common standard protocol interfaces such as USB3.1, PCIE2.1/3.0, I2S and SLIMBUS. TurboX D845 SOM provides convenient and stable system solution for IOT field, it can be embedded into the device on VR/AR, Drone, Robot, Smart Camera, AI devices, and any other connecting fields. The size of module is 37mm x 60mm x 7.1mm, weight 10.3g, with 336PINs. Features The following table shows the detailed features and performance on TurboX D845 SOM. Processors 64-bit applications processor (Kryo 385) with 2 MB L3 cache Quad high-performance Kryo cores at 2.649 GHz - Kryo Gold cluster with 256 kB L2 Applications cache per core Processor Quad low-power Kryo cores at 1.766 GHz - Kyro Silver cluster with 128 kB L2 cache per core Digital signal Compute DSP with Qualcomm Hexagon Vector Extensions (dual-HVX512) processor processing Copyright © 2018 All Rights Reserved , Thundercomm Technology Co., Ltd.
    [Show full text]
  • Qualcomm Hexagon V67 Programmer's Reference Manual
    Qualcomm® Hexagon™ V67 Programmer’s Reference Manual 80-N2040-45 Rev. B February 25, 2020 All Qualcomm products mentioned herein are products of Qualcomm Technologies, Inc. and/or its subsidiaries Qualcomm and Hexagon are trademarks of Qualcomm Incorporated, registered in the United States and other countries. Other product and brand names may be trademarks or registered trademarks of their respective owners. This technical data may be subject to U.S. and international export, re-export, or transfer (“export”) laws. Diversion contrary to U.S. and international law is strictly prohibited. Qualcomm Technologies, Inc. 5775 Morehouse Drive San Diego, CA 92121 U.S.A. © 2018, 2020 Qualcomm Technologies, Inc. and/or its subsidiaries. All rights reserved. Contents Figures .............................................................................................................................. 14 Tables ................................................................................................................................ 15 1 Introduction......................................................................................... 17 1.1 Features....................................................................................................................... 17 1.2 Functional units .......................................................................................................... 19 1.2.1 Memory............................................................................................................. 20 1.2.2 Registers...........................................................................................................
    [Show full text]
  • MSM8953 (14 X 12 Mm) Qualcomm Technologies, Inc
    MSM8953 (14 x 12 mm) Device Specification (Advance Information) 80-P6190-1 Rev. A Qualcomm Technologies, Inc. Device description Key features (see Section 1.2 for details) The MSM™ device uses the advanced 14 nm FinFET Advanced RF techniques with WTR3925 process for lower active power dissipation and faster peak Downlink carrier aggregation (DL-CA) CPU performance. It includes a customized 64-bit ARM Uplink carrier aggregation (UL-CA) Cortex A-53 octa-core applications processor non-package- on-package (non-PoP) LPDDR3 SDRAM memory. Two 4-lane DSI D-PHY 1.1, FHD (1920 × 1200) 60 fps Video 4k at 30 fps, 1080p at 60 fps, 2560 buffer width (10 layers blending), VESA DSC 1.1 Adreno GPU with NOTE In this document, MSM8953 refers only to UBWC the 14 × 12 mm package. Three 4-lane CSI (4 + 4 + 4) D-PHY 1.2 at 2.1 Gbps per lane The latest air interface standards are supported, including: Dual 14-bit image signal processing (ISP): 13 MP and LTE up to Cat 7 (FDD and TDD), including 2 × 20 CA 13 MP; 24 MP at 30 fps ZSL with dual-ISP; 13 MP WCDMA up to Rel-10 HSPA+, including 4C-HSDPA+ 30 ZSL with single-ISP and HSPDA 2 + 1 CA Support for USB 3.0, eMMC 5.1, and SD 3.0 TD-SCDMA up to DC-HSPA+ 64 QAM Low-power Qualcomm sensor core (SSC) with QDSP6 to GSM, GPRS, and EDGE support always-on use cases CDMA up to 1X Advanced and 1xEV-DOrB WLAN 802.11 b/g/n/ac, DBS, Bluetooth 4.1, and FM with WCN3615, WCN3660B, and WCN3680B Key processor and memory characteristics include: Non-PoP high-speed memory and LPDDR3 SDRAM Customized octa-core
    [Show full text]
  • A Chained Barrel Processor
    Technical Disclosure Commons Defensive Publications Series November 2020 A Chained Barrel Processor N/A Follow this and additional works at: https://www.tdcommons.org/dpubs_series Recommended Citation N/A, "A Chained Barrel Processor", Technical Disclosure Commons, (November 02, 2020) https://www.tdcommons.org/dpubs_series/3737 This work is licensed under a Creative Commons Attribution 4.0 License. This Article is brought to you for free and open access by Technical Disclosure Commons. It has been accepted for inclusion in Defensive Publications Series by an authorized administrator of Technical Disclosure Commons. : A Chained Barrel Processor A Chained Barrel Processor ABSTRACT A barrel processor implements a technique to interleave a set of B instruction streams in a round-robin manner, such that a given thread/stream has an instruction slot once every B cycles. The barrel approach delivers throughput in a way that is efficient in circuit area and power, but suffers from the constraint that a given thread of execution can issue only once every N cycles, limiting single-threaded performance. This disclosure describes a hybrid architecture and microarchitecture that improves single-threaded performance while preserving the bandwidth advantages for multithreaded applications. KEYWORDS ● Barrel processor ● Multithreading ● Reduced instruction set computer ● RISC ● Single instruction, multiple data ● SIMD ● Replicated barrel operation ● Chained operation ● Execution pipeline BACKGROUND The “end of Moore’s law,” the point at which improvements in semiconductor density can be relied upon to deliver continuous exponential improvements in price/performance, is no longer a hypothetical future. This has many consequences for computing. Architectural and Published by Technical Disclosure Commons, 2020 2 Defensive Publications Series, Art.
    [Show full text]