High-Performance Architectures for Embedded Memory Systems
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Intel® Strongarm® SA-1110 High- Performance, Low-Power Processor for Portable Applied Computing Devices
Advance Copy Intel® StrongARM® SA-1110 High- Performance, Low-Power Processor For Portable Applied Computing Devices PRODUCT HIGHLIGHTS ■ Innovative Application Specific Standard Product (ASSP) delivers leadership performance, integration and low power for palm-size devices, PC companions, smart phones and other emerging portable applied computing devices As businesses and individuals rely increasingly on portable applied ■ High-speed 100 MHz memory bus and a computing devices to simplify their lives and boost their productivity, flexible memory these devices have to perform more complex functions quickly and controller that adds efficiently. To satisfy ever-increasing customer demands to support for SDRAM, communicate and access information ‘anytime, anywhere’, SMROM, and variable- manufacturers need technologies that deliver high-performance, robust latency I/O devices — provides design functionality and versatility while meeting the small-size and low-power flexibility, scalability and restrictions of portable, battery-operated products. Intel designed the high memory bandwidth SA-1110 processor with all of these requirements in mind. ■ Rich development The Intel® SA-1110 is a highly integrated 32-bit StrongARM® environment enables processor that incorporates Intel design and process technology along leading edge products with the power efficiency of the ARM* architecture. The SA-1110 is while reducing time- to-market software compatible with the ARM V4 architecture while utilizing a high-performance micro-architecture that is optimized to take advantage of Intel process technology. The Intel SA-1110 provides the performance, low power, integration and cost benefits of the Intel SA-1100 processor plus a high speed memory bus, flexible memory controller and the ability to handle variable-latency I/O devices. -
Comparison of Contemporary Real Time Operating Systems
ISSN (Online) 2278-1021 IJARCCE ISSN (Print) 2319 5940 International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 11, November 2015 Comparison of Contemporary Real Time Operating Systems Mr. Sagar Jape1, Mr. Mihir Kulkarni2, Prof.Dipti Pawade3 Student, Bachelors of Engineering, Department of Information Technology, K J Somaiya College of Engineering, Mumbai1,2 Assistant Professor, Department of Information Technology, K J Somaiya College of Engineering, Mumbai3 Abstract: With the advancement in embedded area, importance of real time operating system (RTOS) has been increased to greater extent. Now days for every embedded application low latency, efficient memory utilization and effective scheduling techniques are the basic requirements. Thus in this paper we have attempted to compare some of the real time operating systems. The systems (viz. VxWorks, QNX, Ecos, RTLinux, Windows CE and FreeRTOS) have been selected according to the highest user base criterion. We enlist the peculiar features of the systems with respect to the parameters like scheduling policies, licensing, memory management techniques, etc. and further, compare the selected systems over these parameters. Our effort to formulate the often confused, complex and contradictory pieces of information on contemporary RTOSs into simple, analytical organized structure will provide decisive insights to the reader on the selection process of an RTOS as per his requirements. Keywords:RTOS, VxWorks, QNX, eCOS, RTLinux,Windows CE, FreeRTOS I. INTRODUCTION An operating system (OS) is a set of software that handles designed known as Real Time Operating System (RTOS). computer hardware. Basically it acts as an interface The motive behind RTOS development is to process data between user program and computer hardware. -
Performance Impact of Memory Channels on Sparse and Irregular Algorithms
Performance Impact of Memory Channels on Sparse and Irregular Algorithms Oded Green1,2, James Fox2, Jeffrey Young2, Jun Shirako2, and David Bader3 1NVIDIA Corporation — 2Georgia Institute of Technology — 3New Jersey Institute of Technology Abstract— Graph processing is typically considered to be a Graph algorithms are typically latency-bound if there is memory-bound rather than compute-bound problem. One com- not enough parallelism to saturate the memory subsystem. mon line of thought is that more available memory bandwidth However, the growth in parallelism for shared-memory corresponds to better graph processing performance. However, in this work we demonstrate that the key factor in the utilization systems brings into question whether graph algorithms are of the memory system for graph algorithms is not necessarily still primarily latency-bound. Getting peak or near-peak the raw bandwidth or even the latency of memory requests. bandwidth of current memory subsystems requires highly Instead, we show that performance is proportional to the parallel applications as well as good spatial and temporal number of memory channels available to handle small data memory reuse. The lack of spatial locality in sparse and transfers with limited spatial locality. Using several widely used graph frameworks, including irregular algorithms means that prefetched cache lines have Gunrock (on the GPU) and GAPBS & Ligra (for CPUs), poor data reuse and that the effective bandwidth is fairly low. we evaluate key graph analytics kernels using two unique The introduction of high-bandwidth memories like HBM and memory hierarchies, DDR-based and HBM/MCDRAM. Our HMC have not yet closed this inefficiency gap for latency- results show that the differences in the peak bandwidths sensitive accesses [19]. -
Schedule 14A Employee Slides Supertex Sunnyvale
UNITED STATES SECURITIES AND EXCHANGE COMMISSION Washington, D.C. 20549 SCHEDULE 14A Proxy Statement Pursuant to Section 14(a) of the Securities Exchange Act of 1934 Filed by the Registrant Filed by a Party other than the Registrant Check the appropriate box: Preliminary Proxy Statement Confidential, for Use of the Commission Only (as permitted by Rule 14a-6(e)(2)) Definitive Proxy Statement Definitive Additional Materials Soliciting Material Pursuant to §240.14a-12 Supertex, Inc. (Name of Registrant as Specified In Its Charter) Microchip Technology Incorporated (Name of Person(s) Filing Proxy Statement, if other than the Registrant) Payment of Filing Fee (Check the appropriate box): No fee required. Fee computed on table below per Exchange Act Rules 14a-6(i)(1) and 0-11. (1) Title of each class of securities to which transaction applies: (2) Aggregate number of securities to which transaction applies: (3) Per unit price or other underlying value of transaction computed pursuant to Exchange Act Rule 0-11 (set forth the amount on which the filing fee is calculated and state how it was determined): (4) Proposed maximum aggregate value of transaction: (5) Total fee paid: Fee paid previously with preliminary materials. Check box if any part of the fee is offset as provided by Exchange Act Rule 0-11(a)(2) and identify the filing for which the offsetting fee was paid previously. Identify the previous filing by registration statement number, or the Form or Schedule and the date of its filing. (1) Amount Previously Paid: (2) Form, Schedule or Registration Statement No.: (3) Filing Party: (4) Date Filed: Filed by Microchip Technology Incorporated Pursuant to Rule 14a-12 of the Securities Exchange Act of 1934 Subject Company: Supertex, Inc. -
Error Detection and Correction Methods for Memories Used in System-On-Chip Designs
International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-8, Issue-2S2, January 2019 Error Detection and Correction Methods for Memories used in System-on-Chip Designs Gunduru Swathi Lakshmi, Neelima K, C. Subhas ABSTRACT— Memory is the basic necessity in any SoC Static Random Access Memory (SRAM): It design. Memories are classified into single port memory and consists of a latch or flipflop to store each bit of multiport memory. Multiport memory has ability to source more memory and it does not required any refresh efficient execution of operation and high speed performance operation. SRAM is mostly used in cache memory when compared to single port. Testing of semiconductor memories is increasing because of high density of current in the and in hand-held devices. It has advantages like chips. Due to increase in embedded on chip memory and memory high speed and low power consumption. It has density, the number of faults grow exponentially. Error detection drawback like complex structure and expensive. works on concept of redundancy where extra bits are added for So, it is not used for high capacity applications. original data to detect the error bits. Error correction is done in Read Only Memory (ROM): It is a non-volatile memory. two forms: one is receiver itself corrects the data and other is It can only access data but cannot modify data. It is of low receiver sends the error bits to sender through feedback. Error detection and correction can be done in two ways. One is Single cost. Some applications of ROM are scanners, ID cards, Fax bit and other is multiple bit. -
Embedded DRAM
Embedded DRAM Raviprasad Kuloor Semiconductor Research and Development Centre, Bangalore IBM Systems and Technology Group DRAM Topics Introduction to memory DRAM basics and bitcell array eDRAM operational details (case study) Noise concerns Wordline driver (WLDRV) and level translators (LT) Challenges in eDRAM Understanding Timing diagram – An example References Slide 1 Acknowledgement • John Barth, IBM SRDC for most of the slides content • Madabusi Govindarajan • Subramanian S. Iyer • Many Others Slide 2 Topics Introduction to memory DRAM basics and bitcell array eDRAM operational details (case study) Noise concerns Wordline driver (WLDRV) and level translators (LT) Challenges in eDRAM Understanding Timing diagram – An example Slide 3 Memory Classification revisited Slide 4 Motivation for a memory hierarchy – infinite memory Memory store Processor Infinitely fast Infinitely large Cycles per Instruction Number of processor clock cycles (CPI) = required per instruction CPI[ ∞ cache] Finite memory speed Memory store Processor Finite speed Infinite size CPI = CPI[∞ cache] + FCP Finite cache penalty Locality of reference – spatial and temporal Temporal If you access something now you’ll need it again soon e.g: Loops Spatial If you accessed something you’ll also need its neighbor e.g: Arrays Exploit this to divide memory into hierarchy Hit L2 L1 (Slow) Processor Miss (Fast) Hit Register Cache size impacts cycles-per-instruction Access rate reduces Slower memory is sufficient Cache size impacts cycles-per-instruction For a 5GHz -
Semiconductor Memories
Semiconductor Memories Prof. MacDonald Types of Memories! l" Volatile Memories –" require power supply to retain information –" dynamic memories l" use charge to store information and require refreshing –" static memories l" use feedback (latch) to store information – no refresh required l" Non-Volatile Memories –" ROM (Mask) –" EEPROM –" FLASH – NAND or NOR –" MRAM Memory Hierarchy! 100pS RF 100’s of bytes L1 1nS SRAM 10’s of Kbytes 10nS L2 100’s of Kbytes SRAM L3 100’s of 100nS DRAM Mbytes 1us Disks / Flash Gbytes Memory Hierarchy! l" Large memories are slow l" Fast memories are small l" Memory hierarchy gives us illusion of large memory space with speed of small memory. –" temporal locality –" spatial locality Register Files ! l" Fastest and most robust memory array l" Largest bit cell size l" Basically an array of large latches l" No sense amps – bits provide full rail data out l" Often multi-ported (i.e. 8 read ports, 2 write ports) l" Often used with ALUs in the CPU as source/destination l" Typically less than 10,000 bits –" 32 32-bit fixed point registers –" 32 60-bit floating point registers SRAM! l" Same process as logic so often combined on one die l" Smaller bit cell than register file – more dense but slower l" Uses sense amp to detect small bit cell output l" Fastest for reads and writes after register file l" Large per bit area costs –" six transistors (single port), eight transistors (dual port) l" L1 and L2 Cache on CPU is always SRAM l" On-chip Buffers – (Ethernet buffer, LCD buffer) l" Typical sizes 16k by 32 Static Memory -
M32R Family Software Manual MITSUBISHI 32-BIT SINGLE-CHIP MICROCOMPUTER
To our customers, Old Company Name in Catalogs and Other Documents On April 1st, 2010, NEC Electronics Corporation merged with Renesas Technology Corporation, and Renesas Electronics Corporation took over all the business of both companies. Therefore, although the old company name remains in this document, it is a valid Renesas Electronics document. We appreciate your understanding. Renesas Electronics website: http://www.renesas.com April 1st, 2010 Renesas Electronics Corporation Issued by: Renesas Electronics Corporation (http://www.renesas.com) Send any inquiries to http://www.renesas.com/inquiry. Notice 1. All information included in this document is current as of the date this document is issued. Such information, however, is subject to change without any prior notice. Before purchasing or using any Renesas Electronics products listed herein, please confirm the latest product information with a Renesas Electronics sales office. Also, please pay regular and careful attention to additional and different information to be disclosed by Renesas Electronics such as that disclosed through our website. 2. Renesas Electronics does not assume any liability for infringement of patents, copyrights, or other intellectual property rights of third parties by or arising from the use of Renesas Electronics products or technical information described in this document. No license, express, implied or otherwise, is granted hereby under any patents, copyrights or other intellectual property rights of Renesas Electronics or others. 3. You should not alter, modify, copy, or otherwise misappropriate any Renesas Electronics product, whether in whole or in part. 4. Descriptions of circuits, software and other related information in this document are provided only to illustrate the operation of semiconductor products and application examples. -
IXP400 Software's Programmer's Guide
Intel® IXP400 Software Programmer’s Guide June 2004 Document Number: 252539-002c Intel® IXP400 Software Contents INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND/OR USE OF INTEL PRODUCTS, INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT, OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel Corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property rights that relate to the presented subject matter. The furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual property rights. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. The Intel® IXP400 Software v1.2.2 may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. MPEG is an international standard for video compression/decompression promoted by ISO. Implementations of MPEG CODECs, or MPEG enabled platforms may require licenses from various entities, including Intel Corporation. This document and the software described in it are furnished under license and may only be used or copied in accordance with the terms of the license. The information in this document is furnished for informational use only, is subject to change without notice, and should not be construed as a commitment by Intel Corporation. -
Comparative Architectures
Comparative Architectures CST Part II, 16 lectures Lent Term 2006 David Greaves [email protected] Slides Lectures 1-13 (C) 2006 IAP + DJG Course Outline 1. Comparing Implementations • Developments fabrication technology • Cost, power, performance, compatibility • Benchmarking 2. Instruction Set Architecture (ISA) • Classic CISC and RISC traits • ISA evolution 3. Microarchitecture • Pipelining • Super-scalar { static & out-of-order • Multi-threading • Effects of ISA on µarchitecture and vice versa 4. Memory System Architecture • Memory Hierarchy 5. Multi-processor systems • Cache coherent and message passing Understanding design tradeoffs 2 Reading material • OHP slides, articles • Recommended Book: John Hennessy & David Patterson, Computer Architecture: a Quantitative Approach (3rd ed.) 2002 Morgan Kaufmann • MIT Open Courseware: 6.823 Computer System Architecture, by Krste Asanovic • The Web http://bwrc.eecs.berkeley.edu/CIC/ http://www.chip-architect.com/ http://www.geek.com/procspec/procspec.htm http://www.realworldtech.com/ http://www.anandtech.com/ http://www.arstechnica.com/ http://open.specbench.org/ • comp.arch News Group 3 Further Reading and Reference • M Johnson Superscalar microprocessor design 1991 Prentice-Hall • P Markstein IA-64 and Elementary Functions 2000 Prentice-Hall • A Tannenbaum, Structured Computer Organization (2nd ed.) 1990 Prentice-Hall • A Someren & C Atack, The ARM RISC Chip, 1994 Addison-Wesley • R Sites, Alpha Architecture Reference Manual, 1992 Digital Press • G Kane & J Heinrich, MIPS RISC Architecture -
Case Study on Integrated Architecture for In-Memory and In-Storage Computing
electronics Article Case Study on Integrated Architecture for In-Memory and In-Storage Computing Manho Kim 1, Sung-Ho Kim 1, Hyuk-Jae Lee 1 and Chae-Eun Rhee 2,* 1 Department of Electrical Engineering, Seoul National University, Seoul 08826, Korea; [email protected] (M.K.); [email protected] (S.-H.K.); [email protected] (H.-J.L.) 2 Department of Information and Communication Engineering, Inha University, Incheon 22212, Korea * Correspondence: [email protected]; Tel.: +82-32-860-7429 Abstract: Since the advent of computers, computing performance has been steadily increasing. Moreover, recent technologies are mostly based on massive data, and the development of artificial intelligence is accelerating it. Accordingly, various studies are being conducted to increase the performance and computing and data access, together reducing energy consumption. In-memory computing (IMC) and in-storage computing (ISC) are currently the most actively studied architectures to deal with the challenges of recent technologies. Since IMC performs operations in memory, there is a chance to overcome the memory bandwidth limit. ISC can reduce energy by using a low power processor inside storage without an expensive IO interface. To integrate the host CPU, IMC and ISC harmoniously, appropriate workload allocation that reflects the characteristics of the target application is required. In this paper, the energy and processing speed are evaluated according to the workload allocation and system conditions. The proof-of-concept prototyping system is implemented for the integrated architecture. The simulation results show that IMC improves the performance by 4.4 times and reduces total energy by 4.6 times over the baseline host CPU. -
Embedded Electronic Coding System
1 GUARD FOR BLIND PEOPLE GUARD FOR BLIND PEOPLE ELECTRONICS AND COMMUNICATION LITAM 2 GUARD FOR BLIND PEOPLE ABSTRACT Aim of this project is to design and develop the electronic guard for blind people on embedded plat form. This project was developed for keep the right way for blind people. It has two important units; they are object detecting sensor unit and micro-controller alarm unit. The object detecting sensor is sense the opposite objects, if the blind person is going to hit any object, the sensor sense that object and given to controller. The controller activates the driver circuit; it will produce the alarm sound, now the people easily identify the opposite object. ELECTRONICS AND COMMUNICATION LITAM 3 GUARD FOR BLIND PEOPLE CHAPTER 1 INTRODUCTION ELECTRONICS AND COMMUNICATION LITAM 4 GUARD FOR BLIND PEOPLE 1.1 METHODOLOGY OF STUDY An embedded based electronic code locking system is designed and implemented using PIC Micro controller to make security. The entire project was developed under embedded systems. EMBEDDED SYSTEMS: A system is something that maintains its existence and functions as a whole through the interaction of its parts. E.g. Body, Mankind, Access Control, etc A system is a part of the world that a person or group of persons during some time interval and for some purpose choose to regard as a whole, consisting of interrelated components, each component characterized by properties that are selected as being relevant to the purpose. • Embedded System is a combination of hardware and software used to achieve a single specific task. • Embedded systems are computer systems that monitor, respond to, or control an external environment.