Introduction to Digital Signal Processors
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
EEMBC and the Purposes of Embedded Processor Benchmarking Markus Levy, President
EEMBC and the Purposes of Embedded Processor Benchmarking Markus Levy, President ISPASS 2005 Certified Performance Analysis for Embedded Systems Designers EEMBC: A Historical Perspective • Began as an EDN Magazine project in April 1997 • Replace Dhrystone • Have meaningful measure for explaining processor behavior • Developed business model • Invited worldwide processor vendors • A consortium was born 1 EEMBC Membership • Board Member • Membership Dues: $30,000 (1st year); $16,000 (subsequent years) • Access and Participation on ALL Subcommittees • Full Voting Rights • Subcommittee Member • Membership Dues Are Subcommittee Specific • Access to Specific Benchmarks • Technical Involvement Within Subcommittee • Help Determine Next Generation Benchmarks • Special Academic Membership EEMBC Philosophy: Standardized Benchmarks and Certified Scores • Member derived benchmarks • Determine the standard, the process, and the benchmarks • Open to industry feedback • Ensures all processor/compiler vendors are running the same tests • Certification process ensures credibility • All benchmark scores officially validated before publication • The entire benchmark environment must be disclosed 2 Embedded Industry: Represented by Very Diverse Applications • Networking • Storage, low- and high-end routers, switches • Consumer • Games, set top boxes, car navigation, smartcards • Wireless • Cellular, routers • Office Automation • Printers, copiers, imaging • Automotive • Engine control, Telematics Traditional Division of Embedded Applications Low High Power -
Cloud Workbench a Web-Based Framework for Benchmarking Cloud Services
Bachelor August 12, 2014 Cloud WorkBench A Web-Based Framework for Benchmarking Cloud Services Joel Scheuner of Muensterlingen, Switzerland (10-741-494) supervised by Prof. Dr. Harald C. Gall Philipp Leitner, Jürgen Cito software evolution & architecture lab Bachelor Cloud WorkBench A Web-Based Framework for Benchmarking Cloud Services Joel Scheuner software evolution & architecture lab Bachelor Author: Joel Scheuner, [email protected] Project period: 04.03.2014 - 14.08.2014 Software Evolution & Architecture Lab Department of Informatics, University of Zurich Acknowledgements This bachelor thesis constitutes the last milestone on the way to my first academic graduation. I would like to thank all the people who accompanied and supported me in the last four years of study and work in industry paving the way to this thesis. My personal thanks go to my parents supporting me in many ways. The decision to choose a complex topic in an area where I had no personal experience in advance, neither theoretically nor technologically, made the past four and a half months challenging, demanding, work-intensive, but also very educational which is what remains in good memory afterwards. Regarding this thesis, I would like to offer special thanks to Philipp Leitner who assisted me during the whole process and gave valuable advices. Moreover, I want to thank Jürgen Cito for his mainly technologically-related recommendations, Rita Erne for her time spent with me on language review sessions, and Prof. Harald Gall for giving me the opportunity to write this thesis at the Software Evolution & Architecture Lab at the University of Zurich and providing fundings and infrastructure to realize this thesis. -
Automatic Benchmark Profiling Through Advanced Trace Analysis Alexis Martin, Vania Marangozova-Martin
Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin, Vania Marangozova-Martin To cite this version: Alexis Martin, Vania Marangozova-Martin. Automatic Benchmark Profiling through Advanced Trace Analysis. [Research Report] RR-8889, Inria - Research Centre Grenoble – Rhône-Alpes; Université Grenoble Alpes; CNRS. 2016. hal-01292618 HAL Id: hal-01292618 https://hal.inria.fr/hal-01292618 Submitted on 24 Mar 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin , Vania Marangozova-Martin RESEARCH REPORT N° 8889 March 23, 2016 Project-Team Polaris ISSN 0249-6399 ISRN INRIA/RR--8889--FR+ENG Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin ∗ † ‡, Vania Marangozova-Martin ∗ † ‡ Project-Team Polaris Research Report n° 8889 — March 23, 2016 — 15 pages Abstract: Benchmarking has proven to be crucial for the investigation of the behavior and performances of a system. However, the choice of relevant benchmarks still remains a challenge. To help the process of comparing and choosing among benchmarks, we propose a solution for automatic benchmark profiling. It computes unified benchmark profiles reflecting benchmarks’ duration, function repartition, stability, CPU efficiency, parallelization and memory usage. -
DSP56303 Product Brief, Rev
Freescale Semiconductor DSP56303PB Product Brief Rev. 2, 2/2005 DSP56303 24-Bit Digital Signal Processor 16 6 6 3 Memory Expansion Area The DSP56303 is intended Triple X Data HI08 ESSI SCI PrograM Y Data for use in telecommunication Timer RAM RAM RAM 4096 × 24 2048 × 24 2048 × 24 applications, such as multi- bits bits bits line voice/data/ fax (default) (default) (default) processing, video Peripheral Expansion Area conferencing, audio PM_EB XM_EB PIO_EB YAB YM_EB applications, control, and Address External 18 Generation XAB Address general digital signal Unit PAB Bus Address Six-Channel DAB Switch processing. DMA Unit External 24-Bit Bus 13 Bootstrap DSP56300 Interface ROM and Inst. Core Cache Control Control DDB 24 Internal YDB External Data XDB Data Bus Bus PDB Switch Data Switch GDB EXTAL Power Clock Management Data ALU 5 Generator Program Program Program × + → XTAL Interrupt Decode Address 24 24 56 56-bit MAC JTAG PLL Two 56-bit Accumulators Controller Controller Generator 56-bit Barrel Shifter OnCE™ DE 2 MODA/IRQA MODB/IRQB RESET MODC/IRQC PINIT/NMI MODD/IRQD Figure 1. DSP56303 Block Diagram The DSP56303 is a member of the DSP56300 core family of programmable CMOS DSPs. Significant architectural features of the DSP56300 core family include a barrel shifter, 24-bit addressing, instruction cache, and DMA. The DSP56303 offers 100 million multiply-accumulates per second (MMACS) using an internal 100 MHz clock at 3.0–3.6 volts. The DSP56300 core family offers a rich instruction set and low power dissipation, as well as increasing levels of speed and power to enable wireless, telecommunications, and multimedia products. -
Low Power Asynchronous Digital Signal Processing
LOW POWER ASYNCHRONOUS DIGITAL SIGNAL PROCESSING A thesis submitted to the University of Manchester for the degree of Doctor of Philosophy in the Faculty of Science & Engineering October 2000 Michael John George Lewis Department of Computer Science 1 Contents Chapter 1: Introduction ....................................................................................14 Digital Signal Processing ...............................................................................15 Evolution of digital signal processors ....................................................17 Architectural features of modern DSPs .........................................................19 High performance multiplier circuits .....................................................20 Memory architecture ..............................................................................21 Data address generation .........................................................................21 Loop management ..................................................................................23 Numerical precision, overflows and rounding .......................................24 Architecture of the GSM Mobile Phone System ...........................................25 Channel equalization ..............................................................................28 Error correction and Viterbi decoding ...................................................29 Speech transcoding ................................................................................31 Half-rate and enhanced -
Automatic Benchmark Profiling Through Advanced Trace Analysis
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Hal - Université Grenoble Alpes Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin, Vania Marangozova-Martin To cite this version: Alexis Martin, Vania Marangozova-Martin. Automatic Benchmark Profiling through Advanced Trace Analysis. [Research Report] RR-8889, Inria - Research Centre Grenoble { Rh^one-Alpes; Universit´eGrenoble Alpes; CNRS. 2016. <hal-01292618> HAL Id: hal-01292618 https://hal.inria.fr/hal-01292618 Submitted on 24 Mar 2016 HAL is a multi-disciplinary open access L'archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destin´eeau d´ep^otet `ala diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publi´esou non, lished or not. The documents may come from ´emanant des ´etablissements d'enseignement et de teaching and research institutions in France or recherche fran¸caisou ´etrangers,des laboratoires abroad, or from public or private research centers. publics ou priv´es. Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin , Vania Marangozova-Martin RESEARCH REPORT N° 8889 March 23, 2016 Project-Team Polaris ISSN 0249-6399 ISRN INRIA/RR--8889--FR+ENG Automatic Benchmark Profiling through Advanced Trace Analysis Alexis Martin ∗ † ‡, Vania Marangozova-Martin ∗ † ‡ Project-Team Polaris Research Report n° 8889 — March 23, 2016 — 15 pages Abstract: Benchmarking has proven to be crucial for the investigation of the behavior and performances of a system. However, the choice of relevant benchmarks still remains a challenge. To help the process of comparing and choosing among benchmarks, we propose a solution for automatic benchmark profiling. -
Rlsc & DSP Advanced Microprocessor System Design
Purdue University Purdue e-Pubs ECE Technical Reports Electrical and Computer Engineering 3-1-1992 RlSC & DSP Advanced Microprocessor System Design; Sample Projects, Fall 1991 John E. Fredine Purdue University, School of Electrical Engineering Dennis L. Goeckel Purdue University, School of Electrical Engineering David G. Meyer Purdue University, School of Electrical Engineering Stuart E. Sailer Purdue University, School of Electrical Engineering Glenn E. Schmottlach Purdue University, School of Electrical Engineering Follow this and additional works at: http://docs.lib.purdue.edu/ecetr Fredine, John E.; Goeckel, Dennis L.; Meyer, David G.; Sailer, Stuart E.; and Schmottlach, Glenn E., "RlSC & DSP Advanced Microprocessor System Design; Sample Projects, Fall 1991" (1992). ECE Technical Reports. Paper 302. http://docs.lib.purdue.edu/ecetr/302 This document has been made available through Purdue e-Pubs, a service of the Purdue University Libraries. Please contact [email protected] for additional information. RISC & DSP Advanced Microprocessor System Design Sample Projects, Fall 1991 John E. Fredine Dennis L. Goeckel David G. Meyer Stuart E. Sailer Glenn E. Schmottlach TR-EE 92- 11 March 1992 School of Electrical Engineering Purdue University West Lafayette, Indiana 47907 RlSC & DSP Advanced Microprocessor System Design Sample Projects, Fall 1991 John E. Fredine Dennis L. Goeckel David G. Meyer Stuart E. Sailer Glenn E. Schrnottlach School of Electrical Engineering Purdue University West Lafayette, Indiana 47907 Table of Contents Abstract ................................................................................................................... -
BOOM): an Industry- Competitive, Synthesizable, Parameterized RISC-V Processor
The Berkeley Out-of-Order Machine (BOOM): An Industry- Competitive, Synthesizable, Parameterized RISC-V Processor Christopher Celio David A. Patterson Krste Asanović Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2015-167 http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-167.html June 13, 2015 Copyright © 2015, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. The Berkeley Out-of-Order Machine (BOOM): An Industry-Competitive, Synthesizable, Parameterized RISC-V Processor Christopher Celio, David Patterson, and Krste Asanovic´ University of California, Berkeley, California 94720–1770 [email protected] BOOM is a work-in-progress. Results shown are prelimi- nary and subject to change as of 2015 June. I$ L1 D$ (32k) L2 data 1. The Berkeley Out-of-Order Machine BOOM is a synthesizable, parameterized, superscalar out- exe of-order RISC-V core designed to serve as the prototypical baseline processor for future micro-architectural studies of uncore regfile out-of-order processors. Our goal is to provide a readable, issue open-source implementation for use in education, research, exe and industry. uncore BOOM is written in roughly 9,000 lines of the hardware L2 data (256k) construction language Chisel. -
A Free, Commercially Representative Embedded Benchmark Suite
MiBench: A free, commercially representative embedded benchmark suite Matthew R. Guthaus, Jeffrey S. Ringenberg, Dan Ernst, Todd M. Austin, Trevor Mudge, Richard B. Brown {mguthaus,jringenb,ernstd,taustin,tnm,brown}@eecs.umich.edu The University of Michigan Electrical Engineering and Computer Science 1301 Beal Ave., Ann Arbor, MI 48109-2122 Abstract Among the common features of these This paper examines a set of commercially microarchitectures are deep pipelines, significant representative embedded programs and compares them instruction level parallelism, sophisticated branch to an existing benchmark suite, SPEC2000. A new prediction schemes, and large caches. version of SimpleScalar that has been adapted to the Although this class of machines has been the chief ARM instruction set is used to characterize the focus of the computer architecture community, performance of the benchmarks using configurations relatively few microprocessors are employed in this similar to current and next generation embedded market segment. The vast majority of microprocessors processors. Several characteristics distinguish the are employed in embedded applications [8]. Although representative embedded programs from the existing many are just inexpensive microcontrollers, their SPEC benchmarks including instruction distribution, combined sales account for nearly half of all memory behavior, and available parallelism. The microprocessor revenue. Furthermore, the embedded embedded benchmarks, called MiBench, are freely application domain is the fastest growing market available to all researchers. segment in the microprocessor industry. The wide range of applications makes it difficult to 1. Introduction characterize the embedded domain. In fact, an Performance based design has made benchmarking embedded benchmark suite should reflect this by a critical part of the design process [1]. -
Low-Power High Performance Computing
Low-Power High Performance Computing Panagiotis Kritikakos August 16, 2011 MSc in High Performance Computing The University of Edinburgh Year of Presentation: 2011 Abstract The emerging development of computer systems to be used for HPC require a change in the architecture for processors. New design approaches and technologies need to be embraced by the HPC community for making a case for new approaches in system design for making it possible to be used for Exascale Supercomputers within the next two decades, as well as to reduce the CO2 emissions of supercomputers and scientific clusters, leading to greener computing. Power is listed as one of the most important issues and constraint for future Exascale systems. In this project we build a hybrid cluster, investigating, measuring and evaluating the performance of low-power CPUs, such as Intel Atom and ARM (Marvell 88F6281) against commodity Intel Xeon CPU that can be found within standard HPC and data-center clusters. Three main factors are considered: computational performance and efficiency, power efficiency and porting effort. Contents 1 Introduction 1 1.1 Report organisation . 2 2 Background 3 2.1 RISC versus CISC . 3 2.2 HPC Architectures . 4 2.2.1 System architectures . 4 2.2.2 Memory architectures . 5 2.3 Power issues in modern HPC systems . 9 2.4 Energy and application efficiency . 10 3 Literature review 12 3.1 Green500 . 12 3.2 Supercomputing in Small Spaces (SSS) . 12 3.3 The AppleTV Cluster . 13 3.4 Sony Playstation 3 Cluster . 13 3.5 Microsoft XBox Cluster . 14 3.6 IBM BlueGene/Q . -
The DENX U-Boot and Linux Guide (DULG) for Canyonlands
The DENX U-Boot and Linux Guide (DULG) for canyonlands Table of contents: • 1. Abstract • 2. Introduction ♦ 2.1. Copyright ♦ 2.2. Disclaimer ♦ 2.3. Availability ♦ 2.4. Credits ♦ 2.5. Translations ♦ 2.6. Feedback ♦ 2.7. Conventions • 3. Embedded Linux Development Kit ♦ 3.1. ELDK Availability ♦ 3.2. ELDK Getting Help ♦ 3.3. Supported Host Systems ♦ 3.4. Supported Target Architectures ♦ 3.5. Installation ◊ 3.5.1. Product Packaging ◊ 3.5.2. Downloading the ELDK ◊ 3.5.3. Initial Installation ◊ 3.5.4. Installation and Removal of Individual Packages ◊ 3.5.5. Removal of the Entire Installation ♦ 3.6. Working with ELDK ◊ 3.6.1. Switching Between Multiple Installations ♦ 3.7. Mounting Target Components via NFS ♦ 3.8. Rebuilding ELDK Components ◊ 3.8.1. ELDK Source Distribution ◊ 3.8.2. Rebuilding Target Packages ◊ 3.8.3. Rebuilding ELDT Packages ♦ 3.9. ELDK Packages ◊ 3.9.1. List of ELDT Packages ◊ 3.9.2. List of Target Packages ♦ 3.10. Rebuilding the ELDK from Scratch ◊ 3.10.1. ELDK Build Process Overview ◊ 3.10.2. Setting Up ELDK Build Environment ◊ 3.10.3. build.sh Usage ◊ 3.10.4. Format of the cpkgs.lst and tpkgs.lst Files ♦ 3.11. Notes for Solaris 2.x Host Environment • 4. System Setup ♦ 4.1. Serial Console Access ♦ 4.2. Configuring the "cu" command ♦ 4.3. Configuring the "kermit" command ♦ 4.4. Using the "minicom" program ♦ 4.5. Permission Denied Problems ♦ 4.6. Configuration of a TFTP Server ♦ 4.7. Configuration of a BOOTP / DHCP Server ♦ 4.8. Configuring a NFS Server • 5. -
Exploring Coremark™ – a Benchmark Maximizing Simplicity and Efficacy by Shay Gal-On and Markus Levy
Exploring CoreMark™ – A Benchmark Maximizing Simplicity and Efficacy By Shay Gal-On and Markus Levy There have been many attempts to provide a single number that can totally quantify the ability of a CPU. Be it MHz, MOPS, MFLOPS - all are simple to derive but misleading when looking at actual performance potential. Dhrystone was the first attempt to tie a performance indicator, namely DMIPS, to execution of real code - a good attempt, which has long served the industry, but is no longer meaningful. BogoMIPS attempts to measure how fast a CPU can “do nothing”, for what that is worth. The need still exists for a simple and standardized benchmark that provides meaningful information about the CPU core - introducing CoreMark, available for free download from www.coremark.org. CoreMark ties a performance indicator to execution of simple code, but rather than being entirely arbitrary and synthetic, the code for the benchmark uses basic data structures and algorithms that are common in practically any application. Furthermore, EEMBC carefully chose the CoreMark implementation such that all computations are driven by run-time provided values to prevent code elimination during compile time optimization. CoreMark also sets specific rules about how to run the code and report results, thereby eliminating inconsistencies. CoreMark Composition To appreciate the value of CoreMark, it’s worthwhile to dissect its composition, which in general is comprised of lists, strings, and arrays (matrixes to be exact). Lists commonly exercise pointers and are also characterized by non-serial memory access patterns. In terms of testing the core of a CPU, list processing predominantly tests how fast data can be used to scan through the list.