A Multicore Computing Platform for Benchmarking

Total Page:16

File Type:pdf, Size:1020Kb

A Multicore Computing Platform for Benchmarking A MULTICORE COMPUTING PLATFORM FOR BENCHMARKING DYNAMIC PARTIAL RECONFIGURATION BASED DESIGNS by DAVID A. THORNDIKE Submitted in partial fulfillment of the requirements For the degree of Master of Science Thesis Advisor: Dr. Christos A. Papachristou Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY August, 2012 CASE WESTERN RESERVE UNIVERSITY SCHOOL OF GRADUATE STUDIES We hereby approve the thesis/dissertation of David A. Thorndike candidate for the Master of Science degree *. (signed) Christos A.Papachristou (chair of the committee) Francis L. Merat Francis G. Wolff (date) June 1, 2012 *We also certify that written approval has been obtained for any proprietary material contained therein. Table of Contents List of Figures ................................................................................................................... iii List of Tables .................................................................................................................... iv Abstract .............................................................................................................................. v 1. Introduction .................................................................................................................. 1 1.1 Motivation .......................................................................................................... 1 1.2 Contributions ...................................................................................................... 2 1.3 Thesis Outline .................................................................................................... 2 2. Background .................................................................................................................. 3 2.1 Multicore Computing ......................................................................................... 3 2.2 Reconfigurable Computing ................................................................................ 6 2.3 Dynamic Partial Reconfigurability ................................................................... 10 2.4 Related Work .................................................................................................... 12 3. Platform Development Process .................................................................................. 15 3.1 OpenSPARC ..................................................................................................... 16 3.2 LEON3 ............................................................................................................. 19 3.3 Xilinx Virtex-5 ................................................................................................. 22 3.4 GNU Tools / GRLIB ........................................................................................ 24 3.5 Xilinx ISE ......................................................................................................... 25 3.6 SnapGear Linux (for SMP) .............................................................................. 26 3.7 Benchmarking Applications ............................................................................. 27 4. General Implementation Flow ................................................................................... 31 4.1 Hardware and Software Tools .......................................................................... 31 i 4.2 Board Functional Verification .......................................................................... 32 4.3 Hardware Development .................................................................................... 34 4.4 Software Development ..................................................................................... 37 5. Results ........................................................................................................................ 41 6. Summary .................................................................................................................... 50 6.1 Conclusion ........................................................................................................ 50 6.2 Future Work ..................................................................................................... 50 Appendix A ...................................................................................................................... 51 Bibliography .................................................................................................................... 83 ii List of Figures Figure 1. IBM's Cell Processor ...........................................................................................6 Figure 2. DPR for Functional Modification and Size Reduction .....................................11 Figure 3. Leon3 SMP with reconfigurable coprocessors .................................................13 Figure 4. Xilinx XUPV5-LX110T development board (ML509) ...................................15 Figure 5. Virtex-5 FPGA ML50x Evaluation Platform Block Diagram .........................18 Figure 6. Block diagram of configurable LEON3 processor ............................................21 Figure 7. Example LEON3 multicore SoC with other GRLIB IP cores ..........................22 Figure 8. Linear speedup of OpenMPBench's BasicMath ...............................................29 Figure 9. LEON3 Xilinx ML509 Template Design from leon3mp.vhd ..........................35 Figure 10. LEON3 processor core configurability ..........................................................36 iii List of Tables Table 1. Soft-core CPUs suitable for FPGA implementation ..........................................20 Table 2. Xilinx Virtex-5 LX50T and LX110T Comparison ............................................33 Table 3. Single Core Synthesis Results ...........................................................................41 Table 4. Multicore Synthesis Results...............................................................................44 Table 5. Benchmark Performance Results .......................................................................48 iv A Multicore Computing Platform for Benchmarking Dynamic Partial Reconfiguration Based Designs Abstract by DAVID ANDREW THORNDIKE With the increasing application of multiple processor cores (multicores) within embedded system applications, as well as the pervasive utilization of the field- programmable gate array (FPGA), the embedded system development community has been exploring the advantages of the dynamically reconfigurable nature of FPGAs. Given size and power limitations, a primary motivation for this interest is to enable dynamic customization of hardware to optimize system performance for the various algorithms that a system encounters. This work presents a hardware based platform for studying dynamic reconfiguration of FPGAs in the context of multicore embedded systems. It also presents a methodology for developing the hardware and software for these systems. An important aspect of this work was to maximize the utilization of open source hardware and software intellectual property (IP). An example of the basic implementation flow is also provided, along with some benchmarking results. v 1. Introduction 1.1 Motivation Over the past few decades, since the commercial introduction of the Field Programmable Gate Arrays (FPGA) by Xilinx, the configurable nature of FPGAs brought them to the center of attention for many in the field of computing. Though recent advances by Actel have introduced the first commercially available FPGAs with configurable analog blocks (Morris, 2005), FPGAs have primarily occupied the realm of digital logic and microprocessor functionality. When the density of these devices got sufficiently large, designers were able to implement multiple microprocessor cores within a single FPGA in order to achieve the performance and power advantages that brought multicore architectures to general-purpose CPUs and ASIC platforms. With the advancement of dynamic partial reconfiguration capabilities provided by Xilinx tools and devices, new opportunities have unfolded to explore dynamic reconfigurability of multicore systems. The intent of this work is to develop a hardware platform on which the capabilities of this powerful and flexible technology can be explored. Since reconfigurable computing has found wide ranging applicability throughout the field of computing, the primary focus of this work will be on its application to embedded systems. These are microprocessor based systems with customized software for the sole purpose of performing or controlling a set of specific functions, often involving real-time operations. The end-user may be provided options or choices in the operation of the system, but unlike that of a personal computer, the user is generally not provided the ability to program or change the software of an embedded system (Heath, 2003). 1 1.2 Contributions In this work, a reconfigurable multicore computing platform is presented. A method is demonstrated to create such a platform through the application of an open source soft- core processor and open source development tools along with Xilinx tools and a Virtex-5 evaluation board, also referred to as XUPV5 or ML509. 1.3 Thesis Outline This thesis is organized as follows: • Chapter 2: Presents background information on multicore computing and reconfigurable computing, including dynamic partial reconfiguration (DPR). Related work in the area of DPR design on the ML509 is also described. • Chapter 3: Describes components
Recommended publications
  • Embedded Processors on FPGA: Hard-Core Vs Soft-Core Vivek J
    Grand Valley State University ScholarWorks@GVSU Masters Theses Graduate Research and Creative Practice 5-19-2017 Embedded processors on FPGA: Hard-core vs Soft-core Vivek J. Vazhoth Kanhiroth Grand Valley State University Follow this and additional works at: http://scholarworks.gvsu.edu/theses Part of the Engineering Commons Recommended Citation Vazhoth Kanhiroth, Vivek J., "Embedded processors on FPGA: Hard-core vs Soft-core" (2017). Masters Theses. 845. http://scholarworks.gvsu.edu/theses/845 This Thesis is brought to you for free and open access by the Graduate Research and Creative Practice at ScholarWorks@GVSU. It has been accepted for inclusion in Masters Theses by an authorized administrator of ScholarWorks@GVSU. For more information, please contact [email protected]. Embedded processors on FPGA: Hard-core vs Soft-core Vivek Jayakrishnan Vazhoth Kanhiroth A Thesis submitted to the Graduate Faculty of GRAND VALLEY STATE UNIVERSITY In Partial Fulfilment of the Requirements For the Degree of Master of Science in Electrical Engineering Padnos College of Engineering and Computing April 2017 DEDICATION To my parents Jayakrishnan and Jayalakshmi who are my biggest inspiration and to my mentor Rajesh without whose help I would never have come out of my shell. 3 ACKNOWLEDGEMENTS I would like to thank my Thesis Advisor Dr. Chirag Parikh without whose patience, guidance and understanding I would not have finished this thesis. I would also like to thank my Thesis committee members Dr. Christian Trefftz and Dr. Azizur Rahman for their valuable inputs and feedback about my thesis. I am indebted to Dr. Shabbir Choudhuri for always being approachable and helping me on innumerable occasions over the last 3 years.
    [Show full text]
  • Reconfigurable Computing
    Reconfigurable Computing: A Survey of Systems and Software KATHERINE COMPTON Northwestern University AND SCOTT HAUCK University of Washington Due to its potential to greatly accelerate a wide variety of applications, reconfigurable computing has become a subject of a great deal of research. Its key feature is the ability to perform computations in hardware to increase performance, while retaining much of the flexibility of a software solution. In this survey, we explore the hardware aspects of reconfigurable computing machines, from single chip architectures to multi-chip systems, including internal structures and external coupling. We also focus on the software that targets these machines, such as compilation tools that map high-level algorithms directly to the reconfigurable substrate. Finally, we consider the issues involved in run-time reconfigurable systems, which reuse the configurable hardware during program execution. Categories and Subject Descriptors: A.1 [Introductory and Survey]; B.6.1 [Logic Design]: Design Style—logic arrays; B.6.3 [Logic Design]: Design Aids; B.7.1 [Integrated Circuits]: Types and Design Styles—gate arrays General Terms: Design, Performance Additional Key Words and Phrases: Automatic design, field-programmable, FPGA, manual design, reconfigurable architectures, reconfigurable computing, reconfigurable systems 1. INTRODUCTION of algorithms. The first is to use hard- wired technology, either an Application There are two primary methods in con- Specific Integrated Circuit (ASIC) or a ventional computing for the execution group of individual components forming a This research was supported in part by Motorola, Inc., DARPA, and NSF. K. Compton was supported by an NSF fellowship. S. Hauck was supported in part by an NSF CAREER award and a Sloan Research Fellowship.
    [Show full text]
  • Overview of the SPEC Benchmarks
    9 Overview of the SPEC Benchmarks Kaivalya M. Dixit IBM Corporation “The reputation of current benchmarketing claims regarding system performance is on par with the promises made by politicians during elections.” Standard Performance Evaluation Corporation (SPEC) was founded in October, 1988, by Apollo, Hewlett-Packard,MIPS Computer Systems and SUN Microsystems in cooperation with E. E. Times. SPEC is a nonprofit consortium of 22 major computer vendors whose common goals are “to provide the industry with a realistic yardstick to measure the performance of advanced computer systems” and to educate consumers about the performance of vendors’ products. SPEC creates, maintains, distributes, and endorses a standardized set of application-oriented programs to be used as benchmarks. 489 490 CHAPTER 9 Overview of the SPEC Benchmarks 9.1 Historical Perspective Traditional benchmarks have failed to characterize the system performance of modern computer systems. Some of those benchmarks measure component-level performance, and some of the measurements are routinely published as system performance. Historically, vendors have characterized the performances of their systems in a variety of confusing metrics. In part, the confusion is due to a lack of credible performance information, agreement, and leadership among competing vendors. Many vendors characterize system performance in millions of instructions per second (MIPS) and millions of floating-point operations per second (MFLOPS). All instructions, however, are not equal. Since CISC machine instructions usually accomplish a lot more than those of RISC machines, comparing the instructions of a CISC machine and a RISC machine is similar to comparing Latin and Greek. 9.1.1 Simple CPU Benchmarks Truth in benchmarking is an oxymoron because vendors use benchmarks for marketing purposes.
    [Show full text]
  • Implementation, Verification and Validation of an Openrisc-1200
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 1, 2019 Implementation, Verification and Validation of an OpenRISC-1200 Soft-core Processor on FPGA Abdul Rafay Khatri Department of Electronic Engineering, QUEST, NawabShah, Pakistan Abstract—An embedded system is a dedicated computer system in which hardware and software are combined to per- form some specific tasks. Recent advancements in the Field Programmable Gate Array (FPGA) technology make it possible to implement the complete embedded system on a single FPGA chip. The fundamental component of an embedded system is a microprocessor. Soft-core processors are written in hardware description languages and functionally equivalent to an ordinary microprocessor. These soft-core processors are synthesized and implemented on the FPGA devices. In this paper, the OpenRISC 1200 processor is used, which is a 32-bit soft-core processor and Fig. 1. General block diagram of embedded systems. written in the Verilog HDL. Xilinx ISE tools perform synthesis, design implementation and configure/program the FPGA. For verification and debugging purpose, a software toolchain from (RISC) processor. This processor consists of all necessary GNU is configured and installed. The software is written in C components which are available in any other microproces- and Assembly languages. The communication between the host computer and FPGA board is carried out through the serial RS- sor. These components are connected through a bus called 232 port. Wishbone bus. In this work, the OR1200 processor is used to implement the system on a chip technology on a Virtex-5 Keywords—FPGA Design; HDLs; Hw-Sw Co-design; Open- FPGA board from Xilinx.
    [Show full text]
  • Syllabus: EEL4930/5934 Reconfigurable Computing
    EEL4720/5721 Reconfigurable Computing (dual-listed course) Department of Electrical and Computer Engineering University of Florida Spring Semester 2019 Catalog Description: Prereq: EEL4712C or EEL5764 or consent of instructor. Fundamental concepts at advanced undergraduate level (EEL4720) and introductory graduate level (EEL5721) in reconfigurable computing (RC) based upon advanced technologies in field-programmable logic devices. Topics include general RC concepts, device architectures, design tools, metrics and kernels, system architectures, and application case studies. Credit Hours: 3 Prerequisites by Topic: Fundamentals of digital design including device technologies, design methodology and techniques, and design environments and tools; fundamentals of computer organization and architecture, including datapath and control structures, data formats, instruction-set principles, pipelining, instruction-level parallelism, memory hierarchy, and interconnects and interfacing. Instructor: Dr. Herman Lam Office: Benton Hall, Room 313 Office hours: TBA Telephone: (352) 392-2689 Email: [email protected] Teaching Assistant: Seyed Hashemi Office hours: TBA Email: [email protected] Class lectures: MWF 4th period, Larsen Hall 239 Required textbook: none References: . Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation, edited by Scott Hauck and Andre DeHon, Elsevier, Inc. (Morgan Kaufmann Publishers), Amsterdam, 2008. ISBN: 978-0-12-370522-8 . C. Maxfield, The Design Warrior's Guide to FPGAs, Newnes, 2004, ISBN: 978-0750676045.
    [Show full text]
  • A MATLAB Compiler for Distributed, Heterogeneous, Reconfigurable
    A MATLAB Compiler For Distributed, Heterogeneous, Reconfigurable Computing Systems P. Banerjee, N. Shenoy, A. Choudhary, S. Hauck, C. Bachmann, M. Haldar, P. Joisha, A. Jones, A. Kanhare A. Nayak, S. Periyacheri, M. Walkden, D. Zaretsky Electrical and Computer Engineering Northwestern University 2145 Sheridan Road Evanston, IL-60208 [email protected] Abstract capabilities and are coordinated to perform an application whose subtasks have diverse execution requirements. One Recently, high-level languages such as MATLAB have can visualize such systems to consist of embedded proces- become popular in prototyping algorithms in domains such sors, digital signal processors, specialized chips, and field- as signal and image processing. Many of these applications programmable gate arrays (FPGA) interconnected through whose subtasks have diverse execution requirements, often a high-speed interconnection network; several such systems employ distributed, heterogeneous, reconfigurable systems. have been described in [9]. These systems consist of an interconnected set of heteroge- A key question that needs to be addressed is how to map neous processing resources that provide a variety of archi- a given computation on such a heterogeneous architecture tectural capabilities. The objective of the MATCH (MATlab without expecting the application programmer to get into Compiler for Heterogeneous computing systems) compiler the low level details of the architecture or forcing him/her to project at Northwestern University is to make it easier for understand
    [Show full text]
  • I.MX 8Quadxplus Power and Performance
    NXP Semiconductors Document Number: AN12338 Application Note Rev. 4 , 04/2020 i.MX 8QuadXPlus Power and Performance 1. Introduction Contents This application note helps you to design power 1. Introduction ........................................................................ 1 management systems. It illustrates the current drain 2. Overview of i.MX 8QuadXPlus voltage supplies .............. 1 3. Power measurement of the i.MX 8QuadXPlus processor ... 2 measurements of the i.MX 8QuadXPlus Applications 3.1. VCC_SCU_1V8 power ........................................... 4 Processors taken on NXP Multisensory Evaluation Kit 3.2. VCC_DDRIO power ............................................... 4 (MEK) Platform through several use cases. 3.3. VCC_CPU/VCC_GPU/VCC_MAIN power ........... 5 3.4. Temperature measurements .................................... 5 This document provides details on the performance and 3.5. Hardware and software used ................................... 6 3.6. Measuring points on the MEK platform .................. 6 power consumption of the i.MX 8QuadXPlus 4. Use cases and measurement results .................................... 6 processors under a variety of low- and high-power 4.1. Low-power mode power consumption (Key States modes. or ‘KS’)…… ......................................................................... 7 4.2. Complex use case power consumption (Arm Core, The data presented in this application note is based on GPU active) ......................................................................... 11 5. SOC
    [Show full text]
  • Day 2, 1640: Leveraging Opensparc
    Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina – TEC-EDD OpenSPARC T1 • The T1 is a new-from-the-ground-up SPARC microprocessor implementation that conforms to the UltraSPARC architecture 2005 specification and executes the full SPARC V9 instruction set. Sun has produced two previous multicore processors: UltraSPARC IV and UltraSPARC IV+, but UltraSPARC T1 is its first microprocessor that is both multicore and multithreaded. • The processor is available with 4, 6 or 8 CPU cores, each core able to handle four threads. Thus the processor is capable of processing up to 32 threads concurrently. • Designed to lower the energy consumption of server computers, the 8-cores CPU uses typically 72 W of power at 1.2 GHz. G.Furano, L.Messina – TEC-EDD 72W … 1.2 GHz … 90nm … • Is a cutting edge design, targeted for high-end servers. • NOT FOR SPACE USE • But, let’s see which are the potential spin-in … G.Furano, L.Messina – TEC-EDD Why OPEN ? On March 21, 2006, Sun made the UltraSPARC T1 processor design available under the GNU General Public License. The published information includes: • Verilog source code of the UltraSPARC T1 design, including verification suite and simulation models • ISA specification (UltraSPARC Architecture 2005) • The Solaris 10 OS simulation images • Diagnostics tests for OpenSPARC T1 • Scripts, open source and Sun internal tools needed to simulate the design and to do synthesis of the design • Scripts and documentation to help with FPGA implementation
    [Show full text]
  • Part 1 of 4 : Introduction to RISC-V ISA
    PULP PLATFORM Open Source Hardware, the way it should be! Working with RISC-V Part 1 of 4 : Introduction to RISC-V ISA Frank K. Gürkaynak <[email protected]> Luca Benini <[email protected]> http://pulp-platform.org @pulp_platform https://www.youtube.com/pulp_platform Working with RISC-V Summary ▪ Part 1 – Introduction to RISC-V ISA ▪ What is RISC-V about ▪ Description of ISA, and basic principles ▪ Simple 32b implementation (Ibex by LowRISC) ▪ How to extend the ISA (CV32E40P by OpenHW group) ▪ Part 2 – Advanced RISC-V Architectures ▪ Part 3 – PULP concepts ▪ Part 4 – PULP based chips | ACACES 2020 - July 2020 Working with RISC-V Few words about myself Frank K. Gürkaynak (just call me Frank) Senior scientist at ETH Zurich (means I am old) working with Luca Studied / Worked at Universities: in Turkey, United States and Switzerland (ETHZ and EPFL) Involved in Digital Design, Low Power Circuits, Open Source Hardware Part of PULP project from the beginning in 2013 | ACACES 2020 - July 2020 Working with RISC-V RISC-V Instruction Set Architecture ▪ Started by UC-Berkeley in 2010 SW ▪ Contract between SW and HW Applications ▪ Partitioned into user and privileged spec ▪ External Debug OS ▪ Standard governed by RISC-V foundation ▪ ETHZ is a founding member of the foundation ISA ▪ Necessary for the continuity User Privileged ▪ Defines 32, 64 and 128 bit ISA Debug ▪ No implementation, just the ISA ▪ Different implementations (both open and close source) HW ▪ At ETH Zurich we specialize in efficient implementations of RISC-V cores | ACACES 2020 - July 2020 Working with RISC-V RISC-V maintains basically a PDF document | ACACES 2020 - July 2020 Working with RISC-V ISA defines the instructions that processor uses C++ program translated to RISC-V instructions defined by ISA.
    [Show full text]
  • Sparc Enterprise T5440 Server Architecture
    SPARC ENTERPRISE T5440 SERVER ARCHITECTURE Unleashing UltraSPARC T2 Plus Processors with Innovative Multi-core Multi-thread Technology White Paper July 2009 TABLE OF CONTENTS THE ULTRASPARC T2 PLUS PROCESSOR 0 THE WORLD'S FIRST MASSIVELY THREADED SYSTEM ON A CHIP (SOC) 0 TAKING CHIP MULTITHREADED DESIGN TO THE NEXT LEVEL 1 ULTRASPARC T2 PLUS PROCESSOR ARCHITECTURE 3 SERVER ARCHITECTURE 8 SYSTEM-LEVEL ARCHITECTURE 8 CHASSIS DESIGN INNOVATIONS 13 ENTERPRISE-CLASS MANAGEMENT AND SOFTWARE 19 SYSTEM MANAGEMENT TECHNOLOGY 19 SCALABILITY AND SUPPORT FOR INNOVATIVE MULTITHREADING TECHNOLOGY21 CONCLUSION 28 0 The UltraSPARC T2 Plus Processors Chapter 1 The UltraSPARC T2 Plus Processors The UltraSPARC T2 and UltraSPARC T2 Plus processors are the industry’s first system on a chip (SoC), supplying the most cores and threads of any general-purpose processor available, and integrating all key system functions. The World's First Massively Threaded System on a Chip (SoC) The UltraSPARC T2 Plus processor eliminates the need for expensive custom hardware and software development by integrating computing, security, and I/O on to a single chip. Binary compatible with earlier UltraSPARC processors, no other processor delivers so much performance in so little space and with such small power requirements letting organizations rapidly scale the delivery of new network services with maximum efficiency and predictability. The UltraSPARC T2 Plus processor is shown in Figure 1. Figure 1. The UltraSPARC T2 Plus processor with CoolThreads technology 1 The UltraSPARC
    [Show full text]
  • Xilinx Running the Dhrystone 2.1 Benchmark on a Virtex-II Pro
    Product Not Recommended for New Designs Application Note: Virtex-II Pro Device R Running the Dhrystone 2.1 Benchmark on a Virtex-II Pro PowerPC Processor XAPP507 (v1.0) July 11, 2005 Author: Paul Glover Summary This application note describes a working Virtex™-II Pro PowerPC™ system that uses the Dhrystone benchmark and the reference design on which the system runs. The Dhrystone benchmark is commonly used to measure CPU performance. Introduction The Dhrystone benchmark is a general-performance benchmark used to evaluate processor execution time. This benchmark tests the integer performance of a CPU and the optimization capabilities of the compiler used to generate the code. The output from the benchmark is the number of Dhrystones per second (that is, the number of iterations of the main code loop per second). This application note describes a PowerPC design created with Embedded Development Kit (EDK) 7.1 that runs the Dhrystone benchmark, producing 600+ DMIPS (Dhrystone Millions of Instructions Per Second) at 400 MHz. Prerequisites Required Software • Xilinx ISE 7.1i SP1 • Xilinx EDK 7.1i SP1 • WindRiver Diab DCC 5.2.1.0 or later Note: The Diab compiler for the PowerPC processor must be installed and included in the path. • HyperTerminal Required Hardware • Xilinx ML310 Demonstration Platform • Serial Cable • Xilinx Parallel-4 Configuration Cable Dhrystone Developed in 1984 by Reinhold P. Wecker, the Dhrystone benchmark (written in C) was Description originally developed to benchmark computer systems, a short benchmark that was representative of integer programming. The program is CPU-bound, performing no I/O functions or operating system calls.
    [Show full text]
  • Opensparc – an Open Platform for Hardware Reliability Experimentation
    OpenSPARC – An Open Platform for Hardware Reliability Experimentation Ishwar Parulkar and Alan Wood Sun Microsystems, Inc. James C. Hoe and Babak Falsafi Carnegie Mellon University Sarita V. Adve and Josep Torrellas University of Illinois at Urbana- Champaign Subhasish Mitra Stanford University IEEE SELSE 4 - March 26, 2008 www.OpenSPARC.net Outline 1.Chip Multi-threading (CMT) 2.OpenSPARC T2 and T1 processors 3.Reliability in OpenSPARC processors 4.What is available in OpenSPARC 5.Current university research using OpenSPARC 6.Future research directions IEEE SELSE 4 – March 26, 2008 2 www.OpenSPARC.net World's First 64-bit Open Source Microprocessor OpenSPARC.net Governed by GPLv2 Complete processor architecture & implementation Register Transfer Level (RTL) Hypervisor API Verification suite and architectural models Simulation model for operating system bringup on s/w IEEE SELSE 4 – March 26, 2008 3 www.OpenSPARC.net Chip Multithreading (CMT) Instruction- Low Low Low Medium Low High level Parallelism Thread-level Parallelism High High High High High Instruction/Data Large Large Medium Large Large Working Set Data Sharing Low Medium High Medium High Medium IEEE SELSE 4 – March 26, 2008 4 www.OpenSPARC.net Memory Bottleneck Relative Performance 10000 CPU Frequency DRAM Speeds 1000 2 Years 100 Every Gap 2x -- CPU 6 10 -- 2x Every DRAM Years 1 1980 1985 1990 1995 2000 2005 Source: Sun World Wide Analyst Conference Feb. 25, 2003 IEEE SELSE 4 – March 26, 2008 5 www.OpenSPARC.net Single Threading HURRY Up to 85% Cycles Waiting for Memory
    [Show full text]