Opensparc™ Internals

Total Page:16

File Type:pdf, Size:1020Kb

Opensparc™ Internals ISBN 978-0-557-01974-8 90000 > 9 780557 019748 OpenSPARC™ Internals OpenSPARC T1/T2 CMT Throughput Computing David L. Weaver, Editor Sun Microsystems, Inc. 4150 Network Circle Santa Clara, CA 95054 U.S.A. 650-960-1300 Copyright 2002-2008 Sun Microsystems, Inc., 4150 Network Circle • Santa Clara, CA 950540 USA. All rights reserved. This product or document is protected by copyright and distributed under licenses restricting its use, copying, distribution, and decompilation. No part of this product or document may be reproduced in any form by any means without prior written authorization of Sun and its licensors, if any. Third-party software, including font technology, is copyrighted and licensed from Sun suppliers. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and other countries, exclusively licensed through X/Open Company, Ltd. For Netscape Communicator, the following notice applies: Copyright 1995 Netscape Communications Corporation. All rights reserved. Sun, Sun Microsystems, the Sun logo, Solaris, OpenSolaris, OpenSPARC, Java, MAJC, Sun Fire, UltraSPARC, and VIS are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. or its subsidiaries in the U.S. and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the U.S. and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. The OPEN LOOK and Sun Graphical User Interface was developed by Sun Microsystems, Inc. for its users and licensees. Sun acknowledges the pioneering efforts of Xerox in researching and developing the concept of visual or graphical user interfaces for the computer industry. Sun holds a non-exclusive license from Xerox to the Xerox Graphical User Interface, which license also covers Sun’s licensees who implement OPEN LOOK GUIs and otherwise comply with Sun’s written license agreements. RESTRICTED RIGHTS: Use, duplication, or disclosure by the U.S. Government is subject to restrictions of FAR 52.227-14(g)(2)(6/87) and FAR 52.227-19(6/87), or DFAR 252.227- 7015(b)(6/95) and DFAR 227.7202-3(a). DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. ISBN 978-0-557-01974-8 First printing, October 2008 Contents Preface . xiii 1 Introducing Chip Multithreaded (CMT) Processors . 1 2 OpenSPARC Designs . 7 2.1 Academic Uses for OpenSPARC . 8 2.2 Commercial Uses for OpenSPARC . 8 2.2.1 FPGA Implementation. 9 2.2.2 Design Minimization. 9 2.2.3 Coprocessors . 9 2.2.4 OpenSPARC as Test Input to CAD/EDA Tools. 10 3 Architecture Overview. 11 3.1 The UltraSPARC Architecture . 12 3.1.1 Features . 12 3.1.2 Attributes. 13 3.1.2.1 Design Goals . 14 3.1.2.2 Register Windows . 14 3.1.3 System Components . 14 3.1.3.1 Binary Compatibility . 14 3.1.3.2 UltraSPARC Architecture MMU . 15 3.1.3.3 Privileged Software . 15 3.2 Processor Architecture . 15 3.2.1 Integer Unit (IU) . 16 3.2.2 Floating-Point Unit (FPU). 16 3.3 Instructions . 17 3.3.1 Memory Access. 17 3.3.1.1 Memory Alignment Restrictions . 18 3.3.1.2 Addressing Conventions . 18 3.3.1.3 Addressing Range . 18 3.3.1.4 Load/Store Alternate . 19 3.3.1.5 Separate Instruction and Data Memories . 19 3.3.1.6 Input/Output (I/O) . 20 3.3.1.7 Memory Synchronization. 20 v vi Contents 3.3.2 Integer Arithmetic / Logical / Shift Instructions . 20 3.3.3 Control Transfer . 20 3.3.4 State Register Access . 21 3.3.4.1 Ancillary State Registers . 21 3.3.4.2 PR State Registers . 21 3.3.4.3 HPR State Registers. 22 3.3.5 Floating-Point Operate . 22 3.3.6 Conditional Move . 22 3.3.7 Register Window Management . 22 3.3.8 SIMD. 22 3.4 Traps . 23 3.5 Chip-Level Multithreading (CMT) . 23 4 OpenSPARC T1 and T2 Processor Implementations . 25 4.1 General Background . 25 4.2 OpenSPARC T1 Overview. 27 4.3 OpenSPARC T1 Components . 29 4.3.1 OpenSPARC T1 Physical Core . 29 4.3.2 Floating-Point Unit (FPU). 30 4.3.3 L2 Cache . 31 4.3.4 DRAM Controller . 31 4.3.5 I/O Bridge (IOB) Unit . 31 4.3.6 J-Bus Interface (JBI) . 32 4.3.7 SSI ROM Interface . 32 4.3.8 Clock and Test Unit (CTU) . 32 4.3.9 EFuse. 33 4.4 OpenSPARC T2 Overview. 33 4.5 OpenSPARC T2 Components . 34 4.5.1 OpenSPARC T2 Physical Core . 35 4.5.2 L2 Cache . 35 4.5.3 Memory Controller Unit (MCU) . 35 4.5.4 Noncacheable Unit (NCU) . 36 4.5.5 System Interface Unit (SIU) . 36 4.5.6 SSI ROM Interface (SSI). 36 4.6 Summary of Differences Between OpenSPARC T1 and OpenSPARC T2 . 36 4.6.1 Microarchitectural Differences . 37 4.6.2 Instruction Set Architecture (ISA) Differences . 37 4.6.3 MMU Differences . 39 4.6.4 Performance Instrumentation Differences . 40 4.6.5 Error Handling Differences . 40 4.6.6 Power Management Differences . 41 4.6.7 Configuration, Diagnostic, and Debug Differences. 42 vii 5 OpenSPARC T2 Memory Subsystem — A Deeper Look . 43 5.1 Caches . 44 5.1.1 L1 I-Cache. 44 5.1.2 L1 D-Cache . 44 5.1.3 L2 Cache . 45 5.2 Memory Controller Unit (MCU) . 47 5.3 Memory Management Unit (MMU). 50 5.3.1 Address Translation Overview . 50 5.3.2 TLB Miss Handling. 51 5.3.3 Instruction Fetching. 52 5.3.4 Hypervisor Support . 53 5.3.5 MMU Operations . 54 5.3.5.1 TLB Operation Summary. 54 5.3.5.2 Demap Operations . 54 5.4 Noncacheable Unit (NCU). 55 5.5 System Interface Unit (SIU). 55 5.6 Data Management Unit (DMU) . 56 5.7 Memory Models. 56 5.8 Memory Transactions. 57 5.8.1 Cache Flushing . 58 5.8.2 Displacement Flushing . 58 5.8.3 Memory Accesses and Cacheability . 59 5.8.4 Cacheable Accesses. 59 5.8.5 Noncacheable and Side-Effect Accesses . 60 5.8.6 Global Visibility and Memory Ordering . 60 5.8.7 Memory Synchronization: MEMBAR and FLUSH. 61 5.8.8 Atomic Operations . 62 5.8.9 Nonfaulting Load . 63 6 OpenSPARC Processor Configuration . 65 6.1 Selecting Compilation Options in the T1 Core . 66 6.1.1 FPGA_SYN . ..
Recommended publications
  • Sun Fire E2900 Server
    Sun FireTM E2900 Server Just the Facts February 2005 SunWin token 401325 Sun Confidential – Internal Use Only Just The Facts Sun Fire E2900 Server Copyrights ©2005 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Netra, Ultra, UltraComputing, Sun Enterprise, Sun Enterprise Ultra, Starfire, Solaris, Sun WebServer, OpenBoot, Solaris Web Start Wizards, Solstice, Solstice AdminSuite, Solaris Management Console, SEAM, SunScreen, Solstice DiskSuite, Solstice Backup, Sun StorEdge, Sun StorEdge LibMON, Solstice Site Manager, Solstice Domain Manager, Solaris Resource Manager, ShowMe, ShowMe How, SunVTS, Solstice Enterprise Agents, Solstice Enterprise Manager, Java, ShowMe TV, Solstice TMNscript, SunLink, Solstice SunNet Manager, Solstice Cooperative Consoles, Solstice TMNscript Toolkit, Solstice TMNscript Runtime, SunScreen EFS, PGX, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunStart, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. All other product or service names mentioned
    [Show full text]
  • Embedded Processors on FPGA: Hard-Core Vs Soft-Core Vivek J
    Grand Valley State University ScholarWorks@GVSU Masters Theses Graduate Research and Creative Practice 5-19-2017 Embedded processors on FPGA: Hard-core vs Soft-core Vivek J. Vazhoth Kanhiroth Grand Valley State University Follow this and additional works at: http://scholarworks.gvsu.edu/theses Part of the Engineering Commons Recommended Citation Vazhoth Kanhiroth, Vivek J., "Embedded processors on FPGA: Hard-core vs Soft-core" (2017). Masters Theses. 845. http://scholarworks.gvsu.edu/theses/845 This Thesis is brought to you for free and open access by the Graduate Research and Creative Practice at ScholarWorks@GVSU. It has been accepted for inclusion in Masters Theses by an authorized administrator of ScholarWorks@GVSU. For more information, please contact [email protected]. Embedded processors on FPGA: Hard-core vs Soft-core Vivek Jayakrishnan Vazhoth Kanhiroth A Thesis submitted to the Graduate Faculty of GRAND VALLEY STATE UNIVERSITY In Partial Fulfilment of the Requirements For the Degree of Master of Science in Electrical Engineering Padnos College of Engineering and Computing April 2017 DEDICATION To my parents Jayakrishnan and Jayalakshmi who are my biggest inspiration and to my mentor Rajesh without whose help I would never have come out of my shell. 3 ACKNOWLEDGEMENTS I would like to thank my Thesis Advisor Dr. Chirag Parikh without whose patience, guidance and understanding I would not have finished this thesis. I would also like to thank my Thesis committee members Dr. Christian Trefftz and Dr. Azizur Rahman for their valuable inputs and feedback about my thesis. I am indebted to Dr. Shabbir Choudhuri for always being approachable and helping me on innumerable occasions over the last 3 years.
    [Show full text]
  • Sun Ultratm 5 Workstation Just the Facts
    Sun UltraTM 5 Workstation Just the Facts Copyrights 1999 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Ultra, PGX, PGX24, Solaris, Sun Enterprise, SunClient, UltraComputing, Catalyst, SunPCi, OpenWindows, PGX32, VIS, Java, JDK, XGL, XIL, Java 3D, SunVTS, ShowMe, ShowMe TV, SunForum, Java WorkShop, Java Studio, AnswerBook, AnswerBook2, Sun Enterprise SyMON, Solstice, Solstice AutoClient, ShowMe How, SunCD, SunCD 2Plus, Sun StorEdge, SunButtons, SunDials, SunMicrophone, SunFDDI, SunLink, SunHSI, SunATM, SLC, ELC, IPC, IPX, SunSpectrum, JavaStation, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. OpenGL is a registered trademark of Silicon Graphics, Inc. Display PostScript and PostScript are trademarks of Adobe Systems, Incorporated, which may be registered in certain jurisdictions. Netscape is a trademark of Netscape Communications Corporation. DLT is claimed as a trademark of Quantum Corporation in the United States and other countries. Just the Facts May 1999 Positioning The Sun UltraTM 5 Workstation Figure 1. The Ultra 5 workstation The Sun UltraTM 5 workstation is an entry-level workstation based upon the 333- and 360-MHz UltraSPARCTM-IIi processors. The Ultra 5 is Sun’s lowest-priced workstation, designed to meet the needs of price-sensitive and volume-purchase customers in the personal workstation market without sacrificing performance.
    [Show full text]
  • Datasheet Fujitsu Sparc Enterprise T5440 Server
    DATASHEET FUJITSU SPARC ENTERPRISE T5440 SERVER DATASHEET FUJITSU SPARC ENTERPRISE T5440 SERVER THE SYSTEM THAT MOVES WEB APPLICATION CONSOLIDATION INTO MID-RANGE COMPUTING. UP TO 4 HIGH PERFORMANCE PROCESSORS, HIGH MEMORY AND EXTENSIVE CONNECTIVITY PROVIDE THE INFRASTRUCTURE FOR BACK OFFICE AND DATA CENTER CONSOLIDATION TASKS. FUJITSU SPARC ENTERPRISE FOR WEB SECURITY, SPARC ENVIRONMENTS MEAN MANAGEABILITY AND EFFICIENCY AND PERFORMANCE RELIABILITY Fujitsu SPARC Enterprise throughput computing Based on a four socket design, Fujitsu SPARC servers are the ultimate in Web and front-end Enterprise T5440 provides up to 256 threads and business processes. Designed for space efficiency, 512GB of memory for outstanding workload low power consumption, and maximum compute consolidation. These servers can deliver outstanding performance they provide high throughput, data throughput performance in web and network energy-saving, and space-saving solutions, in Web environments while also delivering excellent server server deployment. Built on UltraSPARC T2 or consolidation capability for back office and UltraSPARC T2 Plus processors, everything is departmental database solutions. Fully supported by integrated together on each processor chip to reduce solid management and the top scalability and the overall component count. This speeds openness of the Solaris Operating system, you have performance lowers power use and reduces the ability to maximise thread utilization, deliver component failure. Add in the no-cost virtualization application capability, and scale as large as you technology from Logical Domains and Solaris need. Containers and you have a fully scalable environment for server consolidation. Finish it off with on-chip The intrinsic service management in Fujitsu SPARC encryption and 10 Giga-bit Ethernet freeways and Enterprise T5440 combined with the SPARC they provide the compete environment for secure hardware architecture and Solaris operating system data processing and lightening fast throughput.
    [Show full text]
  • Oracle® Developer Studio 12.6
    ® Oracle Developer Studio 12.6: C++ User's Guide Part No: E77789 July 2017 Oracle Developer Studio 12.6: C++ User's Guide Part No: E77789 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs.
    [Show full text]
  • Design of a Rad-Hard Efuse Trimming Circuit For
    Master Thesis 2018 DESIGN OF A RAD-HARD EFUSE TRIMMING CIRCUIT FOR BANDGAP VOLTAGE REFERENCE FOR LHC EXPERIMENTS UPGRADES Supervisors: Student: Prof. Maher Kayal1 Mustafa Beşirli Dr. Adil Koukab1 Dr. Stefano Michelis2 CERN-THESIS-2018-084 28/06/2018 1School of Engineering (STI), Electronics Laboratory (ELAB), EPFL. 2Experimental Physics Department, Microelectronics Section (EP-ESE-ME), CERN. Electronics Laboratory, STI/ELAB Electrical and Electronic Engineering Section 22 June 2018 2 ACKNOWLEDGEMENTS At the end of the two years of my master’s studies, I would like to thank all the people who supported me during this significant period of my life. First, I would like to thank Prof. Maher Kayal for having given me the chance to work in ELAB and I would like to express my gratitude to Prof. Adil Koukab for having given me the opportunity to work in collaboration with CERN and for supervising my thesis. I would like to express my appreciation to Stefano Michelis for his constant help and precious advices during the development of this project and for providing me vast amount of knowledge on rad-hard analog design. I would also like to thank Federico Faccio for his valuable advices and I would like to express my gratitude to Giacomo Ripamonti for his consistent support during the design and test of my chip. These years were very important for my professional career and personal development. I would like to thank all my friends at EPFL and at CERN; it was nice to meet them. I would also like to express my gratitude to my friends in Turkey for their consistent supports.
    [Show full text]
  • Implementation, Verification and Validation of an Openrisc-1200
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 1, 2019 Implementation, Verification and Validation of an OpenRISC-1200 Soft-core Processor on FPGA Abdul Rafay Khatri Department of Electronic Engineering, QUEST, NawabShah, Pakistan Abstract—An embedded system is a dedicated computer system in which hardware and software are combined to per- form some specific tasks. Recent advancements in the Field Programmable Gate Array (FPGA) technology make it possible to implement the complete embedded system on a single FPGA chip. The fundamental component of an embedded system is a microprocessor. Soft-core processors are written in hardware description languages and functionally equivalent to an ordinary microprocessor. These soft-core processors are synthesized and implemented on the FPGA devices. In this paper, the OpenRISC 1200 processor is used, which is a 32-bit soft-core processor and Fig. 1. General block diagram of embedded systems. written in the Verilog HDL. Xilinx ISE tools perform synthesis, design implementation and configure/program the FPGA. For verification and debugging purpose, a software toolchain from (RISC) processor. This processor consists of all necessary GNU is configured and installed. The software is written in C components which are available in any other microproces- and Assembly languages. The communication between the host computer and FPGA board is carried out through the serial RS- sor. These components are connected through a bus called 232 port. Wishbone bus. In this work, the OR1200 processor is used to implement the system on a chip technology on a Virtex-5 Keywords—FPGA Design; HDLs; Hw-Sw Co-design; Open- FPGA board from Xilinx.
    [Show full text]
  • Sun Ultratm 2 Workstation Just the Facts
    Sun UltraTM 2 Workstation Just the Facts Copyrights 1999 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun Logo, Ultra, SunFastEthernet, Sun Enterprise, TurboGX, TurboGXplus, Solaris, VIS, SunATM, SunCD, XIL, XGL, Java, Java 3D, JDK, S24, OpenWindows, Sun StorEdge, SunISDN, SunSwift, SunTRI/S, SunHSI/S, SunFastEthernet, SunFDDI, SunPC, NFS, SunVideo, SunButtons SunDials, UltraServer, IPX, IPC, SLC, ELC, Sun-3, Sun386i, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. OpenGL is a registered trademark of Silicon Graphics, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. Display PostScript and PostScript are trademarks of Adobe Systems, Incorporated. DLT is claimed as a trademark of Quantum Corporation in the United States and other countries. Just the Facts May 1999 Sun Ultra 2 Workstation Figure 1. The Sun UltraTM 2 workstation Sun Ultra 2 Workstation Scalable Computing Power for the Desktop Sun UltraTM 2 workstations are designed for the technical users who require high performance and multiprocessing (MP) capability. The Sun UltraTM 2 desktop series combines the power of multiprocessing with high-bandwidth networking, high-performance graphics, and exceptional application performance in a compact desktop package. Users of MP-ready and multithreaded applications will benefit greatly from the performance of the Sun Ultra 2 dual-processor capability.
    [Show full text]
  • 5V/12V Efuse with Over Voltage Protection and Blocking FET Control Check for Samples: TPS2592AA, TPS2592AL, TPS2592BA, TPS2592BL, TPS2592ZA
    TPS2592AA, TPS2592AL TPS2592BA, TPS2592BL TPS2592ZA www.ti.com SLVSC11B –JUNE 2013–REVISED NOVEMBER 2013 5V/12V eFuse with Over Voltage Protection and Blocking FET Control Check for Samples: TPS2592AA, TPS2592AL, TPS2592BA, TPS2592BL, TPS2592ZA 1FEATURES APPLICATIONS 2• 12 V eFuse – TPS2592Ax • HDD and SSD Drives • 5 V eFuse – TPS2592Bx • Set Top Boxes • 4.5 V – 18 V Protection – TPS2592Zx • Servers / AUX Supplies • Integrated 28mΩ Pass MOSFET • Fan Control • Fixed Over-Voltage Clamp (TPS2592Ax/Bx) • PCI/PCIe Cards • Absolute Maximum Voltage of 20V • Switches/Routers • 2 A to 5 A Adjustable I (±15% Accuracy) LIMIT PRODUCT INFORMATION(1) • Reverse Current Blocking Support FAULT PART NO UV OV CLAMP Status • Programmable OUT Slew Rate, UVLO RESPONSE • Built-in Thermal Shutdown TPS2592AA 4.3 V 15 V Auto Retry Active • UL Recognition Pending TPS2592BA 4.3 V 6.1 V Auto Retry Active TPS2592AL 4.3 V 15 V Latched Active • Safe during Single Point Failure Test TPS2592BL 4.3 V 6.1 V Latched Active (UL60950) TPS2592ZA 4.3 V — Auto-retry Active • Small Foot Print – 10L (3mm x 3mm) VSON TPS2592ZL 4.3 V — Latched Preview (1) For the most current package and ordering information, see the Package Option Addendum at the end of this document, or see the TI web site at www.ti.com DESCRIPTION The TPS2592xx family of eFuses is a highly integrated circuit protection and power management solution in a tiny package. The devices use few external components and provide multiple protection modes. They are a robust defense against overloads, shorts circuits, voltage surges, excessive inrush current, and reverse current.
    [Show full text]
  • Openpiton: an Open Source Manycore Research Framework
    OpenPiton: An Open Source Manycore Research Framework Jonathan Balkind Michael McKeown Yaosheng Fu Tri Nguyen Yanqi Zhou Alexey Lavrov Mohammad Shahrad Adi Fuchs Samuel Payne ∗ Xiaohua Liang Matthew Matl David Wentzlaff Princeton University fjbalkind,mmckeown,yfu,trin,yanqiz,alavrov,mshahrad,[email protected], [email protected], fxiaohua,mmatl,[email protected] Abstract chipset Industry is building larger, more complex, manycore proces- sors on the back of strong institutional knowledge, but aca- demic projects face difficulties in replicating that scale. To Tile alleviate these difficulties and to develop and share knowl- edge, the community needs open architecture frameworks for simulation, synthesis, and software exploration which Chip support extensibility, scalability, and configurability, along- side an established base of verification tools and supported software. In this paper we present OpenPiton, an open source framework for building scalable architecture research proto- types from 1 core to 500 million cores. OpenPiton is the world’s first open source, general-purpose, multithreaded manycore processor and framework. OpenPiton leverages the industry hardened OpenSPARC T1 core with modifica- Figure 1: OpenPiton Architecture. Multiple manycore chips tions and builds upon it with a scratch-built, scalable uncore are connected together with chipset logic and networks to creating a flexible, modern manycore design. In addition, build large scalable manycore systems. OpenPiton’s cache OpenPiton provides synthesis and backend scripts for ASIC coherence protocol extends off chip. and FPGA to enable other researchers to bring their designs to implementation. OpenPiton provides a complete verifica- tion infrastructure of over 8000 tests, is supported by mature software tools, runs full-stack multiuser Debian Linux, and has been widespread across the industry with manycore pro- is written in industry standard Verilog.
    [Show full text]
  • Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor
    Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor Jiwei Lu, Abhinav Das, Wei-Chung Hsu Khoa Nguyen, Santosh G. Abraham Department of Computer Science and Engineering Scalable Systems Group University of Minnesota, Twin Cities Sun Microsystems Inc. {jiwei,adas,hsu}@cs.umn.edu {khoa.nguyen,santosh.abraham}@sun.com Abstract [26], [28], the processor checkpoints the architectural state and continues speculative execution that Data prefetching via helper threading has been prefetches subsequent misses in the shadow of the extensively investigated on Simultaneous Multi- initial triggering missing load. When the initial load Threading (SMT) or Virtual Multi-Threading (VMT) arrives, the processor resumes execution from the architectures. Although reportedly large cache checkpointed state. In software pre-execution (also latency can be hidden by helper threads at runtime, referred to as helper threads or software scouting) [2], most techniques rely on hardware support to reduce [4], [7], [10], [14], [24], [29], [35], a distilled version context switch overhead between the main thread and of the forward slice starting from the missing load is helper thread as well as rely on static profile feedback executed, minimizing the utilization of execution to construct the help thread code. This paper develops resources. Helper threads utilizing run-time a new solution by exploiting helper threaded pre- compilation techniques may also be effectively fetching through dynamic optimization on the latest deployed on processors that do not have the necessary UltraSPARC Chip-Multiprocessing (CMP) processor. hardware support for hardware scouting (such as Our experiments show that by utilizing the otherwise checkpointing and resuming regular execution). idle processor core, a single user-level helper thread Initial research on software helper threads is sufficient to improve the runtime performance of the developed the underlying run-time compiler main thread without triggering multiple thread slices.
    [Show full text]
  • Chapter 2 Java Processor Architectural
    INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. ProQuest Information and Leaming 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600 UMI" The JAFARDD Processor: A Java Architecture Based on a Folding Algorithm, with Reservation Stations, Dynamic Translation, and Dual Processing by Mohamed Watheq AH Kamel El-Kharashi B. Sc., Ain Shams University, 1992 M. Sc., Ain Shams University, 1996 A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of D o c t o r o f P h il o s o p h y in the Department of Electrical and Computer Engineering We accept this dissertation as conforming to the required standard Dr. F. Gebali, Supervisor (Department of Electrical and Computer Engineering) Dr.
    [Show full text]