Architectures of Processor Chips Basic Architectural Features
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Datasheet Fujitsu Sparc Enterprise T5440 Server
DATASHEET FUJITSU SPARC ENTERPRISE T5440 SERVER DATASHEET FUJITSU SPARC ENTERPRISE T5440 SERVER THE SYSTEM THAT MOVES WEB APPLICATION CONSOLIDATION INTO MID-RANGE COMPUTING. UP TO 4 HIGH PERFORMANCE PROCESSORS, HIGH MEMORY AND EXTENSIVE CONNECTIVITY PROVIDE THE INFRASTRUCTURE FOR BACK OFFICE AND DATA CENTER CONSOLIDATION TASKS. FUJITSU SPARC ENTERPRISE FOR WEB SECURITY, SPARC ENVIRONMENTS MEAN MANAGEABILITY AND EFFICIENCY AND PERFORMANCE RELIABILITY Fujitsu SPARC Enterprise throughput computing Based on a four socket design, Fujitsu SPARC servers are the ultimate in Web and front-end Enterprise T5440 provides up to 256 threads and business processes. Designed for space efficiency, 512GB of memory for outstanding workload low power consumption, and maximum compute consolidation. These servers can deliver outstanding performance they provide high throughput, data throughput performance in web and network energy-saving, and space-saving solutions, in Web environments while also delivering excellent server server deployment. Built on UltraSPARC T2 or consolidation capability for back office and UltraSPARC T2 Plus processors, everything is departmental database solutions. Fully supported by integrated together on each processor chip to reduce solid management and the top scalability and the overall component count. This speeds openness of the Solaris Operating system, you have performance lowers power use and reduces the ability to maximise thread utilization, deliver component failure. Add in the no-cost virtualization application capability, and scale as large as you technology from Logical Domains and Solaris need. Containers and you have a fully scalable environment for server consolidation. Finish it off with on-chip The intrinsic service management in Fujitsu SPARC encryption and 10 Giga-bit Ethernet freeways and Enterprise T5440 combined with the SPARC they provide the compete environment for secure hardware architecture and Solaris operating system data processing and lightening fast throughput. -
Oracle® Developer Studio 12.6
® Oracle Developer Studio 12.6: C++ User's Guide Part No: E77789 July 2017 Oracle Developer Studio 12.6: C++ User's Guide Part No: E77789 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, delivered to U.S. Government end users are "commercial computer software" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, use, duplication, disclosure, modification, and adaptation of the programs, including any operating system, integrated software, any programs installed on the hardware, and/or documentation, shall be subject to license terms and license restrictions applicable to the programs. -
18-741 Advanced Computer Architecture Lecture 1: Intro And
18-742 Fall 2012 Parallel Computer Architecture Lecture 10: Multithreading II Prof. Onur Mutlu Carnegie Mellon University 9/28/2012 Reminder: Review Assignments Due: Sunday, September 30, 11:59pm. Mutlu, “Some Ideas and Principles for Achieving Higher System Energy Efficiency,” NSF Position Paper and Presentation 2012. Ebrahimi et al., “Parallel Application Memory Scheduling,” MICRO 2011. Seshadri et al., “The Evicted-Address Filter: A Unified Mechanism to Address Both Cache Pollution and Thrashing,” PACT 2012. Pekhimenko et al., “Linearly Compressed Pages: A Main Memory Compression Framework with Low Complexity and Low Latency,” CMU SAFARI Technical Report 2012. 2 Feedback on Project Proposals In your email General feedback points Concrete mechanisms, even if not fully right, is a good place to start testing your ideas 3 Last Lecture Asymmetry in Memory Scheduling Wrap up Asymmetry Multithreading Fine-grained Coarse-grained 4 Today More Multithreading 5 More Multithreading 6 Readings: Multithreading Required Spracklen and Abraham, “Chip Multithreading: Opportunities and Challenges,” HPCA Industrial Session, 2005. Kalla et al., “IBM Power5 Chip: A Dual-Core Multithreaded Processor,” IEEE Micro 2004. Tullsen et al., “Exploiting choice: instruction fetch and issue on an implementable simultaneous multithreading processor,” ISCA 1996. Eyerman and Eeckhout, “A Memory-Level Parallelism Aware Fetch Policy for SMT Processors,” HPCA 2007. Recommended Hirata et al., “An Elementary Processor Architecture with Simultaneous Instruction Issuing from Multiple Threads,” ISCA 1992 Smith, “A pipelined, shared resource MIMD computer,” ICPP 1978. Gabor et al., “Fairness and Throughput in Switch on Event Multithreading,” MICRO 2006. Agarwal et al., “APRIL: A Processor Architecture for Multiprocessing,” ISCA 1990. 7 Review: Fine-grained vs. -
Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor
Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor Jiwei Lu, Abhinav Das, Wei-Chung Hsu Khoa Nguyen, Santosh G. Abraham Department of Computer Science and Engineering Scalable Systems Group University of Minnesota, Twin Cities Sun Microsystems Inc. {jiwei,adas,hsu}@cs.umn.edu {khoa.nguyen,santosh.abraham}@sun.com Abstract [26], [28], the processor checkpoints the architectural state and continues speculative execution that Data prefetching via helper threading has been prefetches subsequent misses in the shadow of the extensively investigated on Simultaneous Multi- initial triggering missing load. When the initial load Threading (SMT) or Virtual Multi-Threading (VMT) arrives, the processor resumes execution from the architectures. Although reportedly large cache checkpointed state. In software pre-execution (also latency can be hidden by helper threads at runtime, referred to as helper threads or software scouting) [2], most techniques rely on hardware support to reduce [4], [7], [10], [14], [24], [29], [35], a distilled version context switch overhead between the main thread and of the forward slice starting from the missing load is helper thread as well as rely on static profile feedback executed, minimizing the utilization of execution to construct the help thread code. This paper develops resources. Helper threads utilizing run-time a new solution by exploiting helper threaded pre- compilation techniques may also be effectively fetching through dynamic optimization on the latest deployed on processors that do not have the necessary UltraSPARC Chip-Multiprocessing (CMP) processor. hardware support for hardware scouting (such as Our experiments show that by utilizing the otherwise checkpointing and resuming regular execution). idle processor core, a single user-level helper thread Initial research on software helper threads is sufficient to improve the runtime performance of the developed the underlying run-time compiler main thread without triggering multiple thread slices. -
Chapter 2 Java Processor Architectural
INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. ProQuest Information and Leaming 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600 UMI" The JAFARDD Processor: A Java Architecture Based on a Folding Algorithm, with Reservation Stations, Dynamic Translation, and Dual Processing by Mohamed Watheq AH Kamel El-Kharashi B. Sc., Ain Shams University, 1992 M. Sc., Ain Shams University, 1996 A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of D o c t o r o f P h il o s o p h y in the Department of Electrical and Computer Engineering We accept this dissertation as conforming to the required standard Dr. F. Gebali, Supervisor (Department of Electrical and Computer Engineering) Dr. -
Debugging Multicore & Shared- Memory Embedded Systems
Debugging Multicore & Shared- Memory Embedded Systems Classes 249 & 269 2007 edition Jakob Engblom, PhD Virtutech [email protected] 1 Scope & Context of This Talk z Multiprocessor revolution z Programming multicore z (In)determinism z Error sources z Debugging techniques 2 Scope and Context of This Talk z Some material specific to shared-memory symmetric multiprocessors and multicore designs – There are lots of problems particular to this z But most concepts are general to almost any parallel application – The problem is really with parallelism and concurrency rather than a particular design choice 3 Introduction & Background Multiprocessing: what, why, and when? 4 The Multicore Revolution is Here! z The imminent event of parallel computers with many processors taking over from single processors has been declared before... z This time it is for real. Why? z More instruction-level parallelism hard to find – Very complex designs needed for small gain – Thread-level parallelism appears live and well z Clock frequency scaling is slowing drastically – Too much power and heat when pushing envelope z Cannot communicate across chip fast enough – Better to design small local units with short paths z Effective use of billions of transistors – Easier to reuse a basic unit many times z Potential for very easy scaling – Just keep adding processors/cores for higher (peak) performance 5 Parallel Processing z John Hennessy, interviewed in the ACM Queue sees the following eras of computer architecture evolution: 1. Initial efforts and early designs. 1940. ENIAC, Zuse, Manchester, etc. 2. Instruction-Set Architecture. Mid-1960s. Starting with the IBM System/360 with multiple machines with the same compatible instruction set 3. -
Sun SPARC Enterprise T5440 Servers
Sun SPARC Enterprise® T5440 Server Just the Facts SunWIN token 526118 December 16, 2009 Version 2.3 Distribution restricted to Sun Internal and Authorized Partners Only. Not for distribution otherwise, in whole or in part T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 1 of 133 Copyrights ©2008, 2009 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Sun SPARC Enterprise, Solaris, Java, J2EE, Sun Java, SunSpectrum, iForce, VIS, SunVTS, Sun N1, CoolThreads, Sun StorEdge, Sun Enterprise, Netra, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, and SunSpectrum Bronze are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 2 of 133 Revision History Version Date Comments 1.0 Oct. 13, 2008 - Initial version 1.1 Oct. 16, 2008 - Enhanced I/O Expansion Module section - Notes on release tabs of XSR-1242/XSR-1242E rack - Updated IBM 560 and HP DL580 G5 competitive information - Updates to external storage products 1.2 Nov. 18, 2008 - Number -
Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux
Hindawi Publishing Corporation EURASIP Journal on Embedded Systems Volume 2008, Article ID 582648, 16 pages doi:10.1155/2008/582648 Research Article Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux Emiliano Betti,1 Daniel Pierre Bovet,1 Marco Cesati,1 and Roberto Gioiosa1, 2 1 System Programming Research Group, Department of Computer Science, Systems, and Production, University of Rome “Tor Vergata”, Via del Politecnico 1, 00133 Rome, Italy 2 Computer Architecture Group, Computer Science Division, Barcelona Supercomputing Center (BSC), c/ Jordi Girona 31, 08034 Barcelona, Spain Correspondence should be addressed to Roberto Gioiosa, [email protected] Received 30 March 2007; Accepted 15 August 2007 Recommended by Ismael Ripoll Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsive- ness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations. Copyright © 2008 Emiliano Betti et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION nificantly higher than that of single-core processors, we can expect that in a near future many embedded systems will This article describes a modified Linux kernel called ASMP- make use of multicore processors. -
Ultrasparc-III Ultrasparc-III Vs Intel IA-64
UltraSparc-III UltraSparc-III vs Intel IA-64 vs • Introduction Intel IA-64 • Framework Definition Maria Celeste Marques Pinto • Architecture Comparition Departamento de Informática, Universidade do Minho • Future Trends 4710 - 057 Braga, Portugal [email protected] • Conclusions ICCA’03 ICCA’03 Introduction Framework Definition • UltraSparc-III (US-III) is the third generation from the • Reliability UltraSPARC family of Sun • Instruction level Parallelism (ILP) • Is a RISC processor and uses the 64-bit SPARC-V9 architecture – instructions per cycle • IA-64 is Intel’s extension into a 64-bit architecture • Branch Handling • IA-64 processor is based on a concept known as EPIC (Explicitly – Techniques: Parallel Instruction Computing) • branch delay slots • predication – Strategies: • static •dynamic ICCA’03 ICCA’03 Framework Definition Framework Definition • Memory Hierarchy • Pipeline – main memory and cache memory – increase the speed of CPU processing – cache levels location – several stages that performs part of the work necessary to execute an instruction – cache organization: • fully associative - every entry has a slot in the "cache directory" to indicate • Instruction Set (IS) where it came from in memory – is the hardware "language" in which the software tells the processor what to do • one-way set associative - only a single directory entry be searched – can be divided into four basic types of operations such as arithmetic, logical, • two-way set associative - two entries per slot to be searched (and is extended to program-control -
Sun Blade 1000 and 2000 Workstations
Sun BladeTM 1000 and 2000 Workstations Just the Facts Copyrights 2002 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Blade, PGX, Solaris, Ultra, Sun Enterprise, Starfire, SunPCi, Forte, VIS, XGL, XIL, Java, Java 3D, SunVideo, SunVideo Plus, Sun StorEdge, SunMicrophone, SunVTS, Solstice, Solstice AdminTools, Solstice Enterprise Agents, ShowMe, ShowMe How, ShowMe TV, Sun Workstation, StarOffice, iPlanet, Solaris Resource Manager, Java 2D, OpenWindows, SunCD, Sun Quad FastEthernet, SunFDDI, SunATM, SunCamera, SunForum, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunSolve, SunSolve EarlyNotifier, and SunClient are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and in other countries, exclusively licensed through X/Open Company, Ltd. FireWire is a registered trademark of Apple Computer, Inc., used under license. OpenGL is a trademark of Silicon Graphics, Inc., which may be registered in certain jurisdictions. Netscape is a trademark of Netscape Communications Corporation. PostScript and Display PostScript are trademarks of Adobe Systems, Inc., which may be registered in -
Sparc Enterprise T5440 Server Architecture
SPARC ENTERPRISE T5440 SERVER ARCHITECTURE Unleashing UltraSPARC T2 Plus Processors with Innovative Multi-core Multi-thread Technology White Paper July 2009 TABLE OF CONTENTS THE ULTRASPARC T2 PLUS PROCESSOR 0 THE WORLD'S FIRST MASSIVELY THREADED SYSTEM ON A CHIP (SOC) 0 TAKING CHIP MULTITHREADED DESIGN TO THE NEXT LEVEL 1 ULTRASPARC T2 PLUS PROCESSOR ARCHITECTURE 3 SERVER ARCHITECTURE 8 SYSTEM-LEVEL ARCHITECTURE 8 CHASSIS DESIGN INNOVATIONS 13 ENTERPRISE-CLASS MANAGEMENT AND SOFTWARE 19 SYSTEM MANAGEMENT TECHNOLOGY 19 SCALABILITY AND SUPPORT FOR INNOVATIVE MULTITHREADING TECHNOLOGY21 CONCLUSION 28 0 The UltraSPARC T2 Plus Processors Chapter 1 The UltraSPARC T2 Plus Processors The UltraSPARC T2 and UltraSPARC T2 Plus processors are the industry’s first system on a chip (SoC), supplying the most cores and threads of any general-purpose processor available, and integrating all key system functions. The World's First Massively Threaded System on a Chip (SoC) The UltraSPARC T2 Plus processor eliminates the need for expensive custom hardware and software development by integrating computing, security, and I/O on to a single chip. Binary compatible with earlier UltraSPARC processors, no other processor delivers so much performance in so little space and with such small power requirements letting organizations rapidly scale the delivery of new network services with maximum efficiency and predictability. The UltraSPARC T2 Plus processor is shown in Figure 1. Figure 1. The UltraSPARC T2 Plus processor with CoolThreads technology 1 The UltraSPARC -
Opensparc – an Open Platform for Hardware Reliability Experimentation
OpenSPARC – An Open Platform for Hardware Reliability Experimentation Ishwar Parulkar and Alan Wood Sun Microsystems, Inc. James C. Hoe and Babak Falsafi Carnegie Mellon University Sarita V. Adve and Josep Torrellas University of Illinois at Urbana- Champaign Subhasish Mitra Stanford University IEEE SELSE 4 - March 26, 2008 www.OpenSPARC.net Outline 1.Chip Multi-threading (CMT) 2.OpenSPARC T2 and T1 processors 3.Reliability in OpenSPARC processors 4.What is available in OpenSPARC 5.Current university research using OpenSPARC 6.Future research directions IEEE SELSE 4 – March 26, 2008 2 www.OpenSPARC.net World's First 64-bit Open Source Microprocessor OpenSPARC.net Governed by GPLv2 Complete processor architecture & implementation Register Transfer Level (RTL) Hypervisor API Verification suite and architectural models Simulation model for operating system bringup on s/w IEEE SELSE 4 – March 26, 2008 3 www.OpenSPARC.net Chip Multithreading (CMT) Instruction- Low Low Low Medium Low High level Parallelism Thread-level Parallelism High High High High High Instruction/Data Large Large Medium Large Large Working Set Data Sharing Low Medium High Medium High Medium IEEE SELSE 4 – March 26, 2008 4 www.OpenSPARC.net Memory Bottleneck Relative Performance 10000 CPU Frequency DRAM Speeds 1000 2 Years 100 Every Gap 2x -- CPU 6 10 -- 2x Every DRAM Years 1 1980 1985 1990 1995 2000 2005 Source: Sun World Wide Analyst Conference Feb. 25, 2003 IEEE SELSE 4 – March 26, 2008 5 www.OpenSPARC.net Single Threading HURRY Up to 85% Cycles Waiting for Memory