Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor

Total Page:16

File Type:pdf, Size:1020Kb

Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor Jiwei Lu, Abhinav Das, Wei-Chung Hsu Khoa Nguyen, Santosh G. Abraham Department of Computer Science and Engineering Scalable Systems Group University of Minnesota, Twin Cities Sun Microsystems Inc. {jiwei,adas,hsu}@cs.umn.edu {khoa.nguyen,santosh.abraham}@sun.com Abstract [26], [28], the processor checkpoints the architectural state and continues speculative execution that Data prefetching via helper threading has been prefetches subsequent misses in the shadow of the extensively investigated on Simultaneous Multi- initial triggering missing load. When the initial load Threading (SMT) or Virtual Multi-Threading (VMT) arrives, the processor resumes execution from the architectures. Although reportedly large cache checkpointed state. In software pre-execution (also latency can be hidden by helper threads at runtime, referred to as helper threads or software scouting) [2], most techniques rely on hardware support to reduce [4], [7], [10], [14], [24], [29], [35], a distilled version context switch overhead between the main thread and of the forward slice starting from the missing load is helper thread as well as rely on static profile feedback executed, minimizing the utilization of execution to construct the help thread code. This paper develops resources. Helper threads utilizing run-time a new solution by exploiting helper threaded pre- compilation techniques may also be effectively fetching through dynamic optimization on the latest deployed on processors that do not have the necessary UltraSPARC Chip-Multiprocessing (CMP) processor. hardware support for hardware scouting (such as Our experiments show that by utilizing the otherwise checkpointing and resuming regular execution). idle processor core, a single user-level helper thread Initial research on software helper threads is sufficient to improve the runtime performance of the developed the underlying run-time compiler main thread without triggering multiple thread slices. algorithms or evaluated them using simulation. With Moreover, since the multiple cores are physically the advent of processors supporting SMT and VMT, decoupled in the CMP, contention introduced by helper threading has been shown to be effective on the helper threading is minimal. This paper also discusses Intel Pentium-4 SMT processor and the Itanium-2 several key technical challenges of building a light- processor [7], [13], [24], [25], [29]. In an SMT weight dynamic optimization/software scouting system processor such as Pentium-4, many of the processor on the UltraSPARC/Solaris platform. core resources such as L1 caches, issue queues are either partitioned or shared. Helper threads need to be constructed, deployed and monitored carefully so that 1. Introduction the negative resource contention effects do not Modern processors spend a significant fraction of outweigh the gains due to prefetching. The VMT overall execution time waiting for the memory systems method used a novel combination of performance to deliver cache lines. This observation has motivated monitoring and debugging features of the Itanium-2 to copious research on hardware and software data toggle between the main thread and helper thread prefetching schemes. Execution-based prefetching is a execution. However, on Itanium-2, the large overhead promising approach that aims to provide high in toggling between these modes limits the number of prefetching coverage and accuracy. These schemes cycles available for actual helper code execution to a exploit the abundant execution resources that are couple of hundred cycles. Only a few missing loads severely underutilized following an L2 or L3 cache can be launched in this short time interval. miss on contemporary processors supporting Almost all general-purpose processor chips are Simultaneous Multi-Threading (SMT) [8] or Virtual moving to Chip Multi-Processors (CMP) [19], [20], Multi-Threading (VMT) [24]. In hardware pre- including the Gemini, Niagara, Panther chips from execution or scouting [3], [15], [18], [22], [23], [25], Sun, the Power4 and Power5 chips from IBM [16] and recent announcements from AMD/Intel. The IBM and Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’05) 0-7695-2440-0/05 $20.00 © 2005 IEEE Sun CMPs have a single L2 cache shared by all the Section 5 evaluates the performance of dynamic helper cores and private L1 caches for each of the cores. threading and Section 6 draws the conclusion. Since the cores do not share any execution or L1 resources, helper thread execution on one core has 2. Background and Related Works minimal negative impact on main thread execution on another core. However, since the L2 cache is shared, Many researchers have proposed to use one or more the helper thread may prefetch data into the L2 cache speculative threads to warm up the shared resources, on behalf of the main thread. Since such CMPs are such as the caches and the branch prediction tables, to soon going to be almost universal in the general- reduce the penalty of cache misses and branch mis- purpose arena and since many single-thread predictions. These helper threads (also called scout applications are dominated by off-chip stall cycles, threads) usually execute a code segment pre- helper thread prefetching on CMPs is an attractive constructed at compile time by identifying the proposition that needs further investigation. instructions on the execution path leading to the The minimal sharing of resources between cores performance bottleneck [5]. gives rise to unique issues that are not present in an 2.1. SMT/VMT vs. CMP SMT or VMT implementation. First, how does the main thread initiate helper thread execution for a Pre-computation based helper threads have been particular L2 cache miss? In an SMT system, both evaluated on Pentium-4 with hyper-threading and on threads are co-located on the same core enabling fast Itanium-2 system with special hardware support for synchronization. Second, how does the main thread VMT [7], [13], [15], [24], [29]. Other helper communicate register values to the helper thread? In threading works such as Data-Driven Multi-Threading the Itanium-2 VMT system, the register file is (DDMT) [3], Simultaneous Subordinate Micro- effectively shared between the main and helper threading (SSMT) [28] and Transparent Threads [10] threads. target at achieving effective helper threaded We have devised innovative mechanisms to prefetching on SMT processors. address these issues, implemented a complete dynamic Unlike SMT and VMT, which share many critical optimization system for helper thread based resources, Chip Multi-processing (CMP) processors prefetching, and measured actual speedups on an limit sharing, for example, to only the L2/L3 cache. existing Sun UltraSPARC IV+ CMP chip [27]. In our While the restricted resource sharing moderates the system, the main thread is bound to one core and the benefit of helper threading to only L2/L3 cache runtime performance monitoring code, the runtime prefetching, it also avoids the drawback of hard-to- optimizer and the dynamically generated helper code control resource contention encountered by helper execute on the other core. Runtime performance threading on SMT. The impact of different resource monitoring selects program regions that have sharing levels to thread communication cost on CMP delinquent loads. The helper code generated for these as well as the corresponding performance margin for regions is optimized to prefetch for delinquent loads. pre-execution have been quantitatively assessed by The main thread uses a mailbox in shared memory to Brown and et al. on a research Itanium CMP [14]. communicate and initiate helper thread execution. The 2.2. Dynamic vs. Static Helper Threading normal caching mechanism maintains this mailbox in the L2 cache and also in the L1 cache of the helper Software-based dynamic optimization [6], [17], [30], thread's core. We address many other implementation [31], [33] adapts to the runtime behavior of a program issues in this first evaluation of helper thread due to the change of input data, and/or the underlying prefetching on a physical Chip-Multiprocessor and micro-architecture. Current dynamic optimizations measure significant performance gains on several include data prefetching, procedure inlining, partial SPEC benchmarks and a real-world application. dead code elimination, and code layout, which have The remainder of this paper is organized as been proven to be useful complements to static follows. Section 2 provides the background and related optimizations. Optimizations such as data cache work. Section 3 discusses the helper thread model in prefetching and branch mis-prediction reduction are our optimization framework, including code selection, usually difficult to perform at compile time since helper thread dispatching, communication and cache miss and branch mis-prediction information synchronization. Section 4 introduces the dynamic may not be available. Furthermore, program hot spots optimization framework on the UltraSPARC system. may as well change under different input data sets or Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’05) 0-7695-2440-0/05 $20.00 © 2005 IEEE start detected new no system busy? phase? no main thread still alive? yes yes no yes prepare profile from the samples dispatch new helper task, if any. no new profile arrives? optimize hot loops and end prepare helper tasks yes check phase status
Recommended publications
  • Sun Fire E2900 Server
    Sun FireTM E2900 Server Just the Facts February 2005 SunWin token 401325 Sun Confidential – Internal Use Only Just The Facts Sun Fire E2900 Server Copyrights ©2005 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Netra, Ultra, UltraComputing, Sun Enterprise, Sun Enterprise Ultra, Starfire, Solaris, Sun WebServer, OpenBoot, Solaris Web Start Wizards, Solstice, Solstice AdminSuite, Solaris Management Console, SEAM, SunScreen, Solstice DiskSuite, Solstice Backup, Sun StorEdge, Sun StorEdge LibMON, Solstice Site Manager, Solstice Domain Manager, Solaris Resource Manager, ShowMe, ShowMe How, SunVTS, Solstice Enterprise Agents, Solstice Enterprise Manager, Java, ShowMe TV, Solstice TMNscript, SunLink, Solstice SunNet Manager, Solstice Cooperative Consoles, Solstice TMNscript Toolkit, Solstice TMNscript Runtime, SunScreen EFS, PGX, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunStart, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. All other product or service names mentioned
    [Show full text]
  • Sun Ultratm 5 Workstation Just the Facts
    Sun UltraTM 5 Workstation Just the Facts Copyrights 1999 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Ultra, PGX, PGX24, Solaris, Sun Enterprise, SunClient, UltraComputing, Catalyst, SunPCi, OpenWindows, PGX32, VIS, Java, JDK, XGL, XIL, Java 3D, SunVTS, ShowMe, ShowMe TV, SunForum, Java WorkShop, Java Studio, AnswerBook, AnswerBook2, Sun Enterprise SyMON, Solstice, Solstice AutoClient, ShowMe How, SunCD, SunCD 2Plus, Sun StorEdge, SunButtons, SunDials, SunMicrophone, SunFDDI, SunLink, SunHSI, SunATM, SLC, ELC, IPC, IPX, SunSpectrum, JavaStation, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. OpenGL is a registered trademark of Silicon Graphics, Inc. Display PostScript and PostScript are trademarks of Adobe Systems, Incorporated, which may be registered in certain jurisdictions. Netscape is a trademark of Netscape Communications Corporation. DLT is claimed as a trademark of Quantum Corporation in the United States and other countries. Just the Facts May 1999 Positioning The Sun UltraTM 5 Workstation Figure 1. The Ultra 5 workstation The Sun UltraTM 5 workstation is an entry-level workstation based upon the 333- and 360-MHz UltraSPARCTM-IIi processors. The Ultra 5 is Sun’s lowest-priced workstation, designed to meet the needs of price-sensitive and volume-purchase customers in the personal workstation market without sacrificing performance.
    [Show full text]
  • Sun Ultratm 2 Workstation Just the Facts
    Sun UltraTM 2 Workstation Just the Facts Copyrights 1999 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun Logo, Ultra, SunFastEthernet, Sun Enterprise, TurboGX, TurboGXplus, Solaris, VIS, SunATM, SunCD, XIL, XGL, Java, Java 3D, JDK, S24, OpenWindows, Sun StorEdge, SunISDN, SunSwift, SunTRI/S, SunHSI/S, SunFastEthernet, SunFDDI, SunPC, NFS, SunVideo, SunButtons SunDials, UltraServer, IPX, IPC, SLC, ELC, Sun-3, Sun386i, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. OpenGL is a registered trademark of Silicon Graphics, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. Display PostScript and PostScript are trademarks of Adobe Systems, Incorporated. DLT is claimed as a trademark of Quantum Corporation in the United States and other countries. Just the Facts May 1999 Sun Ultra 2 Workstation Figure 1. The Sun UltraTM 2 workstation Sun Ultra 2 Workstation Scalable Computing Power for the Desktop Sun UltraTM 2 workstations are designed for the technical users who require high performance and multiprocessing (MP) capability. The Sun UltraTM 2 desktop series combines the power of multiprocessing with high-bandwidth networking, high-performance graphics, and exceptional application performance in a compact desktop package. Users of MP-ready and multithreaded applications will benefit greatly from the performance of the Sun Ultra 2 dual-processor capability.
    [Show full text]
  • Chapter 2 Java Processor Architectural
    INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand comer and continuing from left to right in equal sections with small overlaps. ProQuest Information and Leaming 300 North Zeeb Road, Ann Arbor, Ml 48106-1346 USA 800-521-0600 UMI" The JAFARDD Processor: A Java Architecture Based on a Folding Algorithm, with Reservation Stations, Dynamic Translation, and Dual Processing by Mohamed Watheq AH Kamel El-Kharashi B. Sc., Ain Shams University, 1992 M. Sc., Ain Shams University, 1996 A Dissertation Submitted in Partial Fulfillment of the Requirements for the Degree of D o c t o r o f P h il o s o p h y in the Department of Electrical and Computer Engineering We accept this dissertation as conforming to the required standard Dr. F. Gebali, Supervisor (Department of Electrical and Computer Engineering) Dr.
    [Show full text]
  • Debugging Multicore & Shared- Memory Embedded Systems
    Debugging Multicore & Shared- Memory Embedded Systems Classes 249 & 269 2007 edition Jakob Engblom, PhD Virtutech [email protected] 1 Scope & Context of This Talk z Multiprocessor revolution z Programming multicore z (In)determinism z Error sources z Debugging techniques 2 Scope and Context of This Talk z Some material specific to shared-memory symmetric multiprocessors and multicore designs – There are lots of problems particular to this z But most concepts are general to almost any parallel application – The problem is really with parallelism and concurrency rather than a particular design choice 3 Introduction & Background Multiprocessing: what, why, and when? 4 The Multicore Revolution is Here! z The imminent event of parallel computers with many processors taking over from single processors has been declared before... z This time it is for real. Why? z More instruction-level parallelism hard to find – Very complex designs needed for small gain – Thread-level parallelism appears live and well z Clock frequency scaling is slowing drastically – Too much power and heat when pushing envelope z Cannot communicate across chip fast enough – Better to design small local units with short paths z Effective use of billions of transistors – Easier to reuse a basic unit many times z Potential for very easy scaling – Just keep adding processors/cores for higher (peak) performance 5 Parallel Processing z John Hennessy, interviewed in the ACM Queue sees the following eras of computer architecture evolution: 1. Initial efforts and early designs. 1940. ENIAC, Zuse, Manchester, etc. 2. Instruction-Set Architecture. Mid-1960s. Starting with the IBM System/360 with multiple machines with the same compatible instruction set 3.
    [Show full text]
  • Sun SPARC Enterprise T5440 Servers
    Sun SPARC Enterprise® T5440 Server Just the Facts SunWIN token 526118 December 16, 2009 Version 2.3 Distribution restricted to Sun Internal and Authorized Partners Only. Not for distribution otherwise, in whole or in part T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 1 of 133 Copyrights ©2008, 2009 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Sun SPARC Enterprise, Solaris, Java, J2EE, Sun Java, SunSpectrum, iForce, VIS, SunVTS, Sun N1, CoolThreads, Sun StorEdge, Sun Enterprise, Netra, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, and SunSpectrum Bronze are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 2 of 133 Revision History Version Date Comments 1.0 Oct. 13, 2008 - Initial version 1.1 Oct. 16, 2008 - Enhanced I/O Expansion Module section - Notes on release tabs of XSR-1242/XSR-1242E rack - Updated IBM 560 and HP DL580 G5 competitive information - Updates to external storage products 1.2 Nov. 18, 2008 - Number
    [Show full text]
  • Day 2, 1640: Leveraging Opensparc
    Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina – TEC-EDD OpenSPARC T1 • The T1 is a new-from-the-ground-up SPARC microprocessor implementation that conforms to the UltraSPARC architecture 2005 specification and executes the full SPARC V9 instruction set. Sun has produced two previous multicore processors: UltraSPARC IV and UltraSPARC IV+, but UltraSPARC T1 is its first microprocessor that is both multicore and multithreaded. • The processor is available with 4, 6 or 8 CPU cores, each core able to handle four threads. Thus the processor is capable of processing up to 32 threads concurrently. • Designed to lower the energy consumption of server computers, the 8-cores CPU uses typically 72 W of power at 1.2 GHz. G.Furano, L.Messina – TEC-EDD 72W … 1.2 GHz … 90nm … • Is a cutting edge design, targeted for high-end servers. • NOT FOR SPACE USE • But, let’s see which are the potential spin-in … G.Furano, L.Messina – TEC-EDD Why OPEN ? On March 21, 2006, Sun made the UltraSPARC T1 processor design available under the GNU General Public License. The published information includes: • Verilog source code of the UltraSPARC T1 design, including verification suite and simulation models • ISA specification (UltraSPARC Architecture 2005) • The Solaris 10 OS simulation images • Diagnostics tests for OpenSPARC T1 • Scripts, open source and Sun internal tools needed to simulate the design and to do synthesis of the design • Scripts and documentation to help with FPGA implementation
    [Show full text]
  • Sun Ultratm 25 Workstation & Sun Ultra 45 Workstation Just the Facts
    Sun UltraTM 25 Workstation & Sun Ultra 45 Workstation Just the Facts SunWIN Token# 473547 SunWIN Token# 460409 Copyrights © 2006 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Ultra, Sun Blade, Java, Solaris, Java, NetBeans, Sun Fire, Sun StorEdge, SunLink, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunSolve, SunPCi, and SunVTS are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. Ultra 25/45 JTF - 12/10/07 Sun Confidential – Internal Use Only 2 Table of Contents Positioning.....................................................................................................................................................................4 Introduction...............................................................................................................................................................4 Product Family Placement .......................................................................................................................................5 Sun Ultra 45 vs Sun Ultra 25 Workstation...............................................................................................................5
    [Show full text]
  • Ultrasparctm II Microprocessor
    UltraSPARCTM II Microprocessor High-Performance, Highly-Scalable, Multiprocess- The UltraSPARC II processor microarchitecture is designed ing, 64-bit SPARC™ V9 RISC Microprocessor to provide up to 4-way glueless multiprocessing support and supports up to 64-way systems. The processor sup- ports multiple L2 cache speeds and sizes to enable high- performance multiprocessing systems. Balanced overall system performance requires optimal performance along three critical levels: memory band- width, media processing, and raw compute performance. A highly-scalable, high-performance system interconnect ensures a bottleneck-free computing environment result- ing in high memory bandwidth. VIS™ (Visual Instruction Set) multimedia extensions boost the performance of graphics-intensive multimedia applications, and thus reduce overall system costs by eliminating the need for a special-purpose media processor. And the UltraSPARC II delivers superior raw compute performance by using the Placeholder for illustration or photo most innovative RISC microprocessor architecture and state-of-the-art process technology. The UltraSPARC II processor not only helps the system designer by implementing industry-standard testing and instrumentation interfaces, it also uses Error Checking & Correction (ECC) and parity to increase system reliability. With high performance, high scalability, and high reliabil- ity, the UltraSPARC II is the processor of choice for today’s workstations and servers. FEATURES • Full 64-bit implementation of SPARC V9 architecture • 100% binary compatibility with previous versions of SPARC systems The state-of-the-art UltraSPARC™ II processor is the second • Built-in MP support (glueless 4-way and up to 64-way) generation in the UltraSPARC s-series microprocessor fam- • High-performance UPA system interconnect ily.
    [Show full text]
  • Overview of Sun Integrated Lights out Manager
    Overview of Sun Integrated Lights Out Manager February 2008 Sun Microsystems, Inc. Copyright © 2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. U.S. Government Rights - Commercial software. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements. Use is subject to license terms. This distribution may include materials developed by third parties. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. X/Open is a registered trademark of X/Open Company, Ltd. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. AMD and Opteron are trademarks or registered trademarks of Advanced Micro Devices. Sun, Sun Microsystems, the Sun logo, Java, Solaris, Sun Blade, Sun Fire, and Sun SPARC Enterprise are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. This product is covered and controlled by U.S. Export Control laws and may be subject to the export or import laws in other countries. Nuclear, missile, chemical biological weapons or nuclear maritime end uses or end users, whether direct or indirect, are strictly prohibited.
    [Show full text]
  • Ultrasparc-III Ultrasparc-III Vs Intel IA-64
    UltraSparc-III UltraSparc-III vs Intel IA-64 vs • Introduction Intel IA-64 • Framework Definition Maria Celeste Marques Pinto • Architecture Comparition Departamento de Informática, Universidade do Minho • Future Trends 4710 - 057 Braga, Portugal [email protected] • Conclusions ICCA’03 ICCA’03 Introduction Framework Definition • UltraSparc-III (US-III) is the third generation from the • Reliability UltraSPARC family of Sun • Instruction level Parallelism (ILP) • Is a RISC processor and uses the 64-bit SPARC-V9 architecture – instructions per cycle • IA-64 is Intel’s extension into a 64-bit architecture • Branch Handling • IA-64 processor is based on a concept known as EPIC (Explicitly – Techniques: Parallel Instruction Computing) • branch delay slots • predication – Strategies: • static •dynamic ICCA’03 ICCA’03 Framework Definition Framework Definition • Memory Hierarchy • Pipeline – main memory and cache memory – increase the speed of CPU processing – cache levels location – several stages that performs part of the work necessary to execute an instruction – cache organization: • fully associative - every entry has a slot in the "cache directory" to indicate • Instruction Set (IS) where it came from in memory – is the hardware "language" in which the software tells the processor what to do • one-way set associative - only a single directory entry be searched – can be divided into four basic types of operations such as arithmetic, logical, • two-way set associative - two entries per slot to be searched (and is extended to program-control
    [Show full text]
  • Sun Blade 1000 and 2000 Workstations
    Sun BladeTM 1000 and 2000 Workstations Just the Facts Copyrights 2002 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Blade, PGX, Solaris, Ultra, Sun Enterprise, Starfire, SunPCi, Forte, VIS, XGL, XIL, Java, Java 3D, SunVideo, SunVideo Plus, Sun StorEdge, SunMicrophone, SunVTS, Solstice, Solstice AdminTools, Solstice Enterprise Agents, ShowMe, ShowMe How, ShowMe TV, Sun Workstation, StarOffice, iPlanet, Solaris Resource Manager, Java 2D, OpenWindows, SunCD, Sun Quad FastEthernet, SunFDDI, SunATM, SunCamera, SunForum, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunSolve, SunSolve EarlyNotifier, and SunClient are trademarks, registered trademarks, or service marks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and in other countries, exclusively licensed through X/Open Company, Ltd. FireWire is a registered trademark of Apple Computer, Inc., used under license. OpenGL is a trademark of Silicon Graphics, Inc., which may be registered in certain jurisdictions. Netscape is a trademark of Netscape Communications Corporation. PostScript and Display PostScript are trademarks of Adobe Systems, Inc., which may be registered in
    [Show full text]