High-Performance Throughput Computing

Total Page:16

File Type:pdf, Size:1020Kb

High-Performance Throughput Computing HIGH-PERFORMANCE THROUGHPUT COMPUTING THROUGHPUT COMPUTING, ACHIEVED THROUGH MULTITHREADING AND MULTICORE TECHNOLOGY, CAN LEAD TO PERFORMANCE IMPROVEMENTS THAT ARE 10 TO 30× THOSE OF CONVENTIONAL PROCESSORS AND SYSTEMS. HOWEVER, SUCH SYSTEMS SHOULD ALSO OFFER GOOD SINGLE-THREAD PERFORMANCE. HERE, THE AUTHORS SHOW THAT HARDWARE SCOUTING INCREASES THE PERFORMANCE OF AN ALREADY ROBUST CORE BY UP TO 40 PERCENT FOR COMMERCIAL BENCHMARKS. Microprocessors took over the entire Power4 and Power5;8,9 and Intel, through computer industry in the 1970s and 1980s, hyperthreading, are shipping products in this leading to the phrase “the attack of the killer space. AMD has announced plans to ship micros” as a popular topic on the newsgroup dual-core chips in late 2005. What is misun- comp.arch for many years. We believe that the derstood, though, is how you can build a disruption offered by throughput computing microprocessor that has several multithread- is similar in scope. Over the next several years, ed cores that still delivers high-end single- we expect chip multithreading (CMT) proces- thread performance. The combination of sors to take over all computing segments, from high-end single-thread performance and a Shailender Chaudhry laptops and desktops to servers and super- high degree of multicore, multithreading (for computers, effectively creating “the attack of example, tens of threads) is what we call high Paul Caprioli throughput computing.” Most microproces- performance throughput computing, the topic sor companies have now announced plans of this article. Sherman Yip and/or products with multiple cores and/or A variety of papers have shown how work- multiple threads per core. Sun Microsystems loads running on servers10 and desktops11 can Marc Tremblay has been developing CMT technology for greatly benefit from thread-level parallelism. almost a decade through the Microprocessor It is no surprise that large symmetric multi- Sun Microsystems Architecture for Java Computing (MAJC, processors (SMPs) were so successful in the pronounced “magic”) program1,2 and a vari- 1990s in delivering very high throughput in ety of SPARC processors.3-7 IBM, through the terms of transactions per second, for instance, while running large, scalable, multithreaded This article was based on a keynote speech by Marc applications. Successive generations of SMPs Tremblay at the 2004 International Symposium on delivered better individual processors, better Computer Architecture in Munich, with some updat- interconnects (certainly in terms of band- ed information. width, not necessarily in terms of latency) and better scalability. Gradually, applications and 32 Published by the IEEE Computer Society 0272-1732/05/$20.00 © 2005 IEEE Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on November 27,2020 at 15:03:22 UTC from IEEE Xplore. Restrictions apply. operating systems have matched the hardware cache behave as a 64-Mbyte one for scalability. A decade later, architects design- SPECfp2000. ing CMT processors face a dilemma: Many Researchers have proposed several other applications and some operating systems techniques to reduce the impact of ever- already scale to tens of threads. This leads to increasing memory latencies, as the “Scaling the question, “Should you populate a proces- the Memory Wall” sidebar explains. sor die with many very simple cores or should you use fewer but more powerful cores?” We Throughput computing simulated (and built, or are building) two very Systems designed for throughput comput- different cores and two different threading ing emphasize the overall work performed models and discuss how they benefit from over a fixed time period as opposed to focus- multithreading and from growing the num- ing on a metric describing how fast a single ber of cores on a chip. We show that if multi- core or a thread executes a benchmark. The threading is the prime feature and the rest of work is the aggregate amount of computation the core (that is, the rest of the pipeline and performed by all functional units, all threads, the memory subsystem) is architected around all cores, all chips, all coprocessors (such as it, as opposed to simply adding threading to network and cryptography accelerators) and an existing core, these two different cores ben- all the network interface cards in a system. efit greatly from multithreading (a 1.8 to 3.2× The scope of throughput computing is increase in throughput) and from multicores broad, especially if the microarchitecture and (3.1 to 3.7× increase). Server workloads dic- operating system collaborate to enable a tate that high throughput be the primary goal. thread to handle a full process. In this way, a On the other hand, the impact of critical sec- 32-thread CMT can appear to be an SMP of tions, Amdahl’s law, response time require- 32 virtual cores, from the viewpoint of the ments, and pipeline efficiency force us to try operating system and applications. At one end to design a high-performance single-thread of the spectrum, 32 different processes could pipeline although not at the expense of run on a 32-way CMT. At the other end, a throughput. single application, scaling to 32 threads—like Unfortunately as we discuss in a later sec- many large commercial applications running tion, two big levers that industry traditional- on today’s SMPs—can run on a CMT. Any ly used—clock rate and traditional point in between is also possible. For instance, out-of-order execution—no longer work very you could run four application servers, each well in the context of an aggressive 65-nm scaling to eight threads. Obviously, there are process and memory latencies now ranging in sweet spots, and it is the task of the load bal- the hundreds of cycles. ancing software and operating system to Clearly the industry needs new techniques. appropriately leverage the hardware. This article describes hardware scouting in Throughput computing relies on the fact which the processor launches a hardware that server processors run large data sets thread (invisible to software) which runs in and/or a large number of distinct jobs, result- front of the head thread. The goal here is to ing in memory footprints that stress all levels bring all interesting data and instructions (and of the memory hierarchy. This stress results control state) into the on-chip caches. Scout- in much higher miss rates for memory oper- ing heavily leverages some of the existing ations than those typical of SPECint2000, for threaded hardware to boost single-thread per- instance. A compounding effect is the miss formance. The control speculation accuracy, cost. In about 2007, servers will already be the scout’s depth, the memory-level paral- driving over 1 Tbyte of data provided by hun- lelism (MLP) impact, and the overall effect dreds of dual, inline memory modules. This on performance and cache sizes form the core makes the memory physically far from proces- of this article. We will show that this microar- sors (requiring long board traces, heavy load- chitecture technique will, for some configu- ing, backplane connectors, and so on), rations, improve performance by 40 percent resulting in large cache-miss penalties. on TPC-C, double the effective L2 cache size As a result, cores are often idle, waiting for for SPECint2000, and make a 1-Mbyte L2 data or instructions coming from distant MAY–JUNE 2005 33 Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on November 27,2020 at 15:03:22 UTC from IEEE Xplore. Restrictions apply. FUTURE TRENDS Scaling the Memory Wall Tolerating memory latency has of course been a goal for archi- no longer mostly rely on faster transistors and faster wires to provide tects and researchers for a long time. Various designs have employed an easy performance scaling, certain ways of using the transistor caches to tolerate memory latency by using the temporal and spa- budget compensate for the lack of scaling; these reach high levels tial locality that most programs exhibit. To improve the latency tol- of performance, especially in the context of throughput computing. erance of caches, other designs have employed nonblocking caches that support load hits under a miss and multiple misses. The use of software prefetching has proven effective when the References compiler can statically predict which memory reference will cause 1. C.-K. Luk and T.C. Mowry, “Compiler-Based a miss and disambiguate pointers.1,2 This is effective for a small Prefetching for Recursive Data Structures,” Proc. 7th subset of applications. In particular, however, it is not very effec- Int’l Conf. Architectural Support for Programming tive in commercial database applications, a prime target for servers. Languages and Operating Systems (ASPLOS 96), ACM Notice that software prefetching can also hurt performance if it Press, 1996, pp. 222-233. 2. T.C. Mowry, M.S. Lam, and A. Gupta, “Design and • generates extraneous load misses by speculatively hoisting Evaluation of a Compiler Algorithm for Prefetching,” multiple loads above branches based on static analysis and Proc. 5th Int’l Conf. Architectural Support for • increases instruction bandwidth. Programming Languages and Operating Systems (ASPLOS 92), ACM Press, 1992, pp. 62-73. Hardware prefetching alleviates the instruction cache pres- 3. S. Chaudhry and M. Tremblay, “Method and Apparatus sure, using dynamic information to predict misses and discover for Using an Assist Processor to Pre-fetch Data Values access patterns. Its drawback (and the reason why system vendors for a Primary Processor,” US Patent 6,415,356. often turn it off) is that it is hard to map a single hardware algo- 4. J.D. Collins et al., “Dynamic Speculative rithm to different applications, which would require a variety of Precomputation,” Proc. 34th Ann. ACM/IEEE Int’l algorithms. We have observed severe cache pollution when apply- Symp. Microarchitecture (Micro-34), IEEE CS Press, ing a generic hardware prefetch to codes with significantly dif- 2001, pp.
Recommended publications
  • Sun Fire E2900 Server
    Sun FireTM E2900 Server Just the Facts February 2005 SunWin token 401325 Sun Confidential – Internal Use Only Just The Facts Sun Fire E2900 Server Copyrights ©2005 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Netra, Ultra, UltraComputing, Sun Enterprise, Sun Enterprise Ultra, Starfire, Solaris, Sun WebServer, OpenBoot, Solaris Web Start Wizards, Solstice, Solstice AdminSuite, Solaris Management Console, SEAM, SunScreen, Solstice DiskSuite, Solstice Backup, Sun StorEdge, Sun StorEdge LibMON, Solstice Site Manager, Solstice Domain Manager, Solaris Resource Manager, ShowMe, ShowMe How, SunVTS, Solstice Enterprise Agents, Solstice Enterprise Manager, Java, ShowMe TV, Solstice TMNscript, SunLink, Solstice SunNet Manager, Solstice Cooperative Consoles, Solstice TMNscript Toolkit, Solstice TMNscript Runtime, SunScreen EFS, PGX, PGX32, SunSpectrum, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, SunSpectrum Bronze, SunStart, SunVIP, SunSolve, and SunSolve EarlyNotifier are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. All other product or service names mentioned
    [Show full text]
  • Dynamic Helper Threaded Prefetching on the Sun Ultrasparc® CMP Processor
    Dynamic Helper Threaded Prefetching on the Sun UltraSPARC® CMP Processor Jiwei Lu, Abhinav Das, Wei-Chung Hsu Khoa Nguyen, Santosh G. Abraham Department of Computer Science and Engineering Scalable Systems Group University of Minnesota, Twin Cities Sun Microsystems Inc. {jiwei,adas,hsu}@cs.umn.edu {khoa.nguyen,santosh.abraham}@sun.com Abstract [26], [28], the processor checkpoints the architectural state and continues speculative execution that Data prefetching via helper threading has been prefetches subsequent misses in the shadow of the extensively investigated on Simultaneous Multi- initial triggering missing load. When the initial load Threading (SMT) or Virtual Multi-Threading (VMT) arrives, the processor resumes execution from the architectures. Although reportedly large cache checkpointed state. In software pre-execution (also latency can be hidden by helper threads at runtime, referred to as helper threads or software scouting) [2], most techniques rely on hardware support to reduce [4], [7], [10], [14], [24], [29], [35], a distilled version context switch overhead between the main thread and of the forward slice starting from the missing load is helper thread as well as rely on static profile feedback executed, minimizing the utilization of execution to construct the help thread code. This paper develops resources. Helper threads utilizing run-time a new solution by exploiting helper threaded pre- compilation techniques may also be effectively fetching through dynamic optimization on the latest deployed on processors that do not have the necessary UltraSPARC Chip-Multiprocessing (CMP) processor. hardware support for hardware scouting (such as Our experiments show that by utilizing the otherwise checkpointing and resuming regular execution). idle processor core, a single user-level helper thread Initial research on software helper threads is sufficient to improve the runtime performance of the developed the underlying run-time compiler main thread without triggering multiple thread slices.
    [Show full text]
  • Sun SPARC Enterprise T5440 Servers
    Sun SPARC Enterprise® T5440 Server Just the Facts SunWIN token 526118 December 16, 2009 Version 2.3 Distribution restricted to Sun Internal and Authorized Partners Only. Not for distribution otherwise, in whole or in part T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 1 of 133 Copyrights ©2008, 2009 Sun Microsystems, Inc. All Rights Reserved. Sun, Sun Microsystems, the Sun logo, Sun Fire, Sun SPARC Enterprise, Solaris, Java, J2EE, Sun Java, SunSpectrum, iForce, VIS, SunVTS, Sun N1, CoolThreads, Sun StorEdge, Sun Enterprise, Netra, SunSpectrum Platinum, SunSpectrum Gold, SunSpectrum Silver, and SunSpectrum Bronze are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. UNIX is a registered trademark in the United States and other countries, exclusively licensed through X/Open Company, Ltd. T5440 Server Just the Facts Dec. 16, 2009 Sun Internal and Authorized Partner Use Only Page 2 of 133 Revision History Version Date Comments 1.0 Oct. 13, 2008 - Initial version 1.1 Oct. 16, 2008 - Enhanced I/O Expansion Module section - Notes on release tabs of XSR-1242/XSR-1242E rack - Updated IBM 560 and HP DL580 G5 competitive information - Updates to external storage products 1.2 Nov. 18, 2008 - Number
    [Show full text]
  • Day 2, 1640: Leveraging Opensparc
    Leveraging OpenSPARC ESA Round Table 2006 on Next Generation Microprocessors for Space Applications G.Furano, L.Messina – TEC-EDD OpenSPARC T1 • The T1 is a new-from-the-ground-up SPARC microprocessor implementation that conforms to the UltraSPARC architecture 2005 specification and executes the full SPARC V9 instruction set. Sun has produced two previous multicore processors: UltraSPARC IV and UltraSPARC IV+, but UltraSPARC T1 is its first microprocessor that is both multicore and multithreaded. • The processor is available with 4, 6 or 8 CPU cores, each core able to handle four threads. Thus the processor is capable of processing up to 32 threads concurrently. • Designed to lower the energy consumption of server computers, the 8-cores CPU uses typically 72 W of power at 1.2 GHz. G.Furano, L.Messina – TEC-EDD 72W … 1.2 GHz … 90nm … • Is a cutting edge design, targeted for high-end servers. • NOT FOR SPACE USE • But, let’s see which are the potential spin-in … G.Furano, L.Messina – TEC-EDD Why OPEN ? On March 21, 2006, Sun made the UltraSPARC T1 processor design available under the GNU General Public License. The published information includes: • Verilog source code of the UltraSPARC T1 design, including verification suite and simulation models • ISA specification (UltraSPARC Architecture 2005) • The Solaris 10 OS simulation images • Diagnostics tests for OpenSPARC T1 • Scripts, open source and Sun internal tools needed to simulate the design and to do synthesis of the design • Scripts and documentation to help with FPGA implementation
    [Show full text]
  • Overview of Sun Integrated Lights out Manager
    Overview of Sun Integrated Lights Out Manager February 2008 Sun Microsystems, Inc. Copyright © 2008 Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, California 95054, U.S.A. All rights reserved. U.S. Government Rights - Commercial software. Government users are subject to the Sun Microsystems, Inc. standard license agreement and applicable provisions of the FAR and its supplements. Use is subject to license terms. This distribution may include materials developed by third parties. Parts of the product may be derived from Berkeley BSD systems, licensed from the University of California. UNIX is a registered trademark in the U.S. and in other countries, exclusively licensed through X/Open Company, Ltd. X/Open is a registered trademark of X/Open Company, Ltd. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. in the United States and other countries. Products bearing SPARC trademarks are based upon an architecture developed by Sun Microsystems, Inc. Intel is a trademark or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. AMD and Opteron are trademarks or registered trademarks of Advanced Micro Devices. Sun, Sun Microsystems, the Sun logo, Java, Solaris, Sun Blade, Sun Fire, and Sun SPARC Enterprise are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. and other countries. This product is covered and controlled by U.S. Export Control laws and may be subject to the export or import laws in other countries. Nuclear, missile, chemical biological weapons or nuclear maritime end uses or end users, whether direct or indirect, are strictly prohibited.
    [Show full text]
  • Ultrasparc-III Ultrasparc-III Vs Intel IA-64
    UltraSparc-III UltraSparc-III vs Intel IA-64 vs • Introduction Intel IA-64 • Framework Definition Maria Celeste Marques Pinto • Architecture Comparition Departamento de Informática, Universidade do Minho • Future Trends 4710 - 057 Braga, Portugal [email protected] • Conclusions ICCA’03 ICCA’03 Introduction Framework Definition • UltraSparc-III (US-III) is the third generation from the • Reliability UltraSPARC family of Sun • Instruction level Parallelism (ILP) • Is a RISC processor and uses the 64-bit SPARC-V9 architecture – instructions per cycle • IA-64 is Intel’s extension into a 64-bit architecture • Branch Handling • IA-64 processor is based on a concept known as EPIC (Explicitly – Techniques: Parallel Instruction Computing) • branch delay slots • predication – Strategies: • static •dynamic ICCA’03 ICCA’03 Framework Definition Framework Definition • Memory Hierarchy • Pipeline – main memory and cache memory – increase the speed of CPU processing – cache levels location – several stages that performs part of the work necessary to execute an instruction – cache organization: • fully associative - every entry has a slot in the "cache directory" to indicate • Instruction Set (IS) where it came from in memory – is the hardware "language" in which the software tells the processor what to do • one-way set associative - only a single directory entry be searched – can be divided into four basic types of operations such as arithmetic, logical, • two-way set associative - two entries per slot to be searched (and is extended to program-control
    [Show full text]
  • Ultrasparc T1 Sparc History Sun + Sparc = Ultrasparc
    ULTRASPARC T1 SUN + SPARC = ULTRASPARC THE PROCESSOR FORMERLY KNOWN AS “NIAGARA” Processor Cores Threads/Core Clock L1D L1I L2 Cache UltraSPARC IIi 1 1 550Mhz, 650Mhz 16KiB 16KiB 512KiB UltraSPARC IIIi 1 1 1.593Ghz I D 1MBa UltraSPARC III 1 1 1.05-1.2GHz 64KiB 32KiB 8MiBb UltraSPARC IV 2c 1 1.05-1.35Ghz 64KiB 32KiB 16MiBd UltraSPARC IV+ 1 2 1.5Ghz I D 2MiBe UltraSPARC T1 8 4 1.2Ghz 32KiB 16KiBf 3MiBg UltraSPARC T2h 16 (?) 8 2Ghz+ (?) ? ? ? Slide 1 Slide 3 aOn-chip bExternal, on chip tags cUltraSPARC III cores d8MiB per core e32MiB off chip L3 fI/D Cache per core g4 way banked hSecond-half 2007 This work supported by UNSW and HP through the Gelato Federation SPARC HISTORY INSTRUCTION SET ➜ Scalable Processor ARCHitecture ➜ RISC! ➜ 1985 – Sun Microsystems ➜ Berkeley RISC – 1980-1984 ➜ Load–store only through registers ➜ MIPS – 1981-1984 ➜ Fixed size instructions (32 bits) ➜ register + register Slide 2 Architecture v Implementation: Slide 4 ➜ register + 13 bit immediate ➜ SPARC Architecture ➜ Branch delay slot ➜ SPARC V7 – 1986 X Condition Codes ➜ SPARC Interntaional, Ltd – 1989 V (V9) CC and non-CC instructions ➜ SPARC V8 – 1990 V (V9) Compare on integer registers ➜ SPARC V9 – 1994 ➜ Synthesised instructions ➜ Privileged v Non-Privileged SUN + SPARC = ULTRASPARC 1 CODE EXAMPLE 2 CODE EXAMPLE V9 REGISTER WINDOWS void addr(void) { int i = 0xdeadbeef; } 00000054 <addr>: Slide 5 54: 9d e3 bf 90 save %sp, -112, %sp Slide 7 58: 03 37 ab 6f sethi %hi(0xdeadbc00), %g1 5c: 82 10 62 ef or %g1, 0x2ef, %g1 60: c2 27 bf f4 st %g1, [ %fp + -12 ] 64:
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Oracle Database Licensing with Vmware
    OracleScene DIGITAL SPRING 15 Licensing Oracle Database Licensing with VMware As I have been confronted several times Oracle on VMware: Supported, but You are running with Oracle License reviews with VMware not Certified installations, I decided to write this article Before diving into the Oracle licensing Oracle on in order to provide answers to the most questions, the very first thing you have frequent questions I have received from to know about any Oracle product, not VMware and customers. only the Oracle Database, is that there is no Oracle product certified on a VMware are not 100% Licensing is not an easy topic to address architecture. However, all Oracle products and this difficulty is not specific to Oracle. are supported on VMware infrastructure. sure that you Additionally, this difficulty increases with What does this mean, supported, but the number of different software editors not certified? Oracle does not guarantee are compliant and the related software policy you have the proper working of these products on to manage as a License Manager, IT VMware architecture, but will support with the Oracle manager or CTO. In this article, I will focus you in case of incidents not related to the on Oracle licensing and more specifically VMware infrastructure. In cases where licensing rules? on a very frequent use case: Oracle the VMware could be involved, the Oracle Database installed on a VMware ESX support may ask you to reproduce the This article is infrastructure. Indeed, there is really few incident on a certified environment. information about this topic in Oracle’s Figure 1 shows an extract of the Oracle for you! documentation and a simple mistake Support Doc ID 249212.1.
    [Show full text]
  • MICROPROCESSOR EVALUATIONS for SAFETY-CRITICAL, REAL-TIME February 2009 APPLICATIONS: AUTHORITY for EXPENDITURE NO
    DOT/FAA/AR-08/55 Microprocessor Evaluations for Air Traffic Organization Operations Planning Safety-Critical, Real-Time Office of Aviation Research and Development Applications: Authority for Washington, DC 20591 Expenditure No. 43 Phase 3 Report February 2009 Final Report This document is available to the U.S. public through the National Technical Information Services (NTIS), Springfield, Virginia 22161. U.S. Department of Transportation Federal Aviation Administration NOTICE This document is disseminated under the sponsorship of the U.S. Department of Transportation in the interest of information exchange. The United States Government assumes no liability for the contents or use thereof. The United States Government does not endorse products or manufacturers. Trade or manufacturer's names appear herein solely because they are considered essential to the objective of this report. This document does not constitute FAA certification policy. Consult your local FAA aircraft certification office as to its use. This report is available at the Federal Aviation Administration William J. Hughes Technical Center’s Full-Text Technical Reports page: actlibrary.act.faa.gov in Adobe Acrobat portable document format (PDF). Technical Report Documentation Page 1. Report No. 2. Government Accession No. 3. Recipient's Catalog No. DOT/FAA/AR-08/55 4. Title and Subtitle 5. Report Date MICROPROCESSOR EVALUATIONS FOR SAFETY-CRITICAL, REAL-TIME February 2009 APPLICATIONS: AUTHORITY FOR EXPENDITURE NO. 43 PHASE 3 REPORT 6. Performing Organization Code 7. Author(s) 8. Performing Organization Report No. TAMU-CS-AVSI-72005 Rabi N. Mahapatra, Praveen Bhojwani, Jason Lee, and Yoonjin Kim 9. Performing Organization Name and Address 10. Work Unit No.
    [Show full text]
  • Microprocessor
    MICROPROCESSOR www.MPRonline.com THE REPORTINSIDER’S GUIDE TO MICROPROCESSOR HARDWARE ULTRASPARC IV MIRRORS PREDECESSOR Sun Builds Dual-Core Chip in 130nm By Kevin Krewell {11/10/03-02} As part of its Throughput Computing initiative, Sun has started down the path to chip-level multithreading (CMT) with its UltraSPARC IV (US IV), revealed at Microprocessor Forum 2003. Sun’s first US IV consists of two slightly enhanced UltraSPARC III cores that share a systems bus, DRAM memory controller, and off-die L2 cache. An undesirable effect of mirroring the cores was that it put a Built in the Texas Instruments 130nm process, the dual cores hot spot of each core directly next to that of the other (the share one large off-die L2 cache, but when Sun migrates the floating-point unit). Some design changes, such as heat “tow- US IV processor to 90nm, the company will add an on-die L2 ers” in the copper interconnect, were required to draw the cache, with the off-die cache becoming a large L3 cache. heat from this hot spot on the die. Sun’s goal was to increase performance with a multicore approach, initially without significant increases in clock speeds. Sun’s Today UltraSPARC V chip-level multithreading could also be called chip multiprocessing. The UltraSPARC IV UltraSPARC IV Data Facing UltraSPARC III 5X processor should ship in Sun systems in 1H04 s-Series with clock speeds of 1.05GHz and 1.2GHz. 1X 2X Figure 1 shows the roadmap for the Ultra- SPARC family through 2006. UltraSPARC IIi UltraSPARC IIIi The even-numbered member of the i-Series UltraSPARC family continues a plan to lever- 1Y 2Y 2.4Y age the infrastructure and system bus created Gemini Niagara in UltraSPARC III.
    [Show full text]
  • Sun Fire E2900 Server
    Sun Fire™ E2900 Server Enterprise-class performance in a compact, flexible server < The Sun Fire™ E2900 server delivers enterprise-class performance and reliability in a compact, rack-optimized form factor—an ideal system for application serving, business processing, database, IT infastructure, application development, collaboration, and compute-intesnsive scientific engineering. Powered by up to 12 next-generation multiple applications in an isolated, secure Highlights UltraSPARC® IV+ processors, the system delivers environment. And by consolidating applica- more than twice the performance of previous tions, you can further increase utilization and • Delivers more than 5x the systems, in the same footprint. Two Sun Fire reduce costs. performance of UltraSPARC III and more E2900 servers fit into a single standard rack, than twice that of UltraSPARC IV servers The Sun Fire E2900 server also protects your —in the same footprint providing a total rack density of 24 processors investment with a hardware lifecycle of from and 48 compute threads. In addition, enter- • Scales up to 12 processors with 24 threads five to seven years, one of the longest in the and up to 192 GB of memory for data- prise-class features such as PCI-X technology industry. Sun’s unique Solaris Application intensive applications help deliver breakthrough price/performance Guarantee means that you can upgrade your • Solaris Application Guarantee enables for compute-intensive applications. systems without recompiling or recoding your seamless migration to new CPU technologies The system is inherently reliable as well, with applications, which minimizes business • Mix and match UltraSPARC III, IV, and IV+ hot-swappable CPU/memory boards, Dynamic disruptions.
    [Show full text]