The AMD Opteron™ CMP Northbridge Architecture: Now and in the Future

Total Page:16

File Type:pdf, Size:1020Kb

The AMD Opteron™ CMP Northbridge Architecture: Now and in the Future The AMD Opteron™ CMP NorthBridge Architecture: Now and in the Future Pat Conway & Bill Hughes August, 2006 AMD Opteron™ – The Industry’s First Native Dual-Core 64-bit x86 Processor Integration: • Two 64-bit CPU cores • 2MB L2 cache • On-chip Router & Memory Controller Bandwidth: • Dual channel DDR (128-bit) memory bus • 3 HyperTransport™ (HT) links (16-bit each x 2 GT/sec x 2) Usability and Scalability: • Socket compatible: Platform and TDP! • Glueless SMP up to 8 sockets • Memory capacity & BW scale w/ CPUs Power Efficiency: • AMD PowerNow!™ Technology with optimized power management • Industry-leading system level power efficiency 2 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future AMD Opteron™ – The Industry’s First Native Dual-Core 64-bit x86 Processor 3 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future A Clean Break with the Past Chip Chip Chip Chip Chip Chip Chip Chip X X X X X X X X SRQ SRQ MCPMCP MCPMCP MCPMCP MCPMCP Crossbar Crossbar Mem.Ctrlr HT Mem.Ctrlr HT 8 GB/S PCI-EPCI-E USB Memory BridgeBridge USB Memory PCI-EPCI-E SRQ SRQ I/O Hub Controller I/OI/OI/O Hub HubHub Controller BridgeTM Crossbar Crossbar PCIeBridgeTM Hub PCIe Mem.Ctrlr HT Mem.Ctrlr HT PCIPCI Hub BridgeBridge 8 GB/S 8 GB/S TM TM PCIePCIeTM PCIePCIeTM Bridge Bridge XMBXMB XMBXMB XMBXMB XMBXMB Bridge Bridge 8 GB/S USBUSB I/OI/O HubHub PCIPCI Legacy x86 Architecture AMD64’s Direct Connect Architecture • 20-year old traditional front-side bus (FSB) architecture • Industry-standard technology • CPUs, Memory, I/O all share a bus • Direct Connect Architecture reduces FSB bottlenecks • Major bottleneck to performance • HyperTransport™ interconnect offers scalable high bandwidth and low latency • Faster CPUs or more cores ≠ performance • 4 memory controllers – increases memory capacity and bandwidth 4 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future 4P System — Board Layout 5 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future System Overview DRAM DRAM 12.8 GB/s 128-bit MCT Core 0 MCT Core 0 Core 1 Core 1 SRI SRI ncHT XBAR XBAR I/O I/O I/O HT HT HT HT HT-HB HT-HB 4.0GB/s per cHT direction @ 2GT/s Data HT HT Rate I/O I/O HT HT HT-HB XBAR XBAR HT-HB Core 0 Core 0 Core 1 Core 1 “NorthBridge” SRI SRI MCT MCT DRAM DRAM 6 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Northbridge Microarchitecture Overview Core0 Core1 Virtual Channel Use Data Request • Read Y L2 L2 • Non-posted Write • Cache Block Commands Posted Request Posted Writes Y System Probe Broadcast probes N Request APIC Response • Read Response Y Interface (interrupts) • Probe response (SRI) • Completion Memory DRAM Crossbar Controller Controller 2 DDR2 channels (XBAR) (MCT) HT0 HT1 HT2 7 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Northbridge Command Flow Core 0 Victim Buffer (8-entry) Write Buffer (4-entry) Instruction MAB (2-entry) Data MAB (8-entry) All buffers are 64-bit Core 1 command/address System Request to Queue Core 24-entry Memory Command to Address MAP HT0 Input HT1 Input HT2 Input Queue DCT & GART 20-entry Router Router Router Router Router 10-entry Buffer 16-entry Buffer 16-entry Buffer 16-entry Buffer 12-entry Buffer XBAR HT0 Output HT1 Output HT2 Output 8 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Northbridge Data Flow Core 0 All buffers are 64-byte cache lines Victim Buffer (8-entry) Write Buffer (4-entry) from Core 1 Host Bridge HT0 input HT1 input HT2 input from DCT 8-entry Buffer 8-entry Buffer 8-entry Buffer 5-entry Buffer 8-entry Buffer XBAR XBAR HT0 output HT1 output HT2 output System Request Memory Data Queue Data Queue 12-entry 8-entry to Core to Host to DCT Bridge 9 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Lessons Learned #1 Allocation of XBAR Command buffer across Virtual Channels can have big impact on performance Memory 0 L2 L2 Memory 2 5: D 4 : D 3 4: RP0 : R D 3: PI2 P 0 P 2 3: PI0 2: RD MP traffic analysis gives 4: RP2 5: RP0 the best allocation 3: PI1 e.g. Opteron Read Transaction 6: D 8:SD Request (2 visits) 1: RD Probe (3 visits) 4: PI1 5: RP1 Response (8 visits) P 1 P 3 7: SD Memory 1 L2 L2 Memory 3 10 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Lessons Learned #2 Memory Latency is the Key to Application Performance! Performance vs Average Memory Latency Performance (single 2.8GHz core, 400MHz vs DDR2 Average PC3200, 2GT/s HT Memory with 1MB cache in MP Latency system) (single 2.8GHz7 core, 400MHz DDR2 PC3200, 2GT/s HT with 1MB cache120% in MP system) 7 6 100% 120% 5 80% OLTP1 6 4 100% OLTP2 60% 3 SW99 5 SSL 40% 80% OLTP12 JBB 4 OLTP21 20% System Performance 60% 3 SW990 0% Processor Performance SSL 1N 2N 4N (SQ) 8N (TL) 8N (L) 40% 2 JBB 4 Node Square 1 1 Node 8N Ladder 20% I/O I/O System Performance I/O P0 P2 P4 P6 I/O P0 P0 P2 0 0% Performance Processor P1 P3 I/O I/O P1 P3 P5 P7 I/O 1N 2N 4NI/O (SQ)I/O 8N (TL) 8N (L) AvgD 0 hops 1 hops 1.8 hops Latency x + 0ns x + 44ns (124 cpuclk) x + 105ns (234 cpuclk) 8N Twisted Ladder 4 Node Square 8N Ladder 1 Node I/O P0 P2 P4 P6 I/O I/O I/O I/O P0 P2 P4 P6 I/O P0 2 Node P0 P2 I/O P1 P3 P5 P7 I/O I/O P0 P1 I/O 0.5 hops 1.5 hops P1 P3 I/O x + 17ns (47 cpuclk) x + 76ns (214 cpuclk) I/O P1 P3 P5 P7 I/O 11 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Looking Forward HyperTransport™-based Accelerators Imagine it, Build it • Open platform for SOASOA XMLXML system builders (“Torrenza”) AcceleratorAccelerator AcceleratorAccelerator – 3rd Party Accelerators –Media Native Native Dual-Core Dual-Core –FLOPs 8 GB/S –XML SRQ SRQ –SOA Crossbar Crossbar Mem.Ctrlr HT Mem.Ctrlr HT • AMD Opteron™ Socket or HTX 8 GB/S 8 GB/S slot Other Other PCI-EPCI-E FLOPs Optimized Bridge FLOPs Optimized Bridge Accelerator • HyperTransport interface is an SiliconSilicon Accelerator open standard see 8 GB/S hypertransport.org USBUSB I/OI/O HubHub • Coherent HyperTransport PCIPCI interface available if the accelerator caches system memory (under license) 13 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future AMD’s Next Generation Processor Technology Ideal for 65nm SOI and beyond Native quad core die Expandable shared L3 cache Enhanced Direct IPC enhanced Connect Architecture CPU cores and Northbridge • 32B instruction fetch • Four ungangable x16 HyperTransportTM links • Improved branch prediction (up to 5.2GT/sec) • Out-of-order load execution • Enhanced crossbar • Up to 4 DP FLOPS/cycle • Next-generation • Dual 128-bit SSE dataflow memory support • Dual 128-bit loads per cycle • FBDIMM when appropriate • Bit Manipulation extensions (LZCNT/POPCNT) • Enhanced power • SSE extensions (EXTRQ/INSERTQ, MOVNTSD/MOVNTSS) management and RAS 14 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Balanced, Highly Efficient Cache Structure Efficient memory handling reduces the need for “brute force” cache sizes L1 L2 L3 Dedicated L1 64KB 512KB • Locality keeps most critical 2+MB Cache Cache Core 1 Control data in the L1 cache •Low latency • 2 128 bit data paths L1 L2 • 2 loads per cycle C2 L1 L2 C3 L1 L2 C4 15 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Balanced, Highly Efficient Cache Structure Efficient memory handling reduces the need for “brute force” cache sizes L1 L2 L3 64KB 512KB Dedicated L2 2+MB Cache Cache Core 1 Control • Sized to accommodate the majority of working sets L1 L2 today C2 • Dedicated to help eliminate conflicts common in shared L1 L2 caches C3 L1 L2 C4 16 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Balanced, Highly Efficient Cache Structure Efficient memory handling reduces the need for “brute force” cache sizes L1 L2 L3 64KB 512KB 2+MB Cache Cache Core 1 Control Shared L3 – Coming Soon L1 • Allocation policy which L2 optimizes movement, C2 placement and replication L1 of data for multi-core L2 • Ready for expansion C3 L1 L2 C4 17 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future Additional HyperTransport™ Ports • Enable Fully Connected 4 Node (four x16 HT) and 8 Node (eight x8 HT) • Reduced network diameter – Fewer hops to memory • Increased Coherent Bandwidth –more links Four x16 HT – cHT packets visit fewer links –HyperTransport3 OR • Benefits – Low latency because of lower diameter – Evenly balanced utilization of HyperTransport links – Low queuing delays Low latency under load Eight x8 HT 18 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future 4 Node Performance 4 Node Square 4 Node fully connected I/O I/O I/O I/O 16 16 P0 P2 P0 P2 P1 P3 P1 P3 I/O I/O I/O I/O + 2 EXTRA LINKS W/ HYPERTRANSPORT3 4N SQ (2GT/s 4N FC (2GT/s 4N FC (4.4GT/s HyperTransport) HyperTransport) HyperTransport3) Diam 2 Avg Diam 1.00 Diam 1 Avg Diam 0.75 Diam 1 Avg Diam 0.75 XFIRE BW 14.9GB/s XFIRE BW 29.9GB/s XFIRE BW 65.8GB/s (2X) (4X) XFIRE (“crossfire”) BW is the link-limited all-to-all communication bandwidth (data only) 19 21 August 2006 The Opteron CMP NorthBridge Architecture, Now and in the Future 8 Node Performance 8 Node Fully Connected 8 Node8 Node 6HT 2x4 2x4 I/O I/O 8N Twisted Ladder 8 P0 P2 P5 P4 16 8 I/O P0 P2 P4 P6 I/O I/OI/O 5 4 3 4 3 5 I/O P1 P3 I/O P0 P7 I/O P0 0 0 P2 1 1 OR 2 2 I/O P1 P6 I/O 4 3 4 I/O P1 P3 P5 P7 I/O 2 2 1 1 P4 P6 P1 0 0 P3 P2 P3 5 4 3 4 3 5 I/OI/O I/O P5 P7 I/O I/O 8N TL (2GT/s 8N 2x4 (4.4GT/s 8N FC (4.4GT/s HyperTransport) HyperTransport3)
Recommended publications
  • ATX-945G Industrial Motherboard in ATX Form Factor with Intel® 945G Chipset
    ATX-945G Industrial Motherboard in ATX form factor with Intel® 945G chipset Supports Intel® Core™2 Duo, Pentium® D CPU on LGA775 socket 1066/800/533MHz Front Side Bus Intel® Graphics Media Accelerator 950 Dual Channel DDR2 DIMM, maximum 4GB Integrated PCI Express LAN, USB 2.0 and SATA 3Gb/s ATX-945G Product Overview The ATX-945G is an ATX form factor single-processor (x8) and SATA round out the package for a powerful industrial motherboard that is based on the Intel® 945G industrial PC. I/O features of the ATX-945G include a chipset with ICH7 I/O Controller Hub. It supports the Intel® 32-bit/33MHz PCI Bus, 16-bit/8MHz ISA bus, 1x PCI Express Core™2 Duo, Pentium® D, Pentium® 4 with Hyper-Threading x16 slot, 2x PCI Express x1 slots, 3x PCI slots, 1x ISA slot Technology, or Celeron® D Processor on the LGA775 socket, (shared w/ PCI), onboard Gigabit Ethernet, LPC, EIDE Ultra 1066/800/533MHz Front Side Bus, Dual Channel DDR2 DIMM ATA/100, 4 channels SATA 3Gb/s, Mini PCI card slot, UART 667/533/400MHz, up to 4 DIMMs and a maximum of 4GB. The compatible serial ports, parallel port, floppy drive port, HD Intel® Graphic Media Accelerator 950 technology with Audio, and PS/2 Keyboard and Mouse ports. 2048x1536x8bit at 75Hz resolution, PCI Express LAN, USB 2.0 Block Diagram CPU Core™2 Duo Pentium® D LGA775 package 533/800/1066MHz FSB Dual-Core Hyper-Threading 533/800/1066 MHz FSB D I M Northbridge DDR Channel A M ® x Intel 945G GMCH 2 DDRII 400/533/667 MHz D I CRT M M DB-15 DDR Channel B x 2 Discrete PCIe x16 Graphics DMI Interface 2 GB/s IDE Device
    [Show full text]
  • Motherboards, Processors, and Memory
    220-1001 COPYRIGHTED MATERIAL c01.indd 03/23/2019 Page 1 Chapter Motherboards, Processors, and Memory THE FOLLOWING COMPTIA A+ 220-1001 OBJECTIVES ARE COVERED IN THIS CHAPTER: ✓ 3.3 Given a scenario, install RAM types. ■ RAM types ■ SODIMM ■ DDR2 ■ DDR3 ■ DDR4 ■ Single channel ■ Dual channel ■ Triple channel ■ Error correcting ■ Parity vs. non-parity ✓ 3.5 Given a scenario, install and configure motherboards, CPUs, and add-on cards. ■ Motherboard form factor ■ ATX ■ mATX ■ ITX ■ mITX ■ Motherboard connectors types ■ PCI ■ PCIe ■ Riser card ■ Socket types c01.indd 03/23/2019 Page 3 ■ SATA ■ IDE ■ Front panel connector ■ Internal USB connector ■ BIOS/UEFI settings ■ Boot options ■ Firmware upgrades ■ Security settings ■ Interface configurations ■ Security ■ Passwords ■ Drive encryption ■ TPM ■ LoJack ■ Secure boot ■ CMOS battery ■ CPU features ■ Single-core ■ Multicore ■ Virtual technology ■ Hyperthreading ■ Speeds ■ Overclocking ■ Integrated GPU ■ Compatibility ■ AMD ■ Intel ■ Cooling mechanism ■ Fans ■ Heat sink ■ Liquid ■ Thermal paste c01.indd 03/23/2019 Page 4 A personal computer (PC) is a computing device made up of many distinct electronic components that all function together in order to accomplish some useful task, such as adding up the numbers in a spreadsheet or helping you to write a letter. Note that this defi nition describes a computer as having many distinct parts that work together. Most PCs today are modular. That is, they have components that can be removed and replaced with another component of the same function but with different specifi cations in order to improve performance. Each component has a specifi c function. Much of the computing industry today is focused on smaller devices, such as laptops, tablets, and smartphones.
    [Show full text]
  • ECS P4M890T-M2 Manual.Pdf
    Preface Copyright This publication, including all photographs, illustrations and software, is protected under international copyright laws, with all rights reserved. Neither this manual, nor any of the material contained herein, may be reproduced without written consent of the author. Version 1.0B Disclaimer The information in this document is subject to change without notice. The manufacturer makes no representations or warranties with respect to the contents hereof and specifically disclaims any implied warranties of merchantability or fitness for any particular purpose. The manufacturer reserves the right to revise this publication and to make changes from time to time in the content hereof without obligation of the manufacturer to notify any person of such revision or changes. Trademark Recognition Microsoft, MS-DOS and Windows are registered trademarks of Microsoft Corp. MMX, Pentium, Pentium-II, Pentium-III, Pentium-4, Celeron are registered trademarks of Intel Corporation. Other product names used in this manual are the properties of their respective owners and are acknowledged. Federal Communications Commission (FCC) This equipment has been tested and found to comply with the limits for a Class B digital device, pursuant to Part 15 of the FCC Rules. These limits are designed to provide reason- able protection against harmful interference in a residential installation. This equipment generates, uses, and can radiate radio frequency energy and, if not installed and used in accordance with the instructions, may cause harmful interference
    [Show full text]
  • Comparison of High Performance Northbridge Architectures in Multiprocessor Servers
    Comparison of High Performance Northbridge Architectures in Multiprocessor Servers Michael Koontz MS CpE Scholarly Paper Advisor: Dr. Jens-Peter Kaps Co-Advisor: Dr. Daniel Tabak - 1 - Table of Contents 1. Introduction ...............................................................................................................3 2. The x86-64 Instruction Set Architecture (ISA) ...................................................... 3 3. Memory Coherency ...................................................................................................4 4. The MOESI and MESI Cache Coherency Models.................................................8 5. Scalable Coherent Interface (SCI) and HyperTransport ....................................14 6. Fully-Buffered DIMMS ..........................................................................................16 7. The AMD Opteron Northbridge ............................................................................19 8. The Intel Blackford Northbridge Architecture .................................................... 27 9. Performance and Power Consumption .................................................................32 10. Additional Considerations ..................................................................................34 11. Conclusion ............................................................................................................ 36 - 2 - 1. Introduction With the continuing growth of today’s multi-media, Internet based culture, businesses are becoming more dependent
    [Show full text]
  • AMD Opteron™ 4200 Series Processor
    AMD Opteron™ 4200 Series Processor AMD Opteron™ 4200 Series Processor What’s new in the AMD Opteron™ 4200 Series Processor (Codenamed “Valencia”) and the new “Bulldozer” Microarchitecture? Platform Processor Socket Chipset Opteron 4000 Opteron 4200 C32 56x0 / 5100 (codenamed “San Marino” or “Adelaide”) (codenamed “Valencia") (with BIOS update) In their first microarchitecture rebuild since Opteron, AMD introduces the Bulldozer module, implemented in both the Opteron 4200 and 6200 Series. The modules are engineered to emphasize throughput, core count, and parallel server workloads. A module consists of two tightly coupled cores, in which some core resources are shared. The effect is between a core with two threads and a dual-core processor, in which each core is fully independent. Performance, scalability, and efficiency for today’s applications. Processor Features Platform Features . 32nm process technology . 2 DDR3 memory channels, LRDIMM, RDIMM, UDIMM . Up to 8 Bulldozer cores up to 1600 GT/s . Evolved integrated northbridge o 1.25 and 1.35V Low-voltage DRAM o 8M L3 cache o 1.5V also available o Bandwidth improvements . 2 HyperTransport™ technology 3.0 links up to 6.4 GT/s o Memory capacity improvements . Advanced Platform Management Link (APML) for o Power efficiency improvements system management . Up to16MB combined L2 + L3 cache . Compatible with existing C32 socket with BIOS update Product Specifications Clock Frequency / Turbo L3 Max Memory System Bus Process TDP1 / ACP2 Model Cores L2 Cache Core Frequency (GHz) Cache Speed Speed 4284 3.0 / 3.7 8 4 x 2MB 4280 2.8 / 3.5 95 W / 75W 4238 3.3 / 3.7 8MB DDR3-1600 MHz 6.4 GT/s 4234 3.1 / 3.5 6 3 x 2MB 32nm 4226 2.7 / 3.1 4274 HE 2.5 / 3.5 8 4 x 2MB 65 W / 50W 8MB DDR3-1600 MHz 6.4 GT/s 4228 HE 2.8 / 3.6 6 3 x 2MB 35W / 32W 4256 EE 1.6 / 2.8 8 4 x 2MB 8MB DDR3-1600 MHz 6.4 GT/s 1 TDP stands for Thermal Design Power.
    [Show full text]
  • ESPRIMO Mobile V5505
    Issue February 2008 ESPRIMO Mobile V5505 Pages 3 The ESPRIMO Mobile V5505 is a versatile all-round notebook, equally suitable for occasional or professional users, and heavy-duty operation by mobile field sales people. This professional notebook features a superb 15.4-inch WXGA display, an ergonomic keyboard and a built-in super multi DVD writer drive. LAN are integrated for easier connection, courtesy of the latest Intel® Centrino® Duo Mobile Technology based on Intel® 965GM chipset. It has great connections; four USB ports and an integrated 3in1 Card Reader. Connect to printers, scanners, cameras or any other accessory. Elegance Designed for the most demanding mobile users, ESPRIMO Mobile is the perfect synthesis of form, function and style. Power-saving Intel® Centrino® Duo Mobile for long independent working Increased productivity due to latest Intel® 965GM chipset which supports latest technology standards: DDR2 memory, PCI Express and S-ATA Ergonomics Enjoy the viewing quality of brilliant 15.4-inch WXGA TFT display Convenient working with full sized keyboard Connectivity Ideal connectivity through WLAN. Antennas are integrated for best signal reception. Variety of interfaces for best connection to the peripherals Reliability Germany‘s quality standard Award-winning best-in-class manufacturing Mechanical and function stability through extensive quality tests Data Sheet ⏐ Issue: February 2008⏐ ESPRIMO Mobile V5505 Page 2 / 3 System ESPRIMO Mobile V5505 Processor Intel® Core™2 Duo Processor Up to T7500 (2.2 GHz) Second level
    [Show full text]
  • Blackford:Blackford: AA Dualdual Processorprocessor Chipsetchipset Forfor Serversservers Andand Workstationsworkstations
    Blackford:Blackford: AA DualDual ProcessorProcessor ChipsetChipset forfor ServersServers andand WorkstationsWorkstations Kai Cheng, Sundaram Chinthamani, Sivakumar Radhakrishnan, Fayé Briggs and Kathy Debnath IntelIntel CorporationCorporation 8/22/20068/22/2006 © Copyright 2006, Intel Corporation. All rights reserved. Hot Chips 2006 *Third party marks and brands are the property of their respective owners. Digital Enterprise Group 1 LegalLegal DisclaimeDisclaimerr • Intel, the Intel logo, Centrino, the Centrino logo, Intel Core, Core Inside, Pentium, Pentium Inside, Itanium, Itanium Inside, Xeon, Xeon Inside, Pentium III Xeon, Celeron, Celeron Inside, and Intel SpeedStep are trademarks or registered trademark of Intel Corporation or its subsidiaries in the United States and other countries. • This document is provided “as is” with no warranties whatsoever, including any warranty of merchantability, non-infringement fitness for any particular purpose, or any warranty otherwise arising out of any proposal, specification or sample • Information in this document is provided in connection with Intel products. No license, express or implied, by estoppels or otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sustaining applications. • Intel does not control or audit the design or implementation of 3rd party benchmarks or websites referenced in this document.
    [Show full text]
  • Memory Controller Design and Optimizations for High-Performance Systems
    MEMORY CONTROLLER DESIGN AND OPTIMIZATIONS FOR HIGH-PERFORMANCE SYSTEMS Project Report By Yanwei Song, Raj Parihar In Partial Fulfillment of the Course CSC458 – Parallel and Distributed Systems April 2009 University of Rochester Rochester, New York TABLE OF CONTENT ABSTRACT........................................................................................................................ 3 1. INTRODUCTION ...................................................................................................... 4 2. OVERVIEW: DRAM BASCIS .................................................................................. 5 3. DRAM MEMORY CONTROLLER .......................................................................... 7 4. PARALLELISM IN MEMORY SYSTEM.............................................................. 10 5. SURVEY: TECHNIQUES/ IMPLEMENTATIONS ............................................... 11 6. SIMULATIONS AND ANALYSIS......................................................................... 13 7. FUTURE WORK...................................................................................................... 17 8. CONCLUSION......................................................................................................... 18 REFERENCES ................................................................................................................. 19 2 ABSTRACT “Memory wall” continues to grow despite the technological advancement and gap between memory sub-system and processor’s clock speed is still increasing. On-chip
    [Show full text]
  • AMD Athlon Northbridge with 4X AGP and Next Generation Memory
    AMD AthlonTM Northbridge with 4x AGP and Next Generation Memory Subsystem Chetana Keltcher, Jim Kelly, Ramani Krishnan, John Peck, Steve Polzin, Sridhar Subramanian, Fred Weber Advanced Micro Devices, Sunnyvale, CA * AMD Athlon was formerly code-named AMD-K7 1 Outline of the Talk • Introduction • Architecture • Clocking and Gearbox • Performance • Silicon Statistics • Conclusion 2 Introduction • NorthBridge: “Electronic traffic cop” that directs data flow between the main memory and the rest of the system • Bridge the gap between processor speed and memory speed – Higher bandwidth busses • Example: AGP 2.0, EV6 and AMD Athlon system bus – Better memory technology • Example: Double data rate SDRAM, RDRAM 3 System Block Diagram 100MHz for PC-100 SDRAM AMD Athlon 200MHz for DDR SDRAM System Bus 533-800MHz for RDRAM DRAM CPU 200 MHz, 64 bits NorthBridge Scaleable to 400 MHz 66MHz, 32 bits (1x,2x,4x) Graphics CPU AGP Bus Device 33MHz, 32 bits PCI PCI Bus Devices SouthBridge ISA Bus IDE USBSerial Printer Port Port 4 Features of the AMD Athlon Northbridge • Can support one or two AMD Athlon or EV6 processors • 200MHz data rate (scaleable to 400MHz), 64-bit processor interface • 33MHz, 32-bit PCI 2.2 compliant interface • 66MHz, 32-bit AGP 2.0 compliant interface supports 1x, 2x and 4x data transfer modes • Versions for SDRAM, DDR SDRAM and RDRAM memory • Single bit error correction and multiple bit error detection (ECC) • Distributed Graphics Aperture Remapping Table (GART) • Power management features including powerdown self-refresh of SDRAM
    [Show full text]
  • TECHNICAL MANUAL of Intel 945GC +Intel 82801G Based Mini-ITX M/B
    TECHNICAL MANUAL Of Intel 945GC +Intel 82801G Based Mini-ITX M/B For Intel Atom Processor NO.G03-NC92-F Rev1.0 Release date: Oct., 2008 Trademark: * Specifications and Information contained in this documentation are furnished for information use only, and are subject to change at any time without notice, and should not be construed as a commitment by manufacturer. Environmental Protection Announcement Do not dispose this electronic device into the trash while discarding. To minimize pollution and ensure environment protection of mother earth, please recycle. ii TABLE OF CONTENT USER’S NOTICE .................................................................................................................................. iv MANUAL REVISION INFORMATION ............................................................................................ iv ITEM CHECKLIST.............................................................................................................................. iv CHAPTER 1 INTRODUCTION OF THE MOTHERBOARD 1-1 FEATURE OF MOTHERBOARD..................................................................................... 1 1-2 SPECIFICATION.................................................................................................................. 2 1-3 LAYOUT DIAGRAM ........................................................................................................... 3 CHAPTER 2 JUMPER SETTING, CONNECTORS AND HEADERS 2-1 JUMPER SETTING ..............................................................................................................6
    [Show full text]
  • Manual Title
    HP ProBook 4415s Notebook PC HP ProBook 4416s Notebook PC HP ProBook 4515s Notebook PC Maintenance and Service Guide Addendum Addendum Part Number: 577544-001 Addendum revision history Part number Publication date Description -001 September 2009 Updated commodities in the Product Description table. Updated spare parts throughout MSG. © Copyright 2009 Hewlett-Packard Development Company, L.P. AMD Athlon, AMD Sempron, AMD Turion, and ATI Mobility Radeon are trademarks of Advanced Micro Devices, Inc. Bluetooth is a trademark owned by its proprietor and used by Hewlett-Packard Company under license. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. First Edition: September 2009 Product description changes The item descriptions in the table below are supplemental to the Product Description table in chapter 1. 14.0-in 14.0-in 15.6-in 15.6-in UMA Discrete UMA Discrete RS880M RX881 RS880M RX881 Description 4415s 4416s 4515s 4515s AMD Turion™ II Ultra Dual-Core Mobile M600 processor, 2.4-GHz with 2-MB XXXX L2 cache AMD Turion II Dual-Core Mobile M520 processor, 2.3-GHz with 1-MB L2 cache XXXX AMD Turion II Dual-Core Mobile M500 processor, 2.2-GHz with
    [Show full text]
  • Cache Hierarchy and Memory Subsystem of the Amd Opteron Processor
    [3B2-14] mmi2010020016.3d 30/3/010 12:7 Page 16 .......................................................................................................................................................................................................................... CACHE HIERARCHY AND MEMORY SUBSYSTEM OF THE AMD OPTERON PROCESSOR .......................................................................................................................................................................................................................... THE 12-CORE AMD OPTERON PROCESSOR, CODE-NAMED ‘‘MAGNY COURS,’’ COMBINES ADVANCES IN SILICON, PACKAGING, INTERCONNECT, CACHE COHERENCE PROTOCOL, AND SERVER ARCHITECTURE TO INCREASE THE COMPUTE DENSITY OF HIGH-VOLUME COMMODITY 2P/4P BLADE SERVERS WHILE OPERATING WITHIN THE SAME POWER ENVELOPE AS EARLIER-GENERATION AMD OPTERON PROCESSORS.AKEY ENABLING FEATURE, THE PROBE FILTER, REDUCES BOTH THE BANDWIDTH OVERHEAD OF TRADITIONAL BROADCAST-BASED COHERENCE AND MEMORY LATENCY. ......Recent trends point to high and and latency sensitive, which favors the use growing demand for increased compute den- of high-performance cores. In fact, a com- sity in large-scale data centers. Many popular mon use of chip multiprocessor (CMP) serv- server workloads exhibit abundant process- ers is simply running multiple independent and thread-level parallelism, so benefit di- instances of single-threaded applications in rectly from additional cores. One approach multiprogrammed mode. In addition, for Pat Conway to exploiting
    [Show full text]