A Case for Toggle-Aware Compression for GPU Systems

Total Page:16

File Type:pdf, Size:1020Kb

A Case for Toggle-Aware Compression for GPU Systems A Case for Toggle-Aware Compression for GPU Systems Gennady Pekhimenko†, Evgeny Bolotin?, Nandita Vijaykumar†, Onur Mutlu†, Todd C. Mowry†, Stephen W. Keckler?# †Carnegie Mellon University ?NVIDIA #University of Texas at Austin ABSTRACT bandwidth utilization (e.g., of on-chip and o-chip intercon- nects [15, 5, 64, 58, 51, 60, 69]). Several recent works focus on Data compression can be an eective method to achieve higher bandwidth compression to decrease memory trac by trans- system performance and energy eciency in modern data- mitting data in a compressed form in both CPUs [51, 64, 5] intensive applications by exploiting redundancy and data simi- and GPUs [58, 51, 69], which results in better system perfor- larity. Prior works have studied a variety of data compression mance and energy consumption. Bandwidth compression techniques to improve both capacity (e.g., of caches and main proves to be particularly eective in GPUs because they are memory) and bandwidth utilization (e.g., of the on-chip and often bottlenecked by memory bandwidth [47, 32, 31, 72, 69]. o-chip interconnects). In this paper, we make a new observa- GPU applications also exhibit high degrees of data redun- tion about the energy-eciency of communication when com- dancy [58, 51, 69], leading to good compression ratios. pression is applied. While compression reduces the amount of While data compression can dramatically reduce the num- transferred data, it leads to a substantial increase in the number ber of bit symbols that must be transmitted across a link, of bit toggles (i.e., communication channel switchings from 0 compression also carries two well-known overheads: (1) la- to 1 or from 1 to 0). The increased toggle count increases the tency, energy, and area overhead of the compression/decom- dynamic energy consumed by on-chip and o-chip buses due pression hardware [4, 52]; and (2) complexity and cost to to more frequent charging and discharging of the wires. Our support variable data sizes [22, 57, 51, 60]. Prior work has results show that the total bit toggle count can increase from addressed solutions to both of these problems. For exam- 20% to 2.2× when compression is applied for some compression ple, Base-Delta-Immediate compression [52] provides a low- algorithms, averaged across dierent application suites. We latency, low-energy hardware-based compression algorithm. characterize and demonstrate this new problem across 242 GPU Decoupled and Skewed Compressed Caches [57, 56] provide applications and six dierent compression algorithms. To miti- mechanisms to eciently manage data recompaction and gate the problem, we propose two new toggle-aware compres- fragmentation in compressed caches. sion techniques: Energy Control and Metadata Consolidation. These techniques greatly reduce the bit toggle count impact of 1.1. Compression & Communication Energy the data compression algorithms we examine, while keeping In this paper, we make a new observation that yet another im- most of their bandwidth reduction benets. portant problem with data compression must be addressed to implement energy-ecient communication: transferring data 1. Introduction in compressed form (as opposed to uncompressed form) leads Modern data-intensive computing forces system designers to to a signicant increase in the number of bit toggles, i.e., the deliver good system performance under multiple constraints: number of wires that switch from 0 to 1 or 1 to 0. An increase shrinking power and energy envelopes (power wall), increas- in bit toggle count causes higher switching activity [65, 9, 10] ing memory latency (memory latency wall), and scarce and ex- for wires, causing higher dynamic energy to be consumed pensive bandwidth resources (bandwidth wall). While many by on-chip or o-chip interconnects. The bit toggle count dierent techniques have been proposed to address these is- of compressed data transfer increases for two reasons. First, sues, these techniques often oer a trade-o that improves the compressed data has a higher per-bit entropy because the one constraint at the cost of another. Ideally, system architects same amount of information is now stored in fewer bits (the would like to improve one or more of these system parameters, “randomness” of a single bit grows). Second, the variable-size e.g., on-chip and o-chip1 bandwidth consumption, while nature of compressed data can negatively aect the word/it simultaneously avoiding negative eects on other key param- data alignment in hardware. Thus, in contrast to the common eters, such as overall system cost, energy, and latency charac- wisdom that bandwidth compression saves energy (when it teristics. One potential way to address multiple constraints is is eective), our key observation reveals a new trade-o: en- to employ dedicated hardware-based data compression mech- ergy savings obtained by reducing bandwidth versus energy anisms (e.g., [71, 4, 14, 52, 6]) across dierent data links in loss due to higher switching energy during compressed data the system. Compression exploits the high data redundancy transfers. This observation and the corresponding trade-o observed in many modern applications [52, 57, 6, 69] and are the key contributions of this work. can be used to improve both capacity (e.g., of caches, DRAM, To understand (1) how applicable general-purpose data non-volatile memories [71, 4, 14, 52, 6, 51, 60, 50, 69, 74]) and compression is for real GPU applications, and (2) the severity of the problem, we use six compression algorithms [4, 52, 14, 1Communication channel between the last-level cache and main memory. 51, 76, 53] to analyze 221 discrete and mobile graphics appli- 978-1-4673-9211-2/16/$31.00 ©2016 IEEE 188 cation traces from a major GPU vendor and 21 open-source, on-chip interconnect energy consumption with C-Pack com- general-purpose GPU applications. Our analysis shows that pression algorithm to a much more acceptable 1.1× increase. although o-chip bandwidth compression achieves a signi- cant compression ratio (e.g., more than 47% average eective 2. Background bandwidth increase with C-Pack [14] across mobile GPU ap- plications), it also greatly increases the bit toggle count (e.g., a Data compression is a powerful mechanism that exploits corresponding 2.2× average increase). This eect can signi- the existing redundancy in the applications’ data to relax cantly increase the energy dissipated in the on-chip/o-chip capacity and bandwidth requirements for many modern sys- interconnects, which constitute a signicant portion of the tems. Hardware-based data compression was explored in memory subsystem energy. the context of on-chip caches [71, 4, 14, 52, 57, 6] and main memory [2, 64, 18, 51, 60], but mostly for CPU-oriented ap- 1.2. Toggle-Aware Compression plications. Several prior works [64, 51, 58, 60, 69] examined In this work, we develop two new techniques that make band- the specics of memory bandwidth compression, where it is width compression for on-chip/o-chip buses more energy- critical to decide where and when to perform compression ecient by limiting the overall increase in compression- and decompression. related bit toggles. Energy Control (EC) decides whether to While these works evaluated the energy/power benets of send data in compressed or uncompressed form, based on a bandwidth compression, the overhead of compression was model that accounts for the compression ratio, the increase limited to the examined overheads of 1) the compression/de- in bit toggles, and current bandwidth utilization. The key compression logic and 2) the newly-proposed mechanisms/de- insight is that this decision can be made in a ne-grained signs. To our knowledge, this is the rst work that examines manner (e.g., for every cache line), using a simple model the energy implications of compression on the data trans- to approximate the commonly-used Energy × Delay and ferred over the on-chip/o-chip buses. Depending on the Energy × Delay2 metrics. In this model, Energy is directly type of the communication channel, the transferred data bits proportional to the bit toggle count; Delay is inversely pro- have dierent eect on the energy spent on communication. portional to the compression ratio and directly proportional We provide a brief background on this eect for three major to the bandwidth utilization. Our second technique, Metadata communication channel types. Consolidation (MC), reduces the negative eects of scattering On-chip Interconnect. For the full-swing on-chip inter- the metadata across a compressed cache line, which happens connects, one of the dominant factors that denes the energy with many existing compression algorithms [4, 14]. Instead, cost of a single data transfer (commonly called a it) is the MC consolidates compression-related metadata in a contigu- activity factor—the number of bit toggles on the wires (com- ous fashion. munication channel switchings from 0 to 1 or from 1 to 0). Our toggle-aware compression mechanisms are generic The bit toggle count for a particular it depends on both the and applicable to dierent compression algorithms (e.g., it’s data and the data that was previously sent over the same Frequent Pattern Compression (FPC) [4] and Base-Delta- wires. Several prior works [63, 10, 73, 66, 9] examined more Immediate (BDI) compression [52]), dierent communication energy-ecient data communication in the context of on- channels (on-chip and o-chip buses), and dierent archi- chip interconnects [10], reducing the number of bit toggles. tectures (e.g., GPUs, CPUs, and hardware accelerators). We The key dierence between our work and these prior works demonstrate that our proposed mechanisms are mostly or- is that we aim to address the eect of increase (sometimes thogonal to dierent data encoding schemes also used to a dramatic increase, see Section 3) in bit toggle count specif- minimize the bit toggle count (e.g., Data Bus Inversion [63]), ically due to data compression.
Recommended publications
  • Gs-35F-4677G
    March 2013 NCS Technologies, Inc. Information Technology (IT) Schedule Contract Number: GS-35F-4677G FEDERAL ACQUISTIION SERVICE INFORMATION TECHNOLOGY SCHEDULE PRICELIST GENERAL PURPOSE COMMERCIAL INFORMATION TECHNOLOGY EQUIPMENT Special Item No. 132-8 Purchase of Hardware 132-8 PURCHASE OF EQUIPMENT FSC CLASS 7010 – SYSTEM CONFIGURATION 1. End User Computer / Desktop 2. Professional Workstation 3. Server 4. Laptop / Portable / Notebook FSC CLASS 7-25 – INPUT/OUTPUT AND STORAGE DEVICES 1. Display 2. Network Equipment 3. Storage Devices including Magnetic Storage, Magnetic Tape and Optical Disk NCS TECHNOLOGIES, INC. 7669 Limestone Drive Gainesville, VA 20155-4038 Tel: (703) 621-1700 Fax: (703) 621-1701 Website: www.ncst.com Contract Number: GS-35F-4677G – Option Year 3 Period Covered by Contract: May 15, 1997 through May 14, 2017 GENERAL SERVICE ADMINISTRATION FEDERAL ACQUISTIION SERVICE Products and ordering information in this Authorized FAS IT Schedule Price List is also available on the GSA Advantage! System. Agencies can browse GSA Advantage! By accessing GSA’s Home Page via Internet at www.gsa.gov. TABLE OF CONTENTS INFORMATION FOR ORDERING OFFICES ............................................................................................................................................................................................................................... TC-1 SPECIAL NOTICE TO AGENCIES – SMALL BUSINESS PARTICIPATION 1. Geographical Scope of Contract .............................................................................................................................................................................................................................
    [Show full text]
  • Nvidia Forceware Graphics Drivers for XP, Manual and Notes
    ForceWare Graphics Drivers Release 162 Notes Version 162.18 for Windows XP NVIDIA Corporation June 29, 2007 Confidential Information Published by NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, 3DFX, 3DFX INTERACTIVE, the 3dfx Logo, STB, STB Systems and Design, the STB Logo, the StarBox Logo, NVIDIA nForce, GeForce, NVIDIA Quadro, NVDVD, NVIDIA Personal Cinema, NVIDIA Soundstorm, Vanta, TNT2, TNT,
    [Show full text]
  • Release 85 Notes
    ForceWare Graphics Drivers Release 85 Notes Version 88.61 For Windows Vista x86 and Windows Vista x64 NVIDIA Corporation May 2006 Published by NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, 3DFX, 3DFX INTERACTIVE, the 3dfx Logo, STB, STB Systems and Design, the STB Logo, the StarBox Logo, NVIDIA nForce, GeForce, NVIDIA Quadro, NVDVD, NVIDIA Personal Cinema, NVIDIA Soundstorm, Vanta, TNT2, TNT,
    [Show full text]
  • Manycore GPU Architectures and Programming, Part 1
    Lecture 19: Manycore GPU Architectures and Programming, Part 1 Concurrent and Mul=core Programming CSE 436/536, [email protected] www.secs.oakland.edu/~yan 1 Topics (Part 2) • Parallel architectures and hardware – Parallel computer architectures – Memory hierarchy and cache coherency • Manycore GPU architectures and programming – GPUs architectures – CUDA programming – Introduc?on to offloading model in OpenMP and OpenACC • Programming on large scale systems (Chapter 6) – MPI (point to point and collec=ves) – Introduc?on to PGAS languages, UPC and Chapel • Parallel algorithms (Chapter 8,9 &10) – Dense matrix, and sorng 2 Manycore GPU Architectures and Programming: Outline • Introduc?on – GPU architectures, GPGPUs, and CUDA • GPU Execuon model • CUDA Programming model • Working with Memory in CUDA – Global memory, shared and constant memory • Streams and concurrency • CUDA instruc?on intrinsic and library • Performance, profiling, debugging, and error handling • Direc?ve-based high-level programming model – OpenACC and OpenMP 3 Computer Graphics GPU: Graphics Processing Unit 4 Graphics Processing Unit (GPU) Image: h[p://www.ntu.edu.sg/home/ehchua/programming/opengl/CG_BasicsTheory.html 5 Graphics Processing Unit (GPU) • Enriching user visual experience • Delivering energy-efficient compung • Unlocking poten?als of complex apps • Enabling Deeper scien?fic discovery 6 What is GPU Today? • It is a processor op?mized for 2D/3D graphics, video, visual compu?ng, and display. • It is highly parallel, highly multhreaded mulprocessor op?mized for visual
    [Show full text]
  • CPU-GPU Benchmark Description
    UC San Diego UC San Diego Electronic Theses and Dissertations Title Heterogeneity and Density Aware Design of Computing Systems Permalink https://escholarship.org/uc/item/9qd6w8cw Author Arora, Manish Publication Date 2018 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Heterogeneity and Density Aware Design of Computing Systems A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science (Computer Engineering) by Manish Arora Committee in charge: Professor Dean M. Tullsen, Chair Professor Renkun Chen Professor George Porter Professor Tajana Simunic Rosing Professor Steve Swanson 2018 Copyright Manish Arora, 2018 All rights reserved. The dissertation of Manish Arora is approved, and it is ac- ceptable in quality and form for publication on microfilm and electronically: Chair University of California, San Diego 2018 iii DEDICATION To my family and friends. iv EPIGRAPH Do not be embarrassed by your failures, learn from them and start again. —Richard Branson If something’s important enough, you should try. Even if the probable outcome is failure. —Elon Musk v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . v Table of Contents . vi List of Figures . viii List of Tables . x Acknowledgements . xi Vita . xiii Abstract of the Dissertation . xvi Chapter 1 Introduction . 1 1.1 Design for Heterogeneity . 2 1.1.1 Heterogeneity Aware Design . 3 1.2 Design for Density . 6 1.2.1 Density Aware Design . 7 1.3 Overview of Dissertation . 8 Chapter 2 Background . 10 2.1 GPGPU Computing . 10 2.2 CPU-GPU Systems .
    [Show full text]
  • Trabajo #1 De Computación Gráfica
    TRABAJO #1 DE COMPUTACIÓN GRÁFICA MONITORES 1.¿Cuántos tipos de monitores hay? Hay actualmente tres tipos principales de tecnologías: CRT, o tubos de rayos catódicos (los de siempre), LCD, o pantallas de cristal líquido (Liquid Crystal Display), y las pantallas de plasma. LCD: Una pantalla de cristal líquido o LCD (acrónimo del inglés Liquid crystal display) es una pantalla delgada y plana formada por un número de píxeles en color o monocromos colocados delante de una fuente de luz o reflectora. A menudo se utiliza en pilas, dispositivos electrónicos, ya que utiliza cantidades muy pequeñas de energía eléctrica. CTR: El monitor esta basado en un elemento CRT (Tubo de rayos catódicos), los actuales monitores, controlados por un microprocesador para almacenar muy diferentes formatos, así como corregir las eventuales distorsiones, y con capacidad de presentar hasta 1600x1200 puntos en pantalla. Los monitores CRT emplean tubos cortos, pero con la particularidad de disponer de una pantalla completamente plana. PLASMA: Se basan en el principio de que haciendo pasar un alto voltaje por un gas a baja presión se genera luz. Estas pantallas usan fósforo como los CRT pero son emisivas como las LCD y frente a estas consiguen una gran mejora del color y un estupendo ángulo de visión. 2.Compara en una tabla similitudes y diferencias, ventajas y desventajas, entre CRT, LCD, PLASMA, TFT, HDA, y otras tecnologías Similitud Diferencia Ventajas Desventajas CRT Lo usan los -Permiten -Ocupan monitores de reproducir una más espacio plasma mayor variedad (cuanto mas cromática. fondo, mejor geometría). -Distintas resoluciones se -Los pueden ajustar modelos al monitor.
    [Show full text]
  • Time Complexity Parallel Local Binary Pattern Feature Extractor on a Graphical Processing Unit
    ICIC Express Letters ICIC International ⃝c 2019 ISSN 1881-803X Volume 13, Number 9, September 2019 pp. 867{874 θ(1) TIME COMPLEXITY PARALLEL LOCAL BINARY PATTERN FEATURE EXTRACTOR ON A GRAPHICAL PROCESSING UNIT Ashwath Rao Badanidiyoor and Gopalakrishna Kini Naravi Department of Computer Science and Engineering Manipal Institute of Technology Manipal Academy of Higher Education Manipal, Karnataka 576104, India f ashwath.rao; ng.kini [email protected] Received February 2019; accepted May 2019 Abstract. Local Binary Pattern (LBP) feature is used widely as a texture feature in object recognition, face recognition, real-time recognitions, etc. in an image. LBP feature extraction time is crucial in many real-time applications. To accelerate feature extrac- tion time, in many previous works researchers have used CPU-GPU combination. In this work, we provide a θ(1) time complexity implementation for determining the Local Binary Pattern (LBP) features of an image. This is possible by employing the full capa- bility of a GPU. The implementation is tested on LISS medical images. Keywords: Local binary pattern, Medical image processing, Parallel algorithms, Graph- ical processing unit, CUDA 1. Introduction. Local binary pattern is a visual descriptor proposed in 1990 by Wang and He [1, 2]. Local binary pattern provides a distribution of intensity around a center pixel. It is a non-parametric visual descriptor helpful in describing a distribution around a center value. Since digital images are distributions of intensity, it is helpful in describing an image. The pattern is widely used in texture analysis, object recognition, and image description. The local binary pattern is used widely in real-time description, and analysis of objects in images and videos, owing to its computational efficiency and computational simplicity.
    [Show full text]
  • HP Z800 Workstation Overview
    QuickSpecs HP Z800 Workstation Overview HP recommends Windows Vista® Business 1. 3 External 5.25" Bays 2. Power Button 3. Front I/O: 3 USB 2.0, 1 IEEE 1394a, 1 Headphone out, 1 Microphone in DA - 13278 North America — Version 2 — April 13, 2009 Page 1 QuickSpecs HP Z800 Workstation Overview 4. Choice of 850W, 85% or 1110W, 89% Power Supplies 9. Rear I/O: 1 IEEE 1394a, 6 USB 2.0, 1 serial, PS/2 keyboard/mouse 5. 12 DIMM Slots for DDR3 ECC Memory 2 RJ-45 to Integrated Gigabit LAN 1 Audio Line In, 1 Audio Line Out, 1 Microphone In 6. 3 External 5.25” Bays 10. 2 PCIe x16 Gen2 Slots 7. 4 Internal 3.5” Bays 11.. 2 PCIe x8 Gen2, 1 PCIe x4 Gen2, 1 PCIe x4 Gen1, 1 PCI Slot 8. 2 Quad Core Intel 5500 Series Processors 12 3 Internal USB 2.0 ports Form Factor Rackable Minitower Compatible Operating Genuine Windows Vista® Business 32-bit* Systems Genuine Windows Vista® Business 64-bit* Genuine Windows Vista® Business 32-bit with downgrade to Windows® XP Professional 32-bit custom installed** Genuine Windows Vista® Business 64-bit with downgrade to Windows® XP Professional x64 custom installed** HP Linux Installer Kit for Linux (includes drivers for both 32-bit & 64-bit OS versions of Red Hat Enterprise Linux WS4 and WS5 - see: http://www.hp.com/workstations/software/linux) For detailed OS/hardware support information for Linux, see: http://www.hp.com/support/linux_hardware_matrix *Certain Windows Vista product features require advanced or additional hardware.
    [Show full text]
  • Opengl Performance Tools
    OpenGL Performance Tools Jeff Kiel NVIDIA Corporation Performance Tools Agenda Problem statement GPU pipelined architecture at a glance NVPerfKit 2.0: Driver and GPU Performance Data GLExpert: OpenGL API Assistance NVPerfSDK: Integrated into your application NVPerfAPI PDH, NVDevCPL Solutions to common bottlenecks NVShaderPerf: Shader Performance Copyright © NVIDIA Corporation 2004 What’s The Problem? Why is my app running at 13FPS after CPU tuning? How can I determine what is going in that GPU? How come IHV engineers are able to figure it out? Copyright © NVIDIA Corporation 2004 GPU architecture at a glance Pipelined architecture: each unit needs the data from the previous unit to do its job Method: Bottleneck identification and elimination Goal: Balance the pipeline Copyright © NVIDIA Corporation 2004 GPU Pipelined Architecture (simplified view) GPU …110010100100… Vertex Vertex Pixel Frame CPU Rasterizer Setup Shader Shader buffer Texture Storage + Filtering Vertices Pixels Copyright © NVIDIA Corporation 2004 GPU Pipelined Architecture (simplified view) GPU Vertex Vertex Pixel Frame CPU Rasterizer Setup Shader Shader buffer Texture Storage + Filtering OneOne unitunit cancan limitlimit thethe speedspeed ofof thethe pipelinepipeline…… Copyright © NVIDIA Corporation 2004 Classic Bottleneck Identification Modify target stage to decrease workload FPS FPS If performance/FPS improves greatly, this stage is the bottleneck Careful not to change the workload of other stages! Copyright © NVIDIA Corporation 2004 Classic Bottleneck Identification
    [Show full text]
  • INTEL's TOP CPU the Pentium 4 3.4Ghz Extreme Edition Intel
    44> 7625274 81181 Chips and chipsets. They’re the heart of any computing system, and as usual, we have packed this issue of PC Modder with dozens of pos- sible processor and motherboard matchups. Our Case Studies will THE PARTS SHOP help you discover what sort of performance you can expect to 19 AMD’s Top CPU achieve when overclocking various chip and chipset combos. When The Athlon 64 FX53 you’re ready to crank up your own silicon, our Cool It articles will provide tips on keeping excess heat under control. And then when 20 Intel’s Top CPU it’s time to build the ultimate home for your motherboard, our Cut Pentium 4 3.4GHz Extreme Edition It section will give you some case ideas to drool over. Whether you’re a modding novice or master, you’ll find this issue is filled with all the 21 Grantsdale Grants Intel Users’ Wishes tips and tools you need to build faster and more beautiful PCs. New i915 Chipset Adds Many New Technologies 22 Special FX SiS Adds Support For AMD’s FIRST-TIMERS Athlon 64FX CPU 4 The Need For Speed 23 Chipset Entertainment Take Your CPU To Ultimate Heights VIA’s PM880 Chipset One Step At A Time Ready For Media PCs 8 Cutting Holes & Taking Names & HDTV Your First Case Mod 24 May The 12 Water For First-Timers Force Be Tips To Help Watercooling With You Newbies Take The Plunge NVIDIA’s nForce3 Chipset 16 Benchmark Basics A Primer On 25 Essential Overclocking Utilities Measuring Power Modding Is As Much About The Your PC’s Right Software As The Right Hardware Performance 31 Sizing Up Sockets How Your Processor Saddles Up 33 The Mad Modder’s Toolkit Minireviews, Meanderings & Musings Copyright 2004 by Sandhills Publishing Company.
    [Show full text]
  • Linux Hwmon Documentation
    Linux Hwmon Documentation The kernel development community Jul 14, 2020 CONTENTS i ii CHAPTER ONE THE LINUX HARDWARE MONITORING KERNEL API Guenter Roeck 1.1 Introduction This document describes the API that can be used by hardware monitoring drivers that want to use the hardware monitoring framework. This document does not describe what a hardware monitoring (hwmon) Driver or Device is. It also does not describe the API which can be used by user space to communicate with a hardware monitoring device. If you want to know this then please read the following file: Documentation/hwmon/sysfs-interface.rst. For additional guidelines on how to write and improve hwmon drivers, please also read Documentation/hwmon/submitting-patches.rst. 1.2 The API Each hardware monitoring driver must #include <linux/hwmon.h> and, in most cases, <linux/hwmon-sysfs.h>. linux/hwmon.h declares the following regis- ter/unregister functions: struct device * hwmon_device_register_with_groups(struct device *dev, const char *name, void *drvdata, const struct attribute_group **groups); struct device * devm_hwmon_device_register_with_groups(struct device *dev, const char *name, void *drvdata, const struct attribute_group␣ ,!**groups); struct device * hwmon_device_register_with_info(struct device *dev, const char *name, void *drvdata, const struct hwmon_chip_info *info, const struct attribute_group **extra_ ,!groups); (continues on next page) 1 Linux Hwmon Documentation (continued from previous page) struct device * devm_hwmon_device_register_with_info(struct device *dev, const char *name, void *drvdata, const struct hwmon_chip_info *info, const struct attribute_group **extra_ ,!groups); void hwmon_device_unregister(struct device *dev); void devm_hwmon_device_unregister(struct device *dev); hwmon_device_register_with_groups registers a hardware monitoring device. The first parameter of this function is a pointer to the parent device.
    [Show full text]
  • HP Z400 Workstation Overview
    QuickSpecs HP Z400 Workstation Overview HP recommends Windows Vista® Business 1. 3 External 5.25" Bays 2. Power Button 3. Front I/O: 2 USB 2.0, 1 IEEE 1394a (optional card required), Headphone, Microphone DA - 13276 North America — Version 4 — April 17, 2009 Page 1 QuickSpecs HP Z400 Workstation Overview 4. 3 External 5.25” Bays 9. Rear I/O: 6 USB 2.0, PS/2 keyboard/mouse 1 RJ-45 to Integrated Gigabit LAN 5. 4 DIMM Slots for DDR3 ECC Memory 1 Audio Line In, 1 Audio Line Out, 1 Microphone In 6. 2 Internal 3.5” Bays 10. 2 PCIe x16 Gen2 Slots 7. 475W, 85% efficient Power Supply 11.. 1 PCIe x4 Gen2, 1 PCIe x4 Gen1, 2 PCI Slots 8. Dual/Quad Core Intel 3500 Series Processors 12 4 Internal USB 2.0 ports Form Factor Convertible Minitower Compatible Operating Genuine Windows Vista® Business 32-bit* Systems Genuine Windows Vista® Business 64-bit* Genuine Windows Vista® Business 32-bit with downgrade to Windows® XP Professional 32-bit custom installed** (expected available until August 2009) Genuine Windows Vista® Business 64-bit with downgrade to Windows® XP Professional x64 custom installed** (expected available until August 2009) HP Linux Installer Kit for Linux (includes drivers for both 32-bit & 64-bit OS versions of Red Hat Enterprise Linux WS4 and WS5 - see: http://www.hp.com/workstations/software/linux) Novell Suse SLED 11 (expected availability May 2009) *Certain Windows Vista product features require advanced or additional hardware. See http://www.microsoft.com/windowsvista/getready/hardwarereqs.mspx and http://www.microsoft.com/windowsvista/getready/capable.mspx for details.
    [Show full text]