Optimizing Parallel Reduction in Cuda to Reach GPU Peak Performance

Total Page:16

File Type:pdf, Size:1020Kb

Optimizing Parallel Reduction in Cuda to Reach GPU Peak Performance Optimizing Parallel Reduction In Cuda To Reach GPU Peak Performance 1Adityo Mahardito 2Adang Suhendra 3Deni Tri Hasta 1Gunadarma University([email protected]) 2Gunadarma University([email protected]) 3Gunadarma University([email protected]) Abstract GPUs are massively multi threaded many core chips that have hundreds of cores and thousands of con- current threads swith high performance and memory bandwidth. Now days, GPUs have already been used to accelerate some numerically intensive high performance computing applications in parallel not only used to graphic processing. This thesis aims primarily to demonstrate the programming model ap- proaches that can be maximize the performance of GPUs. This is accomplished by a proof of maximum reach of bandwidth memory and get speed up from the GPU that used to process parallel computation. The programming environment that used is NVIDIA’s CUDA, it is parallel architecture and model for general- purpose computing on a GPU. 1 Introduction trol. Many companies have produced GPUs under a number of brand names. Intel, NVIDIA, AMD/AT, Currently, demand for hardware resources for pro- S3 Graphics, VIA Technologies, and Matrox pro- cessing data on computers is increasing. This can duce GPUs. But, NVIDIA and ATI control nearly be proved by continuously increasing the mini- 100% of the market. Recently, NVIDIA began re- mum requirement of software applications. The leasing cards supporting an API extension to the C phenomenon of the advent multi core CPUs and programming language CUDA (”Compute Unified many core GPUs means that mainstream processor Device Architecture”), which allows specified func- chips are now parallel systems. Many applications tions from a normal C program to run on the GPU’s that process large data sets can use a data paral- stream processors[8]. lel programming model to speed up the computa- The performance offered by the GPU, in fact, tions. The application outside the field of computa- depend-ing on the method approach that used tion itself are accelerated by data parallel process- while create a CUDA program, even though using ing, from general image rendering and processing, same algorithm. Because in the parallel program- signal processing, physics simulation to computa- ming, especially on CUDA, the thinks that must tional finance or computational biology. However, be considered to obtain optimal results are mem- to more support of data parallel processing, the ory usage and instruction usage. The main focus GPU can be considered than the CPU, because there of this thesis is make optimization of parallel pro- are very significant differences between the capabil- gramming on the CUDA environment that running ity that are owned by the CPUs and GPUs to pro- over GPU using parallel reduction algorithm. The cess sets large data on parallel computation. In parallel reduction algorithm that implemented is the other side, GPUs are very efficient not only at a summation algorithm that computes the sum of manipulating computer graphics, but also at their large arrays of values. highly parallel structure makes them more effective than general purpose CPUs for a range of complex 2 Literature Review algorithms [2]. The GPU is specialized for compute intensive, 2.1 Graphic Processor Unit highly parallel computation and therefore designed such that more transistors are devoted to data pro- A graphics processing unit or GPU is a processor at- cessing rather than data caching and flow con- tached to a graphics card dedicated to calculating floating point operations. GPU has evolved into a highly parallel, multithreaded, manycore processor with tremendous computational horsepower and very high memory bandwidth. The reason behind the discrepancy in floating- point capability between the CPU and the GPU is that the GPU is specialized for compute intensive, highly parallel computation and therefore designed such that more transistors are devoted to data pro- cessing rather than data caching and flow control. Figure 1: Streaming Processor (SP) This difference makes the CPU and GPU suited for different kinds of problems. In a simplified way, one can say that a CPU performs best when a few, isn’t specific in saying about SFU architecture, just small pieces of data are processed in a complex, assume that each SFU is also a fully pipelined, sin- but sequential way. This lets the CPU utilize the gle issue, in order microprocessor. There’s a MT many transistors used for caching, branch predic- issue unit that dispatches instructions to all of the tion and instruction level parallelism. The GPU, on SPs and SFUs in the group. the other hand, need massively data parallel prob- In addition to the processor cores in a SM, lems to work efficiently. The programming model there’s a very small instruction cache, a read only most commonly used when programming a GPU is data cache and a 16KB read/write shared mem- based on the stream programming model. In the ory. These cache sizes are kept purposefully small stream programming model, input to and output because unlike a conventional desktop micropro- from a computation comes in the form of streams. cessor, the datasets we’re trying to cache here are A stream is a collection of homogeneous data ele- small. Each SP ends up working on an individual ments on which some operation, called a kernel, is pixel and despite the move to 32-bit floating point to be performed, and the operation on one element values, there’s only so much data associated with is independent of the other elements in the stream. a single pixel. The 16KB memory is akin to Cell’s Another important difference between a general local stores in that it’s not a cache, but a software purpose processor and a typical GPU is the mem- managed data store so that latency is always pre- ory bandwidth. Because of simpler memory mod- dictable. With this many cores in a single SM, con- els and no requirements from legacy operating sys- trol and predictability and very important to mak- tems, the GPU can support more than 100 GB/s ing the whole thing work efficiently. of memory bandwidth, while the bandwidth of The next cluster is Texture Processor Cluster general-purpose processors is around 20 GB/s [7]. (TPC), shown by Figure 2. NVIDIA purposefully designed its GPU architecture to be modular, so a single TPC can be made up of any number of SMs. 2.1.1 GPU Architecture In the GT200 architecture it was made up of three SMs and it can be different with the other series Start from Streaming processor (see at Figure 1), [4]. it is a fully pipelined, single-issue, in order micro- processor complete with two ALUs and a FPU. An SP doesn’t have any cache, so it’s not particularly great at anything other than cranking through tons of mathematical operations. Since an SP spends most of its time working on pixel or vertex data, the fact that it doesn’t have a cache doesn’t really matter, because if you add up enough of these cores of GPU technology you can start to get something productive given that graphics rendering is a highly parallelizable task[3]. The GPU has a many-core processor containing an array of Streaming Multiprocessors (SMs). A SM is an array of SPs, consists of 8 Streaming Proces- sors (SPs), along with two more processors called Special Function Units (SFUs). Each SFU has four FP multiply units which are used for transcenden- tal operations (e.g. sin, cosin) and interpolation, the latter being used in some of the calculations for things like anisotropic texture filtering. NVIDIA Figure 2: Texture Processor Cluster The components of the TPC however haven’t a device and how threads are grouped into blocks. changed, a TPC is made up of SMs, some control The memory model defines the different types of logic and a texture block. Remember that a SM is memories that are available to a CUDA program. a total of 8 SPs and 2 SFUs, so that brings the to- tal up to 24 SPs and 6 SFUs per cluster in GT200. 2.3 CUDA Programming Model The texture block includes texture addressing and filtering logic as well as a L1 texture cache. NVIDIA’s Compute Unified Device Architecture The modular theme continues with the Stream- (CUDA) is a programming model and hard- ing Processor Array (SPA) that is composed of a ware/software environment for GPGPU, which con- number of TPCs, see Figure 3 [5]. sists of the following components: • An extension to the C programming language that al-lows programmers to define GPGPU functions called kernels which are executed by multiple threads on the GPU. • A compiler and related tools for translating CUDA source code to GPU devicespecific bi- nary • A software stack consisting of GPGPU appli- cation li-braries, CUDA runtime libraries and Figure 3: Streaming Processor Array CUDA device driver • A CUDA enabled GPU device Similarly, the processed data elements need to be stored back in global memory for copying back to host memory. The task of effectively hiding the 2.3.1 CUDA Kernels global memory access la-tency and managing the The CUDA programming language provides a memory hierarchy is very crucial for obtaining max- means for programmers to express a problem that imal performance from the GPU. Each SM manages exhibits significant data-parallelism as a CUDA ker- the creation, execution, synchronization and de- nel. A CUDA kernel is a function that is executed struction of concurrent threads in hardware, with in Single Program Multiple Data (SPMD) on a large zero scheduling overhead, and this is one of the key set of data elements. factors in achieving very high execution through- Kernel code is written on the thread level with put.
Recommended publications
  • Release Notes for X11R6.8.2 the X.Orgfoundation the Xfree86 Project, Inc
    Release Notes for X11R6.8.2 The X.OrgFoundation The XFree86 Project, Inc. 9February 2005 Abstract These release notes contains information about features and their status in the X.Org Foundation X11R6.8.2 release. It is based on the XFree86 4.4RC2 RELNOTES docu- ment published by The XFree86™ Project, Inc. Thereare significant updates and dif- ferences in the X.Orgrelease as noted below. 1. Introduction to the X11R6.8.2 Release The release numbering is based on the original MIT X numbering system. X11refers to the ver- sion of the network protocol that the X Window system is based on: Version 11was first released in 1988 and has been stable for 15 years, with only upwardcompatible additions to the coreX protocol, a recordofstability envied in computing. Formal releases of X started with X version 9 from MIT;the first commercial X products werebased on X version 10. The MIT X Consortium and its successors, the X Consortium, the Open Group X Project Team, and the X.OrgGroup released versions X11R3 through X11R6.6, beforethe founding of the X.OrgFoundation. Therewill be futuremaintenance releases in the X11R6.8.x series. However,efforts arewell underway to split the X distribution into its modular components to allow for easier maintenance and independent updates. We expect a transitional period while both X11R6.8 releases arebeing fielded and the modular release completed and deployed while both will be available as different consumers of X technology have different constraints on deployment. Wehave not yet decided how the modular X releases will be numbered. We encourage you to submit bug fixes and enhancements to bugzilla.freedesktop.orgusing the xorgproduct, and discussions on this server take place on <[email protected]>.
    [Show full text]
  • Stephen Clarke-Willson
    Contact Stephen Clarke-Willson www.linkedin.com/in/drstephencw Programmer, Producer, Executive (LinkedIn) Sammamish www.arena.net (Company) www.above-the-garage.com (Personal) Summary above-the-garage.com/blog (Blog) Software technology development leadership and management. Top Skills Specialties: Technical / Product team building and management; Systems Programming System Architecture experience as first, second and third level manager in fast growing Game Development companies; systems programming, systems architecture, technology development. Publications Guild Wars Microservices and 24/7 Uptime Guild Wars 2 - Scaling from one to Experience millions Applying Game Design To Virtual NCSOFT Environments VP of Technology Nano-Plasm March 2019 - Present (1 year 7 months) Bellevue, WA ArenaNet LLC 13 years 6 months Programmer in the role of Studio Technical Director May 2013 - March 2019 (5 years 11 months) Bellevue, WA Lead engineering staff (about 100 members) for "gaming as a service" with continuous high volume content creation and delivery at MMO scale. Translate business objectives into innovative yet achievable technical challenges. Created technical unit with flat reporting structure and peer input reviews. Programmer in the roles of Server Programmer / Server Team Lead October 2005 - April 2013 (7 years 7 months) Developing multi-threaded, restartable, dynamically updatable, high performance, internet-resilient, bug free server code for the MMO Guild Wars (1 and 2). Above the Garage Productions Programmer / Owner Page 1 of 4 May 2004 - October 2005 (1 year 6 months) Game Technology Developer, Above the Garage Productions Developed downloadable music system "DirectSong.com" (from payment system [PayPal] to delivery system to embedded music player using Microsoft WMA technology).
    [Show full text]
  • Reviving the Development of Openchrome
    Reviving the Development of OpenChrome Kevin Brace OpenChrome Project Maintainer / Developer XDC2017 September 21st, 2017 Outline ● About Me ● My Personal Story Behind OpenChrome ● Background on VIA Chrome Hardware ● The History of OpenChrome Project ● Past Releases ● Observations about Standby Resume ● Developmental Philosophy ● Developmental Challenges ● Strategies for Further Development ● Future Plans 09/21/2017 XDC2017 2 About Me ● EE (Electrical Engineering) background (B.S.E.E.) who specialized in digital design / computer architecture in college (pretty much the only undergraduate student “still” doing this stuff where I attended college) ● Graduated recently ● First time conference presenter ● Very experienced with Xilinx FPGA (Spartan-II through 7 Series FPGA) ● Fluent in Verilog / VHDL design and verification ● Interest / design experience with external communication interfaces (PCI / PCIe) and external memory interfaces (SDRAM / DDR3 SDRAM) ● Developed a simple DMA engine for PCI I/F validation w/Windows WDM (Windows Driver Model) kernel device driver ● Almost all the knowledge I have is self taught (university engineering classes were not very useful) 09/21/2017 XDC2017 3 Motivations Behind My Work ● General difficulty in obtaining meaningful employment in the digital hardware design field (too many students in the field, difficulty obtaining internship, etc.) ● Collects and repairs abandoned computer hardware (It’s like rescuing puppies!) ● Owns 100+ desktop computers and 20+ laptop computers (mostly abandoned old stuff I
    [Show full text]
  • 4010, 237 8514, 226 80486, 280 82786, 227, 280 a AA. See Anti-Aliasing (AA) Abacus, 16 Accelerated Graphics Port (AGP), 219 Acce
    Index 4010, 237 AIB. See Add-in board (AIB) 8514, 226 Air traffic control system, 303 80486, 280 Akeley, Kurt, 242 82786, 227, 280 Akkadian, 16 Algebra, 26 Alias Research, 169 Alienware, 186 A Alioscopy, 389 AA. See Anti-aliasing (AA) All-In-One computer, 352 Abacus, 16 All-points addressable (APA), 221 Accelerated Graphics Port (AGP), 219 Alpha channel, 328 AccelGraphics, 166, 273 Alpha Processor, 164 Accel-KKR, 170 ALT-256, 223 ACM. See Association for Computing Altair 680b, 181 Machinery (ACM) Alto, 158 Acorn, 156 AMD, 232, 257, 277, 410, 411 ACRTC. See Advanced CRT Controller AMD 2901 bit-slice, 318 (ACRTC) American national Standards Institute (ANSI), ACS, 158 239 Action Graphics, 164, 273 Anaglyph, 376 Acumos, 253 Anaglyph glasses, 385 A.D., 15 Analog computer, 140 Adage, 315 Anamorphic distortion, 377 Adage AGT-30, 317 Anatomic and Symbolic Mapper Engine Adams Associates, 102 (ASME), 110 Adams, Charles W., 81, 148 Anderson, Bob, 321 Add-in board (AIB), 217, 363 AN/FSQ-7, 302 Additive color, 328 Anisotropic filtering (AF), 65 Adobe, 280 ANSI. See American national Standards Adobe RGB, 328 Institute (ANSI) Advanced CRT Controller (ACRTC), 226 Anti-aliasing (AA), 63 Advanced Remote Display Station (ARDS), ANTIC graphics co-processor, 279 322 Antikythera device, 127 Advanced Visual Systems (AVS), 164 APA. See All-points addressable (APA) AED 512, 333 Apalatequi, 42 AF. See Anisotropic filtering (AF) Aperture grille, 326 AGP. See Accelerated Graphics Port (AGP) API. See Application program interface Ahiska, Yavuz, 260 standard (API) AI.
    [Show full text]
  • Manycore GPU Architectures and Programming, Part 1
    Lecture 19: Manycore GPU Architectures and Programming, Part 1 Concurrent and Mul=core Programming CSE 436/536, [email protected] www.secs.oakland.edu/~yan 1 Topics (Part 2) • Parallel architectures and hardware – Parallel computer architectures – Memory hierarchy and cache coherency • Manycore GPU architectures and programming – GPUs architectures – CUDA programming – Introduc?on to offloading model in OpenMP and OpenACC • Programming on large scale systems (Chapter 6) – MPI (point to point and collec=ves) – Introduc?on to PGAS languages, UPC and Chapel • Parallel algorithms (Chapter 8,9 &10) – Dense matrix, and sorng 2 Manycore GPU Architectures and Programming: Outline • Introduc?on – GPU architectures, GPGPUs, and CUDA • GPU Execuon model • CUDA Programming model • Working with Memory in CUDA – Global memory, shared and constant memory • Streams and concurrency • CUDA instruc?on intrinsic and library • Performance, profiling, debugging, and error handling • Direc?ve-based high-level programming model – OpenACC and OpenMP 3 Computer Graphics GPU: Graphics Processing Unit 4 Graphics Processing Unit (GPU) Image: h[p://www.ntu.edu.sg/home/ehchua/programming/opengl/CG_BasicsTheory.html 5 Graphics Processing Unit (GPU) • Enriching user visual experience • Delivering energy-efficient compung • Unlocking poten?als of complex apps • Enabling Deeper scien?fic discovery 6 What is GPU Today? • It is a processor op?mized for 2D/3D graphics, video, visual compu?ng, and display. • It is highly parallel, highly multhreaded mulprocessor op?mized for visual
    [Show full text]
  • University of Klagenfurt Digital Signal Processor (DSP) MD SARWAR
    University of Klagenfurt Digital Signal Processor (DSP) GROUP MEMBERS : MD SARWAR ZAHAN (MATRIX:1461419) BASHIRU OTOKITI (MATRIX:1361474) Topic: “GPU Processing” Proc. IEEE 96(5), 2008 AGENDA Introduction GPU Algorithm About GPU CPU VS GPU Short History Application of GPU GPU Pipeline, Architecture Conclusion WHAT IS GPU ? A graphics processing unit (GPU) is a dedicated processor that performs rapid mathematical calculations for rendering high quality video and images . The Abstract goal of a GPU is to enable a representation of a 3D world as realistically as possible. SHORT HISTORY OF GPU 2010 to 1970s 1980s 1990s 2000 to 2010 present ARCADE SYSTEM BOARDS S3 GRAPHICS NVIDIA & AUDI NEC 7220 NVIDIA 3D GRAPHICS RENDERING PIPELINE Image: 3D GRAPHICS RENDERING PIPELINE Vertex Processing: Process and transform individual vertices. Rasterization: Convert each primitive into a set of fragments. Fragment Processing: Process individual fragments. Output Merging: Combine the fragments of all primitives into color-pixel for the display. GPU ARCHITECTURE Image: NVidia GeForce 6800 Series GPU Board Host (CPU). 6 parallel vertex processors (receive data from the host). Image: NVidia GeForce 6800 GPU Architecture triangle setup stage (takes care of primitive assembly). rasterizer stage which produces the fragments. 16 processors (computes the output colors of each fragment). GPU COMPUTING Parallelism is the future of computing. GPU has moved from a fixed-function into full-fledged parallel programmable processor. GPU follow a single program multiple-data (SPMD) programming model. Image: SPMD Model SPMD Tasks are split up and run simultaneously on multiple processors with different input for faster results. GPU SOFTWARE ENVIRONMENTS Famous languages for GPU programming: NVIDIA’s (CUDA) OpenCL HLSL Cg GPU PERFORMANCE EVALUATION Image: GPU Performance Scan performance on CPU, graphics-based GPU (using OpenGL), and direct-compute GPU (using CUDA).
    [Show full text]
  • Lecture: Manycore GPU Architectures and Programming, Part 1
    Lecture: Manycore GPU Architectures and Programming, Part 1 CSCE 569 Parallel Computing Department of Computer Science and Engineering Yonghong Yan [email protected] https://passlab.github.io/CSCE569/ 1 Manycore GPU Architectures and Programming: Outline • Introduction – GPU architectures, GPGPUs, and CUDA • GPU Execution model • CUDA Programming model • Working with Memory in CUDA – Global memory, shared and constant memory • Streams and concurrency • CUDA instruction intrinsic and library • Performance, profiling, debugging, and error handling • Directive-based high-level programming model – OpenACC and OpenMP 2 Computer Graphics GPU: Graphics Processing Unit 3 Graphics Processing Unit (GPU) Image: http://www.ntu.edu.sg/home/ehchua/programming/opengl/CG_BasicsTheory.html 4 Graphics Processing Unit (GPU) • Enriching user visual experience • Delivering energy-efficient computing • Unlocking potentials of complex apps • Enabling Deeper scientific discovery 5 What is GPU Today? • It is a processor optimized for 2D/3D graphics, video, visual computing, and display. • It is highly parallel, highly multithreaded multiprocessor optimized for visual computing. • It provide real-time visual interaction with computed objects via graphics images, and video. • It serves as both a programmable graphics processor and a scalable parallel computing platform. – Heterogeneous systems: combine a GPU with a CPU • It is called as Many-core 6 Graphics Processing Units (GPUs): Brief History GPU Computing General-purpose computing on graphics processing units (GPGPUs) GPUs with programmable shading Nvidia GeForce GE 3 (2001) with programmable shading DirectX graphics API OpenGL graphics API Hardware-accelerated 3D graphics S3 graphics cards- single chip 2D accelerator Atari 8-bit computer IBM PC Professional Playstation text/graphics chip Graphics Controller card 1970 1980 1990 2000 2010 Source of information http://en.wikipedia.org/wiki/Graphics_Processing_Unit 7 NVIDIA Products • NVIDIA Corp.
    [Show full text]
  • Graphical Process Unit a New Era
    Nov 2014 (Volume 1 Issue 6) JETIR (ISSN-2349-5162) Graphical Process Unit A New Era Santosh Kumar, Shashi Bhushan Jha, Rupesh Kumar Singh Students Computer Science and Engineering Dronacharya College of Engineering, Greater Noida, India Abstract - Now in present days every computer is come with G.P.U (graphical process unit). The graphics processing unit (G.P.U) has become an essential part of today's mainstream computing systems. Over the past 6 years, there has been a marked increase in the performance and potentiality of G.P.U. The modern G.P.Us is not only a powerful graphics engine but also a deeply parallel programmable processor showing peak arithmetic and memory bandwidth that substantially outpaces its CPU counterpart. The G.P.U's speedy increase in both programmability and capability has spawned a research community that has successfully mapped a broad area of computationally demanding, mixed problems to the G.P.U. This effort in general-purpose computing on the G.P.Us, also known as G.P.U computing, has positioned the G.P.U as a compelling alternative to traditional microprocessors in high-performance computer systems of the future. We illustrate the history, hardware, and programming model for G.P.U computing, abstract the state of the art in tools and techniques, and present 4 G.P.U computing successes in games physics and computational physics that deliver order-of- magnitude performance gains over optimized CPU applications. Index Terms - G.P.U, History of G.P.U, Future of G.P.U, Problems in G.P.U, eG.P.U, Integrated graphics ________________________________________________________________________________________________________ I.
    [Show full text]
  • PC Hardware Contents
    PC Hardware Contents 1 Computer hardware 1 1.1 Von Neumann architecture ...................................... 1 1.2 Sales .................................................. 1 1.3 Different systems ........................................... 2 1.3.1 Personal computer ...................................... 2 1.3.2 Mainframe computer ..................................... 3 1.3.3 Departmental computing ................................... 4 1.3.4 Supercomputer ........................................ 4 1.4 See also ................................................ 4 1.5 References ............................................... 4 1.6 External links ............................................. 4 2 Central processing unit 5 2.1 History ................................................. 5 2.1.1 Transistor and integrated circuit CPUs ............................ 6 2.1.2 Microprocessors ....................................... 7 2.2 Operation ............................................... 8 2.2.1 Fetch ............................................. 8 2.2.2 Decode ............................................ 8 2.2.3 Execute ............................................ 9 2.3 Design and implementation ...................................... 9 2.3.1 Control unit .......................................... 9 2.3.2 Arithmetic logic unit ..................................... 9 2.3.3 Integer range ......................................... 10 2.3.4 Clock rate ........................................... 10 2.3.5 Parallelism .........................................
    [Show full text]
  • User's Manual
    HS-862 User’s Manual Edition 1.1 Copyright Copyright© 2002, 2003. All rights reserved. This document is copyrighted and all rights are reserved. The information in this document is subject to change without prior notice to make improvements to the products. This document contains proprietary information and protected by copyright. No part of this document may be reproduced, copied, or translated in any form or any means without prior written permission of the manufacturer. All trademarks and/or registered trademarks contains in this document are property of their respective owners. Disclaimer Taiwan Commate Computer Inc. shall not be liable for any incidental or consequential damages resulting from the performance or use of this product. Taiwan Commate Computer Inc. does not issue a warranty of any kind, express or implied, including without limitation implied warranties of merchantability or fitness for a particular purpose. The company has the right to revise the manual or include changes in the specifications of the product described within it at any time without notice and without obligation to notify any person of such revision or changes. Trademark All trademarks are the property of their respective holders. Any question please visit our website at http://www.commell.com.tw. HS-862 User’s Manual 1 Packing List Hardware HS-862 CPU Card ...................................................... X 1 Cable Kit IDE Flat Cable (40-pin) ............................................... X 1 Floppy Cable............................................................... X 1 DB9 COM / DB15 Parallel Port Cable with Bracket .... X 1 Dual USB Cable with Bracket ..................................... X 2 PS/2 Keyboard and Mouse Cable............................... X 1 Audio Cable with Bracket...........................................
    [Show full text]
  • DPX®-S415 Is the Latest, Highly Integrated Industrial Single Board Computer from Advantech-Innocore
    ® AMD® Athlon™ II Neo, Turion™ II DPX -S415 Neo Gaming Platform Features Very high performance AMD® platform Comprehensive Gaming features High performance integrated or PCI-Express graphics Low power consumption Small format GRAPHICS Introduction The DPX®-S415 is the latest, highly integrated industrial single board computer from Advantech-Innocore. The DPX®-S415 offers unrivalled performance, long lifecycle and low power with AMD’s Athlon X2 and Turion Neo processors and includes the fastest chipset graphics available (Radeon HD4270). A full feature set of I/O and COMs designed specifically for gaming devices is also included on the board and it is backward compatible both mechanically and in the software API with the DPX®-S410 series. Feature Summary Mobile AMD Athlon II X2, Turion II X2 Neo Mobile AMD Athlon X2 Neo, performance Turion X2 Neo performance AMD 785E Embedded chipset Up to 2.2GHz HT3 CPUs Highest performance chipset graphics Very low power operation (CPU 8 to 25W) CPU/Chipset Radeon HD4270 AMD 785E chipset. System Max. 8GB DDR3 SDRAM Embedded/long lifecycle chipset and CPUs 2 x Gigabit Ethernet LAN Highest performance chipset 2 x Compact Flash, 2 x CFast™ graphics – Radeon HD4270 2 x SATA DOM sockets 2 x SO-DIMM socket, 8 GB Max Sound (on-board 3 channel amp) Memory DDR3 up to 800 MHz (1600 MT/s) 2 MB SRAM, 1 MB ROM, EEPROM 64 bit OS and RAM in excess of 4 GB 5 x RS232 AMI PCI/ PnP/ ACPI BIOS 2 x CCTalk/RS232 BIOS BIOS Flash can be write protected 2 x RS232/TTL Fast boot option.
    [Show full text]
  • CHAPTER15 Video Hardware 16 1738 Ch15 7/30/04 10:31 AM Page 868
    16 1738 ch15 7/30/04 10:31 AM Page 867 CHAPTER15 Video Hardware 16 1738 ch15 7/30/04 10:31 AM Page 868 868 Chapter 15 Video Hardware Video Display Technologies Along with the mouse and keyboard, the video display is a vital part of the user interface of any computer. Actually, it is a latecomer to computing; before CRT monitors came into general use, the teletypewriter was the standard computer interface—a large, loud device that printed the input and output characters on a roll of paper. The first CRT displays used on computers were primitive by today’s standards; they displayed only text in a single color (usually green), but to users at the time they were a great improvement, allowing real- time display of input and output data. Over time, color displays were introduced, screen sizes increased, and LCD technologies moved from the portable computer to the desktop. The latest trends, large-screen plasma displays and LCD/DLP projectors, reflect the increasing convergence of entertainment and computer tech- nologies exemplified by developments such as Windows XP Media Center PCs. Today, PC video displays are much more sophisticated, but you must be careful when selecting video hardware for your computer. A slow video adapter or monitor can slow down even the fastest and most powerful PC. Incorrect monitor and video adapter combinations can also cause eyestrain or be unsuit- able for the tasks you want to accomplish. The video subsystem of a PC consists of two main components: ■ Monitor (or video display). The monitor can be a CRT or an LCD panel for desktop use, or a wide- screen LCD TV, plasma display, or projector using LCD or DLP technology.
    [Show full text]