Evolution of NV40

Total Page:16

File Type:pdf, Size:1020Kb

Evolution of NV40 Evolution of NV40 Chris Seitz Progression of Graphics Virtual Fighter Wanda Wolfman NV1 NV1x NV2x 1 Mtrans 22 Mtrans 63Mtrans Dawn Nalu NV3x NV4x 130Mtrans 222M ©2004 NVIDIA Corporation. All rights reserved. ‘95-‘98: Texture Mapping & Z-Buffer CPU GPU Application / Rasterization Stage Geometry Stage Raster Texture Rasterizer Operations Unit Unit 2D Triangles Bus (PCI) Frame 2D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Multitexturing Base Texturemodulated by Light Map X from UT2004 (c) Epic Games Inc. = Used with permission ©2004 NVIDIA Corporation. All rights reserved. 1999-2000: Transform and Lighting CPU GPU “Fixed Function Pipeline” Application Geometry Stage Rasterization Stage Stage Transform Raster Register & Lighting Rasterizer Operations Texture Combiner Unit Unit Unit 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. 2001: Programmable Vertex Shader CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Raster Rasterizer Register Shader Texture Operations (no flow w/ Z-Cull Combiner control) Unit Unit (4 Textures) 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Vertex Shader A programmable processor for any per-vertex computation void VertexShader( // Input per vertex in float4 positionInModelSpace, in float2 textureCoordinates, in float3 normal, // Input per batch of triangles uniform float4x4 modelToProjection, uniform float3 lightDirection, // Output per vertex out float4 positionInProjectionSpace, out float2 textureCoordinatesOutput, out float3 color ) { // Vertex transformation positionInProjectionSpace = mul(modelToProjection, positionInModelSpace); // Texture coordinates copy textureCoordinatesOutput = textureCoordinates; // Vertex color computation color = dot(lightDirection, normal); } ©2004 NVIDIA Corporation. All rights reserved. Bump Mapping Bump mapping involves fetching the per-pixel normal from a normal map texture (instead of using the interpolated vertex normal) in order to compute lighting at a given pixel + = Diffuse lightNormal Map Diffuse light with bumps ©2004 NVIDIA Corporation. All rights reserved. ‘02- ‘03: Programmable Pixel Shader CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Pixel Raster Shader Rasterizer Shader (static & Texture Operations w/ Z-Cull (static flow dynamic Unit control) Unit flow control) (16 textures) 3D Bus Triangles (AGP) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 Revolutionary performance & complete Shader Model 3.0 Complete Native Shader Model 3.0 Support Full support for shader model 3.0 Vertex Texture Fetch / Long programs / Pixel Shader flow control Full speed fp32 shading OpenEXR High Dynamic Range Rendering Floating point frame buffer blending Floating point texture filtering Unparalleled Performance 222M xtors / 0.13um @ IBM 6 vertex units / 16 pixel pipelines Next Generation Video VMR / High quality compositing Hardware MPEG encode / decode HDTV Output PCI Express ©2004 NVIDIA Corporation. All rights reserved. 2004: Shader Model 3.0 & 64-Bit Color Support CPU GPU Application Geometry Stage Rasterization Stage Stage Vertex Rasterizer Pixel Shader Raster w/ Z-Cull Shader (static & Texture (static & Operations dynamic dynamic flow Unit flow control) Unit control) 64-Bit Color 3D Bus Triangles (PCIe) Frame 3D Triangles Textures Buffer Textures System Memory Video Memory ©2004 NVIDIA Corporation. All rights reserved. Shader Model 3.0 Longer shaders → More complex shading Pixel shader: Lord of the Rings™ Dynamic flow control → Better performance The Battle for Middle-earth™ Derivative instructions → Shader antialiasing Support for 32-bit floating-point precision → Fewer artifacts Face register → Faster two-sided lighting Vertex shader: Texture access → Simulation on GPU, displacement mapping Far Cry Geometry Instancing → Better performance SpeedTree ©2004 NVIDIA Corporation. All rights reserved. Complete Native DirectX 9 Support DX9 DX9.0c Vertex Shader Model 2.0 3.0 Vtx Shader Instructions 256 216 (65,535) Displacement Mapping - 9 Vertex Texture Fetch - 9 Geometry Instancing - 9 Dynamic Flow Control - 9 Pixel Shader Model 2.0 3.0 Required Shader Precision fp24 fp32 Pixel Shader Instructions 96 216 (65,535) Subroutines - 9 Loops & Branches - 9 Dynamic Flow Control - 9 ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 Graphics Unparalleled Performance Up to 8x pixel shading performance - combination of 4x the pipes and 2x the math per pipe 2x vertex shading performance - MIMD architecture, dual- issue, very efficient branching Next generation UltraShadow - 4x the performance of NV35 32ppc for z/stencil rendering 256-bit DDR3 – 1.1 GHz DDR data rate. ©2004 NVIDIA Corporation. All rights reserved. GeForce 6800 series 3D Pipeline Triangle Setup Z-Cull Shader Instruction Dispatch L2 Tex Fragment Crossbar Memory Memory Memory Memory Partition Partition Partition Partition ©2004 NVIDIA Corporation. All rights reserved. Multiples ofGeForceFX5950 performanceonshaders NV40 ShaderPerformance ©2004 NVIDIA Corporation. All rights reserved. rights All Corporation. NVIDIA ©2004 10.0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 UT UE3 Tomb Raider Splinter Cell Shadermark Protomark HL2 HDR Light Demo Halo Gunmetal Film FarCry DeusEx Aquamark 3DMark03 ©2004 NVIDIA Corporation. All rights reserved. Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series 2 Million Triangle Detail Mesh Models High Dynamic Range Rendering Fully Customizable Shaders ©2004 NVIDIA Corporation. All rights reserved. UnReal Engine 3.0 Images Courtesy of Epic Games Shader Model 3.0 / 64-bit Floating Point Processing Unreal Engine 3.0 Running On GeForce 6 Series 100 Million Triangle Source Content Scene High Dynamic Range Rendering ©2004 NVIDIA Corporation. All rights reserved. UnReal Engine 3.0 Images Courtesy of Epic Games Demo: Real-Time Tone Mapping The image is entirely computed in 64-bit color and tone-mapped for display Renderings of the same scene, from low to high exposure ©2004 NVIDIA Corporation. All rights reserved. Looking Ahead: Now + 10 years 1000000 CPU Frequency (GHz) 100000 Bus Bandwidth (GB/sec) Pixel Fill Rate (MPixels/sec) 10000 Vertex Rate (MVerts/sec) Graphics flops (GFlops/sec) Graphics Bandwidth (GB/sec) 1000 100 127 Gvertx 10 Pixel Fill Rate ate 1 .5 Mverts tex R Ver s flops Graphic 0.1 BUS Bandwidth ©2004 NVIDIA Corporation. All rights reserved. CPU Frequency 1994 2004 2014 100 MHz 100 GHz NVIDIA SLI Multi-GPU ©2004 NVIDIA Corporation. All rights reserved. Thank You.
Recommended publications
  • Nvidia Geforce 6 Series Specifications
    NVIDIA GEFORCE 6 SERIES PRODUCT OVERVIEW DECEMBER 2004v06 NVIDIA GEFORCE 6 SERIES SPECIFICATIONS CINEFX 3.0 SHADING ARCHITECTURE ULTRASHADOW II TECHNOLOGY ADVANCED ENGINEERING • Vertex Shaders • Designed to enhance the performance of • Designed for PCI Express x16 ° Support for Microsoft DirectX 9.0 shadow-intensive games, like id Software’s • Support for AGP 8X including Fast Writes and Vertex Shader 3.0 Doom 3 sideband addressing Displacement mapping 3 • Designed for high-speed GDDR3 memory ° TURBOCACHE TECHNOLOGY Geometry instancing • Advanced thermal management and thermal ° • Shares the capacity and bandwidth of Infinite length vertex programs monitoring ° dedicated video memory and dynamically • Pixel Shaders available system memory for optimal system NVIDIA® DIGITAL VIBRANCE CONTROL™ Support for DirectX 9.0 Pixel Shader 3.0 ° performance (DVC) 3.0 Full pixel branching support ° • DVC color controls PC graphics such as photos, videos, and games require a Support for Multiple Render Targets (MRTs) PUREVIDEO TECHNOLOGY4 ° • DVC image sharpening controls ° Infinite length pixel programs • Adaptable programmable video processor lot of processing power. Without any help, the CPU • Next-Generation Texture Engine • High-definition MPEG-2 hardware acceleration OPERATING SYSTEMS ° Up to 16 textures per rendering pass • High-quality video scaling and filtering • Windows XP must handle all of the system and graphics ° Support for 16-bit floating point format • DVD and HDTV-ready MPEG-2 decoding up to • Windows ME and 32-bit floating point format 1920x1080i resolution • Windows 2000 processing which can result in decreased system ° Support for non-power of two textures • Display gamma correction • Windows 9X ° Support for sRGB texture format for • Microsoft® Video Mixing Renderer (VMR) • Linux performance.
    [Show full text]
  • Energy-Efficient VLSI Architectures for Next
    UNIVERSITY OF CALIFORNIA Los Angeles Energy-Efficient VLSI Architectures for Next- Generation Software-Defined and Cognitive Radios A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Electrical Engineering by Fang-Li Yuan 2014 c Copyright by Fang-Li Yuan 2014 ABSTRACT OF THE DISSERTATION Energy-Efficient VLSI Architectures for Next- Generation Software-Defined and Cognitive Radios by Fang-Li Yuan Doctor of Philosophy in Electrical Engineering University of California, Los Angeles, 2014 Professor Dejan Markovic,´ Chair Dedicated radio hardware is no longer promising as it was in the past. Today, the support of diverse standards dictates more flexible solutions. Software-defined radio (SDR) provides the flexibility by replacing dedicated blocks (i.e. ASICs) with more general processors to adapt to various functions, standards and even allow mutable de- sign changes. However, such replacement generally incurs significant efficiency loss in circuits, hindering its feasibility for energy-constrained devices. The capability of dy- namic and blind spectrum analysis, as featured in the cognitive radio (CR) technology, makes chip implementation even more challenging. This work discusses several design techniques to achieve near-ASIC energy effi- ciency while providing the flexibility required by software-defined and cognitive radios. The algorithm-architecture co-design is used to determine domain-specific dataflow ii structures to achieve the right balance between energy efficiency and flexibility. The flexible instruction-set-architecture (ISA), the multi-scale interconnects, and the multi- core dynamic scheduling are also proposed to reduce the energy overhead. We demon- strate these concepts on two real-time blind classification chips for CR spectrum anal- ysis, as well as a 16-core processor for baseband SDR signal processing.
    [Show full text]
  • Release 65 Notes
    ForceWare Graphics Drivers Release 65 Notes Version 66.81 Windows XP / 2000 Windows 98 / ME NVIDIA Corporation October 4, 2004 Published by NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, 3DFX, 3DFX INTERACTIVE, the 3dfx Logo, STB, STB Systems and Design, the STB Logo, the StarBox Logo, NVIDIA nForce, GeForce, NVIDIA Quadro, NVDVD, NVIDIA Personal Cinema, NVIDIA Soundstorm, Vanta, TNT2, TNT, RIVA,
    [Show full text]
  • Release 85 Notes
    ForceWare Graphics Drivers Release 85 Notes Version 88.61 For Windows Vista x86 and Windows Vista x64 NVIDIA Corporation May 2006 Published by NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, 3DFX, 3DFX INTERACTIVE, the 3dfx Logo, STB, STB Systems and Design, the STB Logo, the StarBox Logo, NVIDIA nForce, GeForce, NVIDIA Quadro, NVDVD, NVIDIA Personal Cinema, NVIDIA Soundstorm, Vanta, TNT2, TNT,
    [Show full text]
  • Shippensburg University Investment Management Program
    Shippensburg University Investment Management Program Hold NVIDIA Corp. (NASDAQ: NVDA) 11.03.2020 Current Price Fair Value 52 Week Range $501.36 $300 $180.68 - 589.07 Analyst: Valentina Alonso Key Stock Statistics Email: [email protected] Sector: Information Technology Revenue (TTM) $13.06B Stock Type: Large Growth Operating Margin (TTM) 28.56% Industry: Semiconductors and Semiconductors Equipment Market Cap: $309.697B Net Income (TTM) $3.39B EPS (TTM) $5.44 Operating Cash Flow (TTM) $5.58B Free Cash Flow (TTM) $3.67B Return on Assets (TTM) 11.67% Return on Equity (TTM) 27.94% P/E $92.59 Company overview P/B $22.32 Nvidia is the leading designer of graphics processing units that P/S $23.29 enhance the experience on computing platforms. The firm's chips are used in a variety of end markets, including high-end PCs for gaming, P/FCF 44.22 data centers, and automotive infotainment systems. In recent years, the firm has broadened its focus from traditional PC graphics Beta (5-Year) 1.54 applications such as gaming to more complex and favorable Dividend Yield 0.13% opportunities, including artificial intelligence and autonomous driving, which leverage the high-performance capabilities of the Projected 5 Year Growth 17.44% firm's graphics processing units. (per annum) Contents Executive Summary ....................................................................................................................................................3 Company Overview ....................................................................................................................................................4
    [Show full text]
  • Manycore GPU Architectures and Programming, Part 1
    Lecture 19: Manycore GPU Architectures and Programming, Part 1 Concurrent and Mul=core Programming CSE 436/536, [email protected] www.secs.oakland.edu/~yan 1 Topics (Part 2) • Parallel architectures and hardware – Parallel computer architectures – Memory hierarchy and cache coherency • Manycore GPU architectures and programming – GPUs architectures – CUDA programming – Introduc?on to offloading model in OpenMP and OpenACC • Programming on large scale systems (Chapter 6) – MPI (point to point and collec=ves) – Introduc?on to PGAS languages, UPC and Chapel • Parallel algorithms (Chapter 8,9 &10) – Dense matrix, and sorng 2 Manycore GPU Architectures and Programming: Outline • Introduc?on – GPU architectures, GPGPUs, and CUDA • GPU Execuon model • CUDA Programming model • Working with Memory in CUDA – Global memory, shared and constant memory • Streams and concurrency • CUDA instruc?on intrinsic and library • Performance, profiling, debugging, and error handling • Direc?ve-based high-level programming model – OpenACC and OpenMP 3 Computer Graphics GPU: Graphics Processing Unit 4 Graphics Processing Unit (GPU) Image: h[p://www.ntu.edu.sg/home/ehchua/programming/opengl/CG_BasicsTheory.html 5 Graphics Processing Unit (GPU) • Enriching user visual experience • Delivering energy-efficient compung • Unlocking poten?als of complex apps • Enabling Deeper scien?fic discovery 6 What is GPU Today? • It is a processor op?mized for 2D/3D graphics, video, visual compu?ng, and display. • It is highly parallel, highly multhreaded mulprocessor op?mized for visual
    [Show full text]
  • NVIDIA Previews Nforce4 SLI for Intel at Cebit 2005 11 March 2005
    NVIDIA Previews nForce4 SLI for Intel at CeBIT 2005 11 March 2005 NVIDIA Corporation, a worldwide leader in -- New NVIDIA GeForce™ 6 Series GPUs – the graphics and digital media processors, today 512MB version of the performance leading announced the Company will preview its upcoming GeForce 6800 Ultra and the AGP version of the NVIDIA nForce4 core-logic solutions for Intel CPUs GeForce 6200. Since their introduction, the which includes support for NVIDIA SLI graphics GeForce 6 Series of top-to-bottom GPUs has won technology, at CeBIT 2005 in Hannover, Germany. over 500 editorial awards worldwide -- NVIDIA SLI™ technology, which allows users to combine the power of two graphics processing Following its keynote presentation at the Intel units (GPUs) in a single PC system for scalable Developer Forum in San Francisco earlier this graphics performance, enabling a new class of PCs month, NVIDIA will provide a sneak peek to called gaming supercomputers customers and the press at CeBIT, Europe’s -- NVIDIA GeForce 6200 GPUS with TurboCache™ largest computer trade fair, prior to the product’s technology, a new patent-pending hardware and official launch next month. software technology which allows a GPU to render directly to system memory leveraging the high The arrival of NVIDIA nForce4 technology to Intel transfer rate of PCI Express, instead of using local processor-based PCs will provide some exciting memory on the graphics card. new choices for our customers,” said Maurits -- NVIDIA nForce4 MCPs, the award-winning Tichelman, Director Sales and Distribution, NVIDIA core-logic solutions Reseller Channel Operations Europe Middle East -- Mobile PCI Express Module (MXM), which & Africa at Intel.
    [Show full text]
  • Lecture: Manycore GPU Architectures and Programming, Part 1
    Lecture: Manycore GPU Architectures and Programming, Part 1 CSCE 569 Parallel Computing Department of Computer Science and Engineering Yonghong Yan [email protected] https://passlab.github.io/CSCE569/ 1 Manycore GPU Architectures and Programming: Outline • Introduction – GPU architectures, GPGPUs, and CUDA • GPU Execution model • CUDA Programming model • Working with Memory in CUDA – Global memory, shared and constant memory • Streams and concurrency • CUDA instruction intrinsic and library • Performance, profiling, debugging, and error handling • Directive-based high-level programming model – OpenACC and OpenMP 2 Computer Graphics GPU: Graphics Processing Unit 3 Graphics Processing Unit (GPU) Image: http://www.ntu.edu.sg/home/ehchua/programming/opengl/CG_BasicsTheory.html 4 Graphics Processing Unit (GPU) • Enriching user visual experience • Delivering energy-efficient computing • Unlocking potentials of complex apps • Enabling Deeper scientific discovery 5 What is GPU Today? • It is a processor optimized for 2D/3D graphics, video, visual computing, and display. • It is highly parallel, highly multithreaded multiprocessor optimized for visual computing. • It provide real-time visual interaction with computed objects via graphics images, and video. • It serves as both a programmable graphics processor and a scalable parallel computing platform. – Heterogeneous systems: combine a GPU with a CPU • It is called as Many-core 6 Graphics Processing Units (GPUs): Brief History GPU Computing General-purpose computing on graphics processing units (GPGPUs) GPUs with programmable shading Nvidia GeForce GE 3 (2001) with programmable shading DirectX graphics API OpenGL graphics API Hardware-accelerated 3D graphics S3 graphics cards- single chip 2D accelerator Atari 8-bit computer IBM PC Professional Playstation text/graphics chip Graphics Controller card 1970 1980 1990 2000 2010 Source of information http://en.wikipedia.org/wiki/Graphics_Processing_Unit 7 NVIDIA Products • NVIDIA Corp.
    [Show full text]
  • Tegra Perfhud ES 1.1 Quickstart Guide
    Tegra PerfHUD ES 1.1 Quickstart Guide Version 1.1 - 1 - November 2010 Contents INTRODUCTION 3 REQUIREMENTS 4 PERFHUD ES HOST INSTALLATION 5 PERFHUD ES TARGET DEVICE SUPPORT SETUP 6 CONNECTING TO THE TARGET DEVICE 9 COMMON PROBLEMS 11 KNOWN ISSUES 12 November 2010 - 2 - Introduction This quick start guide is designed to assist a new user in setting up and starting PerfHUD ES profiling of applications on a Tegra device. PerfHUD ES enables experimentation and in-depth analysis of OpenGL ES applications. Through data provided by the OpenGL ES driver and the Tegra SoC, developers are provided live system status, including GPU performance and bottleneck information, in order to help target optimizations where they are most needed. The debugging capabilities of PerfHUD ES give full access to the state of the OpenGL ES pipeline, related textures and shaders, and all rendering states to help find the causes for improper setup, rendering anomalies, and performance issues. For the rest of the document, note that all downloads are available on the Tegra Devsite: http://tegradeveloper.nvidia.com/tegra/downloads November 2010 - 3 - Requirements Host PC Ideally, a minimum 2GHz CPU with 2GB RAM or more An OpenGL 1.5 or better GPU (if you have an NVIDIA GeForce 6 series GPU or higher, that should suffice). The ‚PerfHUD ES User Guide‛ installed during this setup process covers specific GPU feature requirements in depth, as certain OpenGL extensions are necessary for the host PC to visualize the graphics rendered on the Tegra device. An available USB port, for flashing your Tegra device, or for Android ADB connection.
    [Show full text]
  • Release 70 Notes
    ForceWare Graphics Drivers Release 70 Notes Version 71.89 For Windows XP / 2000 Windows XP Media Center Edition Windows 98 / ME Windows NT 4.0 NVIDIA Corporation April 14, 2005 Published by NVIDIA Corporation 2701 San Tomas Expressway Santa Clara, CA 95050 Notice ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE. Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication or otherwise under any patent or patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. NVIDIA Corporation products are not authorized for use as critical components in life support devices or systems without express written approval of NVIDIA Corporation. Trademarks NVIDIA, the NVIDIA logo, 3DFX, 3DFX INTERACTIVE, the 3dfx Logo, STB, STB Systems and Design, the STB Logo, the StarBox Logo, NVIDIA nForce, GeForce, NVIDIA Quadro, NVDVD, NVIDIA Personal
    [Show full text]
  • Cuda-Gdb Cuda Debugger
    CUDA-GDB CUDA DEBUGGER DU-05227-042 _v5.0 | October 2012 User Manual TABLE OF CONTENTS Chapter 1. Introduction.........................................................................................1 1.1 What is CUDA-GDB?...................................................................................... 1 1.2 Supported Features...................................................................................... 1 1.3 About This Document.................................................................................... 2 Chapter 2. Release Notes...................................................................................... 3 Chapter 3. Getting Started.....................................................................................6 3.1 Installation Instructions..................................................................................6 3.2 Setting Up the Debugger Environment................................................................6 3.2.1 Linux...................................................................................................6 3.2.2 Mac OS X............................................................................................. 6 3.2.3 Temporary Directory................................................................................ 7 3.3 Compiling the Application...............................................................................8 3.3.1 Debug Compilation.................................................................................. 8 3.3.2 Compiling for Fermi GPUs........................................................................
    [Show full text]
  • GPU Architecture:Architecture: Implicationsimplications && Trendstrends
    GPUGPU Architecture:Architecture: ImplicationsImplications && TrendsTrends DavidDavid Luebke,Luebke, NVIDIANVIDIA ResearchResearch Beyond Programmable Shading: In Action GraphicsGraphics inin aa NutshellNutshell • Make great images – intricate shapes – complex optical effects – seamless motion • Make them fast – invent clever techniques – use every trick imaginable – build monster hardware Eugene d’Eon, David Luebke, Eric Enderton In Proc. EGSR 2007 and GPU Gems 3 …… oror wewe couldcould justjust dodo itit byby handhand Perspective study of a chalice Paolo Uccello, circa 1450 Beyond Programmable Shading: In Action GPU Evolution - Hardware 1995 1999 2002 2003 2004 2005 2006-2007 NV1 GeForce 256 GeForce4 GeForce FX GeForce 6 GeForce 7 GeForce 8 1 Million 22 Million 63 Million 130 Million 222 Million 302 Million 754 Million Transistors Transistors Transistors Transistors Transistors Transistors Transistors 20082008 GeForceGeForce GTX GTX 200200 1.41.4 BillionBillion TransistorsTransistors Beyond Programmable Shading: In Action GPUGPU EvolutionEvolution -- ProgrammabilityProgrammability ? Future: CUDA, DX11 Compute, OpenCL CUDA (PhysX, RT, AFSM...) 2008 - Backbreaker DX10 Geo Shaders 2007 - Crysis DX9 Prog Shaders 2004 – Far Cry DX7 HW T&L DX8 Pixel Shaders 1999 – Test Drive 6 2001 – Ballistics TheThe GraphicsGraphics PipelinePipeline Vertex Transform & Lighting Triangle Setup & Rasterization Texturing & Pixel Shading Depth Test & Blending Framebuffer Beyond Programmable Shading: In Action TheThe GraphicsGraphics PipelinePipeline Vertex Transform
    [Show full text]