Matrox Imaging Library (MIL) 9.0

Total Page:16

File Type:pdf, Size:1020Kb

Matrox Imaging Library (MIL) 9.0 ------------------------------------------------------------------------------- Matrox Imaging Library (MIL) 9.0. MIL 9.0 Update 35 (GPU Processing) Release Notes (July 2011) (c) Copyright Matrox Electronic Systems Ltd., 1992-2011 ------------------------------------------------------------------------------- Main table of contents Section 1 : Differences between MIL 9.0 Update 35 and MIL 9.0 Update 30 Section 2 : Differences between MIL 9.0 Update 30 and MIL 9.0 Update 14 Section 3 : Differences between MIL 9.0 Update 14 and MIL 9.0 Update 3 Section 4 : Differences between MIL 9.0 Update 3 and MIL 9.0 Section 5 : MIL 9.0 GPU (Graphics Processing Unit) accelerations ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- Section 1: Differences between MIL 9.0 Update 35 and MIL 9.0 Update 30 Table of Contents for Section 1 1. Overview 2. GPU acceleration restrictions 3. New GPU functionalities and improvements 3.1. GPU accelerated image processing operations 3.1.1. MimHistogram 4. GPU specific examples 4.1. MilInteropWithCUDA (Updated) 4.2. MilInteropWithDX (Updated) 4.3. MilInteropWithOpenCL (New) 5. Is my application running on the GPU? 5.1. Deactivate MIL Host processing compensation 5.2. Windows SysInternals Process Explorer v15.0 (c) tool 6. Do MIL GPU results and precision change between updates? 6.1. MIL GPU algortihms 6.2. Graphics card firmware 7. Fixed bugs 7.1. All modules (DX9, DX10, DX11) 8. GPU boards 8.1. List of tested GPUs 8.2. GPU board limitations ------------------------------------------------------------------------------- 1. Overview - New DirectX 11 and DirectCompute support (shader models 5.0) - Minimal requirement: Microsoft Windows Vista with SP2 (or later), Windows 7 Note: Make sure that Windows Update 971512 (Windows Graphics, Imaging, and XPS Library) is installed to use DirectX 11 and Direct Compute in Windows Vista with SP2. (See http://support.microsoft.com/kb/971512) - General performance improvements and bug fixes (DirectX 9 and 10) - Interoperability support with OpenCL (through DX10/OpenCL interoperability) - See MilInteropWithOpenCL specific example 2. GPU acceleration restrictions - Before MIL 9.0 Udpate 35, a monitor had to be connected to a graphics card to benefit from GPU acceleration. Installing MIL 9.0 Update 35 allows GPU acceleration on a graphics card with or without its outputs connected to a monitor. Requirements: - DX10 or DX11 - Windows Vista with SP2 or later, Windows 7 - WDDM 1.1 compatible graphics card driver Note: Make sure that Windows Update 971512 (Windows Graphics, Imaging, and XPS Library) is installed to have WDDM 1.1 driver model support in Windows Vista with SP2. (See http://support.microsoft.com/kb/971512) - DirectX versions supported by MIL GPU in your system are those supported by all detected graphics adapters. If, for example, your system is equipped with: - 1 Intel HD 2000 (Intel Core-i7 2600 integrated GPU) - 1 AMD Radeon HD 6970 (discrete graphics adapter) - 1 NVIDIA GeForce GTX 480 (discrete graphics adapter) DirectX 9 and 10 will be supported by MIL GPU. To get DirectX 11 support, you would have to disable the Intel integrated GPU (through a BIOS option in this example). - M_MAPPABLE buffer attribute is no longer supported. This flag was allowed for Host memory GPU buffers (M_HOST_MEMORY) but it will now generate a MIL error. 3. New GPU functionalities and improvements 3.1. GPU accelerated image processing operations 3.1.1. MimHistogram (DX11) - Supports M_MONO8, M_MONO16, floating-point and packed-binary source buffers. - Special very fast optimization for 8-bit histograms. 4. GPU specific examples Visual Studio solutions including all projects for GPU specific examples were removed for compatibility purposes. Go to each specific example folder to open individual solutions and projects. 4.1. MilInteropWithCUDA - Updated from original version in MIL 9.0 update 14 to support Windows 7 through NVIDIA CUDA toolkit 4.0 - Support for Visual Studio 2003 has been removed (NVIDIA CUDA toolkit 4.0 restriction) - Support for Visual Studio 2008 has been added This example demonstrates how it is possible to apply custom CUDA kernels on MIL buffers when needed, and how MIL handles everything else from buffer allocations to onscreen display. The second part of the example shows the same processing entirely done with MIL. Requirements: - An NVIDIA CUDA-compatible GPU - Install latest GPU driver - Install latest Microsoft DirectX SDK - Install NVIDIA CUDA Toolkit 4.0 4.2. MilInteropWithDX - Updated from orignal example MilInteropWithDX9 included in MIL 9.0 update 14. This example demonstrates how it is possible to apply custom DirectX 9 and 10, or DirectCompute shaders on MIL buffers when needed, and how MIL handles everything else from buffer allocations to onscreen display. The second part of the example shows the same processing entirely done with MIL. Requirements: - Install latest GPU driver - Install latest Microsoft DirectX SDK Note: Microsoft Visual Studio 2005 (or later) and a DirectX 10 compatible GPU are needed to run the DirectX 10 interoperability functionality. Note: Microsoft Visual Studio 2008 (or later) and a DirectX 11 compatible GPU are needed to run the DirectCompute interoperability functionality. 4.3. MilInteropWithOpenCL This new example demonstrates how it is possible to apply custom OpenCL kernels on MIL buffers when needed, and how MIL handles everything else from buffer allocations to onscreen display. The second part of the example shows the same processing entirely done with MIL. Requirements: - Windows Vista and later - An OpenCL-compatible GPU (OpenCL 1.1) - Install latest GPU driver - Install latest Microsoft DirectX SDK - Install NVIDIA CUDA Toolkit 4.0 (for NVIDIA GPUs) or - Install AMD APP SDK 2.5 (for AMD GPUs) 5. Is my application running on the GPU? MIL currently does not provide an explicit way to know if a function was executed on the GPU or on the Host CPU. This information must be obtained through two implicit means: the first one is achieved with MIL, while the second one requires a third-party tool. 5.1. Deactivate MIL Host processing compensation Add this call in your application to disable Host compensation: MappControl(M_PROCESSING, M_COMPENSATION_DISABLE); This will cause all following processing calls that cannot be performed by the GPU to generate a MIL error. Refer to MappControl documentation for more information on Host processing compensation. 5.2. Windows SysInternals Process Explorer v15.0 (c) tool (http://technet.microsoft.com/en-us/sysinternals/bb896653) Beginning with version 15.0, the Process Explorer (c) tool includes a GPU usage meter similar to the Windows Task Manager performance meters. This tool can be used to determine if the GPU usage increases when a specific MIL application or function is running. According to the tool documentation, the GPU usage meter is supported on Windows Vista and later only. Go to the Windows SysInternals website for download and documentation. 6. Do MIL GPU results and precision change between updates? 6.1. MIL GPU algortihms When GPU support for a function is added in a MIL GPU update, it is possible that its results and precision will be different from its MIL Host counterpart. It is hardly possible to guarantee identical results with different hardware (CPU, GPU, FPGA, ...). However, once GPU support for a function is added in a MIL GPU update, the results and precision of this function should not change in following updates (unless stated otherwise through specific optimizations or bug fixes). 6.2. Graphics card firmware Some GPU functionalities are based on a firmware installed with your graphics card driver. While this firmware does not necessarily change with each driver version, some MIL GPU results could change between two graphics card driver versions. 7. Fixed bugs 7.1. All modules (DX9, DX10) - Fix: Now returning an explicit error when trying to allocate a remote GPU system and DMIL server is configured as a Windows Service. - Fix: MbufTransfer(M_CLEAR) could fail if destination buffer was already locked on a system different from GPU. - Fix: Internal DLLs are now unloaded when freeing the last allocated GPU system (MsysFree). - Fix: MimResize was compensated on Host if one of M_FAST or M_REGULAR flags was added to InterpolationMode parameter. - Fix: MimLutMap results were incorrect with a 16-bit monochrome source, and color LUT and destination. 8. GPU boards 8.1. List of tested GPUs See section 7.2 for known specific GPU issues and limitations. - OS: XP 32 - OS: XP 64 - OS: VISTA 32 - OS: VISTA 64 - OS: Windows 7 32 - OS: Windows 7 64 + ATI Radeon HD 4850 (latest tested driver: 11.5) + ATI Radeon HD 4870 (latest tested driver: 11.5) + ATI Radeon HD 5870 (latest tested driver: 11.5) + AMD Radeon HD 6970 (latest tested driver: 11.5) + AMD FirePro V7800 + AMD FirePro V8800 + NVIDIA GeForce 8500 GT (latest tested driver: 196.21) + NVIDIA GeForce 8600 GTS (latest tested driver: 196.21) + NVIDIA GeForce 8800 GT (latest tested driver: 196.21) + NVIDIA GeForce 8800 GTX (latest tested driver: 196.21) + NVIDIA GeForce GTX 280 (latest tested driver: 196.21) + NVIDIA GeForce GTX 285 (latest tested driver: 196.21) + NVIDIA GeForce GTS 450 (latest tested driver: 270.61) + NVIDIA GeForce GTX 480 (latest tested driver: 270.61) + NVIDIA Quadro 4000 + NVIDIA Quadro 6000 8.2. GPU board limitations - NVIDIA driver family 197.xx has shown instability issues in DirectX 10 (which is the default acceleration mode in Windows Vista and Windows 7). Revert to driver 196.21 if your MIL GPU application does not behave as expected. This issue does not affect DirectX 9 GPU acceleration. - Floating-point exceptions can occur in NVIDIA drivers for older cards (GeForce 8000 series), in DirectX 10.0 version of some functions on 64-bit Vista. Note that these occurrences are only possible if floating-point exceptions, which are usually disabled by default, are enabled in your application. - Your graphics card driver might not start if the BIOS option mapping PCI resources over 4 GB (64-bit IO mapping) is enabled.
Recommended publications
  • Heterogeneous Computing for Advanced Driver Assistance Systems
    TECHNISCHE UNIVERSITAT¨ MUNCHEN¨ Lehrstuhl fur¨ Robotik, Kunstliche¨ Intelligenz und Echtzeitsysteme Heterogeneous Computing for Advanced Driver Assistance Systems Xiebing Wang Vollstandiger¨ Abdruck der von der Fakultat¨ der Informatik der Technischen Universitat¨ Munchen¨ zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaen (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Prof. Dr. Daniel Cremers Prufer¨ der Dissertation: 1. Prof. Dr.-Ing. habil. Alois Knoll 2. Assistant Prof. Xuehai Qian, Ph.D. 3. Prof. Dr. Kai Huang Die Dissertation wurde am 25.04.2019 bei der Technischen Universitat¨ Munchen¨ eingereicht und durch die Fakultat¨ fur¨ Informatik am 17.09.2019 angenommen. Abstract Advanced Driver Assistance Systems (ADAS) is an indispensable functionality in state-of- the-art intelligent cars and the deployment of ADAS in automated driving vehicles would become a standard in the near future. Current research and development of ADAS still faces several problems. First of all, the huge amount of perception data captured by mas- sive vehicular sensors have posed severe computation challenge for the implementation of real-time ADAS applications. Secondly, conventional automotive Electronic Control Units (ECUs) have to cope with the knoy issues such as technology discontinuation and the consequent tedious hardware/soware (HW/SW) maintenance. Lastly, ADAS should be seamlessly shied towards a mixed and scalable system in which safety, security, and real-time critical components must coexist with the less critical counterparts, while next- generation computation resources can still be added exibly so as to provide sucient computing capacity. is thesis gives a systematic study of applying the emerging heterogeneous comput- ing techniques to the design of an automated driving module and the implementation of real-time ADAS applications.
    [Show full text]
  • AMD APP SDK V2.8.1
    AMD APP SDK v2.8.1 FAQ 1 General Questions 1. Do I need to use additional software with the SDK? To run an OpenCL™ application, you must have an OpenCL™ runtime on your system. If your system includes a recent AMD discrete GPU, or an APU, you also should install the latest Catalyst™ drivers, which can be downloaded from AMD.com. Information on supported devices can be found at developer.amd.com/appsdk. If your system does not include a recent AMD discrete GPU, or APU, the SDK installs a CPU-only OpenCL™ run-time. Also, we recommend using the debugging profiling and analysis tools contained in the AMD CodeXL heterogeneous compute tools suite. 2. Which versions of the OpenCL™ standard does this SDK support? AMD APP SDK 2.8.1 supports the development of applications using the OpenCL™ Specification v 1.2. 3. Will applications developed to execute on OpenCL™ 1.1 still operate in an OpenCL™ 1.2 environment? OpenCL™ is designed to be backwards compatible. The OpenCL™ 1.2 run-time delivered with the AMD Catalyst drivers run any OpenCL™ 1.1-compliant application. However, an OpenCL™ 1.2-compliant application will not execute on an OpenCL™ 1.1 run-time if APIs only supported by OpenCL™ 1.2 are used. 4. Does AMD provide any additional OpenCL™ samples, other than those contained within the SDK? The most recent versions of all of the samples contained within the SDK are also available for individual download from the developer.amd.com/appsdk “Samples & Demos” page. This page also contains additional samples that either were too large to include in the SDK, or which have been developed since the most recent SDK release.
    [Show full text]
  • Comparison of Technologies for General-Purpose Computing on Graphics Processing Units
    Master of Science Thesis in Information Coding Department of Electrical Engineering, Linköping University, 2016 Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman Master of Science Thesis in Information Coding Comparison of Technologies for General-Purpose Computing on Graphics Processing Units Torbjörn Sörman LiTH-ISY-EX–16/4923–SE Supervisor: Robert Forchheimer isy, Linköpings universitet Åsa Detterfelt MindRoad AB Examiner: Ingemar Ragnemalm isy, Linköpings universitet Organisatorisk avdelning Department of Electrical Engineering Linköping University SE-581 83 Linköping, Sweden Copyright © 2016 Torbjörn Sörman Abstract The computational capacity of graphics cards for general-purpose computing have progressed fast over the last decade. A major reason is computational heavy computer games, where standard of performance and high quality graphics con- stantly rise. Another reason is better suitable technologies for programming the graphics cards. Combined, the product is high raw performance devices and means to access that performance. This thesis investigates some of the current technologies for general-purpose computing on graphics processing units. Tech- nologies are primarily compared by means of benchmarking performance and secondarily by factors concerning programming and implementation. The choice of technology can have a large impact on performance. The benchmark applica- tion found the difference in execution time of the fastest technology, CUDA, com- pared to the slowest, OpenCL, to be twice a factor of two. The benchmark applica- tion also found out that the older technologies, OpenGL and DirectX, are compet- itive with CUDA and OpenCL in terms of resulting raw performance. iii Acknowledgments I would like to thank Åsa Detterfelt for the opportunity to make this thesis work at MindRoad AB.
    [Show full text]
  • AMD APP SDK V2.9.1 Getting Started
    AMD APP SDK v2.9.1 Getting Started 1 Overview The AMD APP SDK is provided to the developer community to accelerate the programming in a heterogeneous environment by enabling AMD GPUs to work in concert with the system's x86 CPU cores. The SDK provides samples, documentation, and other materials to quickly get you started leveraging accelerated compute using OpenCL™, Bolt, OpenCV, C++ AMP for your C/C++ application, or Aparapi for your Java application. This document provides instructions on using the AMD APP SDK. The necessary prerequisite installations, environment settings, build and execute instructions for the samples are provided. Review the following quick links to the important sections: Section 2, “APP SDK on Windows” Section 2.1, “Installation” Section 2.2, “General Prerequisites” Section 2.3, “OpenCL” Section 2.4, “BOLT” Section 2.5, “C++ AMP” Section 2.6, “Aparapi” Section 2.7, “OpenCV” Section 3, “APP SDK on Linux” Section 3.1, “Installation” Section 3.2, “General prerequisites” Section 3.3, “OpenCL” Section 3.4, “BOLT” Section 3.5, “Aparapi” Section 3.6, “OpenCV” Section Appendix A, “Important Notes” Section Appendix C, “CMAKE” Section Appendix D, “Building OpenCV from sources” Getting Started 1 of 19 2 APP SDK on Windows 2.1 Installation The AMD APP SDK 2.9.1 installer is delivered as a self-extracting installer for 32-bit and 64-bit systems on Windows. For details on how to install the APP SDK on Windows, see the AMD APP SDK Installation Notes document. The default installation path is C:\Users\<userName>\AMD APP SDK\<appSdkVersion>\.
    [Show full text]
  • AMD APP SDK V3.0 Beta
    AMD APP SDK v3.0 Beta FAQ 1 General Questions 1. Do I need to use additional software with the SDK? For information about the additional software to be used with the AMD APP SDK, see the AMD APP SDK Getting Started Guide. Also, we recommend using the debugging profiling and analysis tools contained in the AMD CodeXL heterogeneous compute tools suite. 2. Which versions of the OpenCL™ standard does this SDK support? AMD APP SDK version 3.0 Beta supports the development of applications using the OpenCL™ Specification version 2.0. 3. Will applications developed to execute on OpenCL™ 1.2 still operate in an OpenCL™ 2.0 environment? OpenCL™ is designed to be backwards compatible. The OpenCL™ 2.0 run-time delivered with the AMD Catalyst drivers run any OpenCL™ 1.2-compliant application. However, an OpenCL™ 2.0-compliant application will not execute on an OpenCL™ 1.2 run-time if APIs only supported by OpenCL™ 2.0 are used. 4. Does AMD provide any additional OpenCL™ samples, other than those contained within the SDK? The most recent versions of all of the samples contained within the SDK are also available for individual download from the developer.amd.com/appsdk “Samples & Demos” page. This page also contains additional samples that either were too large to include in the SDK, or which have been developed since the most recent SDK release. Check the AMD APP SDK web page for new, updated, or large samples. 5. How often can I expect to get AMD APP SDK updates? Developers can expect that the AMD APP SDK may be updated two to three times a year.
    [Show full text]
  • AMD APP SDK Developer Release Notes
    AMD APP SDK v3.0 Beta Developer Release Notes 1 What’s New in AMD APP SDK v3.0 Beta 1.1 New features in AMD APP SDK v3.0 Beta AMD APP SDK v3.0 Beta includes the following new features: OpenCL 2.0: There are 20 samples that demonstrate various features of OpenCL 2.0 such as Shared Virtual Memory, Platform Atomics, Device-side Enqueue, Pipes, New workgroup built-in functions, Program Scope Variables, Generic Address Space, and OpenCL 2.0 image features. For the complete list of the samples, see the AMD APP SDK Samples Release Notes (AMD_APP_SDK_Release_Notes_Samples.pdf) document. Support for Bolt 1.3 library. 6 additional samples that demonstrate various APIs in the Bolt C++ AMP library. One new sample that demonstrates the consumption of SPIR 1.2 binary. Enhancements and bug fixes in several samples. A lightweight installer that supports the following features: Customized online installation Ability to download the full installer for install and distribution 1.2 New features for AMD CodeXL version 1.6 The following new features in AMD CodeXL version 1.6 provide the following improvements to the developer experience: GPU Profiler support for OpenCL 2.0 API-level debugging for OpenCL 2.0 Power Profiling For information about CodeXL and about how to use CodeXL to gather performance data about your OpenCL application, such as application traces and timeline views, see the CodeXL home page. Developer Release Notes 1 of 4 2 Important Notes OpenCL 2.0 runtime support is limited to 64-bit applications running on 64-bit Windows and Linux operating systems only.
    [Show full text]
  • A Qualitative Comparison Study Between Common GPGPU Frameworks
    A Qualitative Comparison Study Between Common GPGPU Frameworks. Adam Söderström Department of Computer and Information Science Linköping University This dissertation is submitted for the degree of M. Sc. in Media Technology and Engineering June 2018 Acknowledgements I would like to acknowledge MindRoad AB and Åsa Detterfelt for making this work possible. I would also like to thank Ingemar Ragnemalm and August Ernstsson at Linköping University. Abstract The development of graphic processing units have during the last decade improved signif- icantly in performance while at the same time becoming cheaper. This has developed a new type of usage of the device where the massive parallelism available in modern GPU’s are used for more general purpose computing, also known as GPGPU. Frameworks have been developed just for this purpose and some of the most popular are CUDA, OpenCL and DirectX Compute Shaders, also known as DirectCompute. The choice of what framework to use may depend on factors such as features, portability and framework complexity. This paper aims to evaluate these concepts, while also comparing the speedup of a parallel imple- mentation of the N-Body problem with Barnes-hut optimization, compared to a sequential implementation. Table of contents List of figures xi List of tables xiii Nomenclature xv 1 Introduction1 1.1 Motivation . .1 1.2 Aim . .3 1.3 Research questions . .3 1.4 Delimitations . .4 1.5 Related work . .4 1.5.1 Framework comparison . .4 1.5.2 N-Body with Barnes-Hut . .5 2 Theory9 2.1 Background . .9 2.1.1 GPGPU History . 10 2.2 GPU Architecture .
    [Show full text]
  • Gpgpu Processing in Cuda Architecture
    Advanced Computing: An International Journal ( ACIJ ), Vol.3, No.1, January 2012 GPGPU PROCESSING IN CUDA ARCHITECTURE Jayshree Ghorpade 1, Jitendra Parande 2, Madhura Kulkarni 3, Amit Bawaskar 4 1Departmentof Computer Engineering, MITCOE, Pune University, India [email protected] 2 SunGard Global Technologies, India [email protected] 3 Department of Computer Engineering, MITCOE, Pune University, India [email protected] 4Departmentof Computer Engineering, MITCOE, Pune University, India [email protected] ABSTRACT The future of computation is the Graphical Processing Unit, i.e. the GPU. The promise that the graphics cards have shown in the field of image processing and accelerated rendering of 3D scenes, and the computational capability that these GPUs possess, they are developing into great parallel computing units. It is quite simple to program a graphics processor to perform general parallel tasks. But after understanding the various architectural aspects of the graphics processor, it can be used to perform other taxing tasks as well. In this paper, we will show how CUDA can fully utilize the tremendous power of these GPUs. CUDA is NVIDIA’s parallel computing architecture. It enables dramatic increases in computing performance, by harnessing the power of the GPU. This paper talks about CUDA and its architecture. It takes us through a comparison of CUDA C/C++ with other parallel programming languages like OpenCL and DirectCompute. The paper also lists out the common myths about CUDA and how the future seems to be promising for CUDA. KEYWORDS GPU, GPGPU, thread, block, grid, GFLOPS, CUDA, OpenCL, DirectCompute, data parallelism, ALU 1. INTRODUCTION GPU computation has provided a huge edge over the CPU with respect to computation speed.
    [Show full text]
  • Pny-Nvidia-Quadro-P2200.Pdf
    UNMATCHED POWER. UNMATCHED CREATIVE FREEDOM. NVIDIA® QUADRO® P2200 Power and Performance in a Compact FEATURES Form Factor. > Four DisplayPort 1.4 1 The Quadro P2200 is the perfect balance of Connectors performance, compelling features, and compact > DisplayPort with Audio form factor delivering incredible creative > NVIDIA nView™ Desktop experience and productivity across a variety of Management Software professional 3D applications. It features a Pascal > HDCP 2.2 Support GPU with 1280 CUDA cores, large 5 GB GDDR5X > NVIDIA Mosaic2 on-board memory, and the power to drive up to four PNY PART NUMBER VCQP2200-SB > NVIDIA Iray and 5K (5120x2880 @ 60Hz) displays natively. Accelerate SPECIFICATIONS MentalRay Support product development and content creation GPU Memory 5 GB GDDR5X workflows with a GPU that delivers the fluid PACKAGE CONTENTS Memory Interface 160-bit interactivity you need to work with large scenes and > NVIDIA Quadro P2200 Memory Bandwidth Up to 200 GB/s models. Professional Graphics NVIDIA CUDA® Cores 1280 Quadro cards are certified with a broad range of board System Interface PCI Express 3.0 x16 sophisticated professional applications, tested by WARRANTY AND Max Power Consumption 75 W leading workstation manufacturers, and backed by SUPPORT a global team of support specialists. This gives you Thermal Solution Active > 3-Year Warranty the peace of mind to focus on doing your best work. Form Factor 4.4” H x 7.9” L, Single Slot Whether you’re developing revolutionary products > Pre- and Post-Sales or telling spectacularly vivid visual stories, Quadro Technical Support Display Connectors 4x DP 1.4 gives you the performance to do it brilliantly.
    [Show full text]
  • Gainward GT 440 1GB DVI HDMI
    Gainward GT 440 1GB DVI HDMI GPU Core Memory Capacity Type Outputs GT 440 810 MHz 1600 MHz 1024 MB GDDR5 / 128 bits DVI, VGA, HDMI PRODUCT SPECIFICATIONS CHIPSET SPECIFICATIONS BUNDLED ACCESSORIES TM z Superior Hardware Design z NVIDIA GeForce GT 440 • Gainward QuickStart Manual z Gainward’s award winning High-Performance/ z Microsoft DirectX 11 Support: DirectX 11 Wide-BandwidthTM hardware design powered GPU with Shader Model 5.0 support by NVIDIA’s GeForceTM GT 440 GPU (40nm) designed for ultra high performance in the integrating 1024MB/128bits high-speed new API’s key graphics feature, GPU- GDDR5 memory which offers enhanced, accelerated tessellation. leading-edge performance for the 3D TM z NVIDIA PhysX : Full support for NVIDIA enthusiasts. PhysX technology, enabling a totally new z 810MHz core clock, 1600MHz (DDR 3200 class of physical gaming interaction for a memory clock. more dynamic and realistic experience with • Driver CD and Application S/W GeForce. z High performance 2-slot cooler. TM Driver for Windows 7/Vista/XP z NVIDIA CUDA Technology: CUDA z DVI (resolution support up to 2560x1600), technology unlocks the power of the GPU’s VGA and HDMI support. processor cores to accelerate the most demanding system tasks such as video z Full integrated support for HDMI 1.4a transcoding, physics simulation, ray tracing including xvYCC, Deep color and 7.1 digital and more, delivering incredible performance surround sound. over traditional CPUs. z Dual-link HDCP Capable: Designed to z Microsoft Windows 7 Support: Windows 7 meet the output protection management is the next generation operating system that (HDCP) and security specifications of the will mark a dramatic improvement in the way Blu-ray Disc formats, allowing the playback the OS takes advantage of the GPU to of encrypted movie content on PCs when provide a more compelling user experience.
    [Show full text]
  • Press Release
    Wednesday, December 5, 2012 12:54:49 PM Eastern Standard Time Subject: AMD Paves Ease-of-Programming Path to Heterogeneous System Architecture with New APP SDK 2.8 and Unified Developer Tool Suite Date: Tuesday, December 4, 2012 8:04:53 AM Eastern Standard Time From: AMD Communications To: All AMD The following news releases cleared Market Wire at 8:00 a.m. (ET), Tuesday, December 4, 2012. Due to incompatible email and PC platforms, some symbols may not translate and spacing may be off. PRESS RELEASE Contact: Travis Williams AMD Public Relations (512) 602-4863 [email protected] AMD Paves Ease-of-Programming Path to Heterogeneous System Architecture with New APP SDK 2.8 and Unified Developer Tool Suite – New Array of Developer Tool Kits, Suites and Libraries Makes Heterogeneous Compute Programming More Accessible on AMD Platforms – SUNNYVALE, Calif. — Dec. 4, 2012 — AMD (NYSE: AMD) today announced availability of the AMD APP SDK 2.8 and the AMD CodeXL unified tool suite to provide developers the tools and resources needed to accelerate applications with AMD accelerated processing units (APUs) and graphics processing units (GPUs). The APP SDK 2.8 and CodeXL tool suite provides access to code samples, white papers, libraries and tools to leverage the processing power of heterogeneous compute with OpenCL™, C++, DirectCompute and more. “With CodeXL and APP SDK 2.8, our highest performing SDK to date which leaps past the competition in performance on standard benchmarks1, AMD continues to empower developers with the resources they need for greater performance and power-efficient applications,” said Manju Hegde, corporate vice president, Heterogeneous Applications and Developer Solutions, AMD.
    [Show full text]
  • Matt Sandy Program Manager Microsoft Corporation
    Matt Sandy Program Manager Microsoft Corporation Many-Core Systems Today What is DirectCompute? Where does it fit? Design of DirectCompute Language, Execution Model, Memory Model… Tutorial – Hello DirectCompute Tutorial – Constant Buffers Performance Considerations Tutorial – Thread Groups Tutorial – Shared Memory SIMD SIMD SIMD SIMD CPU0 CPU1 SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD CPU2 CPU3 SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD SIMD L2 Cache L2 Cache CPU GPU APU CPU ~10 GB/s GPU 50 GFLOPS 2500 GFLOPS ~10 GB/s ~100 GB/s CPU RAM GPU RAM 4-6 GB 1 GB x86 SIMD 50 GFLOPS 500 GFLOPS ~10 GB/s System RAM Microsoft’s GPGPU Programming Solution API of the DirectX Family Component of the Direct3D API Technical Computing, Games, Applications Media Playback & Processing, etc. C++ AMP, Accelerator, Brook+, Domain Domain D3DCSX, Ct, RapidMind, MKL, Libraries Languages ACML, cuFFT, etc. Compute Languages DirectCompute, OpenCL, CUDA C, etc. Processors APU, CPU, GPU, AMD, Intel, NVIDIA, S3, etc. Language and Syntax DirectCompute code written in “HLSL” High-Level Shader Language DirectCompute functions are “Compute Shaders” Syntax is C-like, with some exceptions Built-in types and intrinsic functions No pointers Compute Shaders use existing memory resources Neither create nor destroy memory Execution Model Threads are bundled into Thread Groups Thread Group size is defined in the Compute Shader numthreads attribute Host code dispatches Thread Groups, not threads z 0,0,1 1,0,1 2,0,1 3,0,1 4,0,1 5,0,1
    [Show full text]