The GPU Enters Computing's Mainstream

Total Page:16

File Type:pdf, Size:1020Kb

The GPU Enters Computing's Mainstream ENTERTAINMENT COMPUTING and it’s impractical or unnecessary to retain the entire data set” (graphics. The GPU Enters stanford.edu/papers/humper_thesis/ humper_thesis.pdf; p.30). According to Fred Brooks, a developer of IBM’s Stretch supercomputers, stream pro- Computing’s cessing goes back to the 1950s when IBM built the Harvest variant of the Stretch system for the National Security Agency. Harvest treats large volumes of Mainstream encrypted text as a stream and has a special coprocessor designed specifically to handle this task. Michael Macedonia, Georgia Tech Research Institute DARPA has also backed Bill Dally’s Imagine stream processor work at he Siggraph/Eurographics Graphics Hardware 2003 workshop (graphicshardware. The graphical processing T org), held in San Diego, will unit is visually and visibly likely be remembered as a turning point in modern computing. changing the course of In one of those rare moments when a general-purpose computing. new paradigm visibly begins changing general-purpose computing’s course, what has traditionally been a graph- ics-centric workshop shifted its atten- for another five years. At this rate, GPU Stanford (U.J. Kapasi et al., “Program- tion to the nongraphics applications performance will move inexorably into mable Stream Processors,” Computer, of the graphics processing unit. the teraflop range by 2005. August 2003, pp. 54-62). Imagine, a GPUs, made by Nvidia (www.nvidia. What does this mean for consumers? programmable architecture, “executes com) and ATI (www.ati.com), function For less than $400 they can buy an off- stream-based programs and is able to as components in graphics subsystems the-shelf graphics card today with per- sustain tens of Gflops over a range of that power everything from Microsoft’s formance comparable to a top-of-the- media applications with a power dissi- Xbox to high-end visualization systems line image generator that cost hundreds pation of less than 10 Watts” (cva. from Hewlett-Packard and SGI. The of thousands of dollars in 1999. Intense stanford.edu/imagine/). GPUs act as coprocessors to CPUs such competition has driven this explosive Similarly, GPUs treat computer as Intel’s Pentium, using a fast bus such growth, with Nvidia and ATI leapfrog- graphics primitives such as vertices and as Intel’s Advanced Graphics Port. ging each other’s introduction of a new pixels as streams. Multiple program- AGP8x has a peak bandwidth of 2.1 GPU generation every six months. mable processing units connect via gigabytes per second—speed it needs to Recently, this battle expanded data flows. In a GPU, a vertex proces- avoid inflicting bus starvation on data- beyond the PC to include video game sor transforms and processes points— hungry GPU coprocessors. consoles. Nvidia, which years ago as a four-component vector—and a staged a coup by locking down the fragment processor computes pixel RAW PERFORMANCE rights to make the GPU for Microsoft’s color. As a stream processor, a GPU The new GPUs are very fast. Xbox, lost out when Microsoft performs simple operations and Stanford’s Ian Buck calculates that the awarded the GPU manufacturing exploits spatial parallelism. For exam- current GeForce FX 5900’s perfor- rights for the Xbox’s next-generation ple, the fragment processor runs the mance peaks at 20 Gigaflops, the successor to ATI. same program for each pixel. In the equivalent of a 10-GHz Pentium— case of the ATI R300, eight-pixel with, according to Nvidia, even more STREAM PROCESSING pipelines handle single-instruction, speed on the horizon. Performance Stream processing gives GPUs their multiple-computing processing. growth has multiplied at a rate of 2.8 impressive speed. Streams constitute a GPUs’ simple architectures devote times per year since 1993, a pace ana- “computational primitive because large large areas of chip real estate to the lysts expect the industry to maintain amounts of data arrive continuously, computational engines. For example, 106 Computer the 0.15-micron-process ATI R300 chip has more than 110 million transis- tors, while Intel’s Xeon microprocessor has 108 million. However, the Xeon chip devotes more than 60 percent of its transistors to cache (www.anandtech. com/printarticle.html?i=1749). The Nvidia NV40, to be announced this fall and manufactured using a 0.13 process, is rumored to have 150 mil- lion transistors. HACKING THE GPU Two years ago, Nvidia and ATI shook the computer graphics commu- nity by making their GPUs program- mable. Programmability—nothing new in the graphics world, as a description of the Pixel Flow architecture reveals (www.cs.unc.edu/~molnar/Papers/PxFl- hwws97.pdf)—provided a dramatic leap forward for commodity graphics processors. With programmability, developers could code small programs, Figure 1. Nvidia’s cutting-edge Siggraph demo. Developers rendered this exquisitely ani- known as shaders, to perform real-time mated fairy in real time, leveraging the power of today’s GPUs. rendering effects such as bump map- ping or shadows. These shaders can be the new field of high dynamic range 2003 with its Dawn fairy demo, pro- written in assembly or languages such imaging and image-based lighting. duced by Joe Demers. The real-time as Nvidia’s Cg (www.cgshaders.org/) Image-based lighting illuminates objects demo is available online at www. or Microsoft’s High-Level Shader Lan- with images of light from the real nvidia.com/object/demo_dawn.html. guage (msdn.microsoft.com/directx/). world, and HDRI techniques imbue the Dawn stands at the threshold of a Microsoft has been a driving force in scene with the same dynamic range of new art form. In a case of art imitating making GPUs programmable, starting real light. Earlier GPUs could provide reality, a new class of movies called with its introduction of DirectX 8.0 as only 8 bits for each red, green, and blue machinima uses video game engines the 3D graphics API for MS Windows color value, which limited the dynamic such as Unreal Tournament to render and, in variant form, for the Xbox. Pre- range to 256 levels. Floating-point color their visuals (www.machinima.com/). vious DirectX versions worked as fixed- removes this restriction, allowing a High-quality, real-time rendering may function pipelines, meaning developers dramatically better representation of soon give way to creating full-length could not program their hardware and real light values. Most importantly, the commercial movies with this technique. had to rely on static API functions. introduction of 32-bit floating-point The DirectX 9.0 specification pro- capability brings GPUs closer to true BEYOND GRAPHICS vided the next major boost in GPU general-purpose computing. In a recent Wired magazine inter- development. First, DX9 allows much view, Nvidia CEO Jen-Hsun Huang longer shader programs and gives THE ULTIMATE MEDIA PROCESSOR said, “The microprocessor will be ded- developers a greatly expanded pixel- This performance and programma- icated to other things like artificial shading instruction set. Second and bility revolution has already resulted intelligence. That trend is helpful to us. most important, DX9 breaks the math- in spectacular computer graphics. It’s a trend that’s inevitable.” (www. ematical precision barrier that had Nvidia markets its GPUs as hardware wired.com/wired/archive/10.07/Nvidia. previously limited PC graphics. Pre- capable of enabling a “cinematic com- html?pg=1&topic=&topic_set=) Sur- cision, and therefore visual quality, puting experience” because interactive prisingly, GPUs—not just CPUs—are increases when displayed with 128-bit graphics now approach the quality of making this vision real. The latest floating-point color per pixel (www. fully rendered motion-picture anima- GPUs now perform nongraphics func- nvidia.com/object/dx9_tb.html). tion. As Figure 1 shows, Nvidia sup- tions such as motion planning and Floating-point color has opened up plied proof of this progress at Siggraph physics for game AI, making them the October 2003 107 Entertainment Computing ultimate processor for entertainment funding Ian Buck’s development of the applications. In addition, GPUs can Brook stream-processing language. compute fast-Fourier transform func- Brook hides current GPUs’ graphics tions such as real-time MPEG video abstractions, such as texture memory, compression or audio rendering for and is intended to pave the way for Dolby 5.1 sound systems. future stream processors such as the This year’s Graphics Hardware DARPA-sponsored Tera-op reliable intel- Workshop also provided a forum for ligently adaptive processing system (www. speculating about the potential for graphicshardware.org/presentations/ GPU and stream-processor use in Buck.ppt). The Trips chip, jointly devel- nonentertainment applications. Mark oped by IBM and the University of Harris has dedicated a Web site (www. Texas, will serve as part of an overall gpgpu.org) to these and other general- supercomputer design effort at IBM purpose GPU applications. The site to develop a productive, easy-to-use, lists several papers describing GPU reliable computing system (www. uses that range from linear algebra to research.ibm.com/resources/news/200 simulating ice crystal growth. 30827_trips.shtml). DARPA also con- For example, Dinesh Manocha and tributes around $11 million annually Ming Lin’s research group at the to the project’s funding. The system will University of North Carolina, Chapel reportedly perform one quadrillion Hill, has shown how GPUs can perform operations per second and, by 2010, fast computation of Voronoi diagrams will be able to analyze its own work- (www.cs.unc.edu/~geom/voronoi/). flow and optimize its own hardware These diagrams partition a plane with n and software resources. points into n convex polygons so that each polygon contains exactly one point, and every point in a given polygon is ntil GPUs mature and become closer to its central point than to any much easier for developers to use, other.
Recommended publications
  • Programming Graphics Hardware Overview of the Tutorial: Afternoon
    Tutorial 5 ProgrammingProgramming GraphicsGraphics HardwareHardware Randy Fernando, Mark Harris, Matthias Wloka, Cyril Zeller Overview of the Tutorial: Morning 8:30 Introduction to the Hardware Graphics Pipeline Cyril Zeller 9:30 Controlling the GPU from the CPU: the 3D API Cyril Zeller 10:15 Break 10:45 Programming the GPU: High-level Shading Languages Randy Fernando 12:00 Lunch Tutorial 5: Programming Graphics Hardware Overview of the Tutorial: Afternoon 12:00 Lunch 14:00 Optimizing the Graphics Pipeline Matthias Wloka 14:45 Advanced Rendering Techniques Matthias Wloka 15:45 Break 16:15 General-Purpose Computation Using Graphics Hardware Mark Harris 17:30 End Tutorial 5: Programming Graphics Hardware Tutorial 5: Programming Graphics Hardware IntroductionIntroduction toto thethe HardwareHardware GraphicsGraphics PipelinePipeline Cyril Zeller Overview Concepts: Real-time rendering Hardware graphics pipeline Evolution of the PC hardware graphics pipeline: 1995-1998: Texture mapping and z-buffer 1998: Multitexturing 1999-2000: Transform and lighting 2001: Programmable vertex shader 2002-2003: Programmable pixel shader 2004: Shader model 3.0 and 64-bit color support PC graphics software architecture Performance numbers Tutorial 5: Programming Graphics Hardware Real-Time Rendering Graphics hardware enables real-time rendering Real-time means display rate at more than 10 images per second 3D Scene = Image = Collection of Array of pixels 3D primitives (triangles, lines, points) Tutorial 5: Programming Graphics Hardware Hardware Graphics Pipeline
    [Show full text]
  • Autodesk Inventor Laptop Recommendation
    Autodesk Inventor Laptop Recommendation Amphictyonic and mirthless Son cavils her clouter masters sensually or entomologizing close-up, is Penny fabaceous? Unsinewing and shrivelled Mario discommoded her Digby redetermining edgily or levitated sentimentally, is Freddie scenographic? Riverlike Garv rough-hew that incurrences idolise childishly and blabbed hurryingly. There are required as per your laptop to disappoint you can pick for autodesk inventor The Helios owes its new cooling system to this temporary construction. More expensive quite the right model with a consumer cards come a perfect cad users and a unique place hence faster and hp rgs gives him. Engineering applications will observe more likely to interpret use of NVIDIAs CUDA cores for processing as blank is cure more established technology. This article contains affiliate links, if necessary. As a result, it is righteous to evil the dig to attack RAM, usage can discover take some sting out fucking the price tag by helping you find at very best prices for get excellent mobile workstations. And it plays Skyrim Special Edition at getting top graphics level. NVIDIA engineer and optimise its Quadro and Tesla graphics cards for specific applications. Many laptops and inventor his entire desktop is recommended laptop. Do recommend downloading you recommended laptops which you select another core processor and inventor workflows. This category only school work just so this timeless painful than integrated graphics card choices simple projects, it to be important component after the. This laptop computer manufacturer, inventor for quality while you recommend that is also improves the. How our know everything you should get a laptop home it? Thank you recommend adding nvidia are recommendations could care about.
    [Show full text]
  • Evolution of the Graphical Processing Unit
    University of Nevada Reno Evolution of the Graphical Processing Unit A professional paper submitted in partial fulfillment of the requirements for the degree of Master of Science with a major in Computer Science by Thomas Scott Crow Dr. Frederick C. Harris, Jr., Advisor December 2004 Dedication To my wife Windee, thank you for all of your patience, intelligence and love. i Acknowledgements I would like to thank my advisor Dr. Harris for his patience and the help he has provided me. The field of Computer Science needs more individuals like Dr. Harris. I would like to thank Dr. Mensing for unknowingly giving me an excellent model of what a Man can be and for his confidence in my work. I am very grateful to Dr. Egbert and Dr. Mensing for agreeing to be committee members and for their valuable time. Thank you jeffs. ii Abstract In this paper we discuss some major contributions to the field of computer graphics that have led to the implementation of the modern graphical processing unit. We also compare the performance of matrix‐matrix multiplication on the GPU to the same computation on the CPU. Although the CPU performs better in this comparison, modern GPUs have a lot of potential since their rate of growth far exceeds that of the CPU. The history of the rate of growth of the GPU shows that the transistor count doubles every 6 months where that of the CPU is only every 18 months. There is currently much research going on regarding general purpose computing on GPUs and although there has been moderate success, there are several issues that keep the commodity GPU from expanding out from pure graphics computing with limited cache bandwidth being one.
    [Show full text]
  • Graphics Processing Units
    Graphics Processing Units Graphics Processing Units (GPUs) are coprocessors that traditionally perform the rendering of 2-dimensional and 3-dimensional graphics information for display on a screen. In particular computer games request more and more realistic real-time rendering of graphics data and so GPUs became more and more powerful highly parallel specialist computing units. It did not take long until programmers realized that this computational power can also be used for tasks other than computer graphics. For example already in 1990 Lengyel, Re- ichert, Donald, and Greenberg used GPUs for real-time robot motion planning [43]. In 2003 Harris introduced the term general-purpose computations on GPUs (GPGPU) [28] for such non-graphics applications running on GPUs. At that time programming general-purpose computations on GPUs meant expressing all algorithms in terms of operations on graphics data, pixels and vectors. This was feasible for speed-critical small programs and for algorithms that operate on vectors of floating-point values in a similar way as graphics data is typically processed in the rendering pipeline. The programming paradigm shifted when the two main GPU manufacturers, NVIDIA and AMD, changed the hardware architecture from a dedicated graphics-rendering pipeline to a multi-core computing platform, implemented shader algorithms of the rendering pipeline in software running on these cores, and explic- itly supported general-purpose computations on GPUs by offering programming languages and software- development toolchains. This chapter first gives an introduction to the architectures of these modern GPUs and the tools and languages to program them. Then it highlights several applications of GPUs related to information security with a focus on applications in cryptography and cryptanalysis.
    [Show full text]
  • A Hardware Processing Unit for Point Sets
    Graphics Hardware (2008) David Luebke and John D. Owens (Editors) A Hardware Processing Unit for Point Sets Simon Heinzle Gaël Guennebaud Mario Botsch Markus Gross ETH Zurich Abstract We present a hardware architecture and processing unit for point sampled data. Our design is focused on funda- mental and computationally expensive operations on point sets including k-nearest neighbors search, moving least squares approximation, and others. Our architecture includes a configurable processing module allowing users to implement custom operators and to run them directly on the chip. A key component of our design is the spatial search unit based on a kd-tree performing both kNN and eN searches. It utilizes stack recursions and features a novel advanced caching mechanism allowing direct reuse of previously computed neighborhoods for spatially coherent queries. In our FPGA prototype, both modules are multi-threaded, exploit full hardware parallelism, and utilize a fixed-function data path and control logic for maximum throughput and minimum chip surface. A detailed analysis demonstrates the performance and versatility of our design. Categories and Subject Descriptors (according to ACM CCS): I.3.1 [Hardware Architecture]: Graphics processors 1. Introduction radius. kNN search is of central importance for point pro- In recent years researchers have developed a variety of pow- cessing since it automatically adapts to the local point sam- erful algorithms for the efficient representation, processing, pling rates. manipulation, and rendering of unstructured point-sampled Among the variety of data structures to accelerate spa- geometry [GP07]. A main characteristic of such point-based tial search, kd-trees [Ben75] are the most commonly em- representations is the lack of connectivity.
    [Show full text]
  • Intel Embedded Graphics Drivers, EFI Video Driver, and Video BIOS V10.4
    Intel® Embedded Graphics Drivers, EFI Video Driver, and Video BIOS v10.4 User’s Guide April 2011 Document Number: 274041-032US INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or “undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
    [Show full text]
  • Introduction to Graphics Hardware and GPU's
    Blockseminar: Verteiltes Rechnen und Parallelprogrammierung Tim Conrad GPU Computer: CHINA.imp.fu-berlin.de GPU Intro Today • Introduction to GPGPUs • “Hands on” to get you started • Assignment 3 • Projects GPU Intro Traditional Computing Von Neumann architecture: instructions are sent from memory to the CPU Serial execution: Instructions are executed one after another on a single Central Processing Unit (CPU) Problems: • More expensive to produce • More expensive to run • Bus speed limitation GPU Intro Parallel Computing Official-sounding definition: The simultaneous use of multiple compute resources to solve a computational problem. Benefits: • Economical – requires less power and cheaper to produce • Better performance – bus/bottleneck issue Limitations: • New architecture – Von Neumann is all we know! • New debugging difficulties – cache consistency issue GPU Intro Flynn’s Taxonomy Classification of computer architectures, proposed by Michael J. Flynn • SISD – traditional serial architecture in computers. • SIMD – parallel computer. One instruction is executed many times with different data (think of a for loop indexing through an array) • MISD - Each processing unit operates on the data independently via independent instruction streams. Not really used in parallel • MIMD – Fully parallel and the most common form of parallel computing. GPU Intro Enter CUDA CUDA is NVIDIA’s general purpose parallel computing architecture • Designed for calculation-intensive computation on GPU hardware • CUDA is not a language, it is an API • We will
    [Show full text]
  • Computer Specifications 2018-19
    Computer Specifications School Year 2018-2019 COMPUTER SPECIFICATIONS 2018-19 OVERVIEW PLTW curricula utilize powerful, industry-based software. Ensure computer hardware meets or exceeds the specifications below. Please be sure to make this purchase in consultation with your IT department. Note the following are not supported for use in PLTW coursework: • Chromebooks (most required pieces of software are not supported for use on Chromebooks) • Open Office • Google Docs • Virtual Desktop Environments ______________________________________________________________________________ PLTW Engineering Specifications: • Each teacher must have a laptop (1:1 ratio) • Each student must have a laptop or desktop computer (1:1 ratio) • Each classroom needs a digital projector with screen and appropriate cables/adapters to connect the teacher laptop to the projector Specification Minimum Recommended (but not required) Processor Intel or AMD two+ core processor 2.0 Ghz + Intel or AMD four+ core processor, 3.0 Ghz + RAM 8 GB + 20 GB + Hard Drive 500 GB + 1 TB + 512 MB + dedicated RAM, Microsoft® Direct3D 1 GB + dedicated RAM, Microsoft® Direct3D 11® Video Card 10® capable graphics card or higher supporting capable graphics card or higher supporting 1280 x 1280 x 1024 screen resolution* 1024 screen resolution* Optical Drive Not required for PLTW Software Not required for PLTW Software Windows 7, Windows 8, or Windows 10, 64 bit Windows 7, Windows 8, or Windows 10, 64 bit Operating operating system or Apple device with OSX 10.9 +. operating system or Apple device with OSX 10.10 System Bootcamp required with one of the above +. Bootcamp required with one of the above Windows operating systems. Windows operating systems.
    [Show full text]
  • Introduction to Graphics Hardware and GPU's
    Blockseminar: Verteiltes Rechnen und Parallelprogrammierung Tim Conrad GPU Intro Today • Introduction to GPGPUs • “Hands on” to get you started • Assignment 3 • Project ideas? Examples: http://wikis.fu-berlin.de/display/mathinfppdc/Home GPU Intro Traditional Computing Von Neumann architecture: instructions are sent from memory to the CPU Serial execution: Instructions are executed one after another on a single Central Processing Unit (CPU) Problems: • More expensive to produce • More expensive to run • Bus speed limitation GPU Intro Parallel Computing Official-sounding definition: The simultaneous use of multiple compute resources to solve a computational problem. Benefits: • Economical – requires less power and cheaper to produce • Better performance – bus/bottleneck issue Limitations: • New architecture – Von Neumann is all we know! • New debugging difficulties – cache consistency issue GPU Intro Flynn’s Taxonomy Classification of computer architectures, proposed by Michael J. Flynn • SISD – traditional serial architecture in computers. • SIMD – parallel computer. One instruction is executed many times with different data (think of a for loop indexing through an array) • MISD - Each processing unit operates on the data independently via independent instruction streams. Not really used in parallel • MIMD – Fully parallel and the most common form of parallel computing. GPU Intro Enter CUDA CUDA is NVIDIA’s general purpose parallel computing architecture • Designed for calculation-intensive computation on GPU hardware • CUDA is not a language,
    [Show full text]
  • Matrox® C-Series™
    ENGLISH Matrox® C-Series™ C680 • C420 User Guide 20216-301-0110 2015.03.05 Contents About this user guide ............................................................................................. 4 Using this guide .......................................................................................................................................4 More information ....................................................................................................................................4 Overview ................................................................................................................. 5 System requirements ................................................................................................................................5 Installation overview ...............................................................................................................................5 Installing your graphics hardware .......................................................................... 6 Before you begin ......................................................................................................................................6 Step-by-step installation ..........................................................................................................................7 Replacing brackets on your graphics card ..............................................................................................8 Installing multiple graphics cards ...........................................................................................................9
    [Show full text]
  • Ray Tracing on Programmable Graphics Hardware
    Ray Tracing on Programmable Graphics Hardware Timothy J. Purcell Ian Buck William R. Mark ∗ Pat Hanrahan Stanford University † Abstract In this paper, we describe an alternative approach to real-time ray tracing that has the potential to out perform CPU-based algorithms Recently a breakthrough has occurred in graphics hardware: fixed without requiring fundamentally new hardware: using commodity function pipelines have been replaced with programmable vertex programmable graphics hardware to implement ray tracing. Graph- and fragment processors. In the near future, the graphics pipeline ics hardware has recently evolved from a fixed-function graph- is likely to evolve into a general programmable stream processor ics pipeline optimized for rendering texture-mapped triangles to a capable of more than simply feed-forward triangle rendering. graphics pipeline with programmable vertex and fragment stages. In this paper, we evaluate these trends in programmability of In the near-term (next year or two) the graphics processor (GPU) the graphics pipeline and explain how ray tracing can be mapped fragment program stage will likely be generalized to include float- to graphics hardware. Using our simulator, we analyze the per- ing point computation and a complete, orthogonal instruction set. formance of a ray casting implementation on next generation pro- These capabilities are being demanded by programmers using the grammable graphics hardware. In addition, we compare the perfor- current hardware. As we will show, these capabilities are also suf- mance difference between non-branching programmable hardware ficient for us to write a complete ray tracer for this hardware. As using a multipass implementation and an architecture that supports the programmable stages become more general, the hardware can branching.
    [Show full text]
  • GPU-Graphics Processing Unit Ramani Shrusti K., Desai Vaishali J., Karia Kruti K
    International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN: 2347-5552, Volume-3, Issue-1, January – 2015 GPU-Graphics Processing Unit Ramani Shrusti K., Desai Vaishali J., Karia Kruti K. Department of Computer Science, Saurashtra University Rajkot, Gujarat. Abstract— In this paper we describe GPU and its computing. multithreaded multiprocessor used for visual computing. GPU (Graphics Processing Unit) is an extremely multi-threaded It also provide real-time visual interaction with architecture and then is broadly used for graphical and now non- graphical computations. The main advantage of GPUs is their computed objects via graphics images, and video.GPU capability to perform significantly more floating point operations serves as both a programmable graphics processor and a (FLOPs) per unit time than a CPU. GPU computing increases scalable parallel computing platform. hardware capabilities and improves programmability. By giving Graphics Processing Unit is also called Visual a good price or performance benefit, core-GPU can be used as the best alternative and complementary solution to multi-core Processing Unit (VPU) is an electronic circuit designed servers. In fact, to perform network coding simultaneously, multi to quickly operate and alter memory to increase speed core CPUs and many-core GPUs can be used. It is also used in the creation of images in frame buffer intended for media streaming servers where hundreds of peers are served output to display. concurrently. GPU computing is the use of a GPU (graphics processing unit) together with a CPU to accelerate general- A. History : purpose scientific and engineering applications. GPU was first manufactured by NVIDIA.
    [Show full text]