GPU Data Structures for Graphics and Vision
Total Page:16
File Type:pdf, Size:1020Kb
Gernot Ziegler GPU Data Structures for Graphics and Vision - Ph.D. Thesis - Dissertation zur Erlangung des Grades Doktoringenieur (Dr.-Ing.) der Naturwissenschaftlich-Technischen Fakultät I der Universität des Saarlandes Max-Planck-Institut für Informatik Campus E1.4 66123 Saarbrücken Germany 2 Eingereicht am / Submitted on 20. 9. 2010 in Saarbrücken. Gernot Ziegler MPI Informatik - D4 Campus E1.4 D-66123 Saarbrücken Deutschland E-mail: [email protected] Betreuender Hochschullehrer - Supervisor Prof. Dr. Christian Theobalt, Max-Planck-Institut für Informatik, Saarbrücken, Deutschland. Prüfungsvorsitzender - Chairman of Examination Committee Prof. Dr. Christoph Weidenbach, Max-Planck-Institut für Informatik, Saarbrücken, Deutschland. Gutachter - Reviewers Prof. Dr. Hans-Peter Seidel, Max-Planck-Institut für Informatik, Saarbrücken, Deutschland. Prof. Dr-Ing. Marcus Magnor, TU Braunschweig, Braunschweig, Deutschland Prof. Dr. Christian Theobalt, Max-Planck-Institut für Informatik, Saarbrücken, Deutschland. Dekan - Dean Prof. Dr. Holger Hermanns, Universität des Saarlandes, Saarbrücken, Deutschland. Promovierter akademischer Mitarbeiter - Academic Member of the Faculty having a Doctorate Dr. Robert Herzog, Max-Planck-Institut für Informatik, Saarbrücken, Deutschland. Datum des Kolloquiums - Date of Defense 6. Mai 2011 - May 6th, 2011 3 Abstract Graphics hardware has in recent years become increasingly programmable, and its programming APIs use the stream processor model to expose massive parallelization to the programmer. Unfortunately, the inherent restrictions of the stream processor model, used by the GPU in order to maintain high performance, often pose a problem in porting CPU algorithms for both video and volume processing to graphics hardware. Serial data dependencies which accelerate CPU processing are counterproductive for the data-parallel GPU. This thesis demonstrates new ways for tackling well-known problems of large scale video/volume analysis. In some instances, we enable processing on the restricted hardware model by re- introducing algorithms from early computer graphics research. On other occasions, we use newly discovered, hierarchical data structures to circumvent the random-access read/fixed write restriction that had previously kept sophisticated analysis algorithms from running solely on graphics hardware. For 3D processing, we apply known game graphics concepts such as mip-maps, projective texturing, and dependent texture lookups to show how video/volume processing can benefit algorithmically from being implemented in a graphics API. The novel GPU data structures provide drastically increased processing speed, and lift processing heavy operations to real-time performance levels, paving the way for new and interactive vision/graphics applications. Kurzfassung Graphikhardware wurde in den letzen Jahren immer weiter programmierbar. Ihre APIs verwenden das Streamprozessor-Modell, um die massive Parallelisierung auch für den Programmierer verfügbar zu machen. Leider folgen aus dem strikten Streamprozessor-Modell, welches die GPU für ihre hohe Rechenleistung benötigt, auch Hindernisse in der Portierung von CPU-Algorithmen zur Video- und Volumenverarbeitung auf die GPU. Serielle Datenabhängigkeiten beschleunigen zwar CPU-Verarbeitung, sind aber für die daten-parallele GPU kontraproduktiv . Diese Arbeit präsentiert neue Herangehensweisen für bekannte Probleme der Video- und Volumensverarbeitung. Teilweise wird die Verarbeitung mit Hilfe von modifizierten Algorithmen aus der frühen Computergraphik-Forschung an das beschränkte Hardwaremodell angepasst. Anderswo helfen neu entdeckte, hierarchische Datenstrukturen beim Umgang mit den Schreibzugriff-Restriktionen die lange die Portierung von komplexeren Bildanalyseverfahren verhindert hatten. In der 3D-Verarbeitung nutzen wir bekannte Konzepte aus der Computerspielegraphik wie Mipmaps, projektive Texturierung, oder verkettete Texturzugriffe, und zeigen auf welche Vorteile die Video- und Volumenverarbeitung aus hardwarebeschleunigter Graphik-API-Implementation ziehen kann. Die präsentierten GPU-Datenstrukturen bieten drastisch schnellere Verarbeitung und heben rechenintensive Operationen auf Echtzeit-Niveau. Damit werden neue, interaktive Bildverarbeitungs- und Graphik-Anwendungen möglich. 4 Summary This doctoral thesis details how well-known tasks in large scale video/volume analysis, previously too complex for graphics hardware, can now be implemented using our new algorithmic concepts and data structures. The content begins with an introduction to graphics hardware and the graphics processing unit (GPU). We explain how graphics hardware works internally, how it differs from the CPU, and how its high performance is achieved. Afterwards we elaborate how the high performance design choices, namely parallelization and massive data processing, impose several restrictions on the programming model. Finally, we describe how general purpose computing on the GPU (GPGPU) emerged from the increasing flexibility of the graphics hardware, as well as which hardware/driver features are pivotal for GPGPU applications. Chapter 3 first explains the use of graphics hardware in Free Viewpoint Video (FVV) processing, which requires a merger of computer graphics and image processing to generate novel viewpoints from multi-view video streams. We describe how graphics hardware can not only assist in FVV playback but also in the compression of multi view video. In Chapter 4, we introduce how the camera/projector dualism can be implemented on graphics hardware. We demonstrate its usefulness in interactive camera calibration via proxy geometry, explain how depth projection can be easier represented in the OpenGL graphics API, and provide several approaches to fast reconstruction of novel views from depth video. Chapter 5 is dedicated to hierarchical image processing, which fits graphics hardware very well due to its regular structure. We explain how depth reconstruction via a plane-sweep approach benefits from graphics hardware acceleration. Then, the chapter continues on feature clustering based on reduction, which we use in the real-time detection of connected feature regions in images. The last section addresses how histograms can be computed in a data-parallel fashion, and how camera lens compensation can be accelerated via rapid histogram analysis of feature contour angles. Chapter 6 introduces a hierarchical data structure for GPU-based data compaction, the HistoPyramid, originally only a byproduct of fast histogram computation. We demonstrate how a traversal of HistoPyramids can circumvent many architectural restrictions of graphics hardware in the compaction of large data arrays to few selected input elements. The chapter continues with first applications: Real-time point cloud extraction and Streamline visualization for volume data, and concludes with future applications of the discovered data structure and algorithms. Chapter 7 demonstrates how HistoPyramids even prove useful in data expansion, a concept for replication and modification of input elements while writing an output array. After explaining the necessary modifications, we present two key applications: Real-time iso-surface extraction from volume data via the marching cubes algorithm, and Light Wavefront Simulation via adaptive ODEs. Chapter 8 introduces QuadPyramids, a descendant of HistoPyramids, which is used to cluster common regions in input data. With the help of these data structures, compact quadtrees can be extracted directly into graphics memory. We show first results from real-time video analysis, sketch relevant applications, and discuss how different methods of feature clustering affect the outcome. Chapter 9 presents the QuadPyramid´s logical extension to 3D, the OcPyramid. Beyond the promotion of the original algorithm to three dimensions, we explain how algorithmic extensions enable pointer octree generation on graphics hardware, and can thus contribute to a renaissance for the use of octrees in real-time computer graphics. Chapter 10 summarizes the contributions of the presented research and concludes the thesis with an outlook on the future development of GPU algorithms in the wider context of hardware evolution. 5 Contents 1. Overview................................................................................................15 1.1. Motivation and Overview........................................................................................15 1.2. Supervised student theses........................................................................................17 1.3. Publications.............................................................................................................17 2. The GPU - An Introduction.................................................................18 2.1. The History of Graphics Hardware.........................................................................18 2.2. General Purpose Computing on the GPU................................................................21 2.2.1. Early Studies .................................................................................................................21 2.2.2. Data-Parallel Architecture..............................................................................................22 2.2.3. GPU vs. CPU.................................................................................................................24 2.2.4. Related Architectures.....................................................................................................25