Polygon Rendering for Interactive Visualization on Multicomputers
Total Page:16
File Type:pdf, Size:1020Kb
POLYGON RENDERING FOR INTERACTIVE VISUALIZATION ON MULTICOMPUTERS by David Allan Ellsworth A Dissertation submitted to the faculty of The University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Computer Science. Chapel Hill 1996 © 1996 David Ellsworth ALL RIGHTS RESERVED ii DAVID ALLAN ELLSWORTH. Polygon Rendering for Interactive Visualization on Multicomputers (Under the direction of Professor Henry Fuchs). ABSTRACT This dissertation identifies a class of parallel polygon rendering algorithms suitable for interactive use on multicomputers, and presents a methodology for designing efficient algorithms within that class. The methodology was used to design a new polygon rendering algorithm that uses the frame-to-frame coherence of the screen image to evenly partition the rasterization at reasonable cost. An implementation of the algorithm on the Intel Touchstone Delta at Caltech, the largest multicomputer at the time, renders 3.1 million triangles per second. The rate was measured using a 806,640 triangle model and 512 i860 processors, and includes back-facing triangles. A similar algorithm is used in Pixel-Planes 5, a system that has specialized rasterization processors, and which, when introduced, had a benchmark score for the SPEC Graphics Performance Characterization Group “head” benchmark that was nearly four times faster than commercial workstations. The algorithm design methodology also identified significant performance improvements for Pixel-Planes 5. All fully parallel polygon rendering algorithms have a sorting step to redistribute primitives or fragments according to their screen location. The algorithm class mentioned above is one of four classes of parallel rendering algorithms identified; the classes are differentiated by the type of data that is communicated between processors. The identified algorithm class, called sort-middle, sorts screen-space primitives between the transformation and rasterization. The design methodology uses simulations and performance models to help make the design decisions. The resulting algorithm partitions the screen during rasterization into adaptively sized regions with an average of four regions per processor. The region boundaries are only changed when necessary: when one region is the rasterization bottleneck. On smaller systems, the algorithm balances the loads by assigning regions to processors once per frame, using the assignments made during one frame in the next. However, when 128 or more processors are used at high frame rates, the load balancing may take too long, and so static load balancing should be used. Additionally, a new all-to-all communication method improves the algorithm’s performance on systems with more than 64 processors. iii ACKNOWLEDGEMENTS I am grateful for the advice and support given by my advisor, Henry Fuchs, and by my committee members, Frederick Brooks, Jr, Kevin Jeffay, Anselmo Lastra, and John Poulton. Anselmo Lastra deserves extra thanks for reading several drafts of the dissertation. I have also profited from many discussions with other members of the Pixel-Planes project team. Special thanks go to Steve Molnar, Bill Mark, and Victoria Interrante for their discussions, and Carl Mueller and Vincent Illiano for reading a draft of the dissertation. I doubt that I would have finished without the support from those mentioned above. I am also grateful for the support provided by the National Science Foundation and the Defense Advanced Research Projects Agency, and by an Office of Naval Research fellowship. This research was performed in part using the Intel Touchstone Delta System operated by Caltech on behalf of the Concurrent Supercomputing Consortium. Initial access to the Delta was provided by Steve Taylor and Paul Messina of Caltech; the bulk of the work was done with access provided by DARPA. The NSF/DARPA Science and Technology Center for Computer Graphics and Scientific Visualization provided support for travel to the California Institute of Technology for my initial work on the Delta. iv TABLE OF CONTENTS Page LIST OF TABLES ......................................................................................................................................ix LIST OF FIGURES ....................................................................................................................................xi LIST OF ABBREVIATIONS..................................................................................................................... xvii LIST OF SYMBOLS ..................................................................................................................................xix 1 INTRODUCTION ................................................................................................................................. 1 1.1 Overview.............................................................................................................................. 1 1.1.1 Targeted Parallel System Architecture................................................................ 1 1.1.2 Targeted Applications ......................................................................................... 2 1.2 Thesis Statements ................................................................................................................2 1.3 Analysis of Redistribution Choices ..................................................................................... 2 1.4 Sort-Middle Design Methodology....................................................................................... 4 1.4.1 Design Choices for the Overall Algorithm Structure.......................................... 4 1.4.2 Design Choices for Load-Balancing the Rasterization ....................................... 7 1.4.3 Geometry Processing Design Choices ................................................................8 1.4.4 Implementation to Validate the Design Choices................................................. 9 1.4.5 Systems with Specialized Rasterization Processors............................................ 9 1.5 Description of the Target System ........................................................................................9 1.6 Structure of the Dissertation ................................................................................................10 1.7 Expected Reader Background.............................................................................................. 10 2 BACKGROUND ................................................................................................................................... 11 2.1 The Graphics Pipeline.......................................................................................................... 11 2.2 Types of Parallelism ............................................................................................................13 2.3 Algorithms for General-Purpose Parallel Computers.......................................................... 13 2.3.1 Image-Parallel Polygon Algorithms.................................................................... 14 2.3.2 Object-Parallel Polygon Algorithms ................................................................... 17 2.3.3 Ray Tracing Algorithms...................................................................................... 19 2.4 Hardware Graphics Systems................................................................................................ 20 2.4.1 Image-Parallel Systems....................................................................................... 20 2.4.2 Object-Parallel Architectures ..............................................................................27 2.5 Summary.............................................................................................................................. 29 3 CLASSES OF PARALLEL POLYGON RENDERING ALGORITHMS ........................................... 30 3.1 Description of the Algorithm Classes.................................................................................. 30 3.1.1 Sort-First.............................................................................................................. 31 3.1.2 Sort-Middle ......................................................................................................... 32 3.1.3 Sort-Last ..............................................................................................................34 3.1.4 Other Algorithms ................................................................................................35 3.2 Models of the Algorithm Classes ........................................................................................35 3.2.1 Serial Algorithm Cost.......................................................................................... 36 3.2.2 Sort-First Overhead............................................................................................. 36 3.2.3 Sort-Middle Overhead......................................................................................... 37 3.2.4 Sort-Last-Sparse Overhead ................................................................................. 38 3.2.5 Sort-Last-Full Overhead...................................................................................... 38 3.3 Comparisons of the Classes................................................................................................. 38 3.3.1 Sort-First vs. Sort-Middle ..................................................................................