GENERAL-PURPOSE ARCHITECTURES FOR MEDIA PROCESSING AND DATABASE APPLICATIONS
Parthasarathy Ranganathan
Thesis: Doctor of Philosophy
Electrical and Computer Engineering H D
H D Rice University, Houston, Texas (August 2000) RICE UNIVERSITY
General-Purpose Architectures for Media Processing and Database Workloads by Parthasarathy Ranganathan
ATHESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE Doctor of Philosophy
APPROVED,THESIS COMMITTEE:
Sarita Adve, Chair Associate Professor in Electrical and Computer Engineering
Joseph R. Cavallaro Associate Professor in Electrical and Computer Engineering
Keith D. Cooper Professor of Computer Science
Norman P. Jouppi Compaq Western Research Laboratories
Willy E. Zwaenepoel Noah Harding Professor of Computer Science and Electrical and Computer Engineering Houston, Texas August, 2000 General-Purpose Architectures for Media Processing and Database Workloads
Parthasarathy Ranganathan
Abstract
Workloads on general-purpose computing systems have changed dramatically over the past few years, with greater emphasis on emerging compute-intensive applications such as me- dia processing and databases. However, until recently, most high performance computing studies have primarily focused on scientific and engineering workloads, potentially lead- ing to designs not suitable for these emerging workloads. This dissertation addresses this limitation. Our key contributions include (i) the first detailed quantitative simulation-based studies of the performance of media processing and database workloads on systems using state-of-the-art processors, and (ii) cost-effective architectural solutions targeted at achiev- ing the higher performance requirements of future systems running these workloads. The first part of the dissertation focuses on media processing workloads. We study the effectiveness of state-of-the-art features (techniques to extract instruction-level parallelism, media instruction-set extensions, software prefetching, and large caches). Our results iden- tify two key trends: (i) media workloads on current general-purpose systems are primarily compute-bound and (ii) current trends towards devoting a large fraction of on-chip tran- sistors (up to 80%) for caches can often be ineffective for media workloads. In response to these trends, we propose and evaluate a new cache organization, called reconfigurable caches. Reconfigurable caches allow the on-chip cache transistors to be dynamically di- vided into partitions that can be used for other activities (e.g., instruction memoization, application-controlled memory, and prefetching buffers), including optimizations that ad- dress the compute bottleneck. Our design of the reconfigurable cache requires relatively few modifications to existing cache structures and has small impact on cache access times. The second part of the dissertation evaluates the performance of database workloads like online transaction processing and decision support system on shared-memory mul- tiprocessor servers with state-of-the-art processors. Our main results show that the key performance-limiting characteristics of online transaction processing workloads are (i) large instruction footprints (leading to instruction cache misses) and (ii) frequent data com- munication (leading to cache-to-cache misses). We show that both these inefficiencies can be addressed with simple cost-effective optimizations. Additionally, our analysis of opti- mized memory consistency models with state-of-the-art processors suggest that the choice of the hardware consistency model of the system may not be a dominant factor for database workloads. Acknowledgments
If I have seen farther than others, it is because I was standing on the shoul- ders of giants. Albert Einstein, 1879-1955
This dissertation would not have been possible without the guidance, help, and support of several people.