Low Power Data-Dependent Transform Video and Still Image Coding

Low Power Data-Dependent Transform Video and Still Image Coding by Thucydides Xanthopoulos Bachelor of Science in Electrical Engineering Massachusetts Institute of Technology, June 1992 Master of Science in Electrical Engineering and Computer Science Massachusetts Institute of Technology, February 1995 Submitted to the Department of Electrical Engineering and Computer Science in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology February 1999 c 1999 Massachusetts Institute of Technology. All rights reserved. Signature of Author Department of Electrical Engineering and Computer Science February 1, 1999 Certified by Anantha Chandrakasan, Ph.D. Associate Professor of Electrical Engineering Thesis Supervisor Accepted by Arthur Clarke Smith, Ph.D. Professor of Electrical Engineering Graduate Officer 2 Low Power Data-Dependent Transform Video and Still Image Coding by Thucydides Xanthopoulos Submitted to the Department of Electrical Engineering and Computer Science on February 1, 1999 in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Science Abstract This work introduces the idea of data dependent video coding for low power. Algorithms for the Discrete Cosine Transform (DCT) and its inverse are introduced which exploit statistical properties of the input data in both the space and spatial frequency domains in order to minimize the total number of arithmetic operations. Two VLSI chips have been built as a proof-of-concept of data dependent processing implementing the DCT and its inverse (IDCT). The IDCT core processor exploits the presence of a large number of zero- valued spectral coefficients in the input stream when stimulated with MPEG-compressed video sequences. A data-driven IDCT computation algorithm along with clock gating techniques are used to reduce the number of arithmetic operations for video inputs. The second chip is a DCT core processor that exhibits two innovative techniques for arithmetic operation reduction in the DCT computation context along with standard voltage scal- ing techniques such as pipelining and parallelism. The first method reduces the bitwidth of arithmetic operations in the presence of data spatial correlation. The second method trades off power dissipation and image compression quality (arithmetic precision.) Both chips are fully functional and exhibit the lowest switched capacitance per sample among past DCT/IDCT chips reported in the literature. Their power dissipation profile shows a strong dependence with certain statistical properties of the data that they operate on, according to the design goal. Thesis Supervisor: Anantha Chandrakasan Title: Associate Professor of Electrical Engineering 3 4 Acknowledgments I am very grateful to a number of people for help and contributions during my years as a student. I would like to thank my thesis advisor and mentor Anantha Chandrakasan for all his help and support. Anantha has been a constant source of stimulation and new research ideas and his impact on this work is immeasurable. I thank him for treating me as a peer and not as a subordinate, for providing me with the best possible (and expensive!) equipment for my work, for sending me to numerous conferences (as far as Hawaii and Switzerland) and for having a lot of faith in my abilities. Above all I wish to thank him for the unprecedented amount of time he has devoted to me and my work. I wish to thank my thesis readers Tayo Akinwande and Ichiro Masaki for their time and for the very useful comments and suggestions. This work has been greatly improved because of their time and effort. I am very grateful to my fellow graduate students in 38-107 for a number of reasons. Raj Amirtharajah has been a constant source of wit and interesting ideas (in addition to complaints!) and has been very helpful in clarifying various ideas regarding this work. He has probably had dinner with me more often in the little round table in 38-107 than he did with his girlfriend! Tom Simon, the brilliant designer, has been behind all the interesting circuits of the DCT chip such as the pads and the flip flops. His cerebral sarcasm and our long discussions about France, Mahler and crazy dictators have been excellent and fulfilling chip layout breaks! Jim Goodman, in addition to being an incessant source of coarse Canadian humor, has been instrumental in setting up the technology files and the libraries for the ADI process. His help during the DCT chip tapeout has been invaluable. Scott Meninger, fellow Patriot fan, taught me how to throw the football and introduced me to Mississippi Mud in addition to being an excellent officemate. He is probably the only one in 38-107 that can match my beer drinking abilities! Wendi Rabiner has been a great source of comments and suggestions both for my thesis and my area exam. Josh Bretz introduced me to the humor of the “Onion”, supplied me with great CDs and taught me never to bet against him especially when money is involved. SeongHwan Cho has been a great help with my area exam and has graciously added my name to all the VLD papers. Vadim Gutnik has been a source of interesting discussions and has been very helpful with Unix problems. Gangs Konduri has graciously agreed to help me with a very useful Perl script. I also thank him for the late night study breaks. I wish to thank Rex Min for doing a great job on the DCT/ IDCT demonstration board. I thank Dan McMahill, Don Hitko and 5 Nate Shnidman for stimulating discussions over the past years. Dan has also helped me a great deal with the Tango PCB software. I did not have much interaction with the newest members of our group Amit Sinha and Alice Wang due to my grueling schedule during the past nine months. Yet, I am well aware of their outstanding abilities and I wish them best of luck in their studies. I wish to thank Elizabeth Chung and Margaret Flaherty who have been extremely helpful with administrative matters. Tom Lohman, Myron Freeman and Tom Wingard have been very helpful with the computing infrastructure. I thank National Semiconductor Corporation for providing me with a fellowship during my last two years. I thank Michael Sampogna, Chuck Traylor and Hemant Bheda for all their help during my summer at National. I thank SHARP Corporation for their partial financial support for this research. I thank Yoshifumi Yaoi of SHARP for his help with the DAC paper and with the MPEG compression standard in general. I wish to thank Rich Lethin for entrusting me with interesting projects and providing me with great opportunities. I thank Larry Dennison for his valuable advice in time of need and for his excellent SKILL tools that I used extensively throughout the development of the DCT chip. I would like to thank my Master thesis advisor Bill Dally for his help and support during my years at the MIT AI Laboratory. I also thank him for pointing me to the FMIDCT paper. I wish to thank my uncle John Xanthopoulos and aunt Virginia Xanthopoulos for all their help and support during my years in the States. Their house has always been open for me and their love has been unconditional. I thank my grandmothers Georgia Xanthopoulos and Themelina Kapellas for their love and support from the day I was born. I thank my parents Nikos and Maria and my sister Georgia for their love and support. I know how much you have missed me during the past 10 years because I have missed you exactly the same if not more. You have made me what I am and for this I am eter- nally grateful. My trips to Greece and our telephone conversations have been an incessant source of strength and courage for me to go on. I know that you want me to come back soon. I promise to you that I will come back sooner than you think. Finally, I wish to thank my companion Sylvi Dogramatzian for her unconditional love and support from the day we met. Your love has made me a better person. I thank you for your patience and all your sacrifices. I promise that I will make up for all the lost ground. This work is as much yours as it is mine. I wish to dedicate this work to the memory of both my grandfathers, Thucydides Xan- thopoulos and Manolis Kapellas. 6 Contents 1 Introduction 17 1.1 A Case for Low Power Hardware Compression/Decompression Macrocells 18 1.2 Low Power Design Methodoloy . ....................... 19 1.2.1 Algorithm Optimizations . ....................... 19 1.2.2 Architectural Optimizations . ....................... 21 1.2.3 Circuit Optimizations ........................... 21 1.3 Thesis Contributions and Overview ....................... 21 2 Background 23 2.1 The Discrete Cosine Transform and its Inverse ................. 23 2.1.1 Fast DCT/IDCT Algorithms ....................... 24 2.2 Distributed Arithmetic . ............................. 25 2.2.1 On the Successive Approximation Property of Distributed Arithmetic 27 2.3 Quantization . .................................. 29 2.3.1 The Lloyd-Max Optimum Quantizer . .................. 29 2.4 Image Quality Measure . ............................. 31 2.5 The MPEG Video Compression Standard . .................. 31 3 IDCT Core Processor 33 3.1 Background and Algorithmic Design ....................... 33 3.2 Chip Architecture .................................. 36 3.2.1 Clock Gating . ............................. 40 3.3 Circuit Design . .................................. 42 3.3.1 Arithmetic Circuits ............................. 42 3.3.2 Flip-Flops .................................. 45 7 8 CONTENTS 3.3.3 I/O Pads . .................................. 46 3.4 Power Estimation Methodology . ....................... 46 3.4.1 Glitch Power Investigation . ....................... 49 3.4.2 Pipeline Power Savings Investigation .................. 49 3.5 Chip I/O Specification and Usage . ....................... 52 3.6 Chip Physical Design and Packaging ......................

Low Power Data-Dependent Transform Video and Still Image Coding

Instruction-Level Distributed Processing

CCIA Comments in ITU CWG-Internet OTT Open Consultation.Pdf

Adv Forensic

Clock Gating for Power Optimization in ASIC Design Cycle: Theory & Practice

Traffic Safety Resource Prosecutors Miriam Norman- TSRP

Input and Output Index Mappings for a Prime-Factor- Decomposed Computation of Discrete Cosine Transform

Saber Eletrônica, Designers Pois Precisamos Comprovar Ao Meio Anunciante Estes Números E, Assim, Carlos C

Analysis of Body Bias Control Using Overhead Conditions for Real Time Systems: a Practical Approach∗

Best Texting Apps with Excellent Security Omigo.Ir/R/3Ch1 : ﻟﯿﻨﮏ (Omigo.Ir/Qrmvcdifapricx) Hildanoori ﺧﻮاﻧﺪن 13 دﻗﯿﻘﻪ

Internet: a Major Resource for Toxicologists* San Joy Kumar Pal T, Aamir Nazir, Lndranil Mukhopadhyay, D K

Register Allocation and VDD-Gating Algorithms for Out-Of-Order

A 65 Nm 2-Billion Transistor Quad-Core Itanium Processor