Adaptive Distributed Architectures for Future Semiconductor Technologies

Adaptive Distributed Architectures for Future Semiconductor Technologies by Andrea Pellegrini A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Science and Engineering) in The University of Michigan 2013 Doctoral Committee: Associate Professor Valeria M. Bertacco, Chair Professor Todd M. Austin Professor Scott Mahlke Adjunct Associate Professor Silvio Savarese c Andrea Pellegrini All Rights Reserved 2013 To my family and friends. ii Acknowledgments I would like to thank my advisor Professor Valeria Bertacco, who supported and guided me throughout most of my years at Michigan. I cherished the freedom she offered me to explore and develop new ideas. Moreover, my writing and presentation skills have grown significantly as a result of her mentoring and attention. I am also grateful to my committee members. Professor Todd Austin has been a consis- tent source of inspiration throughout most of my research, and much credit goes to him for sowing the seeds of much of this dissertation. I would also like to acknowledge Professor Scott Mahlke for his technical insights and our friendly interactions. Last, but not least, I would like to thank Professor Silvio Savarese, who has been a dear friend for many years, and is leaving Ann Arbor at the same time as I am departing. This dissertation would not have been possible without a number of people, who supported and helped me during my time in graduate school. I am thankful to have met some truly amazing people: Jason Clemons, Drew DeOrio, Joseph Greathouse, and Ilya Wagner. You have always been there, no matter the day or the time, whenever I was in need. Thank you so much for being such wonderful friends. A special acknowledgement goes to all the students with whom I shared my office. I feel lucky to have been able to share all my frustrations and successes with you: William Arthur, Adam Bauserman, Kai-hui Chang, Debapriya Chatterjee, Kypros Constantinides, Shamik Ganguly, Jeff Hao, Chang-Hong Hsu, Andrew Jones, Doowon Lee, Rawan Khalek, Shashank Mahajan, Biruk Mammo, Andreas Moustakas, Ritesh Parikh, Robert Perricone, David Ramos, Steve Plaza, Shobana Sudhakar, and Dan Zhang. A number of people contributed to making my experience at Michigan wonderful. It is people like you that make the CSE department one of the best places in the world for doing research: Professor Kevin Compton, Professor John Hayes, Professor Benjamin Kuipers, Professor Igor Markov, Professor Trevor Mudge, Professor Marios Papaefthymiou, Professor Karem Sakallah, Professor Tom Wenisch, Denise DuPrie, Laura Fink, Dawn Freysinger, Lauri Johnson-Rafalski, Stephen Reger, Timur Alperovich, Bashar Al-Rawi, Matt Burgess, Michael Cieslak, Hector Garcia, Valeria Garro, Anthony Gutierrez, Jin Hu, Hadi Katebi, iii Anoushe Jamshidi, Daya Khudia, Dave Meisner, Steven Pelley, Joseph Pusdesris, Jarrod Roy, Korey Sewell, Sara Vinco, Chien-Chih Yu, and Ken Zick. Life in Ann Arbor would have been a lot less fun without Luca Cian, who never failed to make our evenings unique – I will surely miss you, my friend. I cannot count the fabulous memories I shared with Matt Burgess and Rick Hollander, you guys will be always in my heart. Thank you, Scott DeOrio, for being an amazing person. Thanks to Joe DeMatio, Mike Dushane, and Donny Nordlicht for letting me share a piece of your motoring Nirvana. Thanks also to a number of dear friends for their support: Filippo Bozzato, Guido Bozzetto, Sandro Colussi, Enrico Furlanetto, Flavio Griggio, Eric Hall, Alberto Lago, Paolo Leschi- utta, German Martinez, Jessica Morton, Zamir Pomare’, Ilario Scian, Scott Stuart, Fabio Tioni, Sever Tita, Silvia Tita, Mario Tomasini, Valeria Vavassori, and Massimiliano Zilli. I would like to thank my family for all their love and encouragement: my mom Maria Assunta, my dad Adriano, my granddad Mario, my uncle Mario, my aunt Silva, my siblings Tommaso, Maria Diletta, Maria Grazia, and Giacomo. I would also like to thank three families that took pride in me and welcomed as their own: the DeOrios, the Gileses, and the Nashleys. I hope one day I will be able to give to others as much as you gave to me. Finally, I would like to thank Natalia Amari for her endless support in the best and worst of times. Her unconditional love and support helped me to overcome the most daunting adversities and challenges. iv Preface As silicon technology continues to scale down transistor size, fundamental characteristics of this logic component are dramatically changing. Computer architects have grown ac- customed to relying on a limited number of robust transistors; hence, they have focused on building machines that could achieve the best performance within a given transistor budget. Unfortunately, transistor features are shifting and future manufacturing processes are expected to integrate a massive number of extremely fragile components. This change imposes challenges that current computer architectures cannot overcome. Indeed, current microprocessor designs are not fit to work in these new technology scenarios and are restricted by their incapability to handle component failures, the inability to manage specialized hardware components and the lack of design modularity. This thesis develops a number of solutions for reliable and a adaptable computing, culminating in an original, distributed architecture that incorporates all these techniques to deliver better reliability and unlock increasing efficiency from the higher transistor density expected with future technologies. The waning reliability of future transistors is the first concern. The consequences of this phenomenon are twofold: lower production yields due to higher rates of manufacturing defects and failures in the field. Neglecting runtime hardware faults can have dangerous consequences, as they could lead to service disruptions and corrupt program output. Since core counts continue to increase, despite the stagnating energy efficiency of processors, researchers foresee that soon only a small fraction of a chip will be able to operate at full throttle. This leads to another issue stemming from future semiconductor technologies: computer architectures must accommodate a large number of hardware functional units and automatically adapt software execution to the subset of components that can be powered up based on the performance requirements of the application. Finally, yet another challenge is the scarce design modularity of modern microproces- sors. Current processors are large, monolithic systems. To provide design flexibility, future designs should be composable, meaning that their components can be organized in various combinations to satisfy specific user requests. Solutions enabling the development of such architectures can allow future systems to v keep benefiting from the technological and economic advantages forecasted by Moore’s law. However, in order to maintain these advantages, and thus continue to improve digital systems, we need to revolutionize how computers are designed. This thesis first investigates the limitations of current processor designs, with particular focus on analyzing the consequences of unreliable components. Performing accurate fault analysis with traditional techniques is a tedious and time-consuming task. We directly address this issue with a complete framework to efficiently evaluate hardware malfunctions. The studies performed with this infrastructure expose the weaknesses in handling runtime failures, not only of current designs, but of state-of-the-art research solutions as well. These analyses also provide a deep understanding of how applications execute on hardware systems. Programs rarely stress all hardware units uniformly and microprocessor utilization varies greatly, even within the execution of a single application. Furthermore, long portions of a program often rely only on a few components. These observations guided the development of a low-cost adaptive reliability technique used to diagnose faulty components in a broad range of architectures, including those proposed in the latter part of this thesis. We then propose to provide adaptability to hardware designs. An adaptable hardware system can alter itself to dynamically match software demands, environmental characteristics, and physical defects. With the target of providing this characteristic, we develop a distributed infrastructure that can be integrated in modern multi-processors, systems-on-chip and distributed systems alike. Chip components dynamically exchange information about their condition and utilization. This information is collected by a distributed software infrastructure, which reconfigures the design to match hardware functionalities and application needs without relying on a central manager. Finally, to overcome the lack of design modularity, we propose a fully modular architecture. Our design organizes its hardware into a reconfigurable fabric of small, state-less modules. Each module can accomplish one or more services towards the execution of a portion of a program. Such a design greatly simplifies hardware organization, since each module is autonomous, and the number of available service providers does not affect the operations of the rest of the system. Thanks to its modularity and flexibility, it can achieve unprecedented reliability and adaptability. In our experiments, we found that these advantages are available at a moderate performance and power cost. Hence, it has the potential to reduce overall engineering costs of a system while

Load more