An Optimised and Gneralised Node for Fat Tree Classes-Adamantini

An optimised and generalised node for fat tree classes By Adamantini Peratikou This thesis is submitted in partial fulfilment of the requirements for the award of the degree of Doctor of Philosophy of the University of Portsmouth School of Computing University of Portsmouth Lion Terrace, Portsmouth, Hampshire PO1 3HE, United Kingdom Date: April 2014 ©Copyright 2014 by Adamantini Peratikou ©This copy of this thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with the author and no quotation from the thesis, nor any information derived there from, may be published without the author’s prior, written consent. DECLARATION BY CANDIDATE I hereby declare that whilst registered as a candidate for the above degree, I have not been registered for any other research award. The results and conclusions embodied in this thesis are the work of the named candidate and have not been submitted for any other academic award. Signature: A.Peratikou Date: 04/04/14 Word count: 31059 (Without appendices) Total word count: 37972 Abstract Fat tree topologies have been extensively used as interconnection networks for high performance parallel systems, with their most recent variants able to easily extend and scale to accommodate different system sizes and requirements. While each progressive and evolved fat-tree topology includes some extra advancements compared to the original ones, these topologies do not fully address or resolve the issues that large scales systems face. We propose an extended Zoned node, architecture as an alternative to conventional fat trees, and other variants. The extension relates to the provision of extra links to balance the number of routing switches, and hence increases the bisection bandwidth, and also to the extra layers that provide an inherent fault tolerance. In this work we emphasize on controlled power consumption, managed network complexity, faster message transmission, lower latency and higher throughput. These features are desirable for high performance parallel systems. We will show through semantics that our zoned node specifies most variants of the fat trees topologies such as k-ary n-tree and XGFT. We will also show that a replication of several zoned nodes into super nodes takes a broader view of complex interconnections such as dragonfly and PERCS. We propose a generic source routing algorithm that we call in this thesis “the single- sliced-addressing” that works for all the classes of the zoned nodes, and we will prove through analysis and simulation that it outperforms the previous routing scheme deployed in some variant fat tree topologies. Most variants of fat-tree topologies do not address optimisation issues. In this work, we develop an optimising system that identifies parameters and components of the zoned node that lead to an optimized architecture. The optimization process is achieved based on minimising the overall cost of the network that is directly related to its complexity and therefore proportional to the relative power, which serves as the objective function that is minimised based on the traffic constraints to maintain a lower delay and a higher throughput. The simulation results show that the extracted optimised zone node performs well under various load conditions and traffic patterns compared to non-optimised variants of fat tree topologies. i I dedicate this thesis to my parents Yiannos and Skevi , for their unconditional love and for guiding me to where I am today. ii Acknowledgements There are a number of people without whom this thesis might not have been written, and to whom I am greatly indebted. I apologize that I cannot acknowledge everyone by name here. Special thanks to my supervisor Dr. Mo Adda for the enormous help he offered me in order to overcome all the difficulties, which I faced during the accomplishment of this research. I admire his vast knowledge and enthusiasm in parallel processing field. Dr. Mo is the funniest advisor and one of the smartest people I know. I hope that I could be as lively, enthusiastic, and energetic as him. I would also like to thank him for his enormous patience, as I am a hard to handle stubborn individual. I am very grateful to my parents Yiannos and Skevi for their constant support and for and for teaching me to aim high. Without their love, encouragement and generous financial assistance i would not have been able to finish this work. I cannot find enough words to thank them and whatever i say will not be enough. I would also like to thank my boyfriend Marinos who helped me all these years in his own way, and for constantly cheering me up, my sister Mary who kept bugging me with her own problems and concerns, making me forget my stress. I am also thankful to Thanos for listening to me constantly complaining, and for the various insightful discussions we had. I would also like to thank my fellow colleagues for their advice and encouragement. I also appreciate all my friends and family members who stood by my side all these years. iii Table of Contents Abstract ................................................................................................................................................. i Acknowledgements .............................................................................................................................. iii Table of Contents ................................................................................................................................. iv List of Tables ..................................................................................................................................... viii List of figures ....................................................................................................................................... ix Abbreviations and Symbols ................................................................................................................ xii Disseminations ................................................................................................................................... xiv CHAPTER 1. INTRODUCTION .......................................................................................................... 1 1.1. Background and motivation ................................................................................................ 1 1.2. Approach, original contribution and benefits of this work ................................................. 3 1.3. Objectives and thesis outline .............................................................................................. 4 CHAPTER 2. LITERATURE REVIEW .................................................................................................. 6 2.1. Background information and terminology .......................................................................... 6 2.1.1. Parallel processing ...................................................................................................... 6 2.1.1.1. Instruction level parallelism .................................................................................... 7 2.1.1.2. Thread level parallelism .......................................................................................... 7 2.1.1.3. Processor types ....................................................................................................... 7 2.1.2. Parallel architectures .................................................................................................. 7 2.1.2.1. Shared memory ....................................................................................................... 7 2.1.2.2. Distributed memory ................................................................................................ 7 2.1.2.3. Hybrid-distributed shared memory ........................................................................ 8 2.1.2.4. Interconnection networks for parallelism .............................................................. 8 2.1.3. Network characteristics .............................................................................................. 9 2.1.3.1. Network diameter ................................................................................................... 9 2.1.3.2. Node degree ........................................................................................................... 9 2.1.3.3. Latency. ................................................................................................................... 9 2.1.3.4. Queuing delay. ........................................................................................................ 9 2.1.3.5. Processing delay. ................................................................................................... 10 2.1.3.6. Transmission delay. ............................................................................................... 10 2.1.3.7. Network congestion. ............................................................................................. 10 2.1.3.8. Throughput. .......................................................................................................... 10 2.1.3.9. Bisection width. .................................................................................................... 10 2.1.3.10. Path diversity. ................................................................................................... 10 2.1.3.11. Cost. .................................................................................................................

An Optimised and Gneralised Node for Fat Tree Classes-Adamantini

Appendix E Final-Draft.Fm Page 1 Friday, July 14, 2006 10:23 AM

Testing the Limits of Tapered Fat Tree Networks

On-Line Reconfigurable Extended Generalized Fat Tree Network-On-Chip for Multiprocessor System-On-Chip Circuits

Parallel Computing MIMD Machine (I)

Download File

Machine CM-5 Technical Summary

Energy-Efficient Interconnection Networks for High-Performance Computing

Chapter 1: Parallel Computing

Research Report Trace-Driven Co-Simulation Of

The RACE Network Architecture

Designing Hybrid Data Center Networks for High Performance and Fault Tolerance

Interconnection Networks for Parallel Computers 1613