Optimizing Hardware Granularity in Parallel Systems
Thomas Kelly
PhD The University of Edinburgh 1995
Li Declaration
This thesis has been composed by me and is, except where indicated, my own work.
Thomas Kelly. November, 1995 Abstract
In order for parallel architectures to be of any use at all in providing superior performance to uniprocessors, the benefits of splitting the workload among several processing elements must outweigh the overheads associated with this "divide and conquer" strategy.
Whether or not this is the case depends on the nature of the algorithm and on the cost: performance functions associated with the real computer hardware available at a given time.
This thesis is an investigation into the tradeoff of grain of hardware versus speed of hardware, in an attempt to show how the optimal hardware parallelism can be assessed.
A model is developed of the execution time T of an algorithm on a machine as a function of the number of nodes, N. The model is used to examine the degree to which it is possible to obtain an optimal value of N, corresponding to minimum execution time.
Specifically, the optimization is done assuming a particular base architecture, an algorithm or class thereof and an overall hardware cost.
Two base architectures and algorithm types are considered, corresponding to two common classes of parallel architectures: a shared memory multiprocessor and a message-passing multicomputer. The former is represented by a simple shared-bus multiprocessor in which each processing element performs operations on data stored in a global shared store. The second type is represented by a two- dimensional mesh-connected multicomputer. In this type of system all memory is considered private and data sharing is carried out using "messages" explicitly passed among the PEs. Acknowledgements
The pursuit of this research has introduced me to many people to whom I am indebted. First, I am grateful to my employer, Motorola Ltd., which supported this work in its entirety. In particular, I want to thank John Barr, Jim Ritchie and Chip Shanley; without their support I would not have been able to complete my work. Their encouragement, and patience, will always be appreciated.
In the Department of Computer Science at Edinburgh I found a good deal of support, and a rich atmosphere of study. To my supervisors Professor Roland Ibbet, and David Rees I am especially grateful. I also learned much from fellow students of whom Tim Harris, Neil MacDonald and Mike Norman were outstand- ing.
I also had the opportunity to spend time at the Department of Computing Sci- ence in the University of Glasgow. There, especially, did I begin the change of mind required to move from engineering to science. Advice and encouragement were gratefully received from John O'Donnell, Professor Simon Peyton-Jones, Mary Sheeran, Rob Sutherland, Pauline Haddow and Hans Stadtler. A particular place in my thanks is reserved for Mohamed Ould-Khaoua whose own persistence as well as encouragement and friendship were and are appreciated.
So to my family, closest to my heart, and the ultimate reason for any of my endeavours. They, my three girls, have encouraged, supported and sustained me most. And my parents; the work presented here rests upon the base with which they provided me.
Finally, I reserve the greatest thanks till last. For me, the period of study covered by this thesis was an apprenticeship in thought. Much as I am glad for the increase in knowledge I have gained in various aspects of Computing Science, far dearer to me was the opening of my eyes to thought itself, and the receipt of courage to ask, and ask, and ask. And in that respect I had the honour to learn from Lewis MacKenzie at the University of Glasgow. Through his mind, bested only by his humility, I finally learned really to begin to think. This thesis is dedicated to him, teacher and friend. For Lewis M. MacKenzie "The herring are flying high tonight"
Table of Contents