Lightweight Protocols for Distributed Systems Thesis for Submission For
Total Page:16
File Type:pdf, Size:1020Kb
-- -- Lightweight Protocols for Distributed Systems Thesis for Submission for PhD J.Crowcroft Department of Computer Science University College London, Gower Street, London WC1E 6BT. -- -- - 2 - ABSTRACT A methodology and architecture have been developed that contrast sharply with common interpretations of the Open Systems Interconnection concept of Layering. The thesis is that while many functionalities are required to support the wide range of services in a Distributed System, no more than two concurrent layers of protocol are required below the application. The rest of the functions required may be speci®ed and implemented as in-line code, or procedure calls. The proliferation of similar functions such as multiplexing and error recovery, through more than one or two layers in the OSI model, contradicts many basic rules of modularisation, and leads to inherent inef®ciency and lack of robustness. Furthermore, it makes the implementation of Open Systems in hardware almost intractable. A key design issue is how to construct the automatic adaptation mechanisms that will allow a protocol to match the service requirements of the user. The arguments in this thesis are based on the experience of the design and implementation of three major classes of protocol for distributed systems: a transaction protocol (ESP), a reliable multicast protocol (MSP), and a Stream Protocol (TCP). A number of performance measurements of these protocols, including the latter, the DoD Transmission Control Protocol, were carried out to support the thesis. Finally, the impact of these protocols and this architecture on the switching fabric are investigated. We examine mechanisms for congestion control and fair sharing of the bandwidth in a large internet connected by datagram routers and MAC Bridges, with a variety of link technologies ranging from terrestrial to satellite. The discussion of Interconnection techniques is included because it is the view of the author that understanding traf®c distribution algorithms is vital to understanding end to end protocol performance. -- -- - 3 - Acknowledgements I would like to thank everyone in my family who encouraged me to ®nally do some work. 1 Thanks are also due to Ben Bacarisse, Mark Riddoch, Karen Paliwoda, Steve Easterbrook, members of the SATNET2 measurements Task Force, Peter Kirstein, Soren Sorenson, Peter Lloyd and Robert Cole for assistance at various points during this research.3 Thesis: Aims and Goals The aim of this research was to develop protocols that are designed and engineered for scalable distributed systems support. To this end, it was a goal to produce an ef®cient transport protocol for one-to-one and one-to-many Remote Procedure Call support that can also be used for conventional (connection-oriented) applications. An important goal is that we understand limits to the performance of such protocols in the Local and Wide Area. Sub-goals included: possible hardware support for transport protocols; link and network level security implications of non-trivial size distributed systems; some understanding of possible formal speci®cation of aspects of the protocols developed. The work described in this thesis was carried out between 1985 and 1988. The author believes that despite the time elapsed since then and the presentation of this work, many of the ®ndings are still highly relevant, especially the protocol designs and performance results. ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ 1. Thanks especially to the members of the IAB's End-to-End Research Group for the alternative title - the Unbearable Lightness of Protocols (Craig Partridge after Milan Kundera). 2. This work was supported in part by DARPA under Contract Number N00014-86-0092. Any views expressed in this thesis are those of the author alone. 3. Some inspiration was drawn from the writings of Myles na Gopaleen concerning the Research Bureau in the Irish Times: Recently I referred brie¯y to a new type of telephone patented by the Research Bureau. It is designed to meet an urgent social requirement. Nearly everybody likes to have a telephone in the house, not so much for its utility (which is very dubious), but for the social standing it implies. A telephone on display in your house means that you have at least some 'friend' or 'friends' - that there is somebody in the world who thinks it worthwhile to communicate with you. It also suggests that you must 'keep in touch', that great business and municipal undertakings would collapse unless your advice could be obtained at a moments notice. With a telephone in your house you are 'important'. But a real telephone costs far too much and in any event would cause you endless annoyance. To remedy this stay tough affairs the Bureau has devised a selection of bogus telephones. They are entirely self contained, powered by dry batteries and where talk can be heard coming from the instrument, this is done by a tiny photoelectric mechanism. Each instrument is ®tted, of course, with a bogus wire which may be embedded in the wainscoating. The cheapest model is simply a dud instrument. Nothing happens if you pick up the receiver and this model can only be used safely if you are certain that none of the visitors to you house will say "Do you mind if I use your phone?" The next model has this same drawback but it has the advantage that at a given hour every night it begins to ring. Buzz- buzz, buzz-buzz, buzz-buzz. You must be quick to get it before an obliging visitor 'helps' you. You pick up the receiver and say "Who? The Taoiseach? (Irish Prime Minister) Oh, very well. Put him on." The 'conversation' that follows is up to yourself. The more expensive instruments ring and speak and they are designed on the basis that a visitor will answer. In one model a voice says, "Is this the Hammond Lane Foundry?". Your visitor will say no automatically and the instrument will immediately ring off. In another model an urgent female voice says "New York calling So-and-So" - mentioning your name - "Is So-and-So there?" and keeps mechanically repeating this formula after the heartless manner of telephone girls. You spring to the instrument as quickly as possible and close the deal for the purchase of a half share in General Motors. These latter instruments are a bit risky as few householders can be relied upon to avoid fantastically exaggerated conversations. The safest of the lot is a Model 2B, which gives an engaged tone no matter what number is dialled by the innocent visitor. This is dead safe. -- -- - 4 - Contents: Organisation and Layout The thesis is presented in 8 chapters. The core of the work is presented in chapters three, four and ®ve. Here we present a simple transport protocol for RPC support, extensions for one-to-many conversations, and performance measurements of a similar protocol. This meets the primary aim of the thesis. The requirement for such protocols has long been identi®ed, and references are given throughout the text in support of this. The outline of all the chapters is as follows: 1. Distributed Systems Protocol Architectures. In this chapter, we review the services provided by the transmission network and those required by the user. We then review protocols in distributed systems, and ®nally present our Distributed Systems Protocol Architecture, 2. Protocol Implementation Design Methodology In this chapter we review Protocol Design Methodologies, and examine relative layering strategies and overheads. We compare the layering overheads versus ¯exibility and examine Finite State Machine Speci®cations and direct implementation from them. We look at mapping End to End protocols to Bounded Buffers 3. Sequential Exchange Protocol In this chapter we present the design of the Sequential Exchange Protocol, a Transaction Protocol for RPC. We look at the need for Presentation Level Integration, the need to support recursive calls, and then look beyond simple RPC at Messaging and Streaming. This protocol is an alternative to VMTPChe86a and can be used as a suitable improvement on the psuedo transport protocol used on UDP for support of NFS/Sun RPC.Inc86a, Mic89a It is intended primarily as a transaction protocol for RPC, something which has been identi®ed as missing for a decade. We look at how there is a need for Presentation Level Integration. The protocol also supports recursive calls. This chapter also examines what is needed beyond RPC such as Messaging/Streaming used for supporting networked window systems. 4. Multicast. This chapter presents the design of a multicast protocol, which is an enhancement to the transaction protocol presented in Chapter 3. The aim of the protocol is to provide a service equivalent to a sequence of reliable sequential unicasts between a client and a number of servers, whilst using the broadcast nature of some networks to reduce both the order of the number of packets transmitted and the overall time needed to collect replies. Appendix 1 introduces a Distributed Conferencing Program that can make use of multicast, and Appendix 2 outlines possible OSI protocol support for the windowing system used in the conferencing application. 5. Protocol Performance. This chapter describes the tools and measurement mechanisms used to investigate the performance of the Transmission Control Protocol (TCP) over the Atlantic Packet Satellite Network. One of the main tools used was developed by Stephen Easterbrook at UCL, while the other was provided by Van Jacobson of LBL. The relevance of this work is that similar dynamic mechanisms to those used by TCP are also employed in the design of both the Sequential Exchange Protocol and the Multicast protocol described in the previous two chapters. It might be argued that TCP has largely been deployed successfully in workstations on LANs. However, historically it has been designed to function over a wider range of underlying media than any other protocol, and to this end is widely adaptable.