INFORMATION to USERS This Manuscript Has Been Reproduced

Operating system support for high-speed networking. Item Type text; Dissertation-Reproduction (electronic) Authors Druschel, Peter Publisher The University of Arizona. Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author. Download date 29/09/2021 12:17:24 Link to Item http://hdl.handle.net/10150/186828 INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely. event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. Each original is also photographed in one exposure and is included in reduced form at the back of the book. Photographs included in the original manuscript have been reproduced xerograpbically in this copy. Higber quality 6" x 9" black and white photographic prints are available for any photographs or illustrations appearing in this copy for an additional cbarge. Contact UMI directly to order. U·MI UniverSity Microfilms International A Bell & Howell Information Company 300 North Zeeb Road. Ann Arbor, M148106-1346 USA 3131761-4700 800/521·0600 Order Number 9506962 Operating system support for high~speed networking Druschel, Peter, Ph.D. The University of Arizona, 1994 Copyright ®1994 by Druschei, Peter. All rIghts reserved. V·M·I 300N.ZeebRd Ann AIbor, MI 48106 OPERATING SYSTEM SUPPORT FOR HIGH-SPEED NETWORKING by Peter Druschel Copyright© Peter Druschel 1994 A Dissertation Submitted to the Faculty of the DEPARTMENT OF COMPUTER SCIENCE In Partial Fulfillment of the Requirements For the Degree of DOCTOR OF PHILOSOPHY In the Graduate College THE UNIVERSITY OF ARIZONA 1994 THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE As members of the Final Examination Committee. we certify that we have read the dissertation prepared by--'P-"e:::te,.,r-'D"'r"'u"'sc"h"'e=-l ________ entitled Operating System Support for High-Speed Networking and recommend that it be accepted as fulfilling the dissertation requirement for the Degree of Doctor of Philosophy natef I ?/u-I"tY Date 7/z.sbt- nAte ' Date Final approval and acceptance of this dissertation is contingent upon the candidate I s submission of the final copy of the dissertation to the Graduate College. I hereby certify that I have ;-ead this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement. STATEMENT BY AUTHOR This dissertation has been submitted in partial fulfillment of requirements for an ad vanced degree at The University of Arizona. and is deposited in the University Library to be made available to borrowers under the rules of the Library. Brief quotations from this dissertation are allowable without special permission, pro vided that accurate acknowledgment of source is made. Requests for permission for ex tended quotation from or reproduction of this manuscript in whole or in part may be granted by the copyright holder. SIGNED, __'-,--t +-[3---1U'-----""-f-\- __ ACKNOWLEDGMENTS First and foremost, I wish to thank Larry Peterson. He has been the ideal advisor, one who allowed me to pursue my own directions, while providing me with the optimism, experience and insight I often lacked. lowe thanks to Norm Hutchinson for his support and the many exciting debates we had during and after his time at Arizona. Thanks also to my other committee members, Rick Schlichting and Greg Andrews, for their support and encouragement. The entire Computer Science department at Arizona deserves credit for creating and maintainin~ d superb environment in which to work and be productive. Special thanks to Cara Wallace and Margaret Newman for their patience, to Jana Zeff for getting me into the program six years ago, and to Wendy Swartz for helping me get out. I will miss the many enlightening discussions and Kippy's luncheons with Mark Abbott, Lawrence Brakmo, David Mosberger, Erich Nahum, Sean O'Malley, Michael Pagels, and Charlie Turner. I am thankful to Bruce Davie and Bellcore for making the OSIRIS boards available to me, and for the opportunity to engage in a fruitful and pleasant research cooperation. Special thanks to Bruce for his support and all the fun we had. As the real brain behind this work, my cat Nene deserves special consideration. His help with GigaMetrics was indispensable. This work was supported by the National Science Foundation under Grant CCR- 9102040, by ARPA under Contract DABT63-91~C~0030, and by Sun Microsystems, Inc. TABLE OF CONTENTS LIST OF FIGURES ABSTRACT 10 CHAPTER 1: Introduction 11 1.1 The Memory Bandwidth Bottleneck ................ 12 1.2 Server-Based Operating System Structure .......... 14 1.3 Thesis Statement and Contributions 15 1.4 Dissertation Overview . .. 16 CHAPTER 2: The Operating System 110 Bottleneck 18 2.1 Memory Performance. 18 2.2 Effectiveness of Cache Memories 21 2.3 Avoiding Main Memory Traffic . 23 2.3.1 Device-Memory Transfers 23 2.3.2 Cross-Domain Data Transfers 25 2.3.3 Data Manipulations 26 2.3.4 Buffer Management .. , . 27 2.3.5 Application Programming Interface. 28 2.4 Encl-to-End Design .......... 29 2.4.1 Hardware Streaming 30 2.4.2 DMA-DMA Streaming. 31 2.4.3 OS Kernel Streaming. 32 2.4.4 User-Level Streaming, , 32 2.4.5 Need for Integration 33 2.5 A Buffer Abstraction for an Integrated Data Path 35 2.5.1 The IOData Abstract Data Type ...... , .. 36 2.5.2 Implementation ..... 37 2.6 Concluding Remarks ................ 38 CHAPTER 3: OS Support for a High-Speed Network Adaptor 39 3.1 Experimental Hardware and Software 39 3.2 Host/Board Communication ...... 42 3.2.1 Shared Data Structure .. 42 3.2.2 Interrupts ..... 44 3.3 Physical Buffer Fragmentation 44 3.4 Cache Coherence 47 3.5 Page Wiring. 48 3.6 DMA Length 49 3.7 DMA versus PIO ..... 50 3.8 New OS Mechanisms . 51 3.8.1 Fast Buffers ... 51 3.8.2 Application Device Channels .. 53 3.9 Performance. 54 3.10 Concluding Remarks .............. 57 CHAPTER 4: Decoupling Modularity and Protection .. 59 4.1 Motivation 59 4.1.1 Why Separate Modularity and Protection? .. 60 4.1.2 How To Separate Modularity and Protection .. 62 4.2 Architectural Model 63 4.3 Object Invocation Mechanism ........ 65 4.3.1 Design Issues .... 65 4.3.2 Object References and Local Invocation 68 4.3.3 Non-Local Invocation ... 69 4.4 Performance ..... 72 4.5 Discussion. 75 4.5.1 Modularity and Protection. 75 4.5.2 Interface design 78 4.5.3 Object Systems ...... 80 4.6 Related work 81 4.7 Concluding Remarks ............................... 81 CHAPTER 5: High-Bandwidth Cross-Domain Data Transfer ............. 83 5.1 Motivation •................................... 83 5.2 Characterizing Network 1/0 . 84 5.2.1 Networks and Buffers. 85 5.2.2 Allocating Buffers. • . .. 86 5.2.3 Accessing Buffers . .. 86 5.2.4 Summary of Requirements . • . 88 5.3 Related Work . 89 5.3.1 Page Remapping . 89 5.3.2 Shared Memory . 90 5.4 Design....................................... 91 5.4.1 Basic Mechanism . 91 5.4.2 Optimizations.............................. 93 5.5 Implementation.................................. 96 5.5.1 Allocation................................ 97 5.5.2 Transfer................................. 98 5.5.3 Deallocation............................... 99 5.5.4 Domain Termination . 99 5.5.5 Exceptional Cases. 99 5.6 Performance.................................... 100 5.7 Discussion ..................................... 106 5.7.1 How Many Domains? .......................... 106 5.7.2 Characteristics of Network 1/0 Revisited ............... 107 5.7.3 Architectural Considerations ......•............... 109 5.7.4 Relationship to Other VM Systems .................. 109 5.8 Concluding Remarks ......•........................ 110 CHAPTER 6: Application Device Channels ....................... 111 6.1 Motivation ..........•................•......•. 111 6.2 Design....................................... 113 6.3 Implementation ......•.....•..................... 114 6.4 Related Work ............. .116 6.5 Performance ............. .. 117 6.6 Summary .117 CHAPTER 7: Conclusion ......... .118 7.1 Contributions and Limitations .118 7.2 Future Directions. ... 119 REFERENCES .................... .. 121 LIST OF FIGURES 1.1 Block Diagram of a DEC 3000/600 AXP . • . • . .. 13 2.1 Implementation of Messages in the :r:-kernel . 28 2.2 Various Forms of Data Streaming . .. 30 2.3 Dependencies among copy-avoiding techniques ................ 35 3.1 Architectural Overview of OSIRIS ....................... 40 3.2 PDU Buffer Fragmentation ................. 45 3.3 DEC 5000/200 UDP/IP/OSIRlS Receive Side Throughput .......... 55 3.4 DEC 3000/600 UDP/IP/OSIRIS R.ceive Side Throughput .......... 56 3.5 UDPjIP/OSIRIS

INFORMATION to USERS This Manuscript Has Been Reproduced

Validated Products List, 1995 No. 3: Programming Languages, Database

1992 Cern School of Computing

DB2SPI B.03.02 User's Guide

Binary Translation 1 Abstract Binary Translation Is a Technique Used To

VAX 4000 Model 100 to DEC 3000 Model 400S AXP Upgrade Instr

ANSA: an Engineer’S Introduction

Computer Architectures an Overview

Exception Handling with Fail-Safe Semantics Phd Thesis

DEC 2000 Model 300 AXP Upgrade Information

Dissertation Acceptance

Alpha AXP PC Hardware

Using Decwindows Motif for Openvms