Towards Precise Network Measurements

TOWARDS PRECISE NETWORK MEASUREMENTS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Ki Suh Lee January 2017 c 2017 Ki Suh Lee ALL RIGHTS RESERVED TOWARDS PRECISE NETWORK MEASUREMENTS Ki Suh Lee, Ph.D. Cornell University 2017 This dissertation investigates the question: How do we precisely access and control time in a network of computer systems? Time is fundamental for network measurements. It is fundamental in measuring one-way delay and round trip times, which are important for network research, monitoring, and applications. Further, measuring such metrics requires precise timestamps, control of time gaps between messages and synchronized clocks. However, as the speed of computer networks increase and processing delays of network devices de- crease, it is challenging to perform network measurements precisely. The key approach that this dissertation explores to controlling time and achieving precise network measurements is to use the physical layer of the network stack. It allows the exploitation of two observations: First, when two physical layers are connected via a cable, each physical layer always generates either data or special characters to maintain the link connectivity. Second, such continuous generation allows two physical layers to be synchronized for clock and bit recovery. As a result, the precision of timestamping can be improved by counting the number of special characters between messages in the physical layer. Further, the precision of pacing can be improved by controlling the number of special characters between messages in the physical layer. Moreover, the precision of synchronized clocks can be improved by running a protocol inside the physical layer by extending bit-level synchronization. Subsequently, we make three contributions embodied in the design, implementation and evaluation of our approaches. First, we present how to improve the precision of timestamping and pacing via access to the physical layer at sub- nanosecond scale. SoNIC implements the physical layer in software and allows users to control and access every bit in the physical layer. Second, we demon- strate that precise timestamping and pacing can improve the performance of network applications with two examples: Covert timing channels and available bandwidth estimation. A covert timing channel, Chupja, is high-bandwidth and robust and can deliver hidden messages while avoiding detection. An available bandwidth estimation algorithm, MinProbe, can accurately estimate the available bandwidth in a high-speed network. Finally, we present how to improve the precision of synchronized clocks via access to the physical layer. DTP, Dat- acenter Time Protocol, extends the physical layer’s link-level synchronization and implements a peer-to-peer clock synchronization protocol with bounded nanosecond precision. Together, these systems and approaches represent important steps towards precise network measurements. BIOGRAPHICAL SKETCH Ki Suh Lee earned his Bachelor of Science degree in Computer Science and En- gineering from Seoul National University in Korea in 2007, and his Master of Science degree in Computer Science from Columbia University in 2009. Then, he joined the doctoral degree program in Computer Science in Cornell Univer- sity, where he worked with Hakim Weatherspoon to build many systems using field-programmable gate array boards. His research interests broadly cover networking, operating systems, and computer architectures. iii To Kayoung, Ethan, and my parents. iv ACKNOWLEDGEMENTS I would like to thank the faculty members at Cornell. I would like to thank my advisor, Hakim Weatherspoon, who showed and taught me how to become a researcher in computer systems. He always guided me to build a system and at the same time not to forget the fundamental research questions that I was trying to answer by building the system. I would not forget the time we spent together to write papers till late nights. Kevin Tang showed me how to do research based on math and theory in computer networks. Further I am sincerely grateful to Fred Schneider and Nate Foster who gracefully accepted to be a member of my special committee and provided me invaluable advice to finish my dissertation. I would also like to thank my research collaborators. What could I have done without Han Wang. From the earlier years of my study Han was always by my side and helped me in every aspect of my research. I thank Tudor who guided me to become a Linux kernel programmer in my earlier years. I thank Chiun Lin, Erluo Li and Vishal Shrivastav for their collaborations on many projects. I also thank many undergraduate students who helped my research: Ethan Kao, Jason Zhang, and Favian Contreras. I would also like to thank my colleagues in the Computer Science department at Cornell and members of Systems Lab and Networking Lab: Ji-Yong Shin, Zhiyuan Teo, Zhiming Shen, Qin Jia, Praveen Kumar, Robert Escriva, Hussam Abu-Libdeh, Dan Williams, Qi Huang, Joonsuk (Jon) Park, and Moontae Lee. I also thank Ed Kiefer and Eric Cronise from CIT who helped me set up network testbeds. Outside the Systems Lab and Networking Lab, I had a pleasure to serve God in a praise team where I played acoustic / bass guitar, mixed the sounds, and served as a Bible study leader. I thank all members of the team and am especially grateful to Jayhun Lee and Hyunjung Kim for so many reasons. I v would not forget enjoyable moments we had together as a team. I cannot thank enough for my family for their support. I thank my lovely wife Kayoung who was at my side always and happily endured long seven years of life in Ithaca. Her endless love and support made me who I am now. Further I had a great joy of having a son, Ethan, during my study. He made me happy and laugh. I thank my parents for their endless support and prayer. Thank God for everything. vi TABLE OF CONTENTS Biographical Sketch . iii Dedication . iv Acknowledgements . .v Table of Contents . vii List of Tables . .x List of Figures . xi 1 Introduction 1 1.1 Background . .3 1.1.1 Terminology . .3 1.1.2 Example: Estimating Available Bandwidth . .4 1.2 Challenges for Precisely Accessing Time . .7 1.2.1 Lack of Precise Pacing . .7 1.2.2 Lack of Precise Timestamping . .8 1.2.3 Lack of Precise Clock Synchronization . 10 1.3 Contributions . 12 1.4 Organization . 13 2 Scope and Methodology 15 2.1 Scope . 15 2.1.1 IEEE 802.3 standard . 16 2.1.2 Precise Timestamping and Pacing Need Access to PHY . 20 2.1.3 Precise Clock Synchronization Needs Access to PHY . 23 2.2 Methodology: Experimental Environments . 29 2.2.1 Hardware . 30 2.2.2 Network Topologies . 32 2.3 Summary . 37 3 Towards Precise Timestamping and Pacing: SoNIC 38 3.1 Design . 40 3.1.1 Access to the PHY in software . 40 3.1.2 Realtime Capability . 41 3.1.3 Scalability and Efficiency . 43 3.1.4 Precision . 44 3.1.5 User Interface . 45 3.1.6 Discussion . 47 3.2 Implementation . 48 3.2.1 Software Optimizations . 48 3.2.2 Hardware Optimizations . 53 3.3 Evaluation . 56 3.3.1 Packet Generator (Packet Pacing) . 58 3.3.2 Packet Capturer (Packet Timestamping) . 60 vii 3.3.3 Profiler . 62 3.4 Application 1: Covert Timing Channel . 63 3.4.1 Design . 66 3.4.2 Evaluation . 71 3.4.3 Summary . 89 3.5 Application 2: Estimating Available Bandwidth . 89 3.5.1 Design . 91 3.5.2 Evaluation . 92 3.5.3 Summary . 95 3.6 Summary . 96 4 Towards Precise Clock Synchronization: DTP 97 4.1 Design . 99 4.1.1 Assumptions . 99 4.1.2 Protocol . 100 4.1.3 Analysis . 105 4.2 Implementation . 108 4.2.1 DTP-enabled PHY . 108 4.2.2 DTP-enabled network device . 110 4.2.3 Protocol messages . 111 4.2.4 DTP Software Clock . 112 4.3 Evaluation . 114 4.3.1 Methodology . 115 4.3.2 Results . 116 4.4 Discussion . 120 4.4.1 External Synchronization . 120 4.4.2 Incremental Deployment . 121 4.4.3 Following The Fastest Clock . 122 4.4.4 What about 1G, 40G or 100G? . 123 4.5 Summary . 124 5 Related Work 125 5.1 Programmable Network Hardware . 125 5.2 Hardware Timestamping . 126 5.3 Software Defined Radio . 127 5.4 Software Router . 127 5.5 Clock Synchronization . 128 6 Future Work 136 6.1 Synchronous DTP . 136 6.2 Packet Scheduler . 137 6.3 SoNIC, DTP, and P4 . 138 7 Conclusion 140 viii A PHY Tutorial 142 A.1 Physical layer: Encoder and Scrambler . 142 A.1.1 Introduction . 142 A.1.2 Encoding . 143 A.1.3 Scrambler . 145 A.1.4 Task . 147 A.2 Physical Layer: Decoder and Descrambler . 148 A.2.1 Introduction . 148 A.2.2 Task . 149 A.3 Pseudo C code for 64/66b Encoder and Decoder . 151 B Fast CRC algorithm 157 B.1 Psuedo assembly code for Fast CRC algorirthm . 158 C Optimizing Scrambler 162 D Glossary of Terms 165 Bibliography 171 ix LIST OF TABLES 2.1 List of servers used for development and experiments . 30 2.2 List of FPGA boards and NICs used for development and experiments . 31 2.3 List of evaluated network switches. “SF” is store-and-forward and “CT” is cut-through. 31 3.1 DMA throughput. The numbers are average over eight runs. The delta in measurements was within 1% or less. 55 3.2 IPD and IPG of homogeneous packet streams. 71 3.3 Evaluated values in the number of /I/s and their correspond- ing time values in nanosecond.

Towards Precise Network Measurements

Implementing a NTP-Based Time Service Within a Distributed Middleware System

IEEE Std 802.3™-2012 New York, NY 10016-5997 (Revision of USA IEEE Std 802.3-2008)

Mutual Benefits of Timekeeping and Positioning

Revision to IEEE Std 802.3-2012 Initial Sponsor Ballot Comments

Ethernet Explained

Time and Frequency Transfer in Local Networks by Jir´I Dostál A

Time Synchronization in Packet Networks and Influence of Network Devices on Synchronization Accuracy

A Resilient National Timing Architecture

Wireless Clock System

Packet Frame Structure Preamble

UNIVERSITY of CALIFORNIA, SAN DIEGO Packet Pacer

Data Link and Physical Layers and 10 Gbe Protocol