An Application of Active Congestion Control in

Abdollah Aghassi

A thesis submitted in conformity with the requirernents for the degree of Master of Applied Science Graduate Department of Electrical and Computer Engineering University of Toronto

O Copyright by Abdollah Aghassi 2000 National Library Bibliothéque nationale I*I of Canada du Canada Acquisitions and Acquisitions et Bibliographie Services services bibliographiques 395 Wellington Street 395. rue Wellington Ottawa ON K1A ON4 Ottawa ON KiA ON4 Canada Canada Vwr Ne Vmre Rfcimce

Our 6k Notre reterence

The author has granted a non- L'auteur a accordé une licence non exclusive licence dowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or sel1 reproduire, prêter, distribuer ou copies of this thesis in rnicrofonn, vendre des copies de cette thèse sous paper or electronic formats. la fome de microfichelfilm, de reproduction sur papier ou sur format électronique.

The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or otherwise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autorisation. An Application of Active Congestion Control in TCP-Vegas

Abdoilah Aghassi

Master of Applied Science, 2000 Department of Electrical and Cornputer Engineering University of Toronto

Abstract

TCP-Vegas, with its innovative congestion avoidance technique. achieves a better link utilization than other TCP vaiants. However, it fails to fairly allocate the available bandwidth arnong the usen. In particular, it penalizes the old connections in favor of the new connections. Furthemore, TCP-Vegas has the potential to induce persistent congestion in the network. These drawbacks al1 stem from the problem of over- estimation of propagation delay. To improve the faimess performance of TCP-Vegas, this thesis proposes a probing mechmism based on the active networking approach. The mechanism requires support from the active network nodes to immediately forward the probe packets. The acknowledgments of probe packets at the sender provide a better estimate of the propagation delays. Our simulation resuits show that our proposed active TCP-Vegas fairly allocates the bandwidth arnong the connections. In addition, the better estimation of propagation delays leads to reduced packet backiog in the buffer, and hence results in a lower level of network congestion than the original TCP-Vegas. Acknowledgement

1 would like to express my sincere gratitude and appreciation to my supervisor, Professor A. Leon-Garcia, for his invaluable guidance, patience, advice and encouragement throughout the course of my M.A.Sc. program and preparation of this thesis. 1 would dso like to thank Professor 1. F. Blake, Professor J. Choe, Professor E. Law and Professor S. W. Davies for their advice and comments dunng my final oral presentation. I wish to acknowledge in particular my fellows and friends in the Communications group for their fruitful discussion and proof-reading of the thesis. The financial support from ITRC is gratefully acknowledged. Table of Contents

1. INTRODUCTION...... 1 1.1 Motivation ...... -3 1.2 Objective ...... 4 1.3 Thesis Organization ...... S

2. TCP CONGESTION CONTROL OVERVIEW ...... 7 2.1 Introduction ...... 8 2.2 Problems with Current TCP Implementation ...... 9 2.3 Proactive Congestion Control ...... 10 2.3.1 Vegas Congestion Avoidance Mechanism ...... 12 2.4 Major Drawback of Delay-Based Congestion Control Algorithrns ...... 14 2.4.1 Errors in Round Trip Delay Mesurement ...... 14 2.5 TCP-Vegas Pro blems ...... 15 2.5.1 Unfairness...... 16 2.5.2 Persistent Congestion ...... 31

3. ACTIVE NETWORKlNG and CONGESTION...... 24 3.1 Introduction...... 25 3.2 Explonng the Network by Smart Packets ...... 28 3.2.1 In-Band and Out-Band Monitor Packets...... 30 3.3 Major Drawbacks of the Active Networking Appronch ...... 32

4 . ACTIVE CONGESTION CONTROL...... 34 4.1 Introduction ...... 35 4.2 Enhanced TCKVegas ...... 36 4.2.1 Faimess Evduation of Enhanced Vegas ...... 36 4.3 Vegas and DRR at the Gateway ...... 40 4.4 Active Vegas ...... 41 4.4.1 Preliminary Simulations ...... 43 4.4.2 Single Bottleneck Network Simulations ...... 45 4.4.2.1 Round Trip Time Measurement ...... 51 4.4.2.2 Fairness Evaiuation ...... 52 4.4.2.3 Buffer Occupancy ...... 54 4.4.2.4 Sumqof Simulation Results ...... -55 4.4.3 Multi-hop Network Simulation...... 56

5.1 Discussion ...... 61 5 -3 Draw backs ...... 62 5.3 Future Research Directions ...... 63 5.4 Research Contributions...... 64 List of Figures

Figure 2.1. General pattern of round trip time vs . window size ...... 8 Figure 2.2. Window algorithm of TCP.Vegas ...... 13 Figure 2.3. Network mode1 ...... 16 Figure 2.4. Simulation results when C = 0.5Mbps ...... 19 Figure 2.5. Network topology after the new connection joins the network ...... 20 Figure 2.6. Simulation results after the new connection joins the network ...... 20 Figure 2.7. Congestion window for each connection ...... --33 Figure 3.1. Traditional passive network pmdigm ...... 25 Figure 3.2. Active network pattern ...... 26 Figure 4.1: Connection goodputs and instantaneous queue lengths for enhanced Vegas.38 Figure 4.7: Connection podputs and congestion windows for old and new connections39 Figure 4.3. Packets in transit for active Vegas ...... 42 Figure 4.4. Priority queue at the gateway ...... 42 Figure 4.5. Network mode1 ...... -43 Figure 4.6. Connection goodputs for 3 active Vegas sources...... 44 Figure 4.7. Connection goodputs for 4 active Vegas sources ...... 45 Figure 4.8. Network topology ...... -46 Figure 4.9. Goodput vs . connection index for 4 different scenarios ...... 47 Figure 4.10. Buffer occupancies for the different scenarîos...... 48 Figure 4.1 1: Mean goodput of each connection under different Vegas algorithm ...... A9 Figure 4.12. Buffer occupancy for original Vegas (infinite buffer size)...... 50 Figure 4.13. Connection BaseRï73 for ail scenarios ...... 51 Figure 4.14. Faimess vs . bottieneck link bandwidth (link delay = 20msec) ...... 53 Figure 4.15: Faimess vs . bottleneck Iink propagation delay (link capacity = 4Mbps). . -34 Figure 4.16. MuIti-hop network mode1 ...... 56 Figure 4.17. Average congestion windows for different connections...... 57 Figure 4.18. Goodput vs . connection index for active and original Vegas ...... 58 List of Tables

Table 4.1 : Connection throughput for 3 different versions of Vegas ...... 44 Table 4.2. Buffer occupancies for 4 different implementations of Vegas ...... 55 Table 4.3. Summary of simulation results ...... -35 Table 4.4. BaseRTT estimation for active and original Vegas ...... 59

vii List of Acronyms

AACC Adaptive Admission Congestion Control ACC Active networking Congestion Control ATM Asynchronous Transfer Mode CARD Congestion Avoidance using Round-trip Delay DAWA Defense Advanced Research Projects Agency DiffSew Differentiated Service DRR Deficit Round Robin ECN Explicit Congestion Notification FIFO First In First Out FrP Fi le Transfer Protoco 1 HTTP HyperText Transfer Protocol IntServ Integated Service IF' M'Tu Maximum Transmission Unit RED Random Early Detection RSVP Resource ReserVation Protocol Rn Round Trip Time SACK Selective ACKnowledgement TCP Transmission Control Protocol Tri-S Slow Start and Search VC Virtual Circuit WFQ Weighted Fair Queuing WRR Weighted Round Robin Chapter 1

Introduction

The existence of both new and old technologies in computer networks with different link capacities leads to congestion. When the network load is light, the network is able to provide the requested throughput. As the load increases beyond the network capacity, the buffers start building up, and the throughput stops increasing. This is the point at which congestion starts to occur. Any attempt to increase the load beyond this point will result in further loss and less throughput. To rernedy these problems, a congestion detectiordcontrol algorithm is required. Moreover an increaseldecrease algorithm in the sender provides a rnechanism for preventing congestion after it is detected in its early stages. Congestion detectiodcontrol, which regulates the amount of traffïc, is a distributed algorithm. It can be divided into two sub-algorithms: the link algorithm, which is executed inside the network to detect congestion and retum this information to the sender; and the source algorithm, which is executed at the edge of the network to adjust the sending rate based on the congestion condition inside the network. In a simple word, the role of congestion detectiodcontrol is to balance supply and demanà As a supplier, when demand increases, the link algorithm increases the link price, and as a consumer, when pnce increases, the source algorithm decreases demand. Therefore in a limited time interval, the system will converge to a senling region. in the congestion detectiodcontrol algorithm, the link pnce represents a congestion measure inside the network, How can the congestion measure be retumed to the sender? In other words, how cm the source algorithm estimate the available supply? Ideally, one cm design the source and link algorithms jointly so that they cm communicate and uansfer information back and forth to move the level of congestion inside the network to a desirable operating point. However in the current Intemet, there is no explicit feedback from the network. and there is no direct communication between the source and link algorithms. There are some implicit signs in the network that imply the congestion measure, the status of the congestion in the network. In other words, source algorithm should estimate the availabie bandwidth, the supply, with some implicit indicators. For example, packet drop is one of these indicators, when the network gets congested packets start dropping. Another sign is round trip delay. when the network gets congested, packets expenence longer round trip delay. It is worth mentioning that packet drop and round trip delay correlate directly to the type of link algorithms used. Apart from providing other services, the is responsible for regulating the traffic flow. This prevents a fast host from over-running the network capacity, and plays the role of the source algorithm in the congestion control mechanism. Transmission control protocol (TCP)is one of the main protocols in the transport layer of the current Intemet. Many of the popular Intemet services, including HTTP, FïP, , and so on rely on the services provided by TCP. There are several varieties of TCP, including TCP Tahoe, Reno, New-Reno, SACK, and Vegas. Al1 TCP variants, except Vegas, estimate the link price with packet dropl. Packet drop is an essential part of their congestion control mechanisms, as stated previously, packet drop is a measure of congestion and link pnce. Packet drop is equivaient to an increase in link pice that should result in a decrease in demand, which is represented by the congestion window. With a drop tail buffer however, Reno continually cycles between sinking into congestion

- - - ' We inchde the explicit congestion notification (ECN) rnarking in the category of packet &op.

2 and recovenng from it. To remove this condition from Reno and its variants. random early detection (RED). as a link algorithm. tries to create an early packet &op to signal the presence of the congestion to Reno. Packet &op results in an inefficient use of the available bandwidth due to many retransmissions of the sarne packet after packet drops occur. In addition. since Reno will eventually force al1 connections to have the same congestion window size. it sipificantly discriminates against connections with longer propagation delays [Il. In contrast to Reno and its variants, TCP-Vegas estimates the level of congestion by monitoring the difference between the rate it is expecting to see and the rate it is actually sending. To cdculate the expected rate, actual rate, and their difference. Vegas needs to measure the round trip delay, including the propagation delay. In other words, round trip delay is a measure of congestion and link price. Increase in round trip delay cm be assumed as an increase in link price. Again, based on supply and demand rule, the congestion window, which indicates the demand, should be decreased.

1.1 Motivation It is clear that TCP-Vegas does not introduce any additional losses in the network when it estimates the available bandwidth; therefore it utilizes bandwidth more efficiently han Reno, due to fewer retransmissions. Experirnental results presented in [2] and [3] show that Vegas achieves between 37 to 71 percent better throughput than Reno. With this improvement in throughput, TCP-Vegas is a possible solution for the future Intemet. But at lest two concems remained unsolved: does TCP-Vegas achieve a fair allocation of resources; and does Vegas lead to persistent congestion. In particular, Vegas discriminates against old connections when a new connection joins the network. The main reason for this discrimination is in the calculation of expected rate. The expected rate cm be achieved when there is no congestion in the network (no queuing delay component in the round trip delay). When a new connection joins to an aiready congested network, it includes the conesponding queuing delay in the round trip delay measurement, which in turn results an error in its expected rate measurement. Briefly stated, new connection over-estimates the expected rate. This results in an unfair distribution of resources among old connections. Furthemore, Vegas' strategy is to adjust the congestion window in an anempt to keep the ciifference between expected and actual rates in a limited interval. This is equivalent to keeping a small number of packets buffered in the routen along the path, e.g., between 1 to 3 packets for each connection. The over-estimation of propagation delay or minimum round trip delay causes an under-estimation of the number of packets backlogged in the buffers; therefore it pushes more packets in the buffen, and creates persistent congestion.

1.2 Objective We believe, for TCP-Vegas to be deployed in the future Intemet, it requires to be modified to improve the unfaimess among different usen. To improve the overall performance of Vegas, either the source algorithm should be changed or a compatible link algonthm should be investigated or both. The objective of this research is to improve the fairness performance of TCP- Vejas. As mentioned previously, the unfair behavior of Vegas has its root in delay measurement erron. The measurement of round trip delay that does not include my queuing delay is our main concern. The most straightfonvard kind of mechanism is to send a packet along the route, being forwarded without experiencing any queuing delay as it goes; we cd1 these active mechanisms. This monitor/controI packet experiences the minimum round trip delay. Active mechanisms require support from the gateways; each gateway along the route must forward the active packet with a high pnority order. The aim of this research is to investigate the idea of active probing to improve the performance of a transport protocol, in particular its faimess. To achieve this goal: first the necessary changes to the source aigorithm, TCP-Vegas, should be added to enable the protocol to send active packets; second the necessary active mechanisrn in the gateways, link aigorithm, should be implemented. To demonstrate the validity of our proposai, we want to implement our active Vegas protocol in the ns-2 simulator 1151. To support fast fonvarding, a prionty drop tail buffer will dso be irnplemented in the buffers. The specific types of measurements we would like to perfom are: r Faimess Perfionnance. Evaluate the fairness performance of active Vegas, and show the level of improvement. a Persistent Congestion. Demonstrate the effectiveness of active Vegas to decrease the number of buffered packets compared to original Vegas. In particular, the single bottleneck and multi-hop networks will be considered.

1.3 Thesis Organization The remainder of this thesis is arranged in the following order. In Chapter 2, an overview of several methods of TCP congestion control will be presented, then important problems of current TCP implementation, TCP-Reno, will be explained. nien we will focus on congestion avoidance methods based on round trip delay, in particular. TCP- Vegas. We will also show that over-estimation of round trip delay causes major problem in TCP-Vegas that leads to an unfair treatrnent of old connections, and persistent congestion in the network. Since Our proposai is based on the probing mechanism with active packets, in Chapter 3 we describe how a monitor/control packet cm improve the performance of a large complex network by obtaining the information from the core of the network and using this information to update the end point entity parameters, namely congestion control parameten. In this chapter, we describe and enurnerate the advantages of using monitor packets for exploring the network and surnmarize the applications of in-band and out-band approaches. In Chapter 4, we introduce Our own methodology, Active Vegas, to resolve the unfaimess of TCP-Vegas. It begins by examining performance issues of the current enhancements to the TCP-Vegas proposal to overcome its failure. Then, based on simulation results, it discusses the limitations of these enhancements. To remedy these limitations, this chapter presents a probing mechanism to estimate the minimum round trip time accurately. We investigaie the effectiveness of active Vegas in providing a Fair allocation of resources to al1 connections. Furthemore, we will show that active Vegas backiogs fewer packets in the buffers compared to the original Vegas. Finally, in Chapter 5, we discuss the advantages and disadvantages of our proposal. There are also many other open issues conceming the congestion conuol mechanism in TCP-Vegas including rerouting, and the format of the control packet that wili be explained in the future research direction of this chapter. We finally enurnerate the research contributions of this thesis. Chapter 2

TCP Congestion Control Overview

The transmission control protocol (TCP) is an ad-hoc mechanism to provide a highly reliable end-to-end protocol in a packet-switch network. To have a diable transmission, a handshaking mechanism between the sender and the receiver is required. WC793 is one of the early documents that specifies this handshaking mechanism. One of the main duties of TCP protocol is flow control. That means, TCP controls the arnount of data sent out by the source. In other words, the role of TCP apart from ensuring a reliable delivery of packets, is to adapt the sending rate of the source to the capacity of the network and the destination. Usually the part of the flow control mechanism that deals with capacity inside the network is called congestion control. In this chapter, an overview of several methods of TCP congestion control will be given and then important problems of curent TCP implementation (TCP-Reno) will be explained. We will focus mainly on congestion control avoidance rnethods based on round trip delay, in paaicular, TCP-Vegas. We will also show that over-estimation of round trip delay causes a major problem in TCP-Vegas that leads to an unfair treatrnent of old connections, and persistent congestion in the network. 2.1 Introduction

In order to use network bandwidth efficiently, TCP controls its flow rate. To control its flow rate, TCP needs to estimate the available bandwidth in the network using a bandwidth estimation scheme. Bandwidth estimation is dificult since there is no explicit feedback from the network and in most of the current implementations, they assume the network is a black box with no explicit feedback. For this reason. they are called implicit feedback schemes. Flow control of TCP is based on a window mechanism, which consists of lirniting the number of outstanding packets. The value of window size represents the source throughput. Without flow control, the source can satunte one or several routers inside the network, which cm introduce many losses and retransmission, resulting in a low goodput (The goodput is the rate of useful data delivered by the network). Figure 2.1 shows general patterns of round trip time (Rn)and goodput as a function of window size.

I I I I Delay I I I

l 1 I W* Wzdow Size

Figure 2.1 General pattern of round trip time and goodput as a function of window size

Unfortunately, due to the heterogeneity of the network, the optimum value for W* of the window size is not known beforehand. TCP must choose the optimum vaiue of the window size based on its estimation of the current available bandwidth, How can the available bandwidth inside the network be estimated? Usually, there are some implicit signs in the network that implies the congestion measure, the status of the congestion in the network. For example, packet drop is one of these signs, when the network gets congested packets start dropping. Another sign is round trip delay, when the network gets congested, packets experience longer round trip delay. Furthemore, packet drop and round trip delay depend on the queue management and service scheduling in the network. In other words, how packets are treated in the network highly affects the packet drop and round trip delay. Therefore, bandwidth estimation by the TCP mechanism is tightiy correlated to the link rnechanism that provides these implicit signs. For example, random early detection (RED) as a link mechanism provides necesSay &op signds for the TCP- Reno to estimate the available bandwidth. It's worth noting that bandwidth estimation in Reno is based on packet &op. Now imagine a TCP mechanism that does not use packet drop as a congestion measure, therefore it will not be compatible with a link algorithm like RED. An important point is the compatibility between the source aigorithm. TCP mechanism in this study, and the link algorithm, scheduling and queue management. Regarding the available bandwidth estimation methods, TCP congestion control schemes cm be classified into two categories; proactive and reactive methods [4]. Reactive methods come into play after congestion is developed. Based on the feedback from the congested network, they will take some action that drives the network out of this state. The Jacobson congestion avoidance mechanism [SI is one of the reactive methods that is activated mainly by packet drop which happens &ter the congestion occurs. Proactive methods come into play before congestion is developed to avoid, and reduce the possibility of network congestion. Proactive methods try to prevent congestion from occumng. Admission control and traffic shaping can be categorized in proactive methods. However, in this research, we are not to compare the performance of proactive and reactive methods. It is certainly clear that proactive methods perform better than reactive methods; when a network is congested, packets are Iost, bandwidth is wasted, and more.

2.2 Problems with Current TCP Implementation

The cument implementation of TCP, TCP-Reno, has no mechanism to detect the congestion just at the beginning because it needs to create losses to estimate the available bandwidth of the connection [6]. In other words, Reno's mechanism to estimate the available bandwidth is continually increasing the window size, using up buffers dong the connection's path, until it makes the network congested and packets are lost. Therefore, Reno is continually congesting the netwotk and creating its own losses to estimate the current available bandwidth. In this regard, TCPReno is reactive rather than proactive. The Jacobson method uses the additive increase and multiplicative decrease algorithm as a measure to probe the network and estirnate the available bandwidth. Due to the nature of adàitive increase and multiplicative decrease of window size, there is an inherent osciIIation problem in the Jacobson method and consequently in TCP-Reno. In other words, in Reno, the window size oscillates between the maximum window size (maximum window size which does not introduce any &op) and half the maximum window size. Consequently, oscillation in window size leads to oscillation in queue length and this oscillation in queue length leads to oscilIation in round trip delay of packets. (This is clear in Figure 2.1). Furthemore, variation in the queue length also results in larger delay jitter (we are not interested in the jitter variation, since TCP is not carrying real time traffic that are sensitive to high jitter variation). In addition, such a probing mechanism is very inefficient to utilize the available bandwidth because of many retransmissions of the same packet after the packet drop occurs. Another major problem with the Jacobson method is its unfairness to the users with longer round trip delays. In other words, TCP-Reno significantly discriminates against connections with longer propagation delays [l]. This is again created by inherent characteristics of the additive increase and multiplicative decrease algorithm. Because it will eventually force al1 of the users to have the sarne window size, this will result in lower throughputs for connections with larger propagation delays.

2.3 Proactive Congestion Control

The are several proposed approaches to proactive congestion control. Since congestion control consists of several phases (e.g. slow start, congestion avoidance, .. .), by being proactive we are focusing mainly on the congestion avoidance phase. One of the methods which is reported in the literature is based on round trip delay variation. In other words, instead of approaching to the optimal vaiue of the window size by introducing packet drops in the network, with the additive increase and multiplicative decrease operation, it attempts to use round trip delay variation to estimate the available bandwidth. Jain et al [7], [8] propose a mechanisrn called "Congestion Avoidance using Round-trip Delay" or C.W. The whok idea in this proposd is Yery simple. Back to the Figure 7.1, when the source window size increases. the queues in each node will build up, and consequently, the round trip time will increase. Since increasing the window size is equivalent to an increase in load, it means that increase in load will eventually lead to an increase in round trip delay. In other words, optimum value of window size Leads to maximum throughput and minimum round trip delay. They define power as the ratio of throughput and round trip delay. By maximizing power they try to achieve the optimum value for window size. Moreover, for window controlled networks, the connection throughput is approximately equal to the ratio of the number of outstanding packets and the round trip delay, which is the ratio of window size and round trip delay. If the network is not congested and load is low, changes in window size does not result in changes in round trip delay. On the other hand, when the network is congested, any small increase in window size will lead to a large increase in round trip delay. In this way, the gradient of the round trip delay venus the window size cm be used as a good measure to estirnate available bandwidth and to adjust window size, very close to the optimal vaiue of window size. Wang and Crowcroft [9] propose another method called "Slow Start and Search" or Tri-S. The main idea in Tri-S is simila. to CARD, which is the flattening of the throughput when the network is congested. M-Suses the current throughput gradient to search for the optimum value of window size instead of using delay gradient as in CARD. In addition, sirnilar to CARD, Tri-S is still dependent on round trip delay rneasurement to calculate the changes in the throughput dopes. Brakmo [2],[10]proposes another congestion avoidance mechanisrn based on round trip delay measurement called TCP-Vegas. In the next part we explain the congestion avoidance mechanisrn of Vegas in detail. 23.1 Vegas Congestion Avoidance Mechanism

The main idea behind congestion avoidance mechanism in Vegas is to measure and control the arnount of data in transit by looking at changes in throughput. When the network is not congested, the actual thmughput will be very close to the expected throughput. In contrat, when the network is congested, the actual throughput will be srnalier than the expected throughpur. Vegas uses rhis Merence in throughput as a measure for window adjustment instead of using changes in throughput slope as in Tri-S. In the congestion avoidance phase, the connection is not being overfiowed. therefore, the expected throughput is given by:

Erpected = WindowSize / BaseRTï where WindowSize is the size of the current congestion window, which in the congestion avoidance phase with no packet &op, is equd to the number of packets in transit and BaseRTT is the minimum of al1 measured round trip delays. The maximum value for expected throughput is achieved when there is no congestion in the network. That means, a good estimate of BaseRTT should not include any queuing delay and theoretically it will be equal to twice the propagation delay plus transmission delay. The calculation of actual throughput cm be done by recording two values; the sending time for a specific segment and the number of packets sent until the acknowledge of that specific segment arrives and dividing the number of packets transrnitted by the actud round trip delay of that specific segment. The maximum number of segments in transit cm not be greater than window size, and it cannot be far less than window size if no packet drops. Therefore, we cm approximate the actual throughput as:

Actual = WindowSize / rn where rn is the actuai round trip delay of a segment. Based on the value of diff = (Erpected - ActuaZ)BaseRîT, the congestion window will increase by one segment, decrease by one segment, or remain unchanged. diff will be the estimated backlogged queue size, if BaseRTT is measured accurately and does not include any queuing delay, (no &op in the congestion avoidance phase). Vegas tries to keep this backlogged queue size in a lirnited interval. The forrnal algorithm for window update is as follows:

diff < a

diff >

oth envise

The window algorithm can also be illustrated pphically as shown in Figure 2.2.

Size

Figure 2.2 Window aigorithm of TCP-Vegas

As mentioned before, without congesting the network, Vegas tries to be proactive by estimation of available bandwidth. This mechanism is totally different from Reno's congestion avoidance mechanism. That means, it does not need to introduce losses in the network to estimate the available bandwidth; therefore, it utilizes the bandwidth more efficiently than TCP-Reno. The main result reported in [2] is that Vegas cm achieve between 37 and 71 percent better throughput than Reno. Here we are not trying to evaluate those results or compare the performance of Vegas and Reno. For more detailed compatison and evaluation see [3], [Il], [U].

2.4 Major Drawback of Delay-Based Congestion Control Algorithms

A congestion conwl scheme is to maintain the balance of demand and supply in the network to achieve two basic objectives: link capacity is fully utilized and resources are shared arnong users in a fair manner. Although TCP-Vegas is vying to propose a better utilization of the link capacity and less variation of delay jitter, it suffers from a main disadvantage. This drawback is not only limited to TCP-Vegas but aiso to almost al1 of the delay-based approaches to congestion control. Since delay-based congestion control algorithms are heavily dependent on measurement and estimation of round trip delays, any over or under- estimation of round trip delays will affect their performances and degnde the proposed improvements.

2.4.1 Errors in Round Trip Delay Measurement

In this part we are trying to investigate why and when there will be an error in round trip delay estimation. End-to-end delay can be divided into severai components: transmission delay, propagation delay, packet processing delay, and queuing delay. Transmission delay is the time required for a packet to be sent physically. In other words, it is the tirne ffom the first bit in a packet is sent until the last bit in the packet is sent. Packet processing delay is the time required for a router to forward a packet. Packet processing delay is the time ciifference between picking up the packet and sending it on the attached link. Packet processing delay does not include the queuing delay. In our theoretical analysis. we assume that the processing time is negligible and packet processing delay can be ignored. Furthermore, we assume that the capacities of the access links are large enough that they do not incur any transmission delay. Hence, the transmission delay for each packet will be 1IC second, where C is equal to the bottieneck link bandwidth (packet per second). Therefore, round trip delay can be divided into two parts: propagation delay and queuing delay. Each of these two parts cm create errors in the round trip delay measurement. Since the topology of the network changes from time to time due to the failures of the links and changing the route of the segments (rerouting), propagation delay for an already established connection could vary because of the changes in the path of the connection. Consequently, propagation delay at time t is not necessarily equal to propagation delay at time t + A (A > O). Therefore rerouting causes the propagation delay to be changed and therefore may create error in round trip delay. In Our resemh. we did not take the possibility of route changes into consideration. Moreover, since the state of the congestion in the network changes a11 the time, the queue length in every router will Vary. Depending on the trafic characteristic. the queuing delay will also Vary, which may create error in round trip delay.

2.5 TCP Vegas Problems

As we saw in the previous section. there could be an error in round trip delay estimation due to the rerouting and the congestion. Since BaseRn has an important role in the congestion detection and the congestion avoidance mechanisms of TCP-Vegas, it is crucial to have a good estimate of BaseRZT. In the following sections, we are trying to show how any inaccurate estimation of round trip delay cm lead to an unfair share of the total bandwidth (131. Furthermore, the overestimation of the propagation delay pushes up buffer backlogs that leads to persistent congestion 1141. 2.5.1 Unfairness In order to show that Vegas fails to achieve the second goal of congestion control, i.e. fairness, we define a scenario and show both anaiytically and experimentally that TCP-Vegas treats old connections unfairly.

Figure 2.3 Network model

The network model that we will use in this scenario is described in Figure 2.3. The mode1 consists of three sources (Sl. S2, S3), three destination (Dl, D2. D3), two intermediate routers (RI, FU) and links connecting those elements. Suppose the bottleneck link is between R1 and R2 (al1 of other queues are empty so there is no queuing delay associated with thrm). Therefore, the only place that queue may built up is on RI. Let the bottleneck link capacity be equal to C (segmentkecond) and propagation delay between S , and D, be equal to r, (sec). Suppose source S 1 and source S2 establish a connection to destination Dl and destination D2 respectively at the same time and source S3 establishes its connection to destination D3 afienvards. Queue management in R1 is Drop Tai1 with enough bufTer space, which introduces no &op. Before connection 3 joins. we can deduce fiom Equation (2.1) that S 1 and S2 congestion windows reach to a settling point such that a < diff' c P (i = 1,Z). Therefore we have:

WindowSi'e, WindowSize, ) . BaseRTT, < p, a < ( BaseRTI; rtt, Since there is no queuing delay involved before connection 3 joins, BaseRlT, would be 1 measured accurately as 2 ri + - (assuming high access link bandwidth). Consequently C we have:

From Equation (7.3), we cm observe one of the advantages of Vegas over Reno that is not to force al1 of the connections to have the same value of window size. As mentioned before, Reno eventuaily forces al1 of the users to have the same window size which will result in lower throughputs for connections with Iarger propagation delays. Here, in Vegas, as we see, window size is approximately proportional to propagation delay and longer propagation connections will have larger window sizes which results in a fair capacity allocation to ail of the connections. But there is an important requirement that the measurement of BaseRTT. which the window mechanism of Vegas is based on, should be perfect. Unfortunately, Vegas fails to achieve this requirement most of the time, particularly when a connection is established in an aiready congested network. To see how Vegas treats old connections when a new connection is established, suppose connection 3 joins the network. Let and ,u4Pefbe the mean queue size in router R1 before and after connection 3 joins the network respectively. For connection I and 2, Equation (2.3) still holds and mi(i = 1,2) cm be calculated as:

and (2.5b)

Since connection 3 docs not have a ieasonabb approximation of propagation delay, it will set i ts BaseRTT to:

and,

Therefore after reaching to the stable condition, the size of congestion window for connection 3 will be:

If we consider the same propagation delay for ail of the connections (5 = r,for i = 1.2,3) and compare Equations (2.8) and (2.5b), we will observe that the congestion window for connection 3 converges to a larger value than the congestion windows for connections 1 and 2. Thus the new connection cm achieve higher bandwidth than the old connections. To confimi the above analysis, we run two simulations with Network Simulator (ns-2)1151. Network model, which used in the first simulation, was shown in Figure 2.3. The following parameters were used: dl link bandwidths and propagation delays except bottleneck link are lMbps and Smsec respectively, the bottleneck link propagation delay is 20msec, and Vegas congestion control parameters are set to: a = I and fl = 3. Packet sizes for dl connections are the sarne and are equal to IKbyte. Each simulation lasted for 60 seconds. Sources SI and S2 establish their connections after 1 second and connection 3 joins to the network after 10 seconds.

Figure 2.4 Simulation results when C = 0.5 Mbps. (a) Connections goodputs

(b) P kfi- and Pu1,er

As it can be seen From Figure Ua, after comection 3 joins the network, the goodputs of the previously established connections will decrease very much, compared to the higher goodput of the last co~ection.Figure 2.Jb shows 4 packets backlogged in the bottleneck queue before comection 3 joins. The reason behind this is that Vegas atternpts to keep at least a packets but no more than ,û packets in the queues. Due to queuing delay corresponding to this backlog, last comection over-estimates BaseRTT which leads to opening its window size more than its share. This scenario will happen every time a new comection joins the network and it may lead to a persistent congestion. In order to validate, another simulation was run based on Figure 2.5. in this scenario we let another connection, connection 4, join the network after 20 seconds. Al1 other parameters are the sarne as the previous simulation. As it was expected, later comection establishment of S4 leads to a very unfair allocation of bandwidth particularly for connections 1 and 2. L DI

S2

i Bottfeneck link

u -

Figure 2.5 Network topology after new connection joins the network

Figure 2.6 Simulation result when C=O.j Mbps. (a) Connections goodputs (b) Instantaneous queue lenght

As it was expected, the settling point of the congestion window for new connection is larger than their shares and the reason is essentially due to over-estimation of baseRïT 2.5.2 Persistent Congestion

It can be obsewed from Figure 2.6b that every time a new TCP-Veys connection joins the network, the queue size jumps to a much higher value which is not desirable and is the major drawback of TCP-Vegas. Corresponding to the parameter sethg in our simulation, we expect that each Vegas connection keeps between one to three packets in the gateway. For example, in Figure 2.4b, four packets build up in the buffer, i.e. two packets for each connection; before the third connection joins the networks. Therefore, we expect that after connection 3 joins the network, the queue size should not be more thm seven packets in the buffer. However, Figure 2.4b shows that the number of packets in the buffer is 11 packets. The sarne result can also be seen in Figure 1.6b. Before connection 3 joins the network, the queue size is four packets, der connection 3 joins, the queue size is 11 packets and after connection 4 joins the network, the queue size is 15 packets. In other words, the queue length in the buffer is more than it should be and will lead to a persistent congestion in the network. Again, the reason behind this phenornenon is error in minimum round trip delay estimation, which causes the new connections to keep more packets in the buffer. We leamed that Vegas estimates the backlog in the buffer by da (diff = (Expected - Actiial)BaseRTT) and tries to keep this backlog between a and 8.In the following, we will show that any over-estimation of BaseRTT results in an under-estimation of the backlog. In other words, the over-estimation of BaseRïT leads to greater backlog. Let us assume that Vegas measures its BaseRïT as:

BaseRTT = real-minRïT + E (2.9) where, real-min-MT is the minimum possible value for round trip delay when there is no congestion, and E~Sthe error corresponding the queuing delay. Consequently. the backlog will be:

WindowSize * real - min- RïT WindowSize * E diff = ( WindowSize - - (2.10) ?tt rn WindorvSize * E Equation (2.10) shows an under-estimation of in the backlog. It means, rtt

WindowSire * E Vegas will keep the actud baclog between a + and ,û + rtt

WindowSize * E rtt To confirm this with simulation result, let us go back to Figure 2.6b. Because of error in Basertt for comection 3, the number of backlogged packets for co~ection3 is 6 packets. To calculate the backlog estimation error. the congestion window size is sho~vn in Figure 2.7. The congestion window, 6. real -minRT, and rtt for comection 3 are equal to 6 packet. 64msec. 76msec, and 140msec respectively. Therefore,

F.VindotvSix * E will be approximately equal to 3 packets. which exactly matches with rtt our simulation result shown in Figure 2.6b.

+ Seriesl i Series2 A Series3 x Series4

Figure 2.7 Congestion window for each comection From the above analysis and observation, it can be concluded that over-estimation of BaseRn aiso creates a large number of packets backlogged in the buffer that will lead to a persistent congestion. Our expectation from Vegas is to maintain at most ,8 packets per each source in the buffer. Therefore, with n active connections. queue len,gh should be no more than n p,but our simulation results show that the current implementation of Vegas accumulates more than n /3 packets in the buffer. Increasing the backlog queue length with an increase in the number of active connections is still one of the weak points of Vegas performance. Chapter 3

Active Networking and Congestion

Active networking is a revolutionary networking technology that offers a change in the traditional network paradigm by introducing programmability into the basic infrastructure of the network. Explonng and justifying this change in the network architecture requires understanding the advantages, disadvantages and trade-offs of the new approach. There are two active network architectures: The programmable router approach, and the capsule approach. In the first part of this chapter, we bnefl y overview the active network concept and enurnerate the tmdeoff between flexibility and efficiency. We note that congestion control is a prime candidate for active networking, since it is especially an intra-network event and is potentiail y far removed from the applications. In the second part of this chapter, we describe how smart packets improve the performance of large complex network by obtaining the information from the core of the network and using this information to update the end point entity parameters, namely congestion control panmeters. In the 1st part of this chapter, the major drawbacks of the active network approach will be explained. 3.1 Introduction

Current networks face many problems, e-g., difficulty of integrating new technologies, poor performance. a~ddifficulty of accomrnodating new services in the existing architecturai model. There are two main bottlenecks to the solutions of these problems: the Intemet itself is hard to change, and current network evolution is limited by the speed of standardization. The standardization process, which needs to be 3greed on by a large number of vendors, manufacturen, and in general with different point of views, is lengthy; thus it delays implementation of new services on a large scale. As a consequence, the introduction of new network services at this level is very slow. For example, there was a span of five or more years from the time that RSVP was proposed to the time that it was deployed, even in a very limited manner. Active networking is to be a remedy to these problems. Although the idea of using active or smart packets is not new and has been previously proposed in the literature [4] [16] 1171 [Ml, but the concept of active networking resulted from the discussion within the DARPA in 1994 and 1995 on the future direction of networking systems [19]. Traditional networks transparently transfer data from one end to the other, i.e., the network is not sensitive to the packets it carries and they are transported between the end systems without any modification. In other words. the packets are static and the network nodes simply use routing information to forward the packets', and al1 packets are ueated sirnilarly.

router/svPitches Pay load Only forwarding basai Payload

Figure 3.1 Traditional passive network paradigm

'With the implementation of DiffServ and InrServ and other complicated services. the network does not oniy forward the packets anymore. Active networks introduce programmability into the basic infrastructure of the network. Each user cm inject programs contained in messages into the network that is capable of perfonning computation and manipulation on behalf of the user. If we think of the iP header in the traditionai networks as input data to a virtual machine, we cm imagine a packet in the active network containing rnethods as well as data payload. In other words, the user specifies how the packet should be processed and fowarded. Figure 3.2 descn hes the active network paradiem.

FIPay load Active Execution Environment

Pay load Infrastructure

Figure 3.2 Active network pattern

The network cmbe "active" in two ways: Discrete or programmable switch/router; Switches perform computation on the user data flowing through them. The injection of the methods into the network is separated from the processing of messages. Integrated or capsule; Lndividuals inject programs into the network. Each packet contains both a program fragment and a piece of data. These two active network architectures seem to be at the opposite extreme ends of the design space. The programmable router approach introduces a little fiexibility into the network, while rnaintaining the efficiency of packet forwarding. In contrast, the capsule approach achieves almost full flexibility, but packet forwarding eficiency is sacrificed. Hence there is a trade-off between programmability and efficiency. There is also a paradox in using active network especially in the capsule approach. From one side, the execution of each packet program increases the arnount of processing at the network. As a consequence, the packets expenence longer per-hop latency. However, active networks are expected to improve the network performance such as per-hop latency, and end-to- end delay. Although longer per-hop latency and other effects of active networking may appear to degrade the network performance, they may actudly lead to a better overail performance, perhaps because of reducing requested bandwidth. creating less congestion and etc. The important point is that with active networks, the network services are to support the user's req~irenents.Althugh some application of actir-, nerworlÿng hnc shown great promise, the advantages of the active network approach are still not obvious. More concrete results are needed to mswer the following questions: What existing important network services will be significantly improved by active network? What useful new services will be enabled?

The answen to these questions can faIl under two categories: Some network services can be irnproved or supponed using information that is only available inside the network. In other words, the network may have information that is required by the applications and by the users to fully optimize performance. Especially the timely use of that information can significantly improve the performance. Examples of these services are: - Accessing accurate information about the curent state of congestion in the network [20]. - Locating the network hot spots in the network where requests for an object are high, in order to perform caching in network nodes [2 11. - Moving management decision points closer to the node being managed [22]. - Finding the location of packet Ioss in rnulticast distribution trees 1231. To improve the performance of some other services, the network needs the infornation from the users and the applications. Examples of diis type of information include: - How important is the user data unit? In many applications, retransmission of a lost packet degrades the network performance. - Usability of the cached data, instead of real-time data. In summary, the performance of active network should be evaluated in terms of application-specific metrics.

3.2 Exploring the Network by Smart Packets

As it was mentioned in the introduction, congestion control and resource management are prime candidates for active networking, since they are specifically an intra-network events and are potentially fa.removcd from the applications. There are two approaches of appl ying active networking technology to congestion control and resource management: exploring the state of the network by the reports from entities inside the network, and leming the network condition by some speciai packets generated by the end points. These two methods are tightly coupled because without the active entities inside the network sending rnonitor packets are not beneficial. Having some congestion control policy implemented by entities inside the network can have several advantages [24]: Easy collection of necessary information from strategically placed network entities for making congestion control decision. Removing the corresponding round trip delay in accessing the relevant information and hence faster response to changes. End points are definitely responsible to eventually enforce the congestion control decision. In contrast with traditional congestion control mechanisms, entities in the network can give an explicit feedback to the end points. Examples of active processing in the entities inside the network can be found in several current congestion control mechanisms. We describe severai proposais based on active entities inside the network. Early packet drop, proposed in [17] and [Ml,is a type of active network processing that improves the performance of ATM network. In an ATM network, packets are fragmented into cells. Transmitting cells of a fragmented packet that one of its cells has already been lost in the switch is wasting the bandwidth. In [17], Romanow and noyd propose an aggressive dropping mechanism to drop entire packets when the congestion threshold is reached. Along the same line, Floyd and Jacobson in [18] propose randorn early detection (RED)mechanism to randomly drop or mark the packet when the average queue length exceeds a threshold. In both proposals, no smart packets are used, instead, active entities implicitly return the congestion status to the end points. Active networking congestion control (ACC) proposed in [25] is a system that uses entities in each router to immediately rext to congestion. .CC replaces the conventionai end-to-end congestion control to a new hop-by-hop congestion control. In time of congestion, each router reports the state of congestion back to the end point and actively slows down the transmission rate as if the end point instantly reduces its sending rate. This rnechanism has two advantages: first, the end points are informed of the congestion much earlier because now, the feedback delay is less than a round trip delay; and second. the intemal network nodes beyond the congested router do not expenence the congestion. This can irnprove the aggregate throughput in the core network. The second method for exploring the network is, generating some type of special packets that are treated in a particular way in the network to gather the information about the state of the network. These special packets have been called with several names in the literature: sample packet, smart packet. control packet, rnonitor packet, scout packet, sampling packet, probe packet, and so on. We use these names interchangeably. This approacch has the extra advantage that it's easier to implement and more compatible with the current Intemet architecture. It can also be integrated with other congestion control means, resulting in an overall superior performance. The idea of using smart packets in a computer network is not new and has already been proposed severai times to improve the network performance. We describe severd proposals based on the smart packet idea. It should be indicated that some active entities are still required inside the network to update the information carried by smart packets. In 1161, Kent and Mogul enurnerate the disadvantages of packet fragmentation in the gateway. To avoid fragmentation, they propose a probing mechanism to discover the minimum MTU (maximum transmission unit) dong the path the datagram will follow. In this mechanism, the packet collects MTU information dong its route. We use a similar rnechanism in our proposal to improve the performance of TCP-Vegas. As mentioned earlier. sending probe packets without having active entities inside the network will be futile, therefore each router must update the probe according to the MTU of the hop it is about to take. Furthemore, the collaboration between the two end systems is also required, since the MTU information should be retumed to the sender. In 141, Hass proposes the mechanism, adaptive admission congestion control (AACC), to monitor the congestion status of the network by using a periodic transmission of time-stamped sampling packets through the network. The moni todsample packet travels through the network like the regular packet. At the other end, the receiver calculates the virtual delay, which is the difference between the time-stamp and the value of the local clock. This virtual delûy is retumed to the sender via the acknowledgement packet. Then the sender compares the value of virtual delay for the last two consecutive acknowledgment packets that is the difference between actual delays. Based on the difference, the sender adjusts its congestion parameters. The larger difference means higher level of congestion in the network. In [26],Hicks et al describe a routing method using scout packet which is very similar to the ATM VC-switching. Usen periodically send scout packets that explore the network to search for a better route, and direct the flow of the regular data packets. Furthermore, the scout packets comrnunicate with each other by leaving part of their states at each router they visited. These states consist of some metric to represent the best path so far. If the scout packet reaches the destination with the best metric, it announces the best path by retuming aiong its route to the sender and installing an entry in the routing table of each router in the path. As a result, the regular data packets are sent aiong the path just set up by the reuiming scout packets.

3.2.1 In-Band and Out-Band Monitor Packets In the previous section, we describe and enumerate the advantages of using monitor packets to explore the network. In this section, we see how these monitor packets will be processed inside the network. There are two approaches for monitor packets: in- band and out-band. In the in-band approach, monitor packets are treated the same way as regular data packets. They receive the same level of forwarding, routing and other services as other data packets. Therefore, both the data and monitor packets are subject to being delayed or even dropped when the node is congested. This might be an advantage or disadvantage, depending on the application. For exarnple, if we are interested in rneasuring and analyzing the packet delay distribution, the in-band approach will be a suitable choice. Furthermore, by examining the arriva1 of the monitor packets, one cm aiso obtain a sarnple of the current congestion level of the data-forwarding path, as proposed in [4]. On the other hand, if propagation delay mesurement is important, using the in-band approach may not be appropriate. In a simple word, for control algorithms that are sensitive to queuing delay, the in-band approach cannot be used. However, this approach is simple to implemented, and has less overhead compared to the out-band method. In the out-band approach, the control packets receive special, usually better services, than the regular data traffic, at the intenor node. It requires a special mangement in the forwarding module of an interior node. such as quick forwarding of the out-band monitor packets. For example, in resource reservation protocol (RSVP) [27]. the PATH. RECV. and other control messages receive special treatment to reserve resources dong a path inside the network. The special treatment for the PATH message is, fonvarding them if it originates from a new source to request a reservation. With out- band packets, it's possible to schedule different flows through the interior node so that any kind of specific requirement can be met. As mentioned above, the out-band approach with a priority scheduling mechanism is more appropriate for a round-trip-delay sensitive control algorithm. However, there could be a scaling problem at the active network node since potentially there may be millions of monitor packets waiting to receive special treatment. This problem also affects the in-band monitor packets when they collect interior node status for management and other purposes. We will summarize the major drawbacks of the active network approach in the next section. 3.3 Major Drawbacks of the Active Networking Approach

It has been demonstrated that active networking technology cm provide a better performance and enable new services, but what is the price for these improvements? What are the problerns created by this new prcposal? mat does this new tcchnology degrade to improve its strength? In this section, we briefly answer to these questions. Processing power Perhaps the primary role of the active network is communication, from which resource management and congestion control are trernendous win for the active network approach. In addition, active networks allow the computation to move from the edge of the network into the intenor node where the important information can be timely accessed. Computation inside the network requires extra processing power in the network intenor nodes, which is not neressary for a simple forwarding in the traditional network. Therefore, the major drawback of the active networking approach is the extra processing overhead introduces. In the active network paradigrn, not only is bandwidth vaiuable, but also the processing power has a price. Najafi and Leon-Garcia [28] propose a mode1 to include the cost of processing as well as bandwidth. The new cost mode1 can be used to find the optimum openting point for trade-off between the processing power and the bandwidth. Safety and security Another difficulty that is concerned with the active network approach, is safety and security. As an illustration, if an active node is to allow any arbitrary C program to be executed in the node, then it will be a vulnerable point for malicious attacks. Irnplementing safety and security in active network requires run-time checking mechanism. For example, in the srnart packet paradigrn, the authentication of the packet should be checked to venfy the identity of the user, and Iimit the resource access. This in turn adds more burdens on the processing requirements. Scaiability Scalability is another bottieneck in active network technology. Generally speaking, active network is not scalable because there may be huge number of active packets from thousands of users requesting special services and treatments in the active nodes. For example, the integrated services IP with RSVP control mechanism is not scaiable because a path needs to set up for every flow, and keeping the state of each flow inside the network is riot possible when the nrimber of usen is ex~nelylargc. Interoperability To be deployed in the cumnt IP network, the active networking technology should perform in a heterogeneous environment. To achieve this goal, active network proposals should minimize the amount of global agreement required.

To conclude this chapter, it's worth mentioning that in this research we focus mainly on the idea of smart/monitor/scout packets to explore the network state; the other extreme of active network technology. capsule approach. is not our main interest. In Chapter 4. we describe Our proposed active TCP-Vegas that takes advantage of the monitor packets to obtain better estimation of the propagation delay of the connection path. Chapter 4

Active Congestion Control

The main conclusion frorn chapter 2 is that TCP-Vegas is unfair to old connections, and creates persistent congestion in the buffer. The reason for these drawbacks is basically in over-estimation of minimum round trip delay. In the previous chapten, it has been shown that the over-estimation of the propagation delay for a given connection leads to an increase in its rate, and results in unfairness. Furthemore, over- estimation causes more packets to be accumulated in the buffer. Perhaps, the main weakness of TCP-Vegas is its propagation delay estimation. Hasegawa et al [13] propose enhanced TCP-Vegas to overcome this failure. In the first part of this chapter, we will review their proposal and based on the simulation results, major weakness of this method will be descnbed. Other researchers [29] as well as Hasegawa [30] propose DRR (Deficit Round Robin) as a rnechanism to increase the faimess of TCP-Vegas. In the second part of this chapter, we bnefly overview this proposal and enumerate its problem. In the third part of this chapter, we introduce Our own methodology, Active Vegas, to resolve the unfaimess of TCP-Vegas. 4.1 Introduction

Flow control, which regulates the amount of traff~c,is a distributed algorithm. It can be divided into two parts: the link algorithm that is executed inside the network, and the source algorithm that is executed at the edpof the network. These two algonthms are tightly related to each other, and they provide the required framework for Bow control. Flow sontrol can be saan as a generai mechanism ùiat covers most of conuol mec hanisms that maintain good system performance. For example. in a differentiated services network, there are control mechanisms on the boundary of the network, e.g., rnarking, policing and shaping (source algonthms), and there are other control mechanisms at the core network, e.g.. queuing and scheduling (link algorithms). The main duties of the link algonthm are congestion detection inside the network and retuming this information to the source algorithm. It is worth mentioning that in the curent Intemet, there is no explicit way for the link algorithm and the source algorithm to communicate with each other. After receiving congestion information from the link algonthm implicitly, the source algorithm adjusts the rate of sending trdtic. As an illustration, FIFO, RED,and RED with ECN are link algorithms. RED detects congestion by setting a threshold on the average queue size and informs the sender implicitly by dropping a packet randornly when average queue size is greater than the threshold. RED with ECN detects congestion the same way as RED does, but instead of dropping the packet randomly, it marks the packet randomly to explicitly inform the sender. As far as the source algorithm is concerned, there are many TCP fiow control algorithms to adjust the rate at which traffic is injected into the network in response to congestion feedback notification from the link algorithm. To that extent, coordination between the link alprithm and source algorithm is very crucial. For example, RED and TCP- Reno cm work together because Reno rate adjustment is based on packet drop and RED detects congestion and notifies Reno by packet drop. In contrast, RED is not compatible with TCP-Vegas because Vegas rate adjusment is not only based on packet drop but also on round trip measurement. Therefore, to improve the overall performance of Vegas, either source algorithm should be changed or compatible link algorithm should be investigated or both. 4.2 Enhanced TCP-Vegas

Probably the most innovative feature of TCP-Vegas is its congestion detection mechanism during congestion avoidance, which leads to a stable window size that does not oscillate enormously like TCP-Reno. Hasegawa [13] claims that the condition of unchanging window sizes results in most of TCP-Vegas drawbacks, particularly in uniaimess to old connecuons and creaung persistent congestion inside the network. Hasegawa [13] proposes an algorithm, Enhanced TCP-Vegas, to intentionally force the window size to oscillate around the appropriate value and prevent the convergence of the window size to a stable point. Their aigorithm is based on setting a equd to fl in TCP- Vegas. Intuitively, the idea came from the fact that TCP-Reno and its ancestor. TCP- Tahoe. do not penalize the old connections after a new connection is established. Therefore, making the window size performance closer to TCP-Reno can be helpful. Referring to Equation (U),making a equd to fi introduces more oscillation in congestion window size. As previously mentioned, Vegas tries to keep the backlogged queue size in a lirnited interval, namel y between a and /? . When the bac klogged queue size is less than a,congestion window size is increased by one and when backlogged queue size is greater than /?, congestion window size is decreased by one. Having set a equal to /?, the congestion window will oscillate around the convergence point.

4.2.1 Fairness Evaluation of Enhanced Vegas

Enhanced Vegas proposai only focused on the simple network topology, a single- hop network with onIy two connections where they are homogenous. The homogenous connections have the same propagation delays. Back to Equation (2.8), it can be seen that any over-estimation of the propagation deiay for a connection still leads to an increase in window size. In the next part, we will see from the simulation results that the congestion window size oscillation, proposed by Hasegawa, is not able to cornpensate this effect. Here, we conducted two sets of simulation with different values of a and 1 to evaluate the performance of the Enhanced Vegas algorithm. The network mode1 is shown in Figure 2.3. The following parameters were used: al1 link bandwidths and propagation delays ,except bottleneck link's, are l Mbps and jms, respectively. The bottleneck link capacity and propagation delay are OSMbps and 20msec respectively. Packet sizes for al1 connections are the sarne and are equal to 1Kbyte. Each simulation lastcd for 50 sscondst Sourzzs SI and S2 eatablinli tlieir co~mectionsaiter 1 second and connection 3 joins to the network der 10 seconds. As it can be seen in Figure 4.1, the allocated bandwidth to the new connection is almost three times greater than the old connections. Furthemore, before the third co~ectionjoins the network, the congestion windows of the old connections oscillate, but after that the queue size reaches steady state and there will be no longer enough oscillation in the congestion window sizes. This is unexpected from the algorithm.

9-c

Imri c

Lamp

ICIUli I i Figure 4.1 Comection goodputs and instantaneous queue lengths for enhanced Vegas

Why is congestion window not oscillating in our simulation result? The reason is that; when diff is equal to a. the congestion window remains unchanged (Equation (2.1)). In order to remove the stability from window size, this condition should be removed from window update algorithm (the equali ty condition should be included either in the increasing or in the decreasing sections). To that extent. the enhanced Vegas algorithm has been changed and implemented in ns-2 as follows:

WindowSize + 1 diff < a PVindo rvSke =

Part (b) of the above simulation (a = p = 2) is repeated with this new algorithm. based on Equation (44, and the results are shown in Figure 4.2. Despite the oscillation in the window size, due to over-estimation of BaseRn, the last connection still receives more bandwidth than its fair share. Even worse, the actual throughput is slightly smaller than that of original Vegas implementation. with different threshold on window size, due to window size oscillation. Lower throughput for enhanced Vegas can be expected because its oscillating window size behavior is very similar to TCP-Reno, which has less goodput than Vegas. The main conclusion from this section is that oscilIation in window size cannot compensate the over-estimation of BaseRIT.

rrnmc SLIP-.- - *¶m-

1mPC

4-+ *m.-

1PP- t IIlLP.- nt . 1 î *#) ' am 8 .* '. r ... 8 . C .. Y.. ..A..

Figure 4.2 Comection goodputs and congestion windows for old and new connections 4.3 Vegas and DRR at the Gateway

Perhaps one of the simplest link algorithms to provide isolation between flows to achieve faimess in terms of throughput is weighted round robin (WRR). The buffer at the router is divided into multiple queues and each queue is served in round robin order. To be as fair as possible, the WRR server must know the average packet size before hand. This is not pncticai when packet size is variable. To eliminate this drawback, deficit round robin (Dm) [31] keeps the state of each queue to measure the "deficit" or past unfairness by use of a deficit counter for each queue. The queuing discipline corresponding to each queue is FIFû. However, a new algorithm is proposed in [30], DRR+, that combines RED and DRR. In other words, the queue management in each logical queue is RED instead of FIFO. Deploying Vegas, and choosing a better scheduling algorithm at the gateway might be a remedy for the intrinsic unfairness of Vegas. In this way, by using DRR gateways to provide the necessary isolation among different usen, the unfaimess of Vegas cmbe resolved [29], [30]. To provide perfect isolation arnong the users. a separate logical queue will be required for each user, which is alrnost impossible for a large nurnber of different users. In other words, as the number of connections increases, it is impossible that DRR can assign a separate logical queue to each connection: therefore, severai connections should share one logical DRR queue. Eventually, we return to the sarne problem of original Vegas and a shared FIF0 buffer, which leads to an unfair allocation of bandwidth among those connections sharing a single queue. Moreover, to provide fairness, DRR needs to keep the state of each logical queue. This will incur extn processing time overhead. In a nutshell, using complicated scheduling algorithm, e.g., weighted fair queuing (WFQ) and DRR, usually ends up in scalability problems. Using DRR+ at the gateway does not help to irnprove the fairness problem of Vegas because of the incompatibility between RED and TCP-Vegas. 4.4 Active Vegas

As repeated several times in the literature. Vegas has achieved a better throughput than TCP-Reno, but it fails to keep the faimess arnong different usen, and creates persistent congestion. In the previous sections, we enumerated recent proposals to overcome this problem and explained their weaknesses. We believe, for TCP-Vegas to be deployed in the future Intemet, it requires to be modified to improve the unfaimess arnong different users. In this section, we introduce Our own algorithm, called "Active Vegas". From the previous simulations and proposals, it cm be observed that a better estimation of the BaseRïT is highly required. On the other hand, using a separate queue for each flow to achieve faimess is not practical from the point of view of scalability. A better estimation of propagation delay can be achieved if queuing delay can be removed from the BaseRïT rneasurement. To remove the delay corresponding to the backlog queue in the gateway from the measurement, we propose a new algorithm. It is based on the active network approach described in chapter 3, to combine a modification in Vegas and a prionty queuing in the gateway. In other words. our proposal is based on out-band monitor packets. Active extension of Vegas allows us to employ an strategy in which the gateway is enhanced dynarnically to improve the faimess of the Vegas. In the default implementation, there is a single Drop Tai1 queue shared by al1 of the connections. By receiving an activelmonitor packet. the gateway changes its queuing mechanism to a priority queue. To provide an out-band path, it creates a fast route for the monitor packet. The buffer at the gateway is logically divided into two separate queues, one with higher priority than that of the other. In this way, high priority packets will be forwarded without expenencing any queuing delay at the gateway. Every time TCP-Vegas enters the congestion avoidance phase (congestion window greater than slow start threshold), it transmits the first packet as a high priority packet, an out-band urgent packet, to monitor the network. Its acknowledgement can provide a better estimation of propagation delay. Further, we assume no delayed acknowledgment to be installed at the receiver. The expedited packet can be either a regular data packet or a separate monitor/control packet. If the regular data packet is used to perform monitoring operation, it has the advantage of reducing the overhead of consuming extra bandwidth for sending separate monitor packet. This cm mitigate active Vegas scheme overhead. Moreover, if the high pnority packet is lost, we presume that active Vegas is intelligent enough to send another monitor packet until it receives its acknowledgement. Since there is no in our simulation, we did not implement this mechanism in our simulation.

Slow Start Congestion Avoidance A Sender -1

Figure 4.3 Packets in transit for active Vegas

As mentioned in chapter 2, every tirne a new comection is established. due to delay corresponding to backlog queue. last connection over-estimates BuseRTT, which results in opening its window size more than its share. Now with Active Vegas, there will be no or very minor queuing delay for the marked expedited packets. Therefore, there will be very small over-estimation in BaseRn, and more faimess will be achieved.

Flow from al1 connections 1 II... I I Dm Low priority I I I 11111001--1---11

Figure 4.4 Prionty queue at the gateway Figure 4.4 shows our dynamic queue modification in response to monitor active Vegas packet. It is also possible to send urgent packet penodically to check if there is any change in the routing path. We will discuss this in ~e future research section. In the next part, we run the fint preliminary simulations to see the algonthm problems and weaknesses.

Simulations

In the firçt part of this section, we use the network topology shown in Figure 4.5 to examine the performance of Active Vegas. Instead of drop tail, a pnonty packet scheduling has been implemented in R1 and al1 sources send priority packets when they enter to congestion avoidance phase after a slow start. It is worth mentioning that cr and /3 are set to 1 to 3, respectively. The propagation delay for al1 links are jmsec and packet size is 1 Kbyte. The first and second connections are established afier one second and the third co~ectionjoins the network after I O seconds.

4 S1 Dl

Figure 4.5 Network mode1

It can be seen kom Figure 4.6 that the faimess has been achieved in this simple simulation. However, extensive simulation is required to evaluate this proposal. Therefore, as a first step, in the next part we run the simulation with 4 Active Vegas sources. Figure 4.6 Comection goodputs for 3 Active Vegas sources

In Table 4.1, three algorithms (original Vegas. enhanced Vegas and active Vegas) are compared from the point of view of achieved average goodput (packet/second, packet size = 1KByte). The network mode1 and simulation scenario are the same for three algorithms.

Original Vegas Enhanced ~eg&Active vegas a 4.P =3 a = p =2 a 4,fl =3

1'' comection 20.0667 19.93 3 2 1.65 zndcomection 20.1333 19.1 22.45 zrd connection 25.52 26.86 20.8

Table 4.1 Comection throughput for 3 different version of Vegas

In the next part, a forth connection joins the network after 20 seconds. Recall from the simulation in Chapter 2 (Figure 2.6) that later connection establishment of S4 leads to a very unfair allocation of bandwidth, particularly for co~ections1 and 2. The results with active Vegas simulation are show in Figure 4.7. As it cm be observed, not only faimess has been irnproved , the number of packets backiogged in the gateway buffer has also been decreased to 12 packets, the number that we expected. Each connection keeps at most 3 packets in the bufEer: so, 12 packet is the maximum number of packets in the buffer. Figure 4.7 Co~ectiongoodputs and instantaneous queue size for 4 Active Vegas sources

it is worth mentioning that buffer size is large enough to avoid any drop. From Figure 4.7, it can be seen that a better faimess has been achieved with Active Vegas compared to original Vegas im plementation.

4.4.2 Single Bottleneck Network Simulations

In this section. we simulate Our proposa1 with many senders and receiven. The simple network mode1 that was simulated in this section is the one descrîbed in Figure 4.8. It consists of a single bottleneck shed by 30 point-to-point connections. For all cases, the shared link has 1Mbps bandwidth. Also, a lMbps access link connects each hosts to the router such that a flow is setup fiom host Sx to host Dx and is identified by a flow Idx. Packet size for dl of the connections is 1Kbyte. The Propagation delay of the access links and the shared link are 5ms and 20ms, respectively. Six sets of simulations were conducted under this topology with different TCP algorithms at senders and different packet scheduling algorithms at the router. To compare some of the simulation results, different buffer sizes were also investigated. Successive connections (i= 1. .. . .30) joins the network every one second. Al1 of the senders tum off after 100 seconds and each simulation lasted for 1 10 seconds. Bandwidth = lMbps

Access Links Bandwidth = LMbps Delay = Srnsec

Figure 4.8 Network topology

In the fint four mns of simulation, buffer size is set to 100 packets and the following scenarios are simulated. Scenario 1: Active Vegas, a = 1 and #? = 3 Scenario 3: Active Vegas, a = ,î3 = 2 Scenario 3: Original Vegas with Drop Tail, a = L and /?= 3 Scenario 4: Original Vegas with Drop Tail, a = /? = 2 In Figure 4.9, the average and the standard deviation of each connection goodput are shown. Each figure corresponds to a different scenario. The x-axis and y-axis represent the index of the co~ectionand the corresponding goodput, respectively. The stars and vertical bar; correspond to the mean and the standard deviation of the rate oscillation. The number of received packets in one second is calculated as the connection goodput.

alpha. ba.2. PQ

V L O 5 10 15 20 25 30 O 5 10 15 20 25 30 connedon index connectior. index

alpha- 1,beta=3,PropTail l5 i+ 4.

-" -" O 5 10 15 20 25 30 O 5 10 15 20 25 30 connedon index connection index

Figure 4.9 Goodput vs. connection index for 4 different scenarios

Figure 4.9 illustrates that active Vegas provides a better faimess among the connections. Setting equal values for a and ,û gives an even better fairness than two different values for a and 8. However, it produces more variation in the window size and hence in the throughput. On the other hand, cr # ,û will converge to a settiing point very quickly and therefore less oscillation in the window size and the throughput. The performance of original Vegas with Drop Tai1 at the gateway is very poor and surpnsingly, it does not behave as we expected. From the previous simulations, we expect that original Vegas with Drop Tai1 at the gateway allocates more bandwidth to the new connections than old connections. This is not the case in the current simulation. To find out the reason, buffer occupancies of active Vegas and original Vegas with Drop Tai1 are shown in Figure 4.10.

Figm 4.10 Buffer occupancies for the different scenarios, buffer size 100 packets Active Vegas not only improves the faimess but also leads to less nurnber of packets logged in the buffer. Figure 4.10 answers our question about the behavior of original Vegas and Drop Tai1 buffer at the gateway. Potentially, originai Vegas induces persistent congestion and creates large queues at the gateway up to the point that Reno like behavior kicks in; buffer overflows, packet drops, and the window size is halved. Since 100-packet buffer size is not enough for original Vegas, after 30 sec, packets start to dm:, and the perfommce hecomes sirnilx to Reno. As mentioned in [Il], [2], if the buffer is not suficiently large. equilibrium cannot be achieved, loss cannot be avoided, and Vegas will go back to Reno. It's worth mentioning that creating less congestion is another advantage of active Vegas compared to original Vegas. To have a better view of the two algorithms in the same conditions. we repeated the last two scenarios of the simulation with enough buffer size to avoid any loss. which resulted in no packet drop. To remove al1 of the start up transient fiom the calculations. mean goodputs for al1 connections under different scenarios were calculated in the last 60 seconds of simulation and they are shown in Figure 4.11.

+Active Vegas, alphacbeta -t- Active Vegas, alpha=beta -t- Original Vegas, alpha~beta +Original Vegas, alpha=beta

Figure 4.1 1 Mean goodput of each connection under different Vegas algonthms Figure 4.1 1 confirmed our previous observation that active Vegas with equal a and /9 results in a fair allocation among the connections. To investigate the nurnber of backlogged packets for active Vegas and original Vegas, we conducted the simulation again for original Vegas with infinite buffer size. Buffer occupancy for original Vegas is show in Figure 4.12. Comparing Figure 4.12 and Figure 4.10, it cmbe observed that active Vegas creates Iess congestion in the buffer compared to original Vegas.

Figure 4.12 Buffer occupancy for original Vegas (infinite buffer size)

Afier al1 connections join the network, original Vegas back logged 170 packets in the buffer while active Vegas back log queue size is about 65 packets. This is two and a haif times less than original Vegas. One of the features of TCP-Vegas is that Vegas attempts to keep at lest a packets but no more than fl packets in the queues. Active Vegas, by its monitor packets and prionty queuing, cm provide and support this feature with a good approximation. Frorn the simulation with 30 connections, we expect the backlog queue size to be Iess than 90 packets (fl = 3). This was satisfied by active Vegas. In contrast, as it cm be observed, original Vegas cannot achieve this feature and creates more congestion in the network. 4.4.2.1 Round Trip Time Measurement

To compare BaseRn measurement and estimation in both active Vegas and original Vegas, we show the values of BaseRïT measured by different connections in Figure 4.13. Again. the x-axis represents the connection index.

1st rtt measurement i2nd rtt update A 3rd rtt update

Active Vegas, alpha=l , beta=3 Active Vegas, alpha=beta=2

Connection index Connection index

Original Vegas, al pha=l , beta=3 Original Vegas, alpha=beta=2

O 10 20 30 40 O 1 O 20 30 40 Connectron index Connection index

Figure 4.13 BaseRïT of connections for al1 scenarios The main point behind Figure 4.13 is that original Vegas cannot estimate the BasRm properly. End-to-end delay cmbe calculated as follows:

(End-to-enddelay) = (Transmission delay) + (Propagation deZay) + (Packet Processing delay) + (Querting delay)

Considering the access link transmission delays in our calculation, the BrrreRTT should be around 80 to 100msec. Since the packet size is IKbyte, and link capacities are lMbps, the transmission delays corresponding to the bottleneck link and the access links wiil be around 8msec. Moreover the propagation delay is 30msec; hence, adding al1 together, the BaseRTT will be 84msec. As it can be seen in Figure 4.13, active Vegas gives a very accurate value (88msec) for BaseRlT for al1 connections. In contrast. original Vegas BaseRlT estimates becorne wone and worse when a new connection joins the network such thl3t the BaseRTTestimate for the last connection is more than 1 second.

4.4.2.2 Fairness Evaluation

The faimess criterion has been widely studied in the litenture but no single metric has been comrnonly accepted [32]. In this research, we consider a "fair share per link" metric; when n users share a bottleneck link, each flow bas the right to llnth of the capacity of the bottleneck bandwidth. lain's faimess index [33] is applicable in this context. For n flows, with flow i being allocated an xi on the bottleneck link, the faimess index is defined as follows:

Fairness

+Active Vegas, alpha

Bottleneck link propagation delay

Figure 4.15 Faimess vs. bottleneck link propagation delay (link capacity = 4Mbps)

Observation frorn Figure 4.15 reveals that original Vegas faimess increases with an increase with propagation delay. In other words, original Vegas perfoms better in long TCP co~ection.The reason is that in long TCP co~ection,the share of the queuing delay in the total end-to-end delay becomes smaller. As a result, the over-estimation error in Base R 7T becomes smaller; and consequently unfaimess will be reduced.

4.4.2.3 Buffer Occupancy Although active Vegas does not suggest a way to completely eliminate persistent congestion, it mitigates the level of the congestion in the network. As we discussed in section 2.5.2, the over-estimation of BaseRTT results in larger backlog. The simulation results with 30 sources and destinations also confirm the fact that original Vegas undcr- estimates the number of packets in the burer, due to over-estimation of BaseRlT for new connections. On the other hand, active Vegas achieves the expected number of packets in the buffer, which is n p for n active connections. Comparing Figure 4.10 and Figure 4.12 reveals that active Vegas mitigates the level of congestion in the network with less number of backlogged packets. Table 4.2 summarizes the buffer occupancies. Active Vegas Active Vegas Original Vegas Onginal Vegas a=i.j3=3 a=B=2 a=i,p=3 cr=F=z

Buffet Occupanc y Stable Oscillatory Stable Osciliatory (packe t) 68 64 170 L 62

Table 4.2 Buffer occupancies for 4 different implementations of Vegas

4.4.2.4 Summary of Simulation Results

in this section, we surnrnarize Our simulation results for the single buttleneck network with 30 sources and destinations in tabular format, which is shown in Table 4.3.

1 1 Active Vegas 1 Active Vegas 1 Original Vegas 1 Original Vegas CU =i. fl =3 a =/? =2 a =i. /? =3 =p=? Miuirnurn faimess 1 0.9805 0.9 106 0.8333 metric Minimum fairness 0.998 1 0.9166 0.785 0.6673 me tric I I Maximum BaseRTï 92.64 92.64 1 180.64 1120 (mil tisecond) Minimum BuseRn 84.96 84.96 84.96 84.96 (milliseconci) Maximum mean throughput 6.42 4.26 9.1 10.54 (pac ketkec)

I Minimum mean 1 1 1 throughput (pac ketfsec) L B uffer occupancy 68 64 170 162 1 (packet)

Table 4.3 Summary of simulation results 4.4.3 Multi-hop Network Simulations

To Merstudy the fairness performance of active Vegas under situations where connections experience different round trip delay and traverse different number of gateways, a rnulti-hop network topology, as illustrated in Figure 4.16, is implemented in ns-2.

S1 SS S7

Core links Bandwidth = lOMbps Bandwidth = lMbps . Delay = Smsec Delay = 50msec D8

:

Figure 4.16 Multi-hop network mode1

It consists of 8 connections. The shared links have lMbps capacities. Also, a lOMbps access link connects each host to the router. A flow is setup from host Sx to host Dx and is identified by a flow Idx. Packet size for al1 of the connections is 1Kbyte.The propagation delays of the access links and shared links are 5msec and SOmsec, respectively. Two sets of simulations were conducted under this topology with active Vegas (a = fl = 2), and original Vegas ( a = /3 = 2) with no packet &op. Successive connections (i = 8,7, .. ., 1) join the network every two seconds. Al1 of the senders turn off after 100 seconds and each simulation lasted for 1 10 seconds. Fim of dl, Ict us look at average congestion windows during congestion avoidance phase for difiierent connections under active Vegas and original Vegas, which is show in Figure 4.17.

Ri Active Vegas Mriginai Vegas

a 1 2 3 4 5 6 7 8 Connection index

Figure 4.17 Average congestion windows for different connections

Since the First four co~ectionsare homogeneous, they should have the sarne window size to share the bandwidth equaily. Figure 4.17 illustrates that original Vegas cm not achieve this. The average congestion window sizes are 10.2 1 and 18.39 for co~ection4 and co~ection2 respectively, which has a difference of 7 packets. On the other hand, active Vegas can force al1 of the homogenous connections to have the sarne window size, e.g. average congestion window sizes are 1 1.47 and 11.84 for connection 4 and connection 1 respectively Another important thing io note, which is cornmon to both active and onginai Vegas, is that Vegas in general does not penalize long TCP connections. As it can be seen in Figure 4.1 7, short TCP connections have smaller window size compared to long TCP connections (connection 8 has the smallest propagation delay and connections 1,2,3, and 4 have the longest propagation delays). This will result in fair allocation of bandwidth among al1 of the connections. To have a quantitative idea, Figure 4.18 describes the connection goodputs versus different connections.

Origine) Vegas

L II 1 l + l

i I & t 1 20.=[ , : '*- el. 20 r*

C L 19.5 O 2 4 6 6 connection index connedan index

Figure 4.18 Goodput vs. connection index for active and original Vegas

Same as single bottleneck topology, the average and the standard deviation of each co~ectiongoodput is illustrated in Figure 4.18. To remove the startup transient from the calculation, mean goodputs were averaged in the last 80 seconds of the simulation. Again, active Vegas performs better and allocates the bandwidth more fairly than original Vegas. The differences between maximum and minimum mean values of the connection goodputs are 2.46, and 11.77 packetfsec for active and original Vegas respectively, which shows a better fahess with active Vegas. As mentioned before, active Vegas does not penalize long TCP connections, i.e. co~ections1, 2, 3 and 4 with longer propagation delays receive the same share as connection 8 with srnailest propagation delay. To compare BaseRTT measurement and estimation in both active Vegas and original Vegas, we show the values of BuseRï7' in millisecond measured by different connections in Table 4.4. In this table blank slots mean that BoseRTT was not ,updated during the simulation, because the first measurement had the smallest value.

BaseRTT Fit mesurement Last update Active Ori_eind Active Onginal 1 573.632 665.024 - -

-7 565.632 72 1.024 - 672 3 571.008 633.024 568 -

4 571.008 585.024 568 - 5 462.703 526.704 - 480 6 348.384 364.384 346.642 360 7 240.08 256.08 240 340 8 129.984 129.983 - -

Table 4.4 BaseRTT estimation for active and original Vegas

For the fint 4 homogeneous connections, the minimum possible value for BaseRTT theoretically is 561.6msec. The transmission delay for the R1, R2, R3, R4, and R5 is âmsec, for R6 is O.8msec. The propagation delay is 260msec. As it cm be seen in Table 4.4, active Vegas estimates BaseRn with a much better accuracy than original Vegas. BmeRTT measured by active Vegas is 571.008msec and BaseRn measured by original Vegas is 721.024msec. The error comsponding to original Vegas is almost more than LOOmsec, which leads to a very unfair bandwidth allocation. Chapter 5

Conclusion

In this thesis, we have first focused on evaluating the essential characteristic of TCP, mainly TCP-Vegas. We have obtained the results that TCP-Vegas sometimes fails to keep faimess among connections. From those results, we have proposed the active version of TCP-Vegas, which has better fairness than the original Vegas. This proposai is based on a probe mechanism to better estimate the propagation delûy and measure the smallest round trip delûy. To what extent is the network performance improved? What is the ptice of this improvement? What are the problems created by this new proposal? This chapter answers the above questions, and concludes the thesis with a summary of areas for future work, and research contributions. 5.1 Discussion We noted that the measurement of the propagation delay in TCP-Vegas is not robust .The primary goal of this thesis is to decrease the effect of enor in round trip delay estimation from TCP-Vegas algorithm (a delay-based congestion control mechanism). In the generd case where the route remains the sarne during the connection time, the propagation delay can only be over-estimated because of the queuing delay. We observed that over-estimation of propagation delay leads to unfair resource allocations of the old connections. As stated previously. TCP-Vegas intends to keep between a and packets in the buffen. However, with the over-estimation of BoseRTT. although it beiieves that it has between a and #3 packets in transit, in fact it has many more packets in transit. In other words, biased BaseRTTcauses more packets to be backlogged in the buffers. This induces persistent congestion in the network. Throughout the course of this research, we leamed that a probing mechanism is useful for discovering the path characteristics, such as deiay, queue length and bandwidth. Along the same Iine, we propose active TCP-Vegas which consists of an extension to original Vegas. This change requires support both frorn gateways and host IP la yen. Assuming no changes in the connection route, we have demonstrated that sending a high pnority probe packet by TCP-Vegas, and fast forwarding rnechanisms at the gateways resolve the problem of BaseRZT over-estimation. Our simulation results with a single bottleneck and multi-hop networks are shown that this simple enhancement to TCP-Vegas improves the fairness and reduces the number of packets in transit; and hence level of the congestion in the network. The most innovative feature of TCP-Vegas is the window control mechanism based on delay measurement, which results in stable window size and higher throughput. However, a packet loss would still lead to a congestion window reduction by a large amount. Therefore to achieve the improvements proposed by TCP-Vegas, packet drops should be prevented. Otherwise, the Reno behavior will kick in, congestion window will oscillate, many retransmissions will occur, and throughput will be decreased. Since the performance improvernent heavily depends on probe packet acknowledgement, we further assume that probe packets and their acknowledgments are not dropped or lost. Moreover, we assume that probe packets cary the same T'OS and IP security options as the regular data packets will. In a nutshell, we assume the sarne route for data and probe packets. We believe that we cm encourage TCP-Reno usea with a proper incentive to switch to -4ctive TCP-Vegas by using a simple priority drop tail scheduling mechmism a: the gateways to provide better throughput and less delay. The only concern is the incompatibility of Vegas and Reno, 3s stated in the literature [12]. In other words, Reno and any variant of Vegas, enhanced or active, cm not share the sme buffer. To remedy this problem, a DRR scheduling mechanism to provide the necessary isolation between the Reno usen and the Vegas users cm be proposed.

5.2 Drawbacks The important drawbacks of Our proposa1 are as follows: Coordination and support between Active TCP-Vegas and IP layer is required, since the active Vegas probe packet should be marked at IP header. For Ipv4. this cm be done using the IP type of service (TOS)octet, although six bits of this byte has already been assigned to differentiated services code point (DSCP). In Ipv6, there is an IP priority field that cm be used for marking active Vegas probe packet. Support from gateways should be provided to probe packets. The major drawback of our proposal is that a priority &op tail scheduling algorithm should be implernented at every gateway in the network. This condition might not always be fulfilled. One of the advantages of Our proposal is that no changes in the current receiver implementation are required. However, receivers can not install delayed acknowledgement. This condition is not limited to our proposal and is comrnon among dl delay-based congestion control mechanisms. 5.3 Future Research Directions

Although this research has addressed an important problem of delay-based congestion control mechanism, particularly TCP-Vegas, several issues remain for further investigations. The following lists some of the future research directions. 1. Rerouting also causes error in propagation delay estimation. When the route changes dunng the connection time, the propagation delay may increase. The sender believes, this increase in round trip delay is a result of congestion in the network, and consequently decreases its transmission rate. This will result in a poor utilization of the network. With rerouting, we face an under-estimation of BaseRTT. One solution to this problem is sending probe packets periodically. One of the problems that needs further investigation is: what is the effect of the frequenc y of probe packets? II. We did not consider the effect of other traftic in the performance of active TCP- Vegas. What types of mechanisms are required to isolate the active Vegas flows from other trafic? Can active Vegas and Reno share the same buffer? if so, what type of queue mniagement and scheduling is appropriate in the buffer? These are the questions that need to be addressed. III. Although the in-band approach is simple to implement, and creates less overhead, in Our proposai, probe packets are out-band. It is also possible that instead of fast forwarding of the probe packets (out-band approach), they are forwarded like regular data packets (in-band approach). in the in-band approach, gateways dong the route must update the probe packet the queuing delay it experiences. Furthemore, in-band mechanism also requires support from the peer hosts. Once the probe packet reaches the end of its route, the queuing delay information it has collected must be piggybacked in the acknowledgement. N. Since TCP-Vegas was aiready implemented as a patch to the LENLTX kernel, active Vegas extension cm also be implernented in the LINUX kernel to further study the performance in a reai network. 5.4 Research Contributions This thesis has addressed a number of issues regarding the congestion control mechanism of TCP-Vegas. With simulation analysis, this thesis has presented severai drawbacks of TCP-Vegas, which have their root in round trip delay estimation. To improve the faimess performance of TCP-Vegas, this thesis proposed a mechanism based on the active networking approach. We develop "Active TCP-Vegas" that efficiently estimates the path propagation delay. The simulation results from Our work lead to the following observations and conclusions. Faimess Performance. We investigate the faimess performance of active TCP-Vegas and compare it with original TCP-Vegas. We expenment with both a single bottleneck and a multi-hop network. Our results show that active Vegas fairly allocates the bandwidth arnong the connections, and does not penalize the old connections in favor of the new connections. Persistent Congestion. We examine the buffer occupancy of active Vegas. We show that better estimation of BaseRTT leads to a smaller number of packets backlogged in the buffers compared to original Vegas. This decreases the level of congestion in the network. References

T. V. LAshman, U. Madhow, "Performance Analysis of Window-Based Flow Control using TCPIIP: The Effect of High Bandwidth-Delay Products and Random Loss", Proceeding of HPN'94. 1994. L. S. Brakmo, L.L. Peterson, 'TCP Vegas: End-to-End Congestion Avoidance on a Global Internet", IEEE Journal on Selected Areas in Commiinications, Oct. 1995, pp. 1465-1480. J.S.Ahn, P.B. Danzig, 2. Liu and L. Yan, "Evaiuation of TCP Vegas: Emulation and Experiment", Compziter Cornmrinication Review. ACM SIGCOMM, 1995, pp. 185-195.

Z. Haas, "Adapti ve Admission Control". Cornputer Cornmcinications Review. ACIM SIGCOMM,Jan. 1991, pp. 58-67. V. Jacobson, "Congestion Avoidance and Control", SIGCOMM Symposiirm on Communications Arcltitectzires and Protocols, 1988, pp. 3 14-329. V. Jacobson, "Modified TCP Congestion Avoidance Algorithm". Technical report, Apr. 1990, ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt. R. Jain, "A Delay-Based Approach for Congestion Avoidance in Interconnected Heterogeneous Computer Networks", Cornputer Communications Review. ACM SIGCOMM, Oct. 1989, pp. 56-71. K.K. Rarnakrishan, R. Jain, "An Explicit Binary Feedback Scheme for Congestion Avoidance in Computer Networks with a Connectionless Network Layer", Proc. ACM SIGCOMM,Aug. 1988, pp. 303-3 13. 2. Wang, J. Crowcroft, "A New Congestion Control Scheme: Slow Start and Search Uri-S)", Computer Commiinication Review, ACM SIGCOMM,1990. [IO] L. S. Brakmo, S. W. O'Malley and L. L. Petenon, 'TCP Vegas: New Techniques for Congestion Detection and Avoidance", Proc of ACM SIGCOMM'94, Oct. 1994. pp.24-35. [Il] T. Bonald, "Comparison of TCP Reno and TCP Vegas via Fluid Approximation", Technicd Report, m, 1998. Availabie at http://www .dmi .en~.fr/%7ernistraYtcpworkshop.html. [12] J. Mo, R.J.La, V. Anathararn and J. Walrand, "Analysis and Comparison of TCP Reno and Vegas", Globecomm '99. [13] G. Hasegawa, M. Murata and H. Miyahara, "Faimess and Stability of Congestion

Control Mechanisrn of TCP", 11''' ITC Specinlis? Seminar, Ocr. 1998, pp. -dd355,262, [14] S. H. Low, L. Peterson, and L. Wang, "Understanding TCP Vegas: Theory and Practice", TR 6 i 6-00, Feb. 2000. AvaiIabIe at h ttp://www .ee.mu.oz.au/staff/slow/papers/vegas.ps. [15] ns-2 Simulator, Available at http://www-rnash.cs.berkeley.edu/ns/. [16] C. Kent and J. MoguI, "Fragmentation Considered Harmful", KMSIGCOMM87, Aug. 1987. [17] A. Romanow and S. Floyd, "Dynamics of TCP Trflic over ATM Networks", IEEE Journal of Selected Area in Communications, May 1995. [18] S. Floyd, snd V. lacobson, "Random Early Detection Gateways for Congestion Avoidance", IEEUACM Transactions on Networking, Aug. 1993, pp. 397413. [19] D. L. Tennenhouse, and D. J. Watherall, "Towards an Active Network Architecture", ACM Cornputer Cornmirnication Review, Apr. 1996, pp. 5- 18. [20] S. Bhattachajee, K. Calvert and E. Zegura, "An Architecture for Active Networking". froc. High Perf: Networking 97, 1997. [21] D. L. Wetherall, U. Legedza and J. Guttag, "Introducing New Internet Services: Why and How", IEEE Nenvork Magazine, JulyIAugust 1998. [22] B. Schwartz, A. W. Jackson, W. T. Strayer, W. Zhou, D.Rockwel1 and C. Partridge, "Smart Packets for Active Networks", OPENARCH'99, Mar. 1999. [23] S. Kasera, S. Bhattacharyya, M. Keaton, D. Kiwior. J. Kurose, D. TowsIey, and S. Zabele, "Scalable Fair Reliable Multicast Using Active SeMces", IEEE Nework Magazine (Special Isszte on Mirlticast), January/Fe bruary 2000. [24] E. Takahashi, P. Steenkiste, J. Gao, A. Fisher, "A Programming Interface for Network Resource Management", OPENARCH'99, Mar. 1999. (251 T. Faber, "ACC: Using Active Networking to Enhance Feedback Congestion Control Mechansirns", IEEEE Neîwork Magazine, MayIJune 1998. [26] M. Hicks et al. "PLANET:An Active Intemetwork", INFOCOMM '99, Mar. 1999. [27] L. Zhang, S. Deering, D. Estrin, S. Shenker and D. Zappala, "RSVP: A New Resource ReSerVation Protocol", IEEE Network Magazine. Sept. 1003, pp. 8- 18. [28] K. Najafi, A. Leon-Garcia, "A Novel Cost Mode1 for Active Networks", Proceeding of ?nt. Conf. on Communication Technologies, World Cornputer Congress 2000. [29] R.J.L. and V. Anatharam, "A Case Study for TCP Vegas and Deficit Round- Robin Gateways", Available at http://www.path.berkeley.edu/-hyongla. Dec. 1998. [30) G. Hasegawa, T. Matsuo, M. Murata and H. Miyahara, "Cornparison of Packet Scheduling Algorithms for Fair Service among Connections on the Intemet". Proceedings of iWS'99. Feb. 1999, pp. 193-200. [31] M. Shreedhar and G. Varghese, 'Efficient Fair Queuing Using Deficit Round Robin", Proceeding of ACM SIGCOMM'95, Sept. 1995. [32] S. Floyd, "Connections with Multiple Congested Gateways in Packet-Switched Networks, Pari 1: One-way Traffic", ACM Computer Communications Review, Oct. 1991, pp. 30-47. [33] D. Chiu and R. Jain, "Analysis of the increase and Decrease Algorithm for Congestion Avoidance in Computer Networks". Computer Networks and ISDN Systems, 17(1989) pp. 1-14 Appendix

In this section, the simulation tool and experiment setup topology are descnbed (most of the information in this part is brought from ns web page [15]).

A.l ns network Simulator

One of the goals of this thesis is to implement Our active Vegas proposai in ns network simulation environment. ns is an event-driven network simulator. An extensible simulation engine is implemented in C++ that uses MIT'S Object Tool Language, OTCL as the command and configuration interface. The interface to ns must be written in TCL. To run the simulator, one needs to be able to program TCL. It's usually convenient to represent the results in xgraph form. However, some of graph is sketched with MATLAB. A network topology is realized using three primitive building blocks: nodes, links and agents. The Simulator class has methods to createkonfigure each of these building blocks. Agents are the objects that actively drive the simulator. Agents can be thought of as the processes andor transport entities that run on nodes that may be hosts or routers. TrafXc sources and sinks are al1 example of agent.

A. 1.1 Links

Links are created between nodes to form a network topology. In this project, we used a duplex-link method that sets up a bi-directional link. Link bandwidth and propagation delay cm be set. Since we are investigating Priority-Queuing scheduling algorithm, PQ objects are attached to each link. PQ objects are a subclass of Queue objects that implement priority based queuing. The only parameter that could be set is limited to the queue size in packet. Default value of queue size is 50. A.1.2 Source and Traffic Types

T~cobject created data for a transport protocol to send. In our simulations we used TCP as transport protocols. However, we implemented active Vegas as a subclass of TCP agent. Al1 of TCP parameters are set as default (e.g.. window size is 20 packets). For TCP protocol, FTP and TELNET source object cm be used. FlT objects produce bulk data for a TCP object to send.

A.1.3 Trace and Monitoring

There are a number of ways of collecting output or trace data on a simulation. Generaily trace data is either displayed directly during execution of the simulation. or (more cornmonly) stored in a file to be post-processed and analyzed. The trace records each individual packet as it arrives, deparis, or is dropped at link or queue. We can also open files in the program to read / wnte any particular variable frodto it.

A.2 Measured Statistics

In this thesis, we mainly focused on two parameters, the end-to-end delay per flow for each source (since al1 f sources transport one type of flow, it could be assumed as end-to end delay for each source) and allocated bandwidth for each source, i.e. source goodput. As explained earlier, trace data is stored in a file. This file includes almost dl the information from the simulation. Since the amount of data is very large, it's impractical to inspect the records without some tools. Therefore several scripts for processing the records was developed in AWK. The raw trace records are first sorted by unique-id of each packet. Then fiom this sorted file we extract those record belong to each flow and then we can calculate end-to- end delay by subtracting the enque time from the source node and receiving time by the destination node. After extracting necessary data hmthe trace record, this information will be ported to MATLAB to find the delay histogram and mean and variance for the corresponding distribution.