Towards an Explanatory Model for Network Traffic

Towards An Explanatory Model for Network Traffic Jorge Gonzalez Joshua Clymer Chad Bollmann Dept. of Mathematical Sciences SEAP Intern Dept. of Electrical & Computer Engineering Florida Atlantic University Naval Postgraduate School Naval Postgraduate School Boca Raton, FL, USA Monterey, CA, USA Monterey, CA, USA [email protected] [email protected] Abstract—This work presents two explanatory mathematical tailed distributions per the Generalized Central Limit Theorem models explaining how network traffic features that display Gaus- (GCLT). sian tendencies in single devices and small networks aggregate The primary contribution of this work is outlining our to alpha-stable processes in larger networks. The first model shows how self-similarity originates from an impulsive-noise- explanation of device traffic aggregation to an alpha-stable based representation of individual processes. A second model or Gaussian result. We use this theory to develop a basic uses renewal processes to justify impulsive process aggregation aggregation model that leads to alpha-stable distributed traffic. to alpha-stable or Gaussian end states and permits estimating network traffic alpha-stable rates of convergence. We develop II. THE IMPULSE MODEL a model based on this first method to empirically validate this aggregation approach. The goal of this model is to explain the aggregation of traffic Index Terms—alpha-stable, computer network traffic model, by looking inside the sub-windows where aggregations occur. heavy-tail, self-similar We define an impulse as a group of time stamps (packets) within a sub-window that are related by a unique source and I. INTRODUCTION destination IP pair (ip0; ip1). Different impulses are assumed We propose explanatory, complementary models for ag- to be independent. gregated network traffic, starting from the device level. Our Impulses are ordered in such a way that P(Yi = a) does purpose is to understand how network traffic aggregates from not depend on i, where Yi is the volume of the ith impulse. individual sources and investigate the tendency of certain The total volume of traffic in a sub-window is given by network traffic features including packet rate to trend towards the sum of the impulses within it, the number of which is alpha-stable [1], [2] in larger networks, while the same features described by a distribution E. V denotes the distribution of exhibit non-parametric or Gaussian distributions in smaller the volumes of all impulses. Traffic in a generic sub-window networks. is given by S = Y1 + Y2 + ::: + Ye, where e sampled The central theme of our approach is an impulse-based from E. The Yi’s are assumed to be independent since they model of traffic from individual processes on a single device. represent the volume of different processes within a sub- We observe that these traffic processes frequently are relatively window and are also identically distributed by construction. periodic and consistent in terms of volume, as illustrated in Figure 1, a rate plot of traffic to a single, centrally- managed device. The impulse characterization of traffic is 9000 justified by the fact that even long transmissions (e.g., a video on YouTube) are sent as short bursts of traffic rather than 7500 constant-rate, lengthier transmissions. Intuitively, the aggregation and perturbation of these traffic 6000 impulses from hundreds of devices leads to the frequently- observed self-similarity in aggregated network traces [3], [4]. Rate [packets/s] Rate 4500 Present throughout the trace: Music Buffering These models have historically relied on Levy´ processes Network and Windows Backups 3600-5200 pps 80, 200 pps fXt : t ≥ 0g, including Poisson and fractional Brownian motion [5]; we propose an alternative Levy´ process, the alpha- 3000 stable, due to its deep theoretical connections with well- YouTube Buffering 600 - 1200 pps Local Routines documented characteristics of traffic. More specifically, Levy´ 1500 1200 and 200 pps processes are SS if and only if they are alpha-stable [6] and are the domain of attraction of random variables with heavy- 0 750 900 0 150 300 Time [s] 450 600 This work was funded by the NPS Naval Research Program under proposal NRP-18-039A. Joshua Clymer is with the ASEE Science and Engineering Figure 1. Rate plot of several minutes of traffic received at local host with Apprenticeship Program. periods of music and video streaming. Their common distribution is approximated by V . We provide and inactivity periods are shadowed by Power distributions empirical evidence that V is well approximated by heavy- for a considerably long period of time, nevertheless a shard tailed functions belonging to the domain of attraction of alpha deviation from this trend is clearly expected at some point. stable distributions. The asymptotic behavior of the aggregation S can now be studied using the GCLT. Specifically, we can estimate the convergence rate under the stronger assumption that Y1 lies Activity Periods in the strong domain of attraction of a stable distribution [7]. 102 Inactivity Periods The convergence rate serves to estimate the expected error 101 in the stable fits. 0 A. Aggregation from sub-windows to window: Ergodicity 10 Intuitively, we think of a window as a set of consecutive Frequency 10−1 samples of sub-windows (typically between 600 and 1000). −2 The distribution of the aggregation of the random variables 10 defined above determines the outcome of randomly selecting 10−3 sub-windows within a stationary trace. 10−2 10−1 100 When interpreted as a Bernoulli scheme the model inherits Duration (seconds) the stronger condition of ergodicity, which implies that the distribution of a large enough window approximates the sample Figure 2. The probability distributions of packet flow duration and the distribution of the aggregation S. interruption periods between packet flows related to the same process. Although the classical argument using moments is unavail- able, we can justify this convergence using several other A packet flow is a string of packets related by (ip0; ip1) methods. The property is also demonstrated empirically in the possibly extending several sub-windows (i.e a consecutive longer manuscript. group of impulses). Two packets belong to the same flow if they are less than two sub-windows apart. We also define the B. Aggregation using Renewal processes random variables Sk and Ek given by We can also model traffic as the aggregation of renewal k−1 k−1 processes per Taqqu and Levy´ in [8] Specifically, they look at X X S = S + U + F k ≥ 1 processes of the form k 0 j j j=0 j=0 (2) T M ∗ X X Ek = Sk + Uk k ≥ 0 X (T;M) = Xm;t (1) t=1 m=1 representing the start-time and end-time of a packet flow respectively. Ik = (Sk;Ek] denotes the kth ON interval. where for each t, the random variables Xm;t : 1 ≤ m ≤ M are i.i.d. copies of a renewal process, and are interpreted as as a set Finally, we define the random variables of similar processes (say streaming from many different users), ( 1 wk \ Ik 6= ; whereas we interpret the index t as ranging across different δk = 0 otherwise processes or network activities in the distribution sense. ∗ The accumulation X (T;M) approaches a Gaussian frac- where wk refers to the kth sub-window in the trace. tional Brownian motion when T << M, and a stable process A signal is now expressed as when T >> M. See [8] for a quick note on how these two 1 self-similar processes differ. We can assume that a certain X X = W δ (3) process is identically distributed across devices if we restrict t k k k=0 ourselves to relatively short windows. PT For a given k ≥ 0 we think of Wk as an independent copy We now interpret the expression t=1 Xt as the superposi- of the random variable of number of packets over time rates tion of the volume of several processes at a sub-window. The with common distribution R, which we now assume to be sum of m copies of X(t) in a given sub-window suggests the truncated (Wk are assumed to posses finite second moments). traffic of m ”similar” processes. We expect that the relation Uk represents an independent copy of the packet flow duration T >> M is satisfied in large networks and in traces captured with distribution U and at busy nodes due to the increased effect of noise due to similarly we now introduce Fk denotes the OFF period the multi-tasking nature of a device using multiple network duration with distribution F . Both Uk and Fk will be assumed sockets. to satisfy the same conditions on Uk as in [8], namely they are i.i.d. and have finite variance or belong to the domain of C. Simulations attraction of a stable distribution with 1 ≤ α ≤ 2. In addition, These simple models capture some of the main properties Wk is independent of Uk and Fk. Figure 2 shows how activity of real network traffic which depends on a great number of hard to quantify factors by describing it in the following way: 1 d ks distance =0.053 Real Traffic(x1; :::) = Toy Model1(V; E) + error1 0.5 Empirical CDF d Simulated CDF = Toy Model (R; T; M; U; F ) + error2 0 2 200 300 400 500 600 700 800 900 1000 1100 1200 1 where both error1 and error1 go to zero asymptotically. Notice ks distance = 0.095 that these models are very much related. We think of them in 0.5 Empirical CDF Simulated CDF 0 terms of the relation: 100 200 300 400 500 600 700 800 900 1000 d 1 Toy Model2(R; T; M; U; F ) = Toy Model1(V; E)+error(U; F ): ks distance = 0.104 0.5 Empirical CDF Model 2 is analogous to the M=G=1 construction due to Cumulative Probability Simulated CDF 0 Cox .

Towards an Explanatory Model for Network Traffic

Introduction to Lévy Processes

A Mathematical Theory of Network Interference and Its Applications

Introduction to Lévy Processes

Fclts for the Quadratic Variation of a CTRW and for Certain Stochastic Integrals

Levy Processes

Financial Modeling with L´Evy Processes and Applying L

Models Beyond the Dirichlet Process

Final Report (PDF)

Modern Bayesian Nonparametrics

Multifractal Analysis of Superprocesses with Stable

BEJ 22 4.Pdf

Random Walks at Random Times: Convergence to Iterated Lévy