Machine Learning for Networking: Workflow, Advances and Opportunities

Home , Computer network, Network performance

ACCEPTED FROM OPEN CALL

Mowei Wang, Yong Cui, Xin Wang, Shihan Xiao, and Junchen Jiang

bstract tion or regression tasks from labeled data, while A USL algorithms focus on classifying the sample Recently, machine learning has been used sets into different groups (i.e., clusters) with unla- in every possible field to leverage its amazing beled data. In RL algorithms, agents learn to find power. For a long time, the networking and dis- the best action series to maximize the cumulat- tributed computing system is the key infrastruc- ed reward (i.e., objective function) by interacting ture to provide efficient computational resources with the environment. The latest breakthroughs, for machine learning. Networking itself can also including deep learning (DL), transfer learning benefit from this promising technology. This arti- and generative adversarial networks (GAN), also cle focuses on the application of MLN, which provide potential research and application direc- can not only help solve the intractable old net- tions in an unimaginable fashion. work questions but also stimulate new network Dealing with complex problems is one of the applications. In this article, we summarize the most important advantages of machine learning. basic workflow to explain how to apply machine For some tasks requiring classification, regression learning technology in the networking domain. and decision making, machine learning may per- Then we provide a selective survey of the lat- form close to or even better than human beings. est representative advances with explanations Some examples are facial recognition and game of their design principles and benefits. These artificial intelligence. Since the network field advances are divided into several network design often sees complex problems that demand effi- objectives and the detailed information of how cient solutions, it is promising to bring machine they perform in each step of MLN workflow learning algorithms into the network domain to is presented. Finally, we shed light on the new leverage the powerful ML abilities for higher net- opportunities in networking design and commu- work performance. The incorporation of machine nity building of this new inter-discipline. Our goal learning into network design and management is to provide a broad research guideline on net- also provides the possibility of generating new working with machine learning to help motivate network applications. Actually, ML techniques researchers to develop innovative algorithms, have been used in the network field for a long standards and frameworks. time. However, existing studies are limited to the use of traditional ML attributes, such as predic- Introduction tion and classification. The recent development With the prosperous development of the Internet, of infrastructures (e.g., computational devices networking research has attracted a lot of atten- like GPU and TPU, ML libraries like Tensorflow tion in the past several decades both in academia and Scikit-Learn) and distributed data processing and industry. Researchers and network operators frameworks (e.g., Hadoop and Spark) provides a can face various types of networks (e.g., wired or good opportunity to unleash the magic power of wireless) and applications (e.g., network securi- machine learning for pursuing the new potential ty and live streaming [1]). Each network applica- in network systems. tion also has its own features and performance Specifically, machine learning for networking requirements, which may change dynamically (MLN) is suitable and efficient for the following with time and space. Because of the diversity and reasons. First, as the best known capabilities of complexity of networks, specific algorithms are ML, classification and prediction play basic but often built for different network scenarios based important roles in network problems such as intru- on the network characteristics and user demands. sion detection and performance prediction [1]. In Developing efficient algorithms and systems to addition, machine learning can also help decision deal with complex problems in different network making, which will facilitate network scheduling scenarios is a challenging task. [2] and parameter adaptation [3, 4], according Mowei Wang, Yong Cui, and Recently, machine learning (ML) techniques to the current states of the environment. Sec- Shihan Xiao are with Tsinghua have made breakthroughs in a variety of applica- ond, many network problems need to interact University. Yong Cui is the tion areas, such as bioinformatics, speech recogni- with complicated system environments. It is not corresponding author. tion and computer vision. Machine learning tries easy to build accurate or analytic models to rep- Xin Wang is with Stony Brook to construct algorithms and models that can learn resent complex system behaviors such as load University. to make decisions directly from data without fol- changing patterns of CDN [5] and throughput Junchen Jiang is with lowing pre-defined rules. Existing machine learn- characteristics [1]. Machine learning can provide Carnegie Mellon University. ing algorithms generally fall into three categories: an estimated model of these systems with accept- supervised learning (SL), unsupervised learning able accuracy. Finally, each network scenario may Digital Object Identifier: (USL) and reinforcement learning (RL). More spe- have different characteristic (e.g., traffic patterns 10.1109/MNET.2017.1700200 cifically, SL algorithms learn to conduct classifica- and network states) and researchers often need to

1 0890-8044/17/$25.00 © 2017 IEEE IEEE Network • Accepted for publication Step 1 Step 2 Step 3 Step 4 Problem formulation Data collection Data analysis Model construction (prediction, regression, (e.g., traffic traces, (preprocessing, feature (offline training and clustering,decision making) performance logs, etc.) extraction) tuning)

Step 6 No Step 5 Deployment and Model validation inference (cross validation, error analysis) (tradeoff on speed, Yes Meet memory, stability and requirements? accuracy of inference)

FIGURE 1. The typical workflow of machine learning for networking. solve the problem for each scenario independent- ly. Machine learning may provide new possibilities These stages are not independent but have inner relationships. This workflow is very similar to the to construct the generalized model via a uniform traditional workflow for machine learning, as network problems are still applications that machine training method [3, 4]. Among efforts in MLN, deep learning has also been investigated and learning can play a role in. applied to provide end-to-end solutions. The latest work in [6] conducts a comprehensive survey on previous efforts that apply deep learning technol- rience (QoE) for live streaming into a real-time ogy in network related areas. exploration-exploitation process rather than as a In this article, we investigate how machine prediction-based problem [7] to well match the learning technology can benefit network design application characteristics. and optimization. Specifically, we summarize the Data Collection: The goal of this step is to typical workflows and requirements for apply- collect a large amount of representative network ing machine learning techniques in the network data without bias. The network data (e.g., traffic domain, which could provide a basic but practi- traces and session logs with performance met- cal guideline for researchers to have a quick start rics) are recorded from different network layers in the area of MLN. Then we provide a selective according to the application needs. For example, survey of the important networking advances with the traffic classification problem often requires the support of machine learning technology, most datasets containing packet-level traces labeled of which have been published in the last three with corresponding application classes [8]. In the years. We group these advances into several context of MLN, data are often collected in two typical networking fields and explain how these phases. In the offline phase, collecting enough prior efforts perform at each step of the MLN high-quality historical data is important for data workflow. Then we discuss the opportunities of analysis and model training. In the online phase, this emerging inter-discipline area. We hope our real-time network state and performance informa- studies can serve as a guide for potential future tion are often used as inputs or feedback signals research directions. for the learning model. The newly collected data can also be stored to update the historical data Basic Workflow for MLN pool for model adaption. Figure 1 shows the baseline workflow for apply- Data Analysis: Every network problem has its ing machine learning in the network field, own characteristics and is impacted by many fac- including problem formulation, data collec- tors, but only several factors (i.e., feature) have the tion, data analysis, model construction, model most effect on the target network performance validation, deployment and inference. These metric. For instance, RTT and the inter-arrival time stages are not independent but have inner of ACK may be the critical features in choosing relationships. This workflow is very similar to the best size of the TCP congestion window [3]. the traditional workflow for machine learning, In the learning paradigm, finding proper features as network problems are still applications that is the key to fully unleashing the potential of data. machine learning can play a role in. In this sec- This step attempts to extract the effective features tion, we explain each step of the MLN work- of a network problem by analyzing the historical flow with representative cases. data samples, which can be regarded as a feature Problem Formulation: Since the training pro- engineering process in the machine learning com- cess of machine learning is often time consuming munity. Before feature extraction, it is important and involves high cost, it is important to correctly to preprocess and clean raw data, through pro- abstract and formulate the problem at the first cesses such as normalization, discretization, and step of MLN. A target problem can be classified missing value completion. Extracting features from into one of the machine learning categories, such cleaned data often needs domain-specific knowl- as classification, clustering and decision making. edge and insights of the target network problem This helps decide what kind of and the amount of [5], which is not only difficult but time-consuming. data to collect and the learning model to select. Thus in some cases deep learning can be a good An improper problem abstraction may provide choice to help automate feature extractions [2, 6]. an unsuitable learning model, which can result in Model Construction: Model construction unsatisfactory learning performance. For exam- involves model selection, training and tuning. A ple, it is better to cast the optimal quality of expe- suitable learning model or algorithm needs to be

IEEE Network • Accepted for publication 2 selected according to the size of the dataset, typi- ly used to test the overall accuracy of the model cal characteristics of a network scenario, the prob- in order to show if the model is overfitting or lem category, and so on. For example, accurate under-fitting. This provides good guidance on how throughput prediction can improve the bitrate to optimize the model, e.g., increasing the data vol- adaption of Internet video, and a Hidden-Mar- ume and reducing model complexity when there kov model may be selected for prediction due exists overfitting. Analyzing wrong samples helps to the dynamic patterns of stateful throughput find the reasons for errors to determine whether [1]. Then the historical data will be used to train the model and the features are proper or the data a model with hyper-parameter tuning, which will are representative enough for a problem [5, 8]. take a long period of time in the offline phase. The procedures in the previous steps may need to The parameter tuning process still lacks enough be re-taken based on the error sources. theoretical guidance, and often involves a search Deployment and Inference: When implement- in a large space to find acceptable parameters or ing the learning model in an operational network to tune by personal experiences. environment, some practical issues should be Model Validation: Offline validation is an considered. Since there are often limitations on indispensable step in the MLN workflow to eval- computation or energy resources and require- uate whether the learning algorithm works well ments on the response time, the tradeoff between enough. During this step, cross validation is usual- accuracy and the overhead is important for the

Networking Steps of MLN workflow application

Data collection Problem Deployment and online Objectives Specific works Data analysis Offline model construction formulation Online inference Offline collection measurement Combine data of platforms Infor- with a few powerful VPs in Take users’ Construct RuleFit model to Optimize measurement budget Sibyl [11]: route SL: prediction with mation homogeneous deployment and queries as input / assign confidence to each in each round to get the best measurement RuleFit cognition with many limited VPs around round by round predicted path query coverage the world

Training HMM model with Ref [9]: traffic SL: prediction with The flow count and the traffic Take flow statistics as input and Traffic Synthetic and real traffic traces Only observe the Kernel Bayes Rule and Recurrent volume Hidden-Markov volume have significant obtain the output of the traffic prediction with flow statistics flow statistics Neural Network with Long Short prediction Model (HMM) correlation volume Term Memory unit

Traffic SL and USL: Flow statistical Zero-day-application exists RTC [8]: traffic Labeled and unlabeled traffic Find the Zero-day-application Inference with the trained model classifica- clustering and features extracted and may degrade the classification traces class and training the classifier to output the classification results tion classification from traffic flows classification accuracy

Resource RL: decision Synthetic workload with The real time Action space is too large and DeepRM [13]: Offline training to update the Directly schedule the arrival jobs manage- making with different patterns is used for resource demand may has conflicts between job scheduling policy network with the trained model ment deep RL training of the arrival job actions

It is difficult to characterize SL: decision Take the Layer-Wise training Record and collect the traffic Traffic patterns labeling with Online traffic the input and output patterns Ref [2]: routing making with Deep to initialize and the patterns in each router routing paths computed by patterns in each to reflect the dynamic nature strategy Belief Architectures backpropagation process to periodically and obtain the next OSPF protocol router of large-scale heterogeneous (DBA) fine-tune the DBA structure routing nodes from the DBAs networks RL: decision Pytheas [7]: Session quality Application sessions sharing Backend cluster determines the Frontend performs the group- making with a Session quality information with general QoE information in the same features can be session groups using CFA [5] based exploration-exploitation variant of UCB features in large time scale optimization small time scale grouped with a long time scale strategy in real time Network algorithm adaption Given network assumption the Directly implement the Remy [3]: TCP RL: decision Calculate network Collect experience from Select the most influential generated algorithm interact with Remy-generated algorithm to congestion making with a state variables network simulator metrics as state variables simulator to learn best actions corresponding network control tabular method with ACK according to states environment Calculate the TCP assumptions are often Take trials with different sending PCC [4]: TCP RL: decision utility function violated. The direct rates and find the best rate congestion making with online / / according the performance is a better according to the feedback utility control learning received SACK signal function Take session CFA [5]: USL: clustering Datasets consisting of quality Similar sessions are with Critical feature learning in features as input, Look up feature-quality table to video QoE with self-designed measurements are collected similar quality determined by minutes scale and quality such as Bitrate, respond to real-time query optimization algorithm from public CDNs critical features estimation in tens of seconds Perfor- CDN, Player, etc. mance A new session is mapped to the prediction CS2P [1]: Take users’ s Sessions with similar features Find set of critical feature and SL: prediction with Datasets of HTTP throughput most similar session cluster and throughput session features tend to behave in related learn a HMM for each cluster of HMM measurement from iQIYI corresponding HMM are used to prediction as input pattern similar sessions predict throughput

Config- cherryPick SL: parameter Take trials with different Take performance under Large configuration space uration [15]: cloud searching with configurations and decide the current configuration as model / and heterogeneous / extrapola- configurations Bayesian next trial direction by Bayesian input applications tion extrapolation optimization Optimization model

TABLE 1. Relationships between latest advances and MLN workflow.

3 IEEE Network • Accepted for publication performance of the practical network system [7]. In addition, machine learning often works in a Traffic prediction and classification are two of the earliest machine learning applications in the net- best-effort way and does not provide any perfor- working field. Because of the well formulated question descriptions and demands from various mance guarantee, which requires system design- ers to consider fault tolerance. Finally, practical subfields of networking, studies of the two topics always maintain a certain degree of popularity. applications often require the learning system to take real-time input, and obtain the inference and output the corresponding policy online. Traffic Prediction and Classification Traffic prediction and classification are two of Overview of Recent Advances the earliest machine learning applications in the Recent breakthroughs of deep learning and other networking field. Because of the well formulated promising machine learning techniques have a question descriptions and demands from various non-ignorable influence on new attempts of the subfields of networking, studies of the two topics network community. Existing efforts have led to always maintain a certain degree of popularity. several considerable advances in different sub- Traffic Prediction: As an important research fields of networking. To illustrate the relationship problem, the accurate estimation of traffic volume between these up-to-date advances and the MLN (e.g., the traffic matrix) is beneficial to congestion workflow, in Table 1 we divide literature studies control, resource allocation, network routing, and into several application scenarios and show how even high-level live streaming applications. There they perform at each step of the MLN workflow. are mainly two directions of research, time series Without ML techniques, the typical solutions analysis and network tomography, which can be for these advances are involved with time-se- simply distinguished depending on if it conducts ries analytics [1, 9], statistical methods [1, 5, 7, traffic prediction with direct observations or not. 8] and rule-based heuristic algorithms [2–5, 10], However, it is expensive to directly measure traf- which are often more interpretable and easier to fic volume, especially in a large-scale high speed implement. However, ML-based methods have a network environment. stronger ability to provide a fine-grained strategy Many existing studies focus on reducing the and can achieve higher prediction accuracy by measurement cost by using indirect metrics rather extracting hidden information from historical data. than only trying different ML algorithms. There As a big challenge of ML-based solutions, the fea- are two methods to handle this problem. One sibility problem is also discussed in this section. is to take more human effort to develop sophis- ticated algorithms by exploring domain-specific Information Cognition knowledge and undiscovered data patterns. As Since data are the fundamental resource for MLN, an example, the work in [9] attempts to predict information (data) cognition with high efficiency is traffic volume according to the dependence critical to capture the network characteristics and between flow counts and flow volume. Another monitor network performance. However, due to method is inspired by the end-to-end deep learn- the complex nature of existing networks and the ing approach. It takes some easily obtained infor- limitations of measurement tools and architec- mation (e.g., bits of a header in the first few flow tures, it is still not easy to access some types of packets) as direct input and extract features auto- data (e.g., trace route and traffic matrix) within matically with the help of the learning model [10]. acceptable granularity and cost. With its capa- Traffic Classification: As a fundamental bility for prediction, machine learning can help function component in network management evaluate network reliability or the probability of a and security systems, traffic classification match- certain network state. As the first example, Inter- es network applications and protocols with the net route measurements help monitor network corresponding traffic flows. The traditional traf- running states and troubleshoot performance fic classification methods include the port-based problems. However, due to insufficient usable approach and the payload-based approach. The vantage points (VP) and a limited probing bud- port-based approach has been proved to be get, it is impossible to execute each route query ineffective due to unfixed or reused port assign- because the query may not match any previously ments, while the payload-based approach suffers measured path or the path may have changed. from privacy problems caused by deep packet Sibyl [11] attempts to predict the unseen paths inspection, which can even fail in the presence and assign confidence to them by using a super- of encrypted traffic. As a result, machine learn- vised machine learning technique called RuleFit. ing approaches based on statistical features have The learning relies on data acquisition, and been extensively studied in recent years, espe- MLN also requires a new scheme of data cog- cially in the network security domain. However, nition. In MLN, it often needs to maintain an it is not easy to consider machine learning as an up-to-date global network state and perform real- omnipotent solution and deploy it into a real- time responses to client demands, which needs world operational environment. For instance, to measure and collect the information in the unlike the traditional machine learning application core network. In order to enable the network to to identify if a figure is a cat or not, it will create a perform diagnostics and make decisions by itself big cost with a misclassification in the context of with the help of machine learning or cognitive network security. Generally, these studies range algorithms, a different network architecture, the from all-known classification scenarios to a more Knowledge Plane [12], was presented that can realistic situation with unknown traffic (e.g., zero- achieve automatic information cognition, which day application traffic [8]). This research roadmap has inspired the following efforts that leverage is very similar to the machine learning technology ML or data-driven methods to enhance network that evolves from supervised learning to unsuper- performance. vised and semi-supervised learning, which can be

IEEE Network • Accepted for publication 4 Prior knowledge Remy Reward: objective function Traffic Web traffic, video conferencing model batch processing, mixture

Range of: the bottleneck link speeds, Network Agent: non-queueing delays, queue sizes, Environment: State: assumptions tabular degrees of multiplexing NS-2 Feedback signals: network method simulator ACK, RTT variables with greedy search RemyCC

fi State-action Network state: — > cwnd parameter, mapping given: traffic model & network assumptions Action: cwnd parameter

FIGURE 2. Remy’s mechanism illustration [3].

treated as a pioneer paradigm to import machine Several attempts have been made to optimize learning into networking fields. the TCP congestion control algorithm using the reinforcement learning approach due to the dif- Resource Management and Network Adaption ficulty of designing a congestion control algo- Efficient resource management and network adap- rithm that can fit all network states. To make tion are the keys to improving network system the algorithm self-adaptive, Remy [3] takes the performance. Some example issues to address are target network assumptions and traffic model as traffic scheduling, routing [2], and TCP congestion prior knowledge to automatically generate the control [3, 4]. All these issues can be formulated specific algorithm, which achieves an amazing as a decision-making problem [13]. However, it is performance gain in many circumstances. In the challenging to solve these problems with a rule- offline phase, Remy tries to learn a mapping, i.e., based heuristic algorithm due to the complexity of RemyCC, between the network state and the cor- diverse system environments, noisy inputs and diffi- responding parameters of the congestion window culty in optimizing the tail performance [13]. Spe- (cwnd) by interacting with the network simulator. cifically, arbitrary parameter assignments based on In the online phase, whenever an ACK is received, experiences and action taken following predeter- RemyCC looks up its mapping table and changes mined rules often result in a scheduling algorithm its cwnd behavior according to the current net- that is understood by people but far from optimal. work state. The mechanism of Remy is illustrated Deep learning is a promising solution due to in Fig. 2. Without the specific network assump- its ability to characterize the inherent relation- tions, a performance-oriented attempt, PCC ships between the inputs and outputs of network [4], can benefit from its online-learning nature. systems without human involvement. In order Although these TCP-related efforts still focus on to meet the requirements of changing network decision making, they take the first important step environments, previous efforts in [2, 14] design toward automated protocol design. a traffic control system with the support of deep learning techniques. Reconsidering backbone Network Performance router architectures and strategies, it takes the Prediction and Configuration Extrapolation traffic pattern in each router as input and outputs Performance prediction can guide decision mak- the next nodes in the routing path with Deep ing. Some example applications are video QoE Belief Architectures. These advancements unleash prediction, CDN location selection, best wireless the potential of the DL-based strategy in network channel selection, and performance extrapolation routing and scheduling. Harnessing the powerful under different configurations. Machine learning representational ability of deep neural networks, is a natural approach to predict system states for deep reinforcement learning achieves great better decision making. results in many AI problems. Typically, there are two general prediction sce- DeepRM [13] is the first work that applies a narios. First, the system owner has the ability to deep RL algorithm for cluster resource scheduling. get various and enough historical data, but it is Its performance is comparable to state-of-the-art non-trivial to build a complex prediction model heuristic algorithms but with less cost. The QoE and update it in real time, which requires a new optimization problem can also benefit from the approach exploiting domain-specific knowledge RL learning methodology. Unlike previous efforts, to simplify the problem (e.g., CFA [5] for video Pytheas [7] regards this problem as an explora- QoE optimization). In prior work, CS2P [1] wants tion-exploitation-based problem rather than a to improve video bitrate selection with accu- prediction-based problem. As a result, Pytheas rate prediction. It finds that sessions with similar outperforms state-of-the-art prediction-based sys- key features may have more related throughput tems by lessening the prediction bias and delayed behavior from data analysis. CS2P learns to clus- response. From this perspective, machine learning ter similar sessions offline and trains different Hid- may help achieve the close-loop of “sensing-anal- den-Markov Models for each cluster to predict ysis-decision,” especially in wireless sensor net- the corresponding throughput given the current works, where the three actions are separated session information. CS2P reinforces the correla- from each other at present. tion of similar sessions in the training process,

5 IEEE Network • Accepted for publication Networking application Computation speed

Objectives Specific works Offline time cost Online time cost Device information

Training 100,000 samples with When <400 routers: / 1000 routers: Ref [2]: routing strategy ~100,000 s >100 ms Intel i7-6900 K

~1,000 s <1 ms The Nvidia Titan X Pascal Network adaption Session-grouping: find 200 Pytheas [7]: general QoE 2.4 GHz, 8 cores and 64 GB groups per minute with 8.5 Not mentioned optimization RAM million sessions

Remy [3]: TCP congestion Amazaon EC2 and 80-core and A few hours Not mentioned control 48-core server

Quality estimation: ~30.7 s every 1–5 min Critical feature learning: CFA [5]: video QoE optimization Two clusters of 32 cores ~30.1 min every 30–60 min Query response: – 0.66 ms every 1 ms Performance prediction Server side: ~150 predictions Intel i7-2.2 GHz, 16 GB RAM, per second Mac OS X 10.11 CS2P [1]: throughput prediction Not mentioned Client side: <10 ms per Intel i7-2.8 GHz, 8 GB RAM, prediction Mac OS X 10.9 TABLE 2. Processing time of selective advances. which outperforms approaches with one single lems. Other reasons that prevent the application model. This is very similar to the above mentioned of ML techniques include the lack of labeled data, traffic prediction problem, since they both pas- high system dynamics and high cost brought by sively fit the runtime ground-truth with a certain learning errors. metric. As another prediction scenario, little historical data exist and it is infeasible to obtain rep- Opportunities for MLN resentative data by conducting performance tests The prior efforts mostly focus on the generalized due to high trial costs in real network systems. To concepts of prediction and classification and deal with this dilemma, cherrypick [15] leverages few can get out of this scope to explore other the Bayesian Optimization algorithm to minimize possible applications. However, with the latest pre-run rounds with a directional guidance to breakthroughs in machine learning and its infra- collect representative runtime data of workloads structures, new potential demands may appear in under different configurations. network disciplines. Some opportunities are intro- duced as follows. Feasibility Discussion One big challenge faced by ML-based methods Open Datasets for the Networking Community is their feasibility. Since many networking applica- Collecting a large amount of high quality data that tions are delay-sensitive, it is non-trivial to design contain both network profiles and performance a real-time system with heavy computation loads. metrics is one of the most critical issues for MLN. To make it practical, a common solution is to train However, acquiring enough labeled data is still the model with global information for a long peri- expensive and labor intensive even in today’s od of time and incrementally update the model machine learning community. For many reasons, with local information in a small time scale [5, it is not easy for researchers to acquire enough 7], which trades off between the computation real trace data even if there are many existing overhead and information staleness. In the online open datasets in the networking domain. phase, the common case is to look up the result This reality drives us to learn from the machine table or draw the inference with a trained model learning community to put much more effort into to make real-time decisions. The processing time constructing open datasets like ImageNet. With in the above advances are selectively listed in unified open datasets, performance benchmarks Table 2, which shows that ML has practical values are an inevitable outcome to provide a standard with the system well-designed. In addition, the platform for researchers to compare their new robustness and generalization of a design are also algorithms or architectures with state-of-the- important for feasibility and are discussed later. art ones. This can reduce the unrepresentative From these perspectives, ML in its current state repeated experiments and have a positive effect is not suitable for all networking problems. The on academic loyalty. In addition, it has been network problems solved with ML techniques so proved in the machine learning domain that learn- far are more or less related to prediction, classi- ing with a simulator rather than in a real environ- fication and decision-making, while it is difficult ment is more effective and with lower cost in to apply machine learning to other types of prob- RL scenarios [3]. In the networking domain, due

IEEE Network • Accepted for publication 6 provides a new opportunity to support flexible The current network components are likely to be added based on people’s understanding at a time objective function and cross-layer optimization. instant rather than a paragon of engineering. There is still enough room for us to improve network It is very convenient to change the optimization goal just by changing the reward function in the performance and efficiency by redesigning the network protocol and architecture. learning model, which is impossible with a traditional heuristic algorithm. Also, the system may perceive high-level application behaviors or QoE to the limited accessibility and high test cost of metrics as a reward, which may enable adap- large-scale network systems, simulators with suf- tive cross-layer optimization without the network ficient fidelity, scalability and high running speed model. In practice, it is nontrivial to design an are also required. These items contribute to both effective reward function. The simplest reward MLN and further development of the networking design principle is to set the direct goal that needs domain, and public resources also make it possi- to be maximized as the reward. However, it is ble for the community to conduct research. often difficult to capture the exact optimization objective, and as a result we end up with an Automated Network Protocol and Architecture Design imperfect but easily obtained metric instead. In With a deeper understanding of the network, most cases it works well, but sometimes it leads researchers gradually find that the existing net- to faulty reward functions that may result in unde- work has many limitations. The network system is sired or even dangerous behavior. totally created by human beings. The current network components are likely to be added based Improving the Comprehension of Network Systems on people’s understanding at a time instant rath- Network behavior is quite complex due to the er than a paragon of engineering. There is still end-to-end network design principle, which gen- enough room for us to improve network perfor- erates various protocols that have simple actions mance and efficiency by redesigning the network in the end system but causes nontrivial in-network protocol and architecture. behavior. From this perspective, it is not easy to It is still quite difficult to design a protocol or figure out what factors can directly affect a certain architecture automatically today. However, the network metric and can be simplified during an machine learning community has made some of algorithm design process even in a mature net- the simplest attempts in this direction and has work research domain like TCP congestion con- achieved some amazing results, such as letting trol. However, with the help of machine learning agents communicate with others to finish a task methods, people can analyze the output of learn- cooperatively. Other new achievements, e.g., ing algorithms through a posterior approach to GAN, have also shown that the machine learning find useful insights for us to understand how the model has the ability to generate elements exist- network behaves and how to design a high per- ing in the real world and create strategies people formance algorithm. do not discover (e.g., AlphaGo). However, these For a detailed explanation, DeepRM [13], generated results are still far from the possibility of a resource management framework, is a good protocol design. There is great potential and the example. To understand why DeepRM performs possibility to create new feasible network com- better, the authors find that DeepRM is not ponents without human involvement, which may work-conserving but decides to reserve room refresh human’s understanding of network sys- for those yet-to-arrive small jobs, which eventu- tems and propose some currently unacceptable ally contributes to reducing job waiting time. For destructive-reconstruction frameworks. other evidence, refer to CFA [5] and Remy [3] and their following works, which provide insights Automated Network Resource Scheduling and for key influence factors in video QoE optimiza- Decision Making tion and TCP congestion control, respectively. It is hard to conduct online scheduling with a principle-based heuristic algorithm due to the Promoting the Development of Machine Learning uncertainty and dynamics of network conditions. When applying machine learning into networking In the machine learning community, it has been fields, due to specific requirements of network proved that reinforcement learning has strong systems and practical implementation problems, capability to deal with decision making problems. some inherent limitations and other emerging The recent breakthrough of Go also proves that problems of machine learning can be pushed for- ML can make not only coarse but precise deci- ward to a new understanding stage with the joint sion, which is beyond people’s common sense. efforts of two research communities. Although it is not easy to directly apply an explo- Typically, there are several problems that are ration-exploitation strategy in highly-varying net- expected to be resolved. First, the robustness work environments, reinforcement learning can of machine learning algorithms is a key chal- be a candidate to replace adaptive algorithms lenge for applications (e.g., self-driving cars and of the present network system. Related efforts network operation) in real-world environments can refer to [3, 4, 7, 13]. In addition, reinforce- where learning errors could lead to high costs. ment learning is highly suitable for problems The networking situation often requires hard con- where several undetermined parameters need to straints on the algorithm output and the worst be assigned adaptively according to the network performance guarantee. Second, a model with state. However, these methods introduce new high generalization ability that can adapt in the complexity and uncertainty into the network sys- high-variance and dynamic traffic circumstances tem itself while the stability, reliability and repeat- is needed, since it is unacceptable to retrain the ability are always the goals of network design. model every time the characteristics of network Moreover, network scheduling with RL also traffic change. Although some of the experiments

7 IEEE Network • Accepted for publication show that the model trained under a specific network environment can, to some degree, achieve Due to the heterogeneity of networking systems, it is imperative to embrace machine learning good performance in other environments [3], it techniques in the networking domain for potential breakthroughs. However, it is not easy for is still not easy because most machine learning algorithms assume that the data follow the same networking researchers to take it into practice due to the lack of machine learning related distribution, which is not practical in networking experiences and insufficient directions. environments. In addition, the accountability and interpretability [3] of machine learning algorithms [10] P. Poupart et al., “Online Flow Size Prediction for Improved create big obstacles in practical implementations, Network Routing,” Proc. IEEE 24th Int’l. Conf. Network Pro- since many learning models, especially for deep tocols (ICNP), 2016, pp. 1–6. learning, are still black box. People do not know [11] I. Cunha et al., “Sibyl: A Practical Internet Route Oracle,” why and how it behaves, hence people cannot Proc. NSDI 2016, pp. 325–44. [12] D. D. Clark et al., “A Knowledge Plane for the Internet,” interfere with the policy. Proc. SIGCOMM 2003, ACM, pp. 3–10. [13] H. Mao et al., “Resource Management with Deep Rein- Conclusions forcement Learning,” Proc. HotNets 2016, pp. 50–56. Due to the heterogeneity of networking systems, [14] N. Kato et al., “The Deep Learning Vision for Heteroge- neous Network Traffic Control: Proposal, Challenges, and it is imperative to embrace machine learning tech- Future Perspective,” IEEE Wireless Commun., 2016. niques in the networking domain for potential [15] O. Alipourfard et al., “Cherrypick: Adaptively Unearthing breakthroughs. However, it is not easy for net- the Best Cloud Configurations for Big Data Analytics,” Proc. working researchers to take it into practice due to NSDI 2017, pp. 469–82. the lack of machine learning related experiences Biographies and insufficient directions. In this article, we pres- Mowei Wang received the B.Eng. degree in communication ent a basic workflow to provide researchers with a engineering from Beijing University of Posts and Telecommuni- cations, Beijing, China, in 2017. He is currently working toward practical guideline to explore new machine learn- his Ph.D. degree in the Department of Computer Science and ing paradigms for future networking research. For Technology, Tsinghua University, Beijing, China. His research a deeper comprehension, we summarize the lat- interests are in the areas of data center networks and machine est advances in machine learning for networking, learning. which covers multiple important network tech- Yong Cui received the B.E. degree and the Ph.D. degree, both niques, including measurement, prediction and in computer science and engineering, from Tsinghua University, scheduling. Moreover, numerous issues are still China, in 1999 and 2004, respectively. He is currently a full pro- open and we shed light on the opportunities that fessor in the Computer Science Department in Tsinghua Univer- sity. He has published over 100 papers in refereed conferences need further research effort from both the net- and journals with several Best Paper Awards. He has co-authored working and machine learning perspectives. seven Internet standard documents (RFC) for his proposal on IPv6 technologies. His major research interests include mobile cloud Acknowledgment computing and network architecture. He served or serves on the editorial boards of IEEE TPDS, IEEE TCC and IEEE Internet Com- This work is supported by NSFC (no. 61422206), puting. He is currently a working group co-chair in IETF. TNList and the “863” Program of China (no. 2015AA016101). We would also like to thank Xin Wang received the B.S. and M.S. degrees in telecommuni- Keith Winstein from Stanford University for his cations engineering and wireless communications engineering, respectively, from Beijing University of Posts and Telecommu- helpful suggestions to improve this article. nications, Beijing, China, and the Ph.D. degree in electrical and computer engineering from Columbia University, New York, References NY. She is currently an associate professor in the Department [1] Y. Sun et al., “CS2P: Improving Video Bitrate Selection and of Electrical and Computer Engineering, State University of New Adaptation with Data-Driven Throughput Prediction,” Proc. York at Stony Brook, Stony Brook, NY. Before joining Stony SIGCOMM 2016, ACM, pp. 272–85. Brook, she was a member of technical staff in the area of mobile [2] B. Mao et al., “Routing or Computing? The Paradigm Shift and wireless networking at Bell Labs Research, Lucent Technol- Towards Intelligent Computer Network Packet Transmission ogies, New Jersey, and an assistant professor in the Department Based on Deep Learning,” IEEE Trans. Computers, 2017. of Computer Science and Engineering, State University of New [3] K. Winstein and H. Balakrishnan, “TCP Ex Machina: Comput- York at Buffalo, Buffalo, NY. Her research interests include algo- er-Generated Congestion Control,” Proc. ACM SIGCOMM rithm and protocol design in wireless networks and communica- Computer Commun. Rev., vol. 43, no. 4, ACM, 2013, pp. tions, mobile and distributed computing, and networked sensing 123–34. and detection. She has served on the executive committee and [4] M. Dong et al., “PCC: Re-Architecting Congestion Control technical committee of numerous conferences and funding for Consistent High Performance,” Proc. NSDI 2015, pp. review panels, and serves as an associate editor for IEEE Trans- 395–408. actions on Mobile Computing. She achieved the NSF CAREER [5] J. Jiang et al., “CFA: A Practical Prediction System for Video Award in 2005 and the ONR Challenge Award in 2010. QoE Optimization,” Proc. NSDI 2016, pp. 137–50. [6] Z. Fadlullah et al., “State-of-the-Art Deep Learning: Evolv- Shihan Xiao received the B.Eng. degree in electronic and ing Machine Intelligence Toward Tomorrow’s Intelligent information engineering from Beijing University of Posts and Network Traffic Control Systems,”IEEE Commun. Surveys & Telecommunications, Beijing, China, in 2012. He is currently Tutorials, 2017. working toward his Ph.D. degree in the Department of Comput- [7] J. Jiang et al., “Pytheas: Enabling Data-Driven Quality of er Science and Technology, Tsinghua University, Beijing, China. Experience Optimization Using Group-Based Exploration-Ex- His research interests are in the areas of wireless networking ploitation,” Proc. NSDI 2017, pp. 393–406. and cloud computing. [8] J. Zhang et al., “Robust Network Traffic Classification,”IEEE/ ACM Trans. Networking (TON), vol. 23, no. 4, 2015, pp. Junchen Jiang is a Ph.D. candidate in the Computer Science 1257–70. Department at Carnegie Mellon University, Pittsburgh, PA, USA, [9] Z. Chen, J. Wen, and Y. Geng, “Predicting Future Traffic where he is advised by Prof. Hui Zhang and Prof. Vyas Sekar. Using Hidden Markov Models,” Proc. IEEE 24th Int’l. Conf. He received the Bachelor’s degree in computer science and Network Protocols (ICNP) 2016, pp. 1–6. technology from Tsinghua University, Beijing, China, in 2011.

IEEE Network • Accepted for publication 8