Applications of Machine Learning in Cable Access Networks Karthik Sundaresan, Nicolas Metts, Greg White; Albert Cabellos-Aparicio Cablelabs; UPC Barcelonatech
Total Page:16
File Type:pdf, Size:1020Kb
Applications of Machine Learning in Cable Access Networks Karthik Sundaresan, Nicolas Metts, Greg White; Albert Cabellos-Aparicio CableLabs; UPC BarcelonaTech. Abstract proven to be powerful components in the Machine Learning space. Recent advances in Machine Learning algorithms have resulted in an explosion of Potential applications for Machine real-world applications, including self-driving Learning abound in the areas of network cars, face and emotion recognition, automatic technology and applications. It can be used to image captioning, real-time speech intelligently learn the various environments of translation and drug discovery. networks and react to dynamic situations better than a fixed algorithm. When it Potential applications for Machine becomes mature, it would greatly accelerate Learning abound in the areas of network the development of autonomic networking. technology, and there are various problem areas in the cable access network which can There are various problem areas in the benefit from Machine Learning techniques. cable access network which can benefit from This paper provides an overview of Machine Machine Learning techniques. For example, Learning algorithms, and how they could be the Proactive Network Maintenance data applied to the applications of interest to cable which represents network conditions obtained operators and equipment suppliers. from CMTS and CM can be processed through Machine Learning algorithms to enable automatic recognition of issues in the cable plant and initiating the needed INTRODUCTION corrective actions. Patterns in network traffic, equipment failures, device reboots or link Recent advances in Machine Learning, and failures can be used to identify the root cause in particular, deep learning, have resulted in of various network problems. Video channel an explosion of real-world applications, usage data could be used to optimize the including self-driving cars, face and emotion multicast channel lineups in a network, and be recognition, automatic image captioning, real- used to predict which programs to record time speech translation and drug discovery. automatically in a DVR. Powerful new learning algorithms, harnessed via open-source toolkits and running on This paper provides an overview of general-purpose GPUs (even cloud instances) Machine Learning algorithms with focus on are the key enablers for this explosion. the applications of interest to cable operators and equipment suppliers. The paper discusses Machine Learning technologies can learn several learning toolkits and computational from historical data, and make predictions or platforms, and hypothesizes some of the most decisions, rather than following strictly static compelling near-term uses. It will explore program instructions. They can dynamically how to apply learning algorithms to various adapt to a changing situation and enhance challenges faced by operators in managing their own intelligence by learning from new network devices and services. In addition, the data. Statistical pattern recognition, data paper covers best practices for achieving mining, clustering and deep learning have optimal results with Machine Learning. 2016 Spring Technical Forum Proceedings produce accurate answers for input points for OVERVIEW OF MACHINE LEARNING which it has not been trained explicitly, provided that the initial training set was This section provides a very brief representative enough. This is accomplished introduction to Machine Learning. The reader by interpolation, that is estimating an unseen familiar with such techniques may safely skip prediction point between two already seen test this section. Machine Learning (ML) is a field points and extrapolation, estimation of the of study of Artificial Intelligence that point beyond the observed training set. provides computers with learning techniques without being explicitly programmed. Traditional programming requires manually coding an algorithm and combining it with input data to produce the resulting output data. This model is fundamentally different in ML: a learning algorithm (e.g., a Neural Network) is trained with the input data and the corresponding output data, which is known as training set, to automatically produce the algorithm, which will be used to predict future results. But how do computers actually learn such an algorithm? Figure 1. An overview of Machine learning Systems can be always described by a mathematical function and in many cases such a function is multi-dimensional (i.e., has many In what follows, we provide a summary of input and output variables, maybe hundreds) the two most relevant techniques in Machine and is non-linear. If we have such a function, Learning: supervised and unsupervised we have a complete model of the system. For learning. instance, the function that describes a network may have the form: function (topology, load, SUPERVISED LEARNING configuration) = performance. That is, the function links the topology of the network, the In supervised learning, there is a given data traffic (load) and configuration (e.g., routing) set and the knowledge on what the correct with the resulting performance (e.g., delay). output looks like. The idea is that there is a Analytical modeling provides tools to obtain relationship between the input and the output. such a function (e.g., Queueing theory in networking). However, such techniques often The input data set is called training data use simplifications and cannot be applied and has a known label or result, such as when the network is too complex. cat/not-cat (for identifying cat photos) or a stock price at a time (for predicting stock Machine Learning provides a different way prices). Network examples could include to obtain such a function. As shown in Figure classifying traffic patterns for intrusion 1 below, a learning algorithm fed with a detection as harmful-traffic vs. normal-traffic, representative training set will fit the function, or predicting the usage of a certain network even if such function has hundreds of link based on various parameters. A model is variables or it is complex and non-linear. The prepared through a training process where it is algorithm produced by a Machine Learning required to make predictions and is corrected approach is the function itself and can also when those predictions are wrong. The 2016 Spring Technical Forum Proceedings training process continues until the model K-Nearest Neighbors Algorithm (kNN) achieves a desired level of accuracy on the training data. kNN makes predictions for a new data point by assigning it the value/label of its K Supervised learning problems are nearest neighbors. For regression this might categorized into "regression" and be the mean output variable, in classification "classification" problems. In a regression this might be the mode (or most common) problem, the goal is to predict results within a class value. continuous output, essentially trying to map the input variables to some continuous In kNN classification, an object is function. In a classification problem, the goal classified by a majority vote of its neighbors, is to predict results in a discrete output, trying with the object being assigned to the class to map the input variables into discrete output most common among its k nearest neighbors. categories. In kNN regression, the output value is the average of the values of its k nearest Example classification algorithms include neighbors. Logistic Regression and the Back Propagation Neural Network, SVMs, kNN, etc. The simplest technique is to use the Euclidean distance between neighbors. kNN Support Vector Machine (SVM) performs a calculation (or learns) when a prediction is needed, just in time. The training SVM is type of supervised Machine instances can be updated or curated over time Learning algorithm. Given a set of training to keep predictions accurate. examples, each marked as belonging to one of two classes, an SVM training algorithm builds UNSUPERVISED LEARNING a model that assigns new examples into one class or the other. A hyperplane splits the Unsupervised learning attacks problems input variable space. In SVM, a hyperplane is with little or no idea what the results should selected to best separate the points in the input look like. The idea is to derive structure from variable space by the class to which they data where there is no apriori knowledge of belong. The SVM learning algorithm finds the the effect of the variables. This structure is coefficients that results in the best separation derived by clustering the data based on of the classes by the hyperplane. New relationships among the variables in the data. examples are then mapped into that same With unsupervised learning there is no space and predicted to belong to a category feedback based on the prediction results. The based on which side of the gap they fall. input data set is not labelled and does not have a known result. The distance between the hyperplane and the closest data points is referred to as the A model is prepared by deducing margin. The best or optimal hyperplane that structures present in the input data. This may can separate the two classes is the line that has be to extract general rules. It may be through the largest margin. Only these points are a mathematical process to systematically relevant in defining the hyperplane and in the reduce redundancy, or it may be to organize construction of the classifier. These points are data by similarity. called the support vectors as they support or define the hyperplane.