The Generalization Ability of Artificial Neural Networks in Forecasting TCP

The generalization ability of Artificial Neural Networks in forecasting TCP/IP network traffic trends. By Vusumuzi Moyo Supervisor Dr K. Sibanda A thesis submitted in fulfilment of the requirements of the Degree of Master of Science in Computer Science at the University of Fort Hare. August 2014 Declaration I hereby declare that “The generalization ability of Artificial Neural Networks in forecasting TCP/IP network traffic trends” is my original work and it has not been submitted before for any degree or examination at any other University. All sources I have used, consulted or quoted are duly indicated and acknowledged herein. ------------------------------------------ August 2014 i Publications Part of the research work presented in this thesis has been published or has been submitted for publication or review in the following papers: V. Moyo and K. Sibanda, The generalization ability of Artificial Neural Networks in forecasting TCP/IP network traffic trends. In Proceedings of the 2013 Southern African Telecommunications Networks and Applications Conference, Stellenbosch wine Spier estates, Western Cape, South Africa. V. Moyo and K. Sibanda, On the optimal Artificial Neural Network architecture for forecasting TCP/IP network traffic trends. In Proceedings of the 2014 Southern African Telecommunications Networks and Applications Conference, Port Elizabeth, Eastern Cape, South Africa. V. Moyo and K. Sibanda, On the optimal learning rate size for the generalization ability of Artificial Neural Networks in forecasting TCP/IP traffic trends [Accepted to be published in International Journal of Computer Applications]. V. Moyo and K. Sibanda, The learning rate and momentum: How much does each matter in forecasting TCP/IP network traffic trends? [Accepted to be published in International Journal of Computer Science Applications]. V. Moyo and K. Sibanda, The effects of the training set size on the generalization ability of Artificial Neural Networks in forecasting TCP/IP network traffic trends: [Submitted to be reviewed in International Journal of Computers Communications and Control]. V. Moyo and K. Sibanda, The generalization ability and convergence of Artificial Neural Networks in forecasting TCP/IP network traffic trends: Which Backpropagation algorithm is most effective: [Submitted to be reviewed in South African Computer Journal]. ii Dedication To my parents, with utmost gratitude iii Acknowledgements First and foremost I would like to thank the Almighty, whose invisible hand continues to lead me through life. I owe special gratitude to the research supervisor Dr K. Sibanda for insight, guidance, support and constructive criticism during the course of this research. He continually and realistically conveyed a spirit of adventure about research, and an excitement regarding teaching. Without his guidance and persistent help, this thesis would not have been possible. A special thank you to the staff of the Department of Computer Science at UFH, to the H.O.D Mr. S. Scott for his continual jokes and support, Professor M. Thinyane for his advice and concern throughout this research, Ms N. Moorosi for her valuable suggestions in conducting the research, Mr S. Ngwenya and many others I did not mention. To the Masters’ class of 2014, thank you guys for making it worthwhile, life would never have been the same without you guys. To Caro, Munya and Pride, thank you for the priceless moments. I would like to thank my parents Mr. and Mrs. K. Moyo for being there and believing in me in all things. I am eternally grateful to you mum and dad. I thank my entire family, to my sister Mamo, thank you for your continued support. Gracious, words can never express how I love you guys. This work is based on the research undertaken within the Telkom CoE in ICTD supported in part by Telkom SA, Tellabs, Saab Grintek Technologies, Easttel, Khula Holdings, THRIP and National Research Foundation of South Africa (UID: 90434). The opinions, findings and conclusions or recommendations expressed here are those of the authors and none of the above sponsors accepts any liability whatsoever in this regard. iv Abstract Artificial Neural Networks (ANNs) have been used in many fields for a variety of applications, and proved to be reliable. They have proved to be one of the most powerful tools in the domain of forecasting and analysis of various time series. The forecasting of TCP/IP network traffic is an important issue receiving growing attention from the computer networks. By improving upon this task, efficient network traffic engineering and anomaly detection tools can be created, resulting in economic gains from better resource management. The use of ANNs requires some critical decisions on the part of the user. These decisions, which are mainly concerned with the determinations of the components of the network structure and the parameters defined for the learning algorithm, can significantly affect the ability of the ANN to generalize, i.e. to have the outputs of the ANN approximate target values given inputs that are not in the training set. This has an impact on the quality of forecasts produced by the ANN. Although there are some discussions in the literature regarding the issues that affect network generalization ability, there is no standard method or approach that is universally accepted to determine the optimum values of these parameters for a particular problem. This research examined the impact a selection of key design features has on the generalization ability of ANNs. We examined how the size and composition of the network architecture, the size of the training samples, the choice of learning algorithm, the training schedule and the size of the learning rate both individually and collectively affect the ability of an ANN to learn the training data and to generalize well to novel data. To investigate this matter, we empirically conducted several experiments in forecasting a real world TCP/IP network traffic time series and the network performance validated using an independent test set. MATLAB version 7.4.0.287’s Neural Network toolbox version 5.0.2 (R2007a) was used for our experiments. The results are found to be promising in terms of ease of design and use of ANNs. Our results indicate that in contrast to Occam’s razor principle for a single hidden layer an increase in number of hidden neurons produces a corresponding increase in generalization ability of ANNs, however larger networks do not always improve the generalization ability of ANNs even though an increase in number of hidden neurons results in a concomitant rise in network generalization. Also, contradicting commonly accepted guidelines, networks trained with a larger representation of the data, exhibit better generalization than networks trained on smaller representations, even though the larger networks have a significantly v greater capacity. Furthermore, the results obtained indicate that the learning rate, momentum, training schedule and choice of learning algorithm have as much a significant effect on ANN generalization ability. A number of conclusions were drawn from the results and later used to generate a comprehensive set of guidelines that will facilitate the process of design and use of ANNs in TCP/IP network traffic forecasting. The main contribution of this research lies in the identification of optimal strategies for the use of ANNs in forecasting TCP/IP network traffic trends. Although the information obtained from the tests carried out in this research is specific to the problem considered, it provides users of back-propagation networks with a valuable guide on the behaviour of networks under a wide range of operating conditions. It is important to note that the guidelines accrued from this research are of an assistive and not necessarily restrictive nature to potential ANN modellers. vi Table of Contents Dedication .................................................................................................................................................... iii Table of Figures ............................................................................................................................................. x Table of Tables ............................................................................................................................................ xii List of Acronyms ..........................................................................................................................................xiii Chapter 1: Introduction .......................................................................................................................... 1 1.1 Research Background .................................................................................................................... 1 1.2 Statement of problem ................................................................................................................... 5 1.3 Research Goals .............................................................................................................................. 5 1.4 Contribution of Study .................................................................................................................... 6 1.5 Scope of the research ................................................................................................................... 7 1.6 Overview of the Dissertation ........................................................................................................ 8 1.7 Conclusion ....................................................................................................................................

Load more