Wireless Network Congestion Management Using Predictive Analytics

University of Nevada, Reno Wireless Network Congestion Management Using Predictive Analytics A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering by Alisha Thapaliya Dr. Shamik Sengupta/Thesis Advisor December, 2018 THE GRADUATE SCHOOL We recommend that the thesis prepared under our supervision by ALISHA THAPALIYA Entitled Wireless Network Congestion Management Using Predictive Analytics be accepted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Shamik Sengupta, Ph.D., Advisor Lei Yang, Ph.D., Committee Member Hanif Livani, Ph.D., Graduate School Representative David W. Zeh, Ph.D., Dean, Graduate School December, 2018 i Abstract Wi-Fi Access Points (APs) deployed publicly are facing serious demands due to the proliferation in Wi-Fi enabled devices. This becomes more prominent when the user crowd moves dynamically in space creating a sporadic usage pattern. In order to cater for the dynamically changing spectrum demands, we need to identify areas with high spectrum usage that needs better Wi-Fi coverage. In this thesis, we aim to understand the dynamic spectrum usage over space, time and channels. The temporal and spatial analysis helps us to identify places that are highly congested at any given time. The channel usage pattern determines channels that are over utilized and under utilized in the congested areas. The usage data from user devices can be analyzed to answer a number of possible questions in regards to congestion, access point load balancing, user mobility trends and efficient channel allocation. Using this data, we attempt to identify Wi-Fi usage trends in a dynamic environment and use it to further predict the congestion in various locations. To accomplish this, we have used University of Nevada, Reno (UNR) to conduct our experiments where we use various supervised learning algorithms to find the existing patterns in spectrum usage inside UNR. Using these patterns, we predict the values for certain key attributes that directly correlate to the congestion status of any location. Finally, we apply unsupervised learning algorithms to these predicted data instances to cluster them into different groups. Each group will determine the level of congestion for any build- ing at any time of any day. This way, we will be able to ascertain whether or not ii any place at any time in the future might require additional resources to be able to deliver wireless services efficiently. In an attempt to deliver wireless services in a resourceful manner, we also talk about self-coexistence among networks where the secondary networks can access licensed bands without interfering with the primary networks. This technology is referred to as dynamic spectrum access that allows the underutilized frequency bands to be used avoiding the need for additional resources. With all the secondary networks trying to access an available channel, there arises a game theoretic com- petition where they want to get a channel for themselves by incurring as minimum cost/time as possible. We implement a predictive strategy in the networks for them to land on an available channel in the shortest time possible minimizing the collisions among themselves. Thus, we investigate various predictive algorithms and observe how a self-learning approach can be helpful in maximizing utilities of the players in comparison to traditional game theoretic approaches. iii Acknowledgements I would like to offer my sincere thanks to my advisor, Dr. Shamik Sengupta. With- out his support, this thesis would not have been possible. He has always provided me with insightful suggestions and guidance throughout the journey of my Mas- ters’ education. I would also like to thank my other committee members, Dr. Lei Yang and Dr. Hanif Livani for giving me proper feedback and support. My thanks also goes to the developers of python library scikit-learn and WEKA tool, using which I was able to successfully conduct my research with valuable results. I acknowledge, from the bottom of my heart, the support of my research by National Science Foundation(NSF), Award #1346600, Award #1516724, and Award #1723814. Lastly, thanks to all of my friends and family who have been a constant support in my life. iv Contents Abstracti Acknowledgements iii 1 Introduction1 1.1 Network Congestion............................1 1.2 Research Problem..............................2 1.2.1 Aerial Access Points........................2 1.2.2 Optimized Resource Allocation..................4 1.2.3 Dynamic Spectrum Access....................5 1.3 Thesis Organization.............................7 2 Related Works8 2.1 Wi-Fi Data Analysis.............................8 2.2 Network Self-Coexistence......................... 10 3 Wi-Fi Spectrum Usage Analytics 13 3.1 Methodology................................ 14 3.1.1 Data Collection........................... 14 v 3.1.2 Data Preprocessing......................... 15 3.1.3 Data Analysis............................ 16 3.2 Results.................................... 17 4 Predicting Congestion Level in Wireless Networks 30 4.1 Methodology................................ 33 4.1.1 Data Collection........................... 33 4.1.2 Dataset Creation and Pre-processing............... 35 4.1.3 Supervised Learning........................ 36 4.1.4 Unsupervised Learning...................... 38 4.2 Results.................................... 41 4.2.1 SVR Prediction Model....................... 41 4.2.2 EM clustering............................ 42 4.2.3 Final output............................. 45 5 Incorporating Machine Learning in a Game Theoretic Environment for Dynamic Spectrum Access 49 5.1 System Model................................ 51 5.1.1 Challenges.............................. 53 5.2 Self-learning in the game.......................... 56 5.2.1 Linear Regression.......................... 58 5.2.2 Support Vector Regression..................... 59 5.2.3 Elastic Net.............................. 60 5.3 Proposed Mechanism............................ 62 vi 5.4 Results.................................... 67 6 Conclusion and Future Works 75 Bibliography 79 vii List of Tables 3.1 Mean and standard deviation of the group of users associated with each WiFi channel in ABB......................... 20 3.2 Total number of users and the channels with the highest number of connected users in 2.4 GHz and 5 GHz frequency range on March 15, Wednesday at 12:05 p.m........................ 27 3.3 Total number of users and the channels with the highest number of connected users in 2.4 GHz and 5 GHz frequency range on March 28, Tuesday at 8:30 p.m........................... 28 3.4 Total number of users and the channels with the highest number of connected users in 2.4 GHz and 5 GHz frequency range on April 01, Saturday at 4:00 p.m............................. 29 4.1 SVR Output Prediction (MSE in parenthesis).............. 42 4.2 Analysis of data in cluster2, cluster0 and cluster1............ 44 5.1 Strategic–form minority game with network x and y .......... 53 5.2 Comparison between experimentally calculated and predicted prob- abilities.................................... 68 5.3 Mean square errors of predictive algorithms.............. 70 viii 5.4 Comparison between the time taken to reach equilibrium (in time units) when using different strategies.................. 71 ix List of Figures 3.1 JCSU Wi-Fi Usage.............................. 17 3.2 ABB Wi-Fi Usage.............................. 18 3.3 Channel Load in ABB............................ 21 3.4 Heatmap: March 15, Wednesday, 12:05 pm............... 23 3.5 Heatmap: March 28, Tuesday, 8:30 pm.................. 24 3.6 Heatmap: April 01, Saturday, 4:00 pm.................. 25 4.1 Hourly Clients, Throughput, Frame Retry and Frame Error...... 38 4.2 Data used for EM clustering........................ 43 4.3 The 3 clusters generated by the EM algorithm.............. 44 4.4 Evaluation of the EM clustering model.................. 47 4.5 Predicted data used in the EM clustering model............ 48 4.6 Clustering Model Output......................... 48 5.1 Networks and Channels a) at the beginning of the game; b) after first stage when Network 2 got a channel; c) after second stage when Networks 1, 3 and 4 got channels; d) at the last stage when equilibrium is achieved............................... 55 5.2 Various stages of the game divided into time slots........... 63 x 5.3 Time to reach equilibrium (in time units) and corresponding optimal probability to switch with varying number of channels........ 65 5.4 Channel switching probability with varying number of channels... 69 5.5 Time taken to reach equilibrium (in time units) with different strategies when N = 10 and M = 10....................... 71 5.6 Time taken to reach equilibrium (in time units) with different strategies when N = 10 and M = 20....................... 72 5.7 Time taken to reach equilibrium (in time units) with different strategies when N = 10 and M = 30....................... 73 5.8 Time taken to reach equilibrium (in time units) with different strategies when N = 10 and M = 40....................... 74 1 Chapter 1 Introduction 1.1 Network Congestion The ubiquity of Wi-Fi enabled devices have plunked down a serious load in the Access Points (APs) especially in public places because of the high density of the people accessing the internet. This load varies over space, time and channels. The crowd in public places such as universities, cafeterias,

Load more