
University of Massachusetts Amherst ScholarWorks@UMass Amherst Masters Theses Dissertations and Theses July 2019 An Empirical Analysis of Network Traffic: viceDe Profiling and Classification Mythili Vishalini Anbazhagan University of Massachusetts Amherst Follow this and additional works at: https://scholarworks.umass.edu/masters_theses_2 Recommended Citation Anbazhagan, Mythili Vishalini, "An Empirical Analysis of Network Traffic: viceDe Profiling and Classification" (2019). Masters Theses. 756. https://doi.org/10.7275/6sh3-za20 https://scholarworks.umass.edu/masters_theses_2/756 This Open Access Thesis is brought to you for free and open access by the Dissertations and Theses at ScholarWorks@UMass Amherst. It has been accepted for inclusion in Masters Theses by an authorized administrator of ScholarWorks@UMass Amherst. For more information, please contact [email protected]. AN EMPIRICAL ANALYSIS OF NETWORK TRAFFIC: DEVICE PROFILING AND CLASSIFICATION A Thesis Presented by MYTHILI VISHALINI ANBAZHAGAN Submitted to the Graduate School of the University of Massachusetts Amherst in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE May 2019 Department of Electrical and Computer Engineering AN EMPIRICAL ANALYSIS OF NETWORK TRAFFIC: DEVICE PROFILING AND CLASSIFICATION A Thesis Presented by MYTHILI VISHALINI ANBAZHAGAN Approved as to style and content by: David Irwin, Chair Daniel Holcomb, Member Michael Zink, Member Robert Jackson, Department Head Department of Electrical and Computer Engi- neering ABSTRACT AN EMPIRICAL ANALYSIS OF NETWORK TRAFFIC: DEVICE PROFILING AND CLASSIFICATION MAY 2019 MYTHILI VISHALINI ANBAZHAGAN B.E., KUMARAGURU COLLEGE OF TECHNOLOGY MSECE, UNIVERSITY OF MASSACHUSETTS AMHERST Directed by: Professor David Irwin Time and again we have seen the Internet grow and evolve at an unprecedented scale. The number of online users in 1995 was 40 million but in 2020, number of online devices are predicted to reach 50 billion, which would be 7 times the human population on earth. Up until now, the revolution was in the digital world. But now, the revolution is happening in the physical world that we live in; IoT devices are employed in all sorts of environments like domestic houses, hospitals, industrial spaces, nuclear plants etc., Since they are employed in a lot of mission-critical or even life-critical environments, their security and reliability are of paramount importance because compromising them can lead to grave consequences. IoT devices are, by nature, different from conventional Internet connected devices like laptops, smart phones etc., They have small memory, limited storage, low pro- cessing power etc., They also operate with little to no human intervention. Hence it becomes very important to understand IoT devices better. How do they behave iii in a network? How different are they from traditional Internet connected devices? Can they be identified from their network traffic? Is it possible for anyone to identify them just by looking at the network data that leaks outside the network, without even joining the network? That is the aim of this thesis. To the best of our knowledge, no study has collected data from outside the network, without joining the network, with the intention of finding out if IoT devices can be identified from this data. We also identify parameters that classify IoT and non-IoT devices. Then we do manual grouping of similar devices and then do the grouping automatically, using clustering algorithms. This will help in grouping devices of similar nature and create a profile for each kind of device. iv TABLE OF CONTENTS Page ABSTRACT .......................................................... iii LIST OF TABLES ...................................................viii LIST OF FIGURES................................................... xi CHAPTER 1. INTRODUCTION ................................................. 1 2. MOTIVATION .................................................... 3 2.1 Privacy Risks in IoT . .4 2.2 Security Risks in IoT . .5 3. RELATED WORK ............................................... 10 4. DESIGN AND IMPLEMENTATION ............................. 13 4.1 Capturing the Network Data . 13 4.2 Points of Traffic Observation . 13 4.3 Modes of Operation of the NIC . 14 4.4 Places and Number of Traffic Captures . 15 5. ANALYSIS OF DATA CAPTURED FROM OUTSIDE THE NETWORK ................................................... 17 5.1 Data Encapsulation . 17 5.2 Identifying the various Networks present in the vicinity and their degree of proximity . 19 5.3 Identifying the Devices belonging to a Network . 21 5.4 Identifying the Type of Device and Vendor using OUI . 23 5.5 Degree of Activeness of a Device . 26 5.6 How often do Devices change states ? . 32 5.7 Traffic associated with a device . 39 v 5.7.1 Content Type of Packets associated with a Device . 39 5.7.2 How much Traffic do Devices Send and Receive ? . 45 5.7.3 Do Devices consistently send the same amount of traffic everyday? . 51 5.7.4 Do Devices only send data or receive data or both ? . 56 5.8 Conclusion . 61 6. ANALYSIS OF DATA CAPTURED FROM INSIDE THE NETWORK ................................................... 62 6.1 Data Encapsulation . 62 6.2 Network Protocols used by a Device . 64 6.2.1 Wired Medium. 65 6.2.2 Wireless Medium . 70 6.3 Network Ports used for communication . 73 6.3.1 Wired Medium. 75 6.3.2 Wireless Medium . 79 6.4 How many external servers do devices contact each day? . 81 6.4.1 Wired Medium. 82 6.4.2 Wireless Medium . 86 6.5 Destination Devices Contacted within the network . 86 6.5.1 Wired Medium. 87 6.5.2 Wireless Medium . 88 6.6 Traffic Associated with a device . 89 6.6.1 How much Traffic do Devices Send and Receive ? . 90 6.6.1.1 Wired Medium . 90 6.6.1.2 Wireless Medium. 92 6.6.2 Do Devices consistently send the same amount of traffic everyday? . 95 6.6.2.1 Wired Interface . 95 6.6.2.2 Wireless Interface . 97 6.7 Do Devices only send data or receive data or both ? . 99 vi 6.7.1 Wired Interface . 100 6.7.2 Wireless Interface . 101 6.8 Conclusion . 102 7. CLUSTERING .................................................. 103 7.1 Defining Network Signature . 104 7.1.1 Quantifying Signature Strength . 105 7.1.2 Identifying Parameters Unique to a device . 105 7.2 Input Data to the Clustering Algorithm . 106 7.2.1 External Wireless . 106 7.2.2 Internal Wired and Wireless . 106 7.3 Cluster Progression - External vs. Internal . 107 7.4 Internal Wired . 109 7.5 Internal Wireless . 111 7.6 External Wireless . 113 7.7 External vs. Internal clustering . 116 7.8 Conclusion . 119 8. FUTURE WORK ................................................ 120 BIBLIOGRAPHY .................................................. 122 vii LIST OF TABLES Table Page 5.1 A subset of fields present in the IEEE 802.11 Header . 18 5.2 Location A - List of Networks . 19 5.3 Location B - List of Networks . 22 5.4 IEEE 802.11 - Values in Address Fields based on the DS Value . 23 5.5 Example List of OUIs and Assignees . 24 5.6 Location A - Subset of List of devices after OUI resolution . 25 5.7 Location A - Network 'HostNetwork' - Comparison of One Day's Duration of Sleep and Active States of an Apple TV, Nest Thermostat and Access Point . 27 5.8 Location A - Network 'HostNetworkGuest' - Comparison of One Day's Duration of Sleep and Active States of devices Apple 4a:aa:13, Apple 3e:39:83 and Access Point . 28 5.9 Location A - Network 'HostNetwork' - Inference - Mean of Number of Transitions per hour and their fluctuation 120+ days . 35 5.10 Location A - Network 'HostNetworkGuest' - Inference - Mean of Number of Transitions per hour and their fluctuation over 120+ days ........................................................ 36 5.11 Location A - Network 'HostNetwork' - Average Percentage of Types of Packets transmitted and received in a day . 41 5.12 Location A - Network 'HostNetworkGuest' - Average Percentage of Types of Packets transmitted and received in a day . 41 5.13 Location A - Other Networks - Average Percentage of Types of Packets transmitted and received in a day . 44 viii 5.14 Traffic Volume Limits for Traffic Categorization . 45 5.15 Location A - Network 'HostNetwork' - Mean of Average Traffic generated in a minute over all days . 47 5.16 Location A - Network HostNetworkGuest - Mean of Average Traffic generated in a minute over all days . 48 5.17 Location A - Outside Networks - Mean of Average Traffic generated in a minute over all days . 50 5.18 Standard Deviation Limits for Traffic Fluctuation Category . 52 5.19 Location A - Network 'HostNetwork' - Fluctuations in Traffic over all days ........................................................ 53 5.20 Location A - Network 'HostNetworkGuest' - Fluctuations in Traffic over all days . ..
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages135 Page
-
File Size-