Degree project

Converting Network Media Data into Human Readable Form A study on deep packet inspection with with real- time visualization.

Author: Steffen-Marc Förderer Date: 2011-05-25 Subject: Computer Science Level: Bachelor Course code: 2DV00E Converting Network Media Data into Human Readable Form A Study on deep packet inspection with real-time visualization.

by c Steffen-Marc F¨orderer

A thesis submitted to the School of Graduate Studies in partial fulfilment of the requirements for the degree of Bachelor of Computer Science

Matematiska och systemtekniska institutionen Linn´euniversitetet

05 2011

V¨axj¨o Sverige Abstract

A proof of concept study into the working of network media capture and visual- ization through the use of Packet Capture in realtime. An application was developed that is able to capture tcp network packets; identify and display images in raw HTTP network traffic through the use of search, sort, error detection, timeout failsafe al- gorithms in real time. The application was designed for network administrators to visualize raw network media content together with its relevant network source & ad- dress identifiers. Different approaches were tried and tested such as using with GTK+ and Visual Studio C# .Net. Furthermore two different types of image identi- fication methods were used: raw magic string identification in pure tcp network traffic and HTTP Mime type identification. The latter being more accurate and faster. C# was seen as vastly superior in both speed of prototyping and final performance eval- uation. The study presents a novel new way of monitoring networks on the basis of their media content through deep packet inspection. — LNU Keywords

TCP/IP Monitoring, Network Sniffing, Network Monitoring, Image Reconstruc- tion, Packet Data Reassembly, Real-time Visualization

ii Acronyms and Abbreviations

ACK - Acknowledgement Packet API - Application Programming Interface ARP - Address Resolution Protocol ASCII - American Standard Code for Information Interchange BMP - BitMap image file / Image Type CPU - Central Processing Unit DNS - Domain Name Service / System GIF - Graphics Interchange Format / Image Type GTK - Gimp Tool Kit GZIP - GNU Zip / file compression HEX - Hexadecimal / Base 16 HTTP - Hyper Text Transfer Protocol ICMP - Internet Control Message Protocol IP - Internet Protocol JPEG - Joint Photographic Experts Group / Image Type LAN - Local Area Network MAC - Media Access Control MIME - Multipurpose Internet Mail Extensions / Content Types OSI Model - Open Systems Interconnection Model PCAP - Packet Capture API PNG - Portable Network Graphics / Image Type POE - Perl Object Environment RAM - Random Access Memory SSH - Secure Shell SYN - Synchronize Packet TCP - Transport Control Protocol TTL - Time To Live URL - Uniform Resource Locator XML - Extensible Markup Language

iii Acknowledgements

Many thanks to Welf L¨owe and Mathias Hedenborg for their patience and help. Most of all I want to thank my family who has supported me throughout my degree.

“Too much technology, in too little time. And little by little ... we went insane.” — Anonymous/

iv Contents

1 Introduction 1 1.1 TheProblem ...... 1 1.2 Background ...... 2 1.3 Researchobjective ...... 2 1.4 PreliminaryRequirements ...... 3 1.5 ThesisOverview...... 4

2 Background and Related Work 5 2.1 TheoreticalFramework...... 5 2.2 Background ...... 5 2.2.1 AnatomyofaHTTPRequest ...... 6 2.3 RelatedWork ...... 8

3 Design Requirements 11 3.1 PerformanceRequirements ...... 11 3.2 FunctionalRequirements ...... 11 3.3 UsabilityRequirements...... 12 3.4 TextualUseCasescenarios...... 12 3.4.1 Scenario1...... 12 3.4.2 Scenario2...... 13 3.4.3 Scenario3...... 13

4 Implementation 14 4.1 LiveCapture ...... 14 4.2 Non-Live Packet Capture and Processing ...... 15 4.3 MechanicsofPcap ...... 15 4.4 SortingPacketsintoConnections ...... 16 4.5 Imageverification...... 17 4.6 Preventing Connection Buffer Overflow ...... 17 4.7 Findingtheimage...... 18 4.7.1 UsingMagicNumbers ...... 18 4.7.2 Using HTTP Headers (MIME Types) ...... 19

5 Getting the Network data 21 5.1 Switched/Non Switched Networks ...... 21 5.2 TheReal-timeproblem...... 23

v 6 The Graphical User Interface 26 6.1 Perl,EclipseandGTK+ ...... 26 6.2 C#,VisualStudioand.Net ...... 28 6.2.1 Implemented Functionality ...... 31 6.3 FinalApplicationLayout...... 33

7 Evaluation 34 7.1 EvaluationandTesting...... 34 7.1.1 Usability Evaluation ...... 34 7.1.2 Functional Evaluation ...... 37 7.1.3 PerformanceEvaluation ...... 39

8 Conclusions and Future Work 43 8.1 Conclusion...... 43 8.2 FutureWork...... 43

Bibliography 45

vi List of Figures

2.1 HTTPGetRequestHeader ...... 7 2.2 HTTPResponsewithHeaderandStartofImage ...... 7 2.3 York1.55forWindows ...... 9 2.4 Driftnet 0.1.6 Linux/Solaris ...... 10 2.5 EtherPEG1.3a1MacOSX...... 10

4.1 LivePacketCaptureProcedure ...... 14 4.2 InitializeandStartPcap ...... 16 4.3 ConnectionArrays ...... 17 4.4 JPEG Beginning of file identifier ...... 19

5.1 SimpleEthernetHub...... 21 5.2 SimpleEthernetSwitch...... 22 5.3 EthernetSwitchwithMITMattack ...... 24

6.1 PerlPrototypeApplicationOverview ...... 27 6.2 C# Application Overview ...... 28 6.3 ImageScrollBuffers ...... 29 6.4 OriginalImageWindow ...... 29 6.5 C# Application with Non Scrolling Image Display ...... 30 6.6 C# Variable Image and Processing Speed Information ...... 31 6.7 UMLUseCaseDiagram ...... 33

7.1 Application used to simulate browsing behaviour...... 38 7.2 ImageLoss(LivevsNon-live) ...... 39 7.3 ImageLoss(ByImageType) ...... 40 7.4 RawTestData ...... 41 7.5 RawTestDataCont...... 42

vii 1 Introduction

This chapter discusses the basic problems and issues that network administrators are faced with in terms of network content detection and classification. It introduces the objectives that this thesis aims to fulfil along with a set of requirements.

1.1 The Problem Network administrators today are faced with the complex challenge of monitoring, maintaining and developing better ways to analyse their networks’ traffic. Often the tools used for this task consist of packet capturing programs that enable the admin- istrator to diagnose numerous problems. These modern tools all support a variety of options to inspect and analyse traffic down to the basics of packet type and structure. The way that these packet capturing programs display their information to the ad- ministrator has been the same since the beginnings of the ARPA Network consisting of simple text (ASCII - American Standard Code for Information Interchange) or HEX - Hexadecimal form. However, although powerful these programs lack a simple feature when it comes to network content inspection. The fact that there are times when a network administrator is not concerned about the type of data packet but rather its actual data payload, has been largely ignored by the development community. This is where all network monitoring programs fail. They are unable to analyse the packet payload to the extent that an administrator does not have to look at a ASCII conversion of the binary content in the packet. Take this scenario for example, a network administrator has been notified that illicit content is being transferred over the network and to find out where the data is coming from and where it is being sent. This is very often the case for Internet Service Providers. The theoretical ideal way to go about this would be to employ firewalls with deep packet inspection that can recognize illegal/copyrighted data and prevent it from being forwarded through the network. Although great advances have been made in image recognition, no hardware or software has been created so far that can recognize whether an image or video fits a certain content criteria. One still relies on the human to analyse and make choices in this scenario. The problem however is not solved and no packet capturing program can convert a packets payload into human readable / recognizable form when it comes to web media. The problem therefore is that no packet capturing tools exist that are able to capture packet payloads, convert them into their original form and displaying this data to the user. This thesis aims to solve this issue by reassembling an network data into its original form which in our case is image data and present this to the user in an interactive manner.

1 1.2 Background The author of ”Ethereal Packet Sniffing”, Angela Orebaugh explains the importance of network analysis: ”... is the key to maintaining an optimized network and detecting security issues. Proactive management can help find issues before they turn into serious problems and cause network downtime or compromise confidential data. In addition to identifying attacks and suspicious activity, you can use your network analyser data to identify security vulnerabilities and weaknesses and enforce your companys security policy. Sniffer logs can be correlated with IDS, firewall, and router logs to provide evidence for forensics and incident handling. A network analyser allows you to capture data from the network, packet by packet, decode the information, and view it in an easy to understand format. Network analysers are easy to find, often free, and easy to use; they are a key part of any administrators toolbox.” (Orebaugh, 2004). Orebaugh shows the versatility of network analysers as tools to manage, secure, and monitor a network. The wording used by her stating ”decode the information, and view it in an easy to understand format” is only partly true. As discussed above, an easy to understand format is not representing a image packets contents in ACII or Hexadecimal but to actually let the user view the original data as it is intended. In the case of image data this would of course mean displaying the actual image. It seems this critical aspect is has been largely ignored in the networking industry. TCP Network traffic is notoriously complex to monitor based on the type of data it contains. As there are a vast amount of applications that utilize TCP connectivity to communicate, the data they send is just as diverse. For example, first and foremost when speaking about internet traffic we will focus on media web traffic which mainly consists of HTTP traffic. The focus in this document lies on http traffic and how it transfers image media through a network. The idea is to be able to monitor this traffic and reassemble the individual packets as to re-create the original data to be viewed by the person doing the monitoring. It is a proof of concept application that shows how network data can be processed by an individual who is not the intended recipient of the data and has not initiated the actual request for the http data. It is important to clarify that HTTP is an application layer protocol meaning it resides at the top of the OSI model (Open Systems Interconnection Model). The difficulty in monitoring HTTP traffic, lies in that one needs to assemble and identify the data as the intended recipients HTTP browser would do without knowing exactly what type of data was requested.

1.3 Research objective This research is intended to show the method of implementing packet capturing soft- ware that can reassemble raw network data into its original form without the use of the original Application Layer Software Stack. It will give a clear insight into how best to handle raw network data and convert it to its original format. It will show any problems encountered and how to circumvent them. The research will also give a good understanding of the scope of work needed to implement such a program.

2 Further it will reveal the efficacy in programming such an application in two different languages. Finally it will enable end users to better monitor their networks without interfering with end user systems. The following sections serve as an overview and background to the overall process and can be viewed in more detailed in Chapter 2. • Design: The objective is to devise a method in which HTTP image data and HTTP GET requests can be filtered out of common network data and reassem- bled in such a way that the original image or request can be displayed to the user. The complexity lies in filtering only packets with image or request data out of a stream of various TCP connections in a fast and effective manner. Once filtered out the packets must be sorted and reassembled without including re- transmissions or duplicates. The resulting image should be displayed to the user with information about the source and destination of all captured images. Preferably a graphical user interface would be used that lets the user monitor a HTTP stream in an automated fashion. Lastly the application should keep a record of all images captured and all HTTP GET requests. • Implementation: Being the practical part of the investigation, the design is implemented. Two applications will be written, one using the Perl with the Perl Net::Pcap module and the other using Microsoft’s Visual C# using the WinPcap module. This will enable an evaluation to be made on which language is more suitable for such a task. The applications must have a graphical user interface that displays the images as they are captured from the network stream. It will also keep track of image origin and destination such as IP, Port and MAC details. • Research significance: A major benefit of the research to the information system community is that it can be used as a guideline to write packet capturing programs that convert raw data into its original form, regardless of data type. One problem of computer system engineering projects such as this research focuses on is that no real work working examples exist that are open source and are able to handle large amounts of network data without using up significant amounts of system resources such as RAM (Random Access Memory) or CPU (Central Processing Unit).

1.4 Preliminary Requirements The most important goal of this research is to create an application that meets a set of minimum requirements in terms of the following requirement question types: • Usability How easy to use is the application? Are menus and options clearly displayed to the user? Can different use case scenarios be performed in a fast and efficient manner by a user unfamiliar with the graphical user interface? Does the application interaction with a user meet the expectations set by similar network sniffing applications?

3 • Functionality What type of network traffic, images and HTTP Get Requests can the application monitor? Is promiscuous sniffing mode supported? What type of network interfaces can be monitored? How many false positives of images or HTTP requests are found in a typical use case scenario? Is information stored in a manner that enables later evaluation?

• Performance How stable and responsive is the application during different network load scenarios? Does the application stay reliable even during long usage? How efficient is the CPU and RAM used? Are there any memory leaks or endless loop situations? How much hard drive resource usage? Is the application able to handle high network loads without missing packets (Real- time scenario)? In other words can images be found and processed in a timely manner so as to prevent the need for cueing to RAM for later processing? For more details refer to Chapter 2.

1.5 Thesis Overview The following chapters lay out the scope and present different approaches towards network data acquisition and filtering. Firstly we show fundamental background knowledge together with examples of existing software in the domain of research. With this information in mind we then present the software requirement set in detail and formally describe real world use case scenarios. The actual inner workings of the thesis software are revealed in the Chapter 4 ”Implementation”. To elaborate on the practice of data acquisition in different network topology environments Chapter 5 explains how ”Getting the Data” can be done in in Switched and Non-Switched Ethernet networks using man-in-the-middle attacks. The issue of real-time image capture and presentation is discussed. Chapter 6 compares the programming complexities in C# and Perl. With the graphical user interface playing a large role in the research special focus is given to GTK+ and Windows Forms. The final implemented functionality of the application features are listed in this chapter along with a use-case UML diagram. Finally Chap- ter 7 and 8 reveal the results from numerous tests and compare them with the set of requirements. Chapter 8 concludes the findings and ties together the results of the research leaving room for future work ideas and considerations.

4 2 Background and Related Work

The chapter on Background and Related work presents the fundamental knowledge required in understanding how HTTP traffic and compression relate to network media acquisition. It lists three network media capture programs that aim to address the problem presented in the introduction and how they succeed or fail in execution. It starts out presenting the scope of the thesis.

2.1 Theoretical Framework The research method follows along the lines of ”Software System Development” (Brit- ton and Doake, 2003) and the unified rational process (Gibbs, 2006). The testing was designed to measure the three requirement objectives (Usability, Functionality and Performance) In the case of Usability this was achieved by by creating pre set tasks that specified a user to make certain adjustments and obtain requested data from the application. These ranged from opening the application and selecting working/log directories to setting up specialized image searches for given scenarios. Each task was timed with the tester watching noting down the users actions along with noting a users opinion on how he or she would optimize the graphical layout or image scan process. As no comparable alternatives to our applications currently exist it was not possible to compare the obtained values to those of existing applications. The functional requirements were a set of pre defined features and actions that the application must be able to fulfil, these were all manually tested using specialized testing software to generate high load web traffic. Features such as minimum image types supported (JPEG, GIF and PNG), accuracy of HTTP Get Requests found along with log data acquisition and storage to name a few were the basis of functionality testing. Performance requirements were measured using iterative network traffic loads and duration times. The application had to be able to capture at least 90% of image traffic in live mode with 95% in non live mode tested on a system with limited hardware such as CPU, RAM and hard drive space.

2.2 Background HTTP: In order for us to understand mechanics on which we build our application to extract media data from TCP streams we first have to understand the mechanisms of how the data is transferred between two hosts. ”The application layer contains a variety of protocols that are commonly needed by users. One widely-used application protocol is HTTP (HyperText Transfer Protocol), which is the basis for the World Wide Web. When a browser wants a Web page, it sends the name of the page it wants to the server using HTTP. The server then sends the page back. Other application protocols are used for file transfer, electronic mail, and network news.” (Tanenbaum,

5 2002) As we are primarily interested in HTTP network traffic the TCP protocol serves as our starting point.

2.2.1 Anatomy of a HTTP Request We first start out with the client sending out a SYN (Synchronize) Packet with a random starting sequence number to the Web Server wanting to establish a TCP connection. The Server responds with a SYN Packet of its own and uses the starting sequence number of the client plus one in its reply. Once the client has received the SYN it sends an ACK (Acknowledgement) packet again adding one to the sequence number. The TCP handshake is now complete and the client can begin by requesting data from the web server. The sequence number plays an important role as it enables a client to read the packets in the correct sequence. The nature of the internet allows packets to take different routes and thus some packets may arrive earlier then others even if they were sent out later. Sequencing is therefore a critical component that we have to consider when writing an application for the purpose of reassembling data into human readable form. As a second goal we are interested in not only image data but what websites or services users are requesting. For example as a user types in www.yahoo.com into his or her browser a request is sent out to the server of yahoo.com to fetch the main webpage. For the sake of our application we ignore interaction of DNS (Domain namye service) lookups. The client sends a HTTP request ”GET” with the path to the site, furthermore the client informs the server with a HTTP header what kind of data it can accept and display. As each GET request is always accompanied by a HOST detailing the requested URL and Server we can potentially extract this information. An example HTTP Get request can be seen in figure 2.1. The figure shows the request packet that is sent our when a user types ”http://www.yahoo.com/finance”. After the String ”GET” comes the path to the entity we request, ”/finance” and the String ”Host:” which is followed by ”www.yahoo.com”. Therefore all we have to do is search for packets that contain a ”GET” and ”Host:” in the header to identify websites a user on the network is trying to visit.

Image Web Traffic: The way how images are transferred from web server to the client details the approach we must take to capture images. When a client sends a request for an image to a server, the server will respond with a data packet that contains a HTTP response header along with the beginning of the request image. It is best to visualize the response packet as seen in figure 2.2. Because each image served always contains a header that details information about the image (MIME Type) we can use this data to reassemble the image. Take special note of section (A), here the header defines the content type as an image and further what type of image, in this case JPEG. Furthermore the header (section B) tells us how large the image is (540255 bytes). Given this information we know that in order to find images in a network stream we must search for these HTTP headers. If we

6 Figure 2.1: HTTP Get Request Header

Figure 2.2: HTTP Response with Header and Start of Image

find a packet that contains a HTTP header with Content-Type ”Image” we know that an image is being transferred. By looking at the size we also know how many packets we have to capture to rebuild the image. The exact procedure is explained later in this document.

Gzip: A pitfall encountered early on in this research was the issue of client or server transfer encoding. We have to be aware how the data that is sent from the server to the client is encoded or vice versa. The idea of encoding HTTP data is to decrease the

7 data size by compressing it, or to segment the data by chunking to enable continuously dynamically generated pages to load. If we were seeking to identify packet payloads that are compressed then we can not simply look for byte sequence identifiers that would indicate a start of a packet or sequence of packets that contain information we are interested in. It would make the process vastly more complex as each packet would have to be decompressed to view its raw data. For the sake of completeness and people wanting to build on this research by extracting data that is not of image or HTTP GET request nature we briefly highlight the issue in the next section. Compressing using the LZ77 and LZ78 algorithms is commonly referred to Gzip compression. It is one of the most common compression algorithms utilized by servers. If a client informs the server that it accepts content encoding of type ”gzip” the server (if compression is enabled) will send certain data compressed back to the client. This, in the case of HTML or XML (Extensible Markup Language) files can save a massive amount of bandwidth. Test show that up to 73% less data needs to be transferred. (Pierzchala, 2004). While very useful it created added problems when wanting to reassemble captured packets that have undergone payload gzip compression. Unless one inspects each HTTP header from the web server checking if the payload is com- pressed and then unpacking it, one can not simply search through packet payload data to identify a packets contents. This is not an issue for images that are already compressed such as jpeg, gif and png. It is however an issue when wanting to review html or text data returned from a server as these benefit the greatest in terms of band- width reduction and are therefore utilized much more. It should be noted that the BMP image format does benefit from compression however most web servers either do not host BMP (large size compared to other formats) images files or compression is not enabled. As the main focus of this thesis is on image identification and re-assembly from HTTP streams compression is not an issue. It is however important to be aware of when re-assembling non image HTTP data.

2.3 Related Work The application we are intending to create surpasses functionality, performance and usability of similar software. We shall briefly highlight three different applications and compare them to our requirements.

• York 1.55 (York, 2011) The application comes with many features such as the ability to select a client and follow his clicks in your browser. Furthermore it has the ability to sniff for HTTP, FTP, POP3, SMTP, SMB, VNC passwords along with capturing multiple media types. In terms of performance the image loss was significantly greater then our completed application. See figure 7.5 on page 42. All tests were done with the same settings for Pcap (Packet Capture API) filter and promiscuous as our application. The usability lacked as no persistent history of images captured and their network identifiers are displayed

8 to the user during a live capture. Furthermore the ”slide show” view of images captured seems to scroll at the same speed regardless of the image cue size. Images can not be clicked as they scroll across the screen to quickly identify an images network details. Overall it gives the impression of an application that has many options but does not focus in detail on any, making it a alternative application for home users. A screenshot example of the application can be seen in figure 2.3.

Figure 2.3: York 1.55 for Windows

• Driftnet 0.1.6 (Lightfoot, 2011) This application is written in C and was compiled using GCC 4.3.6. The program is very simple as it only displays cap- tured images on screen by scrolling thumbnails of the images down the screen. It was identified at a serious memory leak was present and the application would crash after 10 minutes of use. Therefore the performance in image loss could not be measured. Usability was near non existent since the application does not keep logs or show network details of images found to the user. The author himself writes ”Driftnet is in a rather early stage of development. Translation: you may not be able to make it compile, and, if you do, it probably won’t run quite right.” Lastly the functionality lacks as it is only to identify JPEG images. A screenshot example of the application can be seen in figure 2.4.

• EtherPEG 1.3a1 (Cheshire, 2011) An OS X application that only sniffs JPEG and GIF data. It works on the basis of magic numbers. As the author has no access to an OS X machine along with the Non-free compiler ”Coder Warrior”

9 Figure 2.4: Driftnet 0.1.6 Linux/Solaris this application could not be evaluated in terms of performance or functionality. However as explained later the method of magic number identification in raw network traffic is very CPU intensive and as such the conclusion can be made that its image loss during high bandwidth loads is expected to be very high. Screenshots of the application found on Google image search reveal an organized and cluttered view of the captured images without any image network origin or destination identification possible. A screenshot example of the application can be seen in figure 2.5.

Figure 2.5: EtherPEG 1.3a1 MacOS X

10 3 Design Requirements

This chapter lists a set of predetermined requirements that the research program aims to fulfil. The requirements have been split up into the sections pertaining to their scope of comparison. These sections are performance, functional and usability. Lastly several use case scenarios are given to represent the requirement goals in real word examples.

3.1 Performance Requirements The application must be able to perform continuously on a 2Ghz Intel/AMD processor with 1 Gigabyte of Ram and at least 10 Gigabytes of free hard drive space using a standard network connection of 10/100Mbit for at least 24 hours. Due to the concern of the processor not being able to keep up with high traffic volume this criteria will apply to non-live capture. Non-live capture is where incoming packets are stored on the computers hard drive and processed at a later time as compared to live capture where captured packets are processed immediately and displayed to the user. The performance criteria for live packet capture are the same as for non-live with a reduced network connection of 10Mbit and a run time of 1 hour. The application may miss to identify a maximum of 10 percent of images for live capture and 5 percent for non-live capture. Identified images are required to be visually identifiable to their original. This does not mean they are byte by byte identical or do not contain bytes that are considered network noise, ie. packet data contained in an image that is not part of the image.

3.2 Functional Requirements The application shall be able to monitor all network interfaces/devices that the Mi- crosoft Windows WinPcap/Gnu Linux libPcap driver is able to access. This takes into consideration interfaces that do not allow Promiscuous traffic captures such as most consumer Wireless Interface Hardware. It must be able to filter out images of type JPEG, GIF and PNG. The application should detect and display HTTP Get requests in captured packets. A history that details an images source and destination IP and PORT along with MAC addresses must be made available to the user outside the application through the use of a text file that is saved in the folder where images are saved. Lastly a log of all HTTP Get requests will also be written to a file for further reference. This is done to enable users of the application to keep a record of all images/requests found along with their associated network statistics that can be accessed without needing to run the application.

11 3.3 Usability Requirements The application should have a clear and simple user interface that represents the captured image and HTTP GET requests in a simple and interactive manner. It is the goal to present incoming packets and images to the user in real time ie. displaying the data in the graphical user interface as soon as the packet is captured. In order to initiate the packet captures the user is presented with a list of Network interfaces detected by the Pcap driver with the option of selecting one. The option to narrow down the amount of captured packets will be enabled by letting the user define a capture filter according to libPcap syntax. Promiscuous or normal capture must be selected by the user by use of a check-box or radio button before each capture. It is proposed to have two lists in text form where incoming packet details are displayed to the user, for packets identified by the application to contain image data the GUI is updated to represent the IP Source and Destination, MAC Address Source and Destination, Sequence number and Image type. For HTTP GET requests the GUI shall display Source, Destination IP and the URL that was requested. Images that are found shall be displayed on screen in a scrolling manner, either left to right or up and down. As images can be very large in size (bigger then the application window) all images have to be converted to thumbnails. Thumbnail scroll speed needs to be dynamic depending on the users viewing preferences and can be changed by a simple slider. Lastly images displayed in thumbnail form should be click-able and open a new window with the full sized image together with connection details. This will be the same for the click-able text list containing image packet details as mentioned above. The HTTP GET request list must also be click-able where each request line clicked opens the systems default browser to the URL that is contained within.

3.4 Textual Use Case scenarios In this section we present several use case scenarios of how the application could be utilized in real world situations. It is intended to give an overview of the potential problem situations that can be mitigated through the use of traffic sniffing with media reassembly properties.

3.4.1 Scenario 1. The actor (user) and in this scenario described as a network administrator becomes aware data is being transferred over his or her network that is against the companies policy of acceptable use of employee internet. The administrator configures a net- work router to copy all network traffic to his computer. He then begins by setting our application to the correct interface that is connected to the router with an op- tional capture filter to narrow down the search (ie filter out packets that are non http oriented such as ICMP (Internet Control Message Protocol), ARP (Address Resolu- tion Protocol, SSH (Secure Shell) etc. He then begins a non-live capture letting the application run for a couple of hours. When done he stops the capture and lets the

12 application process the captured packets for images and HTTP GET requests. He can then browse to the folder location where the images have been reassembled and visually inspect all image data. Should he find an image that is relevant to his search (non company compliant media) he can then look up the image in a log file to identify source, destination IP and MAC addresses along with the exact time and data the image was transferred over his network. With this information he is able to identify the workstation that the image was requested from. Furthermore he has a history of all web requests initiated by each computer connected to the router.

3.4.2 Scenario 2. A network security specialist is tasked with identifying a companies network security risk by providing data a hacker may obtain if network vulnerabilities are exploited. He finds an employee has set up a non-secured wireless access point. By launching the application and selecting the live-capture option he can present the seriousness of this security risk in a simple and visual manner to the non-technical board of directors using the real-time view of images and website requests found.

3.4.3 Scenario 3. A website designer is asked by a client who runs a photography website of a live view of user interaction with her web page. The client has several websites that contain a vast portfolio of images from flowers to industry equipment that are for sale. To better streamline a users experience with the site she needs to know what images users are drawn to first and which ones are rarely clicked. She asks a network administrator to give her a live view of what web pages users are most interested in. The network administrator sets up our application to monitor GET requests along with all requested images to be shown in live mode. Watching requested images scroll by in live mode she realizes that what she had thought of as popular images and pages are hardly viewed but that most users come to her website to view only a certain portion of the images in her portfolio. With this knowledge she can focus her photography on popular themes to maximise her profits.

13 4 Implementation

The following chapter is used to explain the programming approach taken along with the mechanics of packet capture API (Application Programming Interface). It details the algorithms needed reassemble network data into its original form while presenting two different approaches to identifying image data in network packets. Image verification, buffer overflow prevention and network transfer errors are shown along with the approach taken to deal with these issues.

4.1 Live Capture A key aspect of our application is to capture packets and display images contained in these packages as soon as they become available (Live Capture). This is in contrast to Non-Live packet capture where all captured packets are written to the hard disk and processed later to try and identify images contained within. A live packet capture is very CPU intensive. This is because for each packet that is captured the program must run through a sequence of steps, such as getting the packets IP source and desti- nation, port source and destination, sorting, searching for image identifiers, assembly of packets into an image, saving the image, creating a thumbnail of the image, and finally displaying the image. All the while new packets are coming in needing the exact same processing done on them. Non-Live packet capture allows all the CPU power to be used to simply capture images and write them to disk. Non-Live packet capture is expected to allow a more thorough search for images as packets are not as likely to be dropped due to CPU processing limitations. The process of live packet capture can be presented of as the following pseudo code steps shown in figure 4.1.

1 - Start Packet Capture setting up callback to a process . 2 - New Packet is found, Check if Connection exits in which packet belongs 3 IF TRUE, add packet. 4 IF FALSE, check if packet contains Image Header Identifier 5 IF TRUE, create new Connection adding packet to connection 6 IF FALSE, check if packet contains HTTP Get Request 7 IF TRUE, Extract Requested URL 8 IF FALSE, discard packet 9 IF Connection has all packets needed to create image. 10 Sort all packets by sequence number. 11 If Connection contains duplicates or wrong sequence number packets 12 Discard Connection 13 Else 14 Assemble Image. 15 Save Image.

Figure 4.1: Live Packet Capture Procedure

14 4.2 Non-Live Packet Capture and Processing As mentioned above a Non-Live packet capture is done to allow the capturing of packets to be done separately from the processing (searching for images) of packets. The user may want to use all available computer resources to capture packets and thus minimize the chance of packet loss caused by the CPU load needed for processing. Packets are written directly to disk and saved under the packet capture format used by numerous pcap based applications. When the user is ready to process the captured packet file for images the same method is used as live capture except that packets are read from disk instead of a network adapter. An issue encountered early on with this method is that the speed of processing packets read from disk can easily cause the application to use up all CPU resources. This is due to the loop situation that occurs when a packet has been processed and the next packet is automatically extracted from the capture file. This loop of fetching a packet and processing normally runs without delay resulting in large amounts of packets processed in a short time but also the usage of CPU resources potentially needed by other applications. In order to prevent this from happening it was attempted to utilize a timer tick in C# to fetch a new packet for processing as soon as a given amount of time has elapsed. A timer in C# cannot go below 1 millisecond which turns out is too long. The solution is to process a user set amount of packets each tick (1ms). Using this method processing can be halted or sped up depending on the users needs.

4.3 Mechanics of Pcap The Pcap capture library is commonly used for low level capturing of network traffic. It is famous for being the core capturing capabilities of programs such as Wireshark (formerly Ethereal) and tcpdump. The library provides various api’s to capture, filter and analyse network traffic to suit network programmer needs. The specific library we use for our C# program is the SharpPcap (SharpPcap, 2011) assembly library. For the Perl version of our program we have used the Net Pcap-0.16 (Carnut, 2011) module. After including the module to be used by our program via ”use Net::Pcap” or ”using SharpPcap” we can then begin by identifying the various network devices and selecting the one we want to capture packets from. See lines 1 and 2 in the code sample given in figure 4.2. Once we have selected the device we set up an event handler that is executed for each packet that is detected by the Pcap device. The process is quite simple, special note should be given to the device.Filter. It contains a string of characters that tell the capture device what packets to keep and which ones to discard. The more strict the capture filter is the less processing we need to do. For example, in our scenario we are interested in HTTP images and HTTP Get Requests. We know that most standard web servers run on port 80. We are not interested in packets coming from any other port as they most probably do not contain HTTP data. Furthermore we are looking for packets with a payload, and not SYN/FIN/ACK packets that TCP uses to create/close/acknowledge individual packets and connections. We therefore set a device.Filter to the specific syntax that

15 1 devices = LivePcapDeviceList.Instance; 2 device = devices[comboBox_capDevices.SelectedIndex]; 3 device.OnPacketArrival += new PacketArrivalEventHandler(device_OnPacketArrival); 4 5 if(promiscous) 6 device.Open(DeviceMode.Promiscuous, readTimeoutMilliseconds); 7 else 8 device.Open(DeviceMode.Normal, readTimeoutMilliseconds); 9 10 device.Filter = "tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)"; 11 12 if(capture_live) 13 device.StartCapture(); 14 else 15 device.DumpOpen(dumpfile); 16 device.StartCapture();

Figure 4.2: Initialize and Start Pcap

WinPcap uses to filter out unwanted packets.

4.4 Sorting Packets into Connections When one does a packet capture on a busy network Switch or Router one can expect to see a large amount of packets with different sources (users) and destinations (servers). Say we find a packet that contains the beginning of an image. The packet comes from server (A), and going to client (B). In order to capture the full image we need to capture all packets that come from server (A) and go to client (B). The problem is however that client (B) may at the same time of receiving the image also be receiving other packets that have nothing to do with the image. For example, client (B) may be downloading a binary file from port 21 via FTP. If we only captured packets coming from IP of (A) and going to IP of (B) we would end up with a mix of HTTP and FTP packets without knowing which are associated with HTTP (what we are interested in) and FTP. Therefore we have to go a step further and not only sort by IP but also by port. Looking at the packet that contains HTTP Header data to indicate an image we see that the data is going from port 80 (Http) to port 3213 on the client, if we find another packet that is coming from port 21 and going to port 2523 we know it must not be part of the HTTP image stream. Knowing what packets belong to which user and which application is essential to sorting only the packets that contain image data. The application must be able to sort incoming traffic into connection based con- tainers. For each packet that is captured the program checks if a similar packet (same source and destination IP and Port) already exits. If not the packets is sorted into a new connection array that is identified by its IP and Port source and destination address. If a packet is captured that matches an already existing connection it is appended to the connection array and the array is sorted by sequence number. This will allow the application to keep a history of connections where each connection only contains packets of a specific client-to-host IP set. For simpler visualization of this method see the connection sample figure 4.3. Lines 3, 8 and 13 are connections arrays each containing only packets that match the

16 1 2 Connection Array[Connection Number][Packet] 3 ->Connection 0 4 -> Packet Src IP: 192.168.1.1 Port: 21 Dst IP: 192.168.1.19 Port: 63123 Seq Number: 1000 5 -> Packet Src IP: 192.168.1.1 Port: 21 Dst IP: 192.168.1.19 Port: 63123 Seq Number: 2000 6 -> Packet Src IP: 192.168.1.1 Port: 21 Dst IP: 192.168.1.19 Port: 63123 Seq Number: 3000 7 -> Packet Src IP: 192.168.1.1 Port: 21 Dst IP: 192.168.1.19 Port: 63123 Seq Number: 4000 8 ->Connection 1 9 -> Packet Src IP: 88.21.51.12 Port: 80 Dst IP: 192.168.1.1 Port 21234 Seq Number: 5100 10 -> Packet Src IP: 88.21.51.12 Port: 80 Dst IP: 192.168.1.1 Port 21234 Seq Number: 5200 11 -> Packet Src IP: 88.21.51.12 Port: 80 Dst IP: 192.168.1.1 Port 21234 Seq Number: 5300 12 -> Packet Src IP: 88.21.51.12 Port: 80 Dst IP: 192.168.1.1 Port 21234 Seq Number: 5400 13 ->Connection 2 14 -> Packet Src IP: 31.22.234.501 Port: 333 Dst IP: 192.168.1.1 Port 4123 Seq Number: 34121 15 -> Packet Src IP: 31.22.234.501 Port: 333 Dst IP: 192.168.1.1 Port 4123 Seq Number: 73212 16

Figure 4.3: Connection Arrays

connections first packet Source and Destination IP’s. Should a packet be captured with Source IP 88.21.51.12 Port 80 and Destination 192.168.1.1 Port 21234 the appli- cation will search through all connections looking at the fist packet in each to see if it finds a match. In our example the match would be found at line 8, the program then adds the packet to this connection. Similarly if an captured packet presents itself as Source IP: 192.168.1.39 Port 2211 Destination IP: 41.85.123.5 Port 5232 the program will not find an existing connection and create a new array for the packet called a connection. Incoming packets are automatically checked if they contain data that can be used to identify them as belonging to an image. Image types that the program should be able to capture are: JPEG, PNG, GIF and BMP. Furthermore the application should scan each packet to see if it contains a HTTP GET request, if true then the packet is kept. Packets that do not contain image data or HTTP GET requests are discarded.

4.5 Image verification As with any raw network data packets can become lost, duplicated or sent out of order. To combat this problem packet sequence numbers are used together with image size indicators to verify all packets in a connection are present and in the correct order. As with all TCP sequence numbers correct order checking can be accomplished by adding the data payload size to the last sequence number and checking if the next packet in the connection contains this number. Should there be a discrepancy between the expected sequence number and the actual sequence number then the image is deemed as corrupt and the connection is removed.

4.6 Preventing Connection Buffer Overflow In terms of our application a connection buffer overflow is deemed as the accumulation of connection arrays to the point where more memory is being used compared to the amount freed by deleting a connection. For example, if many packets from different sources arrive with image headers then a new connection array is created for each

17 source. This process takes up memory and for each added connection the application must loop through all connections in order to sort further incoming packets into the correct array. Should for some reason a server stop sending packets with image payload data before the rest of an image is received then the application would keep waiting for the rest of the image. While this happening a few times is not a problem, it does become an issue when the application runs for a long time. To prevent unfinished connections from sitting idle a timer is used to keep track of how long a connection exists. The connection time to live (ttl) can be set by the user depending on the network and traffic flow. The solution is therefore to mark the time when a connection is established and to remove the connection once a pre set time limit is reached or all packets needed for an image have been received. The process of using timeouts is especially important when dealing with networks that have high packet loss.

4.7 Finding the image During implementation of image identification class two different methods were tested against each other to see which one was less CPU taxing and more accurate. We will briefly show both methods in this section and give an overview of the benefits and drawbacks of each.

4.7.1 Using Magic Numbers The first method that was tested used Magic Numbers or Sting identifiers to indicate when an image begins and when an image ends. (Wikipedia-Magic, 2011) In order to identify an image in a set of TCP packets we had to identify the beginning and end sequence of bytes that make up the image. Regardless of JPG, GIF or PNG each image begins with a byte sequence that is specific to that image type. If we know the sequence of bytes that an image begins with we can scan through a packets payload to check if the sequence exists. For example the commonly used jpeg image always begins with ”FFD8FF” in hexadecimal form. Below is an example of a small jpg image viewed in a hex reader.

Image Type Begin of File End of File JPEG FFD8FF FFD9 PNG 89504E470D0A1A0A 49454E44AE426082 GIF 47494638 003B

We first search through each incoming packet for byte sequences such as ”FFD8FF” and mark down the location in the byte array of the packet payload. We then con- tinue to search through each further packet that is part of the same connection until we find ”FFD9”. Every packet found after the packet containing ”FFD8FF” and before ”FFD9” should be part of the image. Assembling the bytes of these packets often reveals the original image. The term ”often” is used here as these magic strings

18 Figure 4.4: JPEG Beginning of file identifier

are not long enough to ensure they can only belong to an image. For example a binary file of another type may contain an ”FFD8FF” although it is not actually an image. Searching for PNG images returns better results as the magic string is longer and therefore chances of data containing the 8 bit sequence of ”89504E470D0A1A0A” hex are not as great. It is also very CPU intensive as each and every packet has to be scanned in its entirety to find a start of image or end of image marker. Furthermore the packet with the end of image magic string may have been lost or dropped by the client. In this case without the use of time-outs or limiting maximum amount of packets per connection/image the application would keep scanning incoming packets until a ”FFD9” end of image marker is found. This process does however have its advantages, as it does not rely on HTTP Header mime types to identify the start of an image and its size (as shown in the next section) we can potentially search a network stream for any data that we know the beginning and end magic strings of.

4.7.2 Using HTTP Headers (MIME Types) As stated in the Background section of this document, HTTP Headers are very useful when trying to identify an image in a TCP stream. If a HTTP Header contains a MIME Type that we are interested in we only need to search for the content size of the media type in the HTTP header to identify where the image ends. (Tanenbaum, 2002)

Image Type Cleartext of Mime Type HEX Mime Type String JPEG Content-Type: image/jpeg 436f6e74656e742d547970653a20696d6167652f6a706567 PNG Content-Type: image/png 436f6e74656e742d547970653a20696d6167652f706e67 GIF Content-Type: image/gif 436f6e74656e742d547970653a20696d6167652f676966 The long magic strings of mime type have very little possibility or showing up in packets that do not belong to a HTTP stream of an image. We can therefore

19 deduce the type of image from the header that is useful when we are only interested in certain types of media such as Images or more specific ”JPEG, PNG, GIF”. The HTTP header also contains the size of the payload (image). All we then have to do is find out where the HTTP Header ends which will also be that start of the image and read all packets part of the connection until we have reached the size specified by the ”Content Length: ” string.

Identifier Type Cleartext HEX End of HTTP Header indicator ” ” 0d0a0d0a Start of Image Size Content-Length: 436f6e74656e742d4c656e6774683a20 End of Image Size ” ” 0d0a

20 5 Getting the Network data

This chapter gives a practical analysis on the steps needed to obtain network traffic in differ- ent network topologies. Logic in switched versus non-switched/hub networks are presented. Finally the real-time problem is discussed with test examples.

5.1 Switched/Non Switched Networks In modern TCP/IP switched networks it is not easy to capture network traffic that is not intended for oneself without having access to the network hardware. This is in contrast to simple (often home networks) that rely on a network hub. Methods do exist that allow for network sniffing in switched networks, they do however require some knowledge of network routing principals. Before we get into switched network traffic sniffing let us examine the simple hub network.

Figure 5.1: Simple Ethernet Hub

In the example figure 5.1 client (A) is connected directly to the hub, which in turn is connected to an internet router (B). Client (C) is also connected to this hub and is attempting to sniff the network traffic between user (A) and the the internet router. Client (A) sends a http request to view http://example.com/image.gif. Client (A) machine holds a default gateway of 192.169.1.1 which is the IP address of the internet router (B). Is is important to note that the http request is first sent over the hub, the hub sends the request not only to the intended target (B) but also to each and every other machine connected to the hub (C). Standard network cards are configured to drop packets that are not intended

21 for them by looking at the media access control (MAC) address, therefore in a normal environment client (C)’s machine would drop the packet and router (B) would accept it forwarding the request. However client (C) has the option of requesting it’s network interface on his machine to not drop the network packets from (A), this is most often referred to as Promiscuous Mode. It allows for viewing of network traffic that is not intended for oneself. Similarly when internet router (B) gets data from the example.com back, it sends it to the hub which forwards the data to all clients connected to it, in our case client (A) and (C). Client (A) and internet router (B) have any idea that client (C) is listening in on their data exchange.

Figure 5.2: Simple Ethernet Switch

As stated previously a switched network is more complex in terms of capturing data that is not intended for oneself. Man in the middle attacks or ARP poisoning can be utilized on switched networks where Address Resolution Protocol is the method used for address resolution to sniff traffic. Switches work on the basis of the Datalink layer of the OSI Reference Model. This means that a switch relies on a machines MAC address as the identifier of the machine. A simple switch has a table that tells it what MAC addresses reside on what port. In the case where User (A) sends a Http Request to the Default Gateway or Router (B) in our scenario User (A)’s machine only knows the IP of the Default Gateway, since the Switch has no concept of IP’s it cannot forward a packet based on the destination IP of the packet. It User (A)s machine has not sent any data to Router (B)

22 it first has to learn the MAC address of Router (B). This is done by User (A)’s machine sending out a ARP Request Broadcast (Who has IP: 192.168.1.1?) to every machine on the network. Each machine connected to the switch receives this ARP Request and checks if it is the one with IP: 192.168.1.1, if it is not the machine ignores the request. If Router (B) gets this request it replies, with an ARP Reply stating, (Yes I am 192.168.1.1, my MAC address is: XX-XX-XX-XX-XX-XX). Now User (A)’s machine knows what MAC address Router (B) has and sends its Http Request with Router (B)’s Mac Address in its destination header to the network Switch (C). The switch keeps a ARP table that tells it on what port Router (B) is connected and forwards the Http Request to that port. While this method is very simple and therefore stable it is not very secure. The Address Resolution Protocol does not use any sort of authentication that would prevent fake or spoofed ARP Replies being sent. Furthermore a machine can send a ARP Reply to any machine without the receiving machine having to have sent a ARP Request. The machine will accept the ARP Reply and note the IP-Mac Address in its ARP tables. Say User (C) 192.168.1.3:XX-XX-XX-XX-XX-AC wants to listen in on the network traf- fic between User (A) 192.168.1.2:XX-XX-XX-XX-XX-AA and Router (B) 192.168.1.1:XX- XX-XX-XX-XX-AB . User (C) first sends an ARP Reply to Router (B) stating a fake IP address and its own MAC address, (I am: 192.168.1.2:XX-XX-XX-XX-XX-AC). Now Router (B) thinks that if its needs to send data to User (A) 192.168.1.2 it will use XX- XX-XX-XX-XX-AC which is infact not User (A)’s MAC address at all but rather User (C)’s. This already gives User (C) all the data inteded for User (A) but it prevents User (A)’s machine from receiving any data from Router (B). Complete the attack User (C) sends an ARP Reply to User (A)’s machine stating (I am: 192.168.1.1:XX-XX-XX-XX- XX-AC). Instead of data intended for Router (B) 192.168.1.1:XX-XX-XX-XX-XX-AB it is sent to 192.168.1.1:XX-XX-XX-XX-XX-AC which is the MAC address of User (C). All User (C) now has to do is forward the traffic accordingly. When it gets traffic from User (A) it forwards it to Router (B) and vice versa. User (A) and Router (B) think they are communicating directly with each other, but User (C) is actually the ”Man-In-the-Middle” forwarding and sniffing the traffic between the two.

5.2 The Real-time problem The ideal goal of an application that can display network data in a visual way is to present the data to the user as it is captured. There are several issues that need to be addressed in order to display network images on screen as close to when the network interface captures them. First off the amount of network data that needs to be processed can vary greatly depending on network traffic load. Furthermore the actual implementation of the packet capture dialogue needs to be able to capture a high percentage of packets without dropping data packets. This can be tested by running network traffic monitoring software against the packet capture dialogue and comparing the difference. For a controlled testing environment it is best to utilize a local area network that is not connected to any 3rd party tiers such as the internet. Preliminary tests were done with a internet connection rated at 700kbps. The machine used was a Intel Core Duo rated at 2.33 Ghz per core on a 2.26.1 Ubuntu Linux . A standard Ethernet NIC of 10/100 Mbit connected to a internet router set to filter no traffic. A simple perl script was written to display total size and number of packets captured. This script utilized

23 Figure 5.3: Ethernet Switch with MITM attack the perl Net::Pcap module for the capture api. It was run against the open source tool tcpdump and Iptraf. A 200MB test file was downloaded from a fast server in Germany to ensure the connection was saturated with traffic. Interestingly enough both the perl script and tcpdump showed the same number of captured packets. This was not the case for Iptraf which missed 2 packets, the reason for this is beyond the scope of this research. It is important to note that no actual processing was done on the packet contents, keeping the processor load to a minimum. Therefore in theory for the above test setup one can expect to have Net::Pcap capture 100% of packets. Since utilizing 700kbps of network traffic has not shown any packet loss and therefore the next step is to test Net::Pcap on a local area network using 10/100Mbit connection.

Capture Program No of Packets Lost Packets Size Connection Perl Net::Pcap 191157 0 185 MB 10/100 SharpPcap 191157 0 185 MB 10/100 Iptraf 191153 4 185 MB 10/100 tcpdump 191157 0 185 MB 10/100

24 The above data shows that no packets were dropped when transferring packets at 10.9MB\s.The next step is to use Net::Pcap to write the packets to file for later process- ing. A perl script was written to capture packets and write them to a packet file using the pcap dump function. A file transfer was done over Samba again with a 185 MB file on a 10/100 LAN (Local Area Network) connection. Comparing the results with tcpdump no packet loss was de- tected. This result leads one to conclude that Net::Pcap module is able to handle at least 10.9MB/s. Due to not being able to test data speed transfers above 10.9MB/s one can not assume that Net::Pcap can handle faster speeds. To present the real-time problem in an effective manner one has to actually attempt to do processing on a received packet as soon as it is captured by Net::Pcap. While one may be tempted to use buffers to overcome Net:Pcap missing incoming packets this method is not advised if the goal is to visualize packet contents in real-time. The faster method is storing all incoming packets in memory until they are ready to be processed. While this saves time it does pose a problem when limited RAM is available for the packets to be stored.

25 6 The Graphical User Interface

The graphical user interface chapter compares two different programming architectures. Firstly the programming language Perl, using the Eclipse IDE with GTK+ GUI fronted are discussed with their benefits and drawbacks. The same is done for C# in Visual Studio 2010. The implemented end functionality is presented.

6.1 Perl, Eclipse and GTK+ The first basic version of the application was implemented in Perl using the GTK+ frame- work. This was seen as favourable as it can be used on both Linux and Microsoft Windows operating systems. Using Perl Integration with Eclipse IDE (Epic, 2011) the fundamental logic of packet capture and assembly was strait forward. It was only when the graphical user interface was being developed where the process started to falter. Although perl does offer a library of binding to Gtk2 it soon became clear that writing advanced user interfaces in Perl using GTK+ is a tedious process. Firstly there is the issue of threading, threading has largely been abandoned in Perl with favour leaning towards what is known as POE or Perl Object Environment. (POE, 2011). The issue is the way Perl GTK2 is event driven, while this is not an issue by itself it becomes a problem when threads are poorly imple- mented and therefore unusable. Perl GTK2 uses an an event loop that remains idle until an a signal is emitted ex. click of a button. If we connect this signal to a callback the callback gets executed, while this execution of the callback occurs the GUI is frozen. The standard way of handling this is using threads, one for the GTK2 loop and the other for any processing we needs to do. Once the processing is done we can update the GUI with the new information. Instead of threads many people recommend POE, it adds another level of complexity to the code and is not intuitive. Programmers that have never use POE will find the learning curve very steep and POE-GTK2 documentation lacking. Writing a GTK2 interface by hand, ie. not using a Visual Editor such as Visual Studio is extremely time consuming. The code for each button, list and menu etc. must be written by hand. A prototype was therefore developed as two separate Perl files. One that could be run from the command line to capture and scan for images with the ability to be executed by a GUI front end application. Should the user want to use a GUI to capture images the front end application would spawn a separate process where image capture and search was done. Although rudimentary it had no need for threads or POE. One process would cap- ture, identify, sort and write images to disk while the other would display the images. A working prototype was achieved but the result proved that Perl was never meant to develop complex GUI’s without the use of a good threading implementation. Key aspects of the program were not implemented such as HTTP GET request identification. Furthermore packet information such as source and destination along with other components listed in the Usability Requirements were abandoned in favour of developing the application in C# using Visual Studio .Net

26 Figure 6.1: Perl Prototype Application Overview

27 6.2 C#, Visual Studio and .Net Upon implementing the first version of the application in C# it quickly became evident that Visual Studio was a far better IDE then Eclipse. The user interface is much more intuitive and documentation of the .Net library is excellent with code examples for any number of tasks. Being able to drag and drop GUI elements onto a form and manipulating their appearance them without the need write code saved a lot of time. An overall screenshot of the main application window can be see in figure 6.2. It is split up into separate GroupBox sections to aid navigation of application options and settings. The layout of the elements in the CaptureBar at the top are set up in order from left to right as to naturally allow the most used settings and buttons to be configured before each capture.

Figure 6.2: C# Application Overview

A key element implemented to achieve our usability requirements was the ability to view images found on the network in an interactive way. In figure 6.3 we have integrated two image buffers that display found images as thumbnails. The images travel from right to left at a variable speed specified by the user. This method enables a large amount of images to be viewed in a short time. The application automatically converts the full sized images to a thumbnail size that fill the image buffers height even when the application window is resized, see figure 6.4. If an image is clicked the original full sized image is shown in a new window together with its network information. Similarly if a HTTP Get Request row is double clicked the application opens the standard internet browser the the requested URL.

28 Figure 6.3: Image Scroll Buffers

Figure 6.4: Original Image Window

Although the method of visualising data by having images scroll from left to right does indeed work in principal, C# .Net displays the images as flickering and tearing. The amount of CPU taken to redraw an image after it has moved one pixel is significant compromising the real-time ability of the underlying network sort and search packet algorithms. The approach taken was to use a timer with a 1ms tick that fires an event to move each thumbnail on the GUI, while for one image alone this method would be somewhat acceptable moving numerous images at the same time becomes impractical. It was then attempted to use the .Net functionality of double buffering which ended up improving the flickering and tearing but not an acceptable extent.(Meterman, 2011) Therefore the only other options were to implement OpenGL or DirectX hardware acceleration for displaying the images or to not scroll but rather have an image skip from one image frame to the next as see in figure 6.5.

29 Another issue that was only identified late in the production phase was the problem with the SharpPcap API library. When prototyping started the version 3.4.0 was used that at the time did not support AirPcap (Wireless) or LibPcap(Linux based) devices. Being the only choice as it integrates seamlessly with C# .Net it was used none the less. As of version 4.0.0 which was a major milestone with significant speed improvements there features were implemented thus meaning the C# .Net SharpPcap integration code written for version 3.4.0 had to be modified to allow use of these new features. (SourceForge.net, 2011)

Figure 6.5: C# Application with Non Scrolling Image Display

Finally to enable better user understanding of Image scroll speed and Offline Processing speed slider adjustment text markers were added to show how many images can be viewed per minute as a slider is moved or similarly the number of packets processed in offline capture mode. (Figure 6.6)

30 Figure 6.6: C# Variable Image and Processing Speed Information

6.2.1 Implemented Functionality This section shows the implemented features and functionalities in the final version of the software. It only presents an overview of the most important items relevant to this thesis, and not standard usability features such as Window resize capability etc. which can be seen in the applications source code. • List and selection of network interfaces to sniff

• Pcap packet filter input option

• Promiscuous mode toggle

• Live\Non-Live Capture (

• Process Button to process Pcap capture files

• Live counter information for: Packet Captured, Images Sniffed, HTTP Requests Sniffed, Active Connections, Connection TTL Exceeded, Packet Sequence Number Errors

• Image Scroll Buffers to show thumbnail samples of captured images (click thumbnail to view full size image)

• Image type to search for (JPG, PNG, GIF)

• Image scroll speed adjustment

• Set upper limit of Image size to search

• Non-Live process speed adjustment

• Connection Time To Live adjustment in seconds

• Packet Monitor list of captured Images with relevant network information (click to view full size image)

• HTTP Request list showing Source, Host, and requested URL (click to open url in Browser)

• Debug Box that shows relevant application information

31 • Selection of location to save images and packet captures

• Image Reference File automatically saved containing: Time, Image Name, Relevant Network Information

• HTTP Get Reference File automatically saved containing: Request time, Request Source, Host, URL

6.3 Final Application Layout The figure below shows how actors initiate different paths of actions that the application handles. In the figure we see that traffic generating user generates traffic that is captured by the traffic capture component which has previously been set up by the application user. The traffic is then processed for images and HTTP Get requests. If images are found they are displayed to the application user, similarly HTTP Get request can also be accessed by the actor ”Application user”. The use case diagram given an overview of the boundaries and actor interaction with the system.

Figure 6.7: UML Use Case Diagram

32 7 Evaluation

7.1 Evaluation and Testing The purpose of this section is to investigate the final application in terms of the pre- set requirements. Should the application not meet the initial goals, further analysis will be undertaken to identify the reasons for the shortcomings. Depending on the type of shortcoming the program will be revised to meet the requirements or recommendations will be made on how to solve the issue. The Future work section details what improvements could be made or features added. A usability evaluation test is done by observing a user complete a set of tasks while noting the time each task takes to complete successfully and obtaining the perceived usability score where 0: ”Not usable, complete redesign needed.” and 100: ”Excellent, no improvements necessary.” Furthermore, the tester asks the user to comment on the reason for giving a certain usability score and what improvements might help to increase the aforementioned. Each test scenario was undertaken by four different users and the mean time and usability scores were calculated.

7.1.1 Usability Evaluation To evaluate the usability of the application a series of tests were undertaken. Each test specified a list of objectives a user had to accomplish while the user was observed and notes were taken. Test Scenario A: The objective is to do a standard live image capture a network interface connected to a bridge in order to obtain an overview of image traffic and HTTP requests on the network. The user is asked to enable the promiscuous mode on the network interface and to only obtain images that are not larger than 2000KB in size.

Task Type Time Taken ¯x Usability Score ¯x Comments Ref. Open App. & Set Def. Fold- 35sec 91 A1 ers Select Correct Interface 136sec 86 A2 Set Promiscuous Mode 5sec 100 A3 Set Live Capture Mode 8sec 50 A4 Set Image Size Limit 48sec 85 A5

Comment References A: A1: None. A2: Due to unfamiliarity with Hardware finding correct network adapter was trial and error. User attempted to use each interface until one with traffic was found. A3: None. A4: Function of Live vs. Nonlive Packet capture not clear. Attributed to user unfamiliarity with set task. A5: User took a long time searching for input field to set Maximum Image Size Limit. Placement of Image Size Limit Input Box could be placed in the main capture bar.

33 Test Scenario B: The objective is to do a image and HTTP request capture over a period of 2 hours whereby the captured packets are written to a file (non-live mode). The user is asked to specify a location where the capture file is to be written, set the correct network interface that connects to the network bride, set the network device to promiscuity and disable (live-capture) mode. After completing the capture, open the saved packet data in the external program Wireshark/Ethereal and confirm the amount of captured packets shown by the Capture application matches those that Wireshark presents.

Task Type Time Taken ¯x Usability Score ¯x Comments Ref. Select Correct Interface 11sec 87 B1 Turn Off Promiscous mode 3sec 100 B2 Select Packet Filter to TCP 230sec 35 B3 Port 80 Select File Save location for 40sec 100 B4 Non-live capture Start Capture 5sec 100 B5 Leave application running for N/A N/A B6 120 minutes Stop Capture 5sec 100 B7 Open Wireshark and import 180sec 84 B8 the 2 hour packet capture. Compare packet count of cap- 37sec 90 B9 ture application with that of Wireshark.

Comment References B: B1: Due to familiarity gained through Test A with the application the users all had no issues selecting the correct interface. However interface naming is still not intuitive. B2: None. B3: All test subjects took a long time trying to figure out if the default packet filter string needed to be changed. Users said a document which details the workings of packet filter string syntax is needed. Once the user knows what string to input the usage of the capture filter option is simple. B4: One user was unsure what location to save data to as this was not specified, upon informing him choice of location was his the user selected the Desktop path. All other users used the ”My Documents” folder under Microsoft Windows XP. B5: None. B6: None. B7: None. B8: Users were unfamiliar with the Wireshark application but were ultimately able to locate their saved packet captures and import them into Wireshark via the ”File->Open” option. One user forgot the previous file save location and utilized the inbuilt ”Open- >Packet Directory” feature as a reference. B9: Packet count in capture application as in Wireshark was easily identifiable and showed the same packet amount.

34 Test Scenario C: The objective is to process the packets obtained in Test Scenario B in order to get an overview of images and HTTP requests contained in the two hour capture. The user is asked to set the Connection TTL option to 20 seconds. Furthermore the user should be able to change the image scroll speed and adjust the speed of processing to achieve optimum performance of the application. The user should then select an image in the image scroll area to inspect its source and destination properties. Next a random HTTP request should be selected noting the the server and url. The user is then tasked with exiting the program and finding the appropriate image in the file system and identifying its previously obtained attributes with those presented in the log file. Finally the user is asked to identify the exact time the HTTP request that was selected by looking it up in the HTTP Request log file.

Task Type Time Taken ¯x Usability Score ¯x Comments Ref. Open Application and Set 13sec 100 C1 TTL option to 20 seconds Start Process 3sec 100 C2 When requested select offline 30sec 86 C3 file to process During process, change image 7sec 100 C4 scroll speed During process, change pro- 8sec 100 C5 cessing speed During process, click an im- 45sec 96 C6 age and note down network attributes. During process, click an 38sec 91 C7 HTTP Request and note down network attributes. Close application and find 61sec 88 C8 previously selected image on file system Compare previously selected 61sec 90 C9 image network attributes with log file Find previously selected 342sec 65 C10 HTTP Request in Log File and identify capture time.

35 Comment References C: C1: None. C2: None. C3: Although default file location browser window opens automatically users did have to remember the name of the last saved capture file. Could be modified to always show the last capture file automatically and only upon request let the user change directories or file selection. C4: No issues, however one user remarked that the scroll speed is arbitrary or not intuitive in terms of a missing defined scale being present on the scroll button. Possibility to change scroll interval to images per minute might be a viable option. C5: Same issue as above, the user has no real visual feedback without a live representation of the number of packets being processed per second. C6: None. C7: Two users mentioned how the same window view of the full sized images could be used but instead incorperate a sample of the url with relevant network attributes. For example if a user clicks a HTTP Request row a window should pop up showing the network attributes together with a built in browser that points to the url. C8: None. C9: None. C10: None. Finding the correct url in a large list of urls was time consuming. One user used the search function of the text editor the log file was displayed in to locate the correct instance faster.

7.1.2 Functional Evaluation The functionality was tested by the program author by submitting the capture software to simulated real world conditions. The program was run on a computer connected to a network bridge in order to monitor traffic between all hosts connected to the bridge. Due to privacy and security issues a real world test could not be undertaken. Instead a separate software was written in C# to simulate web browsing behaviour of a low latency network with that saturated a 700Kbps internet connection. It was decided early on during the rapid prototyping phase that it would be non-practical for a proof of concept thesis to go above the available 700Kbps downstream limit by for example setting up a internal Ethernet Gigabit Web server to host requested media. Furthermore a more accurate evaluation can be seen by using requests to actual internet web servers as only in these conditions can one expect high latency, packet loss, jitter, out of order packets and other undesirable network transfer occurrences. The written software that is also available together with the capture application was designed to read a list of url’s from a file and request each url keeping track of timeouts and web server response times. Due to the nature of web server latency the initial method of sequential requests was quickly scrapped in favour of a multi-threaded version. To clarify the issue encountered was that when a url is requested it is unwise to wait for a response before sending a request for the next url in line. Instead threading was used to split a list of url’s into groups where a thread is created for each group to handle its batch of urls. By using this method a saturation of the 700Kbps internet connection could be guaranteed.

36 In figure 7.1 we can see a sceenshot of the traffic generating program written to simulate high load user browsing behaviour. It keeps track of timeouts, total size of images found, number of different image types and run types. This data can then be used to compare the number of images found by capture software.

Figure 7.1: Application used to simulate browsing behaviour.

The application meets all specified functional requirements including further improve- ments such as the ability to modify the amount of packets processed in offline mode and manipulation of image scroll speed with the ability to link captured urls to be opened in an external browser. Using the prototype method of non mime-type image identification this was not possible. The reason for this was that image identification accuracy was impeded by many false positives stemming from non image related magic numbers in packet data. Below is a list of functional requirements that have been met.

• Ability to monitor all network interfaces/devices via WinPcap.

• Ability to utilize Promiscuous capture mode on supported devices. (Not the case on most wireless network devices.)

• Ability to find images of type: JPEG, GIF and PNG.

• Images are either saved when entire image data has been capture or discarded if not.

• Accuracy in tests have shown 90.5% of images transferred over a network are detected and captured.

• Accuracy in tests have shown 99% of HTTP Get Requests transferred over a network are detected and captured.

• Ability to detect HTTP Get Requests.

37 • Ability to keep a history of image source/destination IP, PORT, MAC and time in application.

• Ability to keep a history of image source/destination IP, PORT, MAC and time outside application in log/history text document.

• Ability to keep a history of HTTP Get Request source/destination IP, PORT, MAC and time in application.

• Ability to keep a history of HTTP Get Request source/destination IP, PORT, MAC and time outside application in log/history text document.

7.1.3 Performance Evaluation The performance results showed that although the application was optimized for speed, packet loss did indeed occur at a rate of about 10% at 700Kbps. Several tests were done using different settings for ConnTTL (Connection time to Live) and maximum image size. The tests revealed no clear differences between the two, although JPEG images were missed with higher frequency then GIF or PNG images. As seen in figure 7.2 the data clearly showed that for a maximum image capture rate the non-live mode was best suited with a image loss of less than 1%. RAM consumption was stable in live mode with a maximum usage of 150MB and 30 to 40MB in non-live. Hard drive storage needed was directly dependant on the amount of images found and their individual sizes. As data compression is already existent in the image types obtained no storage space could have been saved by adding further compression feature to the application.

Figure 7.2: Image Loss (Live vs Non-live)

38 Figure 7.3: Image Loss (By Image Type)

39 Figure 7.4: Raw Test Data 40 Figure 7.5: Raw Test Data Cont. 41 8 Conclusions and Future Work

The goal of the concluding chapter is to summarize the results and findings of the research. It highlight the main aspects of the two different programming approaches in terms of programming language and search processes used. We review the benefits and drawbacks of live versus non-live capture while presenting improvements and ideas relevant to future work.

8.1 Conclusion This thesis has evaluated the feasibility of a image capturing and visualization technique that is stable and fast. As the performance evaluation results (Chapter 7) and the evaluated current applications available (Chapter 2) have shown such an application must be writ- ten to handle large network data loads while being able to detect images and HTTP Get Requests in a realtime and reliable manner. These goals were achieved using SharpPcap capture API along with Visual Studio C#. Smooth scrolling was of images on the GUI was discarded in favour of image translation from one image frame to another thus avoiding flicker, tearing and high CPU loads. Perl with GTK+ was found to be lacking of proper threading implementation instead opting for the more complex and time consuming Perl Object Environment. Lastly best results were achieved using Mime types to identify images along with their data length. Furthermore detecting sequence error, timeouts, multiple im- ages within one network frame greatly increased detection accuracy. Due to live capture being CPU intensive and impractical for long network data analysis the non-live capture method works well in environments where packet loss due to hardware limitations are not acceptable. A maximum image loss of 10% in Live Mode and less than 1% in Non-live mode were key performance requirements that were met (as seen in figure 7.2 on page 39). Use of RAM was observed to range from 20 to 150MB depending on the running time of the application. Overall the different requirements were met, providing a stable, usable and functional application with further features added for better usability and functional- ity. Media acquisition from TCP streams proved to be a complex task with many issues to consider such as network data transfer errors, speed , type and amount of data to be monitored however fundamentally each issue was researched and through trial and error overcome. The final solution along with this thesis that covers the issue of network media monitoring was a working application and description that represents a good example of the type of issues that need consideration and how best to tackle them in a fast and efficient manner.

8.2 Future Work Naturally as the case is with most software, improvements and features can be added. An example of such a feature would be to support more image formats or media types. The difficulty in doing this should be minimal as the ground work of capture, sort and reassembly has been done. Different views could be implemented that allow for monitoring more images at a glance along with graphical network statistics. Besides the already existing HTTP Get

42 Request detection HTTP Post detection could be added. Finally with SharpPcap v4.0.0 a remote capturing of packets by the use of SharpPcap/LibPcap/WinPcap remote sniffing API would be useful to sniff capture network traffic on adapter devices that are remote from the application running program. An example of this would be where a powerful dedicated non mobile machine could be used to run the LiquidIce application while the actual network interface the data is captured from only runs a daemon that forwards the traffic. At the point of writing this document it is still termed as ”highly experimental” (WinPcap, 2011).

43 Bibliography

Britton, Carol and Doake, Jill. Software System Development: A Gentle Introduction. Mcgraw Hill Book Co Ltd, subsequent edition, 2003.

Carnut, Marco. Net::Pcap - search..org, 2011. URL http://search.cpan.org/ ~kcarnut/Net-Pcap-0.05/Pcap.pm. (Accessed 05/2011). Cheshire, Stuart. EtherPEG. http://www.etherpeg.org/, 2011. URL http://www. etherpeg.org/.

Epic, . EPIC - eclipse perl integration, 2011. URL http://www.epic-ide.org/. (Accessed 11/2011).

Gibbs, R. Dennis. Project Management with the IBM Rational Unified Process: Lessons From The Trenches. IBM Press, 1 edition, August 2006.

Lightfoot, Chris. Driftnet, 2011. URL http://www.ex-parrot.com/~chris/driftnet/. (Accessed 01/2012).

Meterman, . Anti flicker graphics using double buffering and how to make simple graphic moves. - CodeProject, 2011. URL http://www.codeproject.com/Articles/7033/ Anti-Flicker-Graphics-using-Double-Buffering-and-h. (Accessed 07/2011).

Orebaugh, Angela. Ethereal Packet Sniffing. Syngress, Rockland, MA, 1 edition, April 2004.

Pierzchala, Stephen. Compressing web content with mod gzip and mod deflate | linux jour- nal, 2004. URL http://www.linuxjournal.com/article/6802. (Accessed 07/2011).

POE, . POE: perl object environment, 2011. URL http://poe.perl.org/. (Accessed 02/2011).

SharpPcap, Anon. SharpPcap, 2011. URL http://sourceforge.net/projects/ sharppcap/. (Accessed 09/2011).

SourceForge.net, . SourceForge.net: sharppcap, 2011. URL http://sourceforge.net/ apps/mediawiki/sharppcap/index.php?title=Main_Page. (Accessed 01/2012).

Tanenbaum, Andrew S. Computer Networks. Prentice Hall, 4 edition, August 2002.

Wikipedia-Magic, . Magic number (programming) - wikipedia, the free encyclopedia, 2011. URL http://en.wikipedia.org/wiki/Magic_number_(programming). (Ac- cessed 05/2011).

WinPcap, Anon. WinPcap.org: group remote, 2011. URL www.winpcap.org/docs/docs_ 40_2/html/group__remote.html. (Accessed 05/2011).

44 York, Anon. York::Log all network traffic - the SZ, 2011. URL http://thesz.diecru.eu/ content/york.php. (Accessed 01/2012).

45

SE-391 82 Kalmar / SE-351 95 Växjö Tel +46 (0)772-28 80 00 [email protected] Lnu.se/dfm