Reverse Engineering a Nit That Unmasks Tor Users
Total Page:16
File Type:pdf, Size:1020Kb
2016 Annual ADFSL Conference on Digital Forensics, Security and Law Proceedings May 25th, 1:00 PM Reverse Engineering a Nit That Unmasks Tor Users Matthew Miller University of Nebraska at Kearney, [email protected] Joshua Stroschein Dakota State University, [email protected] Ashley Podhradsky Dakota State University, [email protected] Follow this and additional works at: https://commons.erau.edu/adfsl Part of the Aviation Safety and Security Commons, Computer Law Commons, Defense and Security Studies Commons, Forensic Science and Technology Commons, Information Security Commons, National Security Law Commons, OS and Networks Commons, Other Computer Sciences Commons, and the Social Control, Law, Crime, and Deviance Commons Scholarly Commons Citation Miller, Matthew; Stroschein, Joshua; and Podhradsky, Ashley, "Reverse Engineering a Nit That Unmasks Tor Users" (2016). Annual ADFSL Conference on Digital Forensics, Security and Law. 10. https://commons.erau.edu/adfsl/2016/wednesday/10 This Peer Reviewed Paper is brought to you for free and open access by the Conferences at Scholarly Commons. It has been accepted for inclusion in Annual ADFSL Conference on Digital Forensics, Security and Law by an (c)ADFSL authorized administrator of Scholarly Commons. For more information, please contact [email protected]. Reverse Engineering a NIT that unmasks TOR users CDFSL Proceedings 2016 REVERSE ENGINEERING A NIT THAT UNMASKS TOR USERS Miller, Matthew Stroschein, Joshua Podhradsky, Ashley [email protected] [email protected] [email protected] University of Nebraska at Kearney and Dakota State University ABSTRACT This paper is a case study of a forensic investigation of a Network Investigative Technique (NIT) used by the FBI to deanonymize users of a The Onion Router (Tor) Hidden Service. The forensic investigators were hired by the defense to determine how the NIT worked. The defendant was ac- cused of using a browser to access illegal information. The authors analyzed the source code, binary les and logs that were used by the NIT. The analysis was used to validate that the NIT collected only necessary and legally authorized information. This paper outlines the publicly available case details, how the NIT logged data, and how the NIT utilized a capability in ash to deanonymize a Tor user. The challenges with the investigation and concerns of the NIT will also be discussed. Keywords: Tor, NIT, deanonymization, Tor Hidden Services, ash 1. INTRODUCTION enforcement have actively modied Tor hidden service website to deanonymize the users. In a The FBI was given access to a group of comput- traditional website, the Internet Protocol (IP) ers that were running a The Onion Router (Tor) address of a client is used as a component of Hidden Service that hosted illegal content. The the IP networking layer. Thus an IP address FBI then requested and received a warrant to is required for a website to operate properly. investigate the individuals whom accessed the To thwart authorities, users of illegal content illegal content on those servers. When content is moved to using proxy addresses to access web- accessed via the Tor network [1], the IP address sites. The poxy service will make the connection of the computer requesting the content is hid- for the client and mask the identity of the user. den. The FBI developed a Network Investiga- To deanonymize proxy users, law enforcement tive Technique (NIT) that would deanonymize can get a court order to seize and modify the the users of the Tor Hidden Service. For a short proxy servers. With the advent of Tor, users can period of time, the FBI ran a server that sup- bounce their encrypted trac through the Tor plied a ash object (NIT) to a users browser, network to achieve anonymity. The movement which in turn would deanonymize those users. to Tor has pushed law enforcement to use more The data gathered from the NIT was used as technically advanced methods of deanonymiza- probable cause to acquire search warrants. This tion. story was reported in many outlets [2] [5] [6] [9] [10]. Our team was retained to investigate the 2. NIT FRAMEWORK NIT to ensure that it only collected the infor- The NIT consisted of several dierent systems mation specied in the original warrant. Our that worked in concert to deanonymize users team's original report has been published for of the Tor Hidden Service. This section is public viewing [8]. broken down into separate sections based the This investigation is one of the rst where law dierent systems that were analyzed. A high c 2016 ADFSL Page 195 CDFSL Proceedings 2016 Reverse Engineering a NIT that unmasks TOR users Users Browser Line 15: generate random SID Flash Lines 18-38: determine method of decloaking hidden service request 3 Lines 1680-1692: use GALLERY_API_KEY to encrypt (server, SID, type) which creates the ECID TOR Network Line 73: include ECID as id in gallery.swf socket connection gallery.php Lines 46-48: resolve domain that hidden service response includes ECID 1 2 Line 49: make a socket connection to cornhusker Cookie cornhusker Logger FBI DNS and Logging Lines 29-34: send operating Server TOR Hidden Service Server system, architecture, and ECID to Server (FBI Controlled) cornhusker gallery.swf Figure 2. NIT Source Code Overview Figure 1. NIT High Level Overview level overview of the NIT is shown in Figure The server side code in gallery.php created 1. Section 2.1 describes how a Session Identier a unique SessionID each time a tracked web- (SessionID) was generated and logged. Section page was loaded. The data ow is shown 2.2 describes the ash application that makes in Figure 2. This SessionID is a random socket connections. Section 2.3 describes how number generated using the mcrypt_create_iv the FBI decrypted and logged data gathered and MCRYPT_DEV_URANDOM functions, from DNS queries. Section 2.4 describes how which are both standard mcrypt encryption the socket connection was logged. Section 2.5 functions. The random number generation is discusses how the data from the logs on dier- shown in Figure 4 on Line 15. After the ent servers were correlated. Section 2.6 discusses SessionID was generated, the FBI would log the our testing of the reliability and reproducibility data which included IDs for the uri, discussion of the NIT. Section 2.7 discusses some of the board, the thread and if the user was a modera- issues related to the log correlation and data va- tor (mod). This is done on line 16 and is shown lidity. in Figure 4. The code for the logging is shown in Figure 5. This code simply logs to a table 2.1 Server Side Code named 'visitors' in a SQL database. The main goal of the NIT was to deanonymize After the SessionID was created, the FBI's users of a Tor Hidden Service by revealing their code would determine which decloaking method public IP address. To accomplish this the FBI to use. This is shown in Figure 6. The browser generated an identier for users of the Tor Hid- that we tested was the Rekonq browser (as per den Service. This code needed to be dynamic the logs and FBI report). The Rekonq browser and change for each user of the webpage, and would cause the 3rd case (Lines 33-36) to be thus the FBI used php for the server side script- executed. The code then delivers the correct ing. When the user would visit one of the method of decloaking based on the variables pages that was being tracked, a page named with a prex display. For the Rekonq browser, gallery.php would be executed. Figure 3 shows gallery.php le would include a ash le named one of these pages that included gallery.php. gallery.swf and it would pass an id to the ash The gallery.php page is included in an iframe object. This id is created using the FBI's gen- and it has a size of 1 pixel by 1 pixel. erate_cookie function. The result of the gener- Page 196 c 2016 ADFSL Reverse Engineering a NIT that unmasks TOR users CDFSL Proceedings 2016 Figure 3. Tracked Webpage 1481.html Figure 4. PHP Variables in gallery.php ate_cookie function will be known as the En- 2.2 Flash Application and Socket crypted Session Identier (ECID). The gener- Server ate_cookie function uses the shared key named GALLERY _AP I_KEY , sets method to 'swf' The above section describes how the SessionID and uses the random SessionID that was gener- and ECID are generated. The ECID is passed to ated above. the ash object through the parameter named id. The FBI did not provide the source code for the ash application for our analy- Figure 8 shows the code that was used to gen- sis. This will be discussed in Section 4. To erate the ECID. The data stored in the ECID understand the functionality of the applica- is a commercial at (@) delimited data structure, tion, we reverse engineered the source code and data is terminated with a $. The num- for the ash application using the JPEXS de- ber 2 at the beginning represents which server compiler [7]. The reverse engineered source generated the cookie. There were three dier- is shown in Figure 9. In this ash code ent servers, each one had a dierent hardcoded the function is called shortly after value. The method is 'swf' and the $session_id loadGallery the ash application is loaded by the browser. is the SessionID from above. This data is en- This function loads the ECID on Line 42. If crypted with a random initialization vector (iv) that value is not null, the ash application and using the GALLERY _AP I_KEY .