<<

2018 IEEE 39th Sarnoff Symposium

Modern Network Security Practices: Using Rainbow Tables to Solve Organizational Issues

Christopher McMahon Xiaowen Zhang Computer Science Dept. Computer Science Dept. College of Staten Island, CUNY College of Staten Island, CUNY Staten Island, NY 11314, U.S.A. Staten Island, NY 11314, U.S.A. Email: [email protected] Email: [email protected]

Abstract—The purpose of this case study analysis is to The rest of paper is organized as follows. In Section II, examine a non-traditional method of identifying weak pass- we briefly introduce some preliminary background on time- words within a large hospital organization. The process of memory trade-off attack, reduction and hash function, and using rainbow tables to crack /ensure rainbow tables. In Section III, we describe the four-step compliance is discussed and specific examples are provided method to crack passwords. We show experiment results in within this paper. This process emphasizes the notion that Section IV and conclude the paper in Section V. network security-related problems tend to be organization- specific and require creative approaches. The goal is to establish a practical use for rainbow tables within an organization as a II.TMTO AND RAINBOW TABLES means of enhancing network security. To understand how rainbow tables work, we must first discuss Hellman’s Time-Memory Trade-Off (TMTO), which Keywords-; password crack; time-memory is the basis for how Rainbow Tables function. trade-off; network security; hash function; reduction function. A. Hellman’s Time-Memory Trade-Off (TMTO) Attack I.INTRODUCTION Assume f is a random function (permutation) f : {1, 2,...,N} → {1, 2,...,N} such that f has a huge cycle A major issue facing network security teams of large covering all N values. Let f (x) = f(x)⊕i be a small tweak organizations is ensuring that organization members are ip compliant with network security procedures. Password com- of f(x), i = 1, 2, . . . , t, t = (N). Hellman’s TMTO attack plexity is a basic but critical component of establishing a [1], [2], [3] consists of two phases: secure network. However, the process of keeping track of Pre-computation phase: members who have passwords that do not meet complexity For each of the t functions f , choose m random start requirements can be a challenging task depending on the p i circumstances. This was a recent problem of the large North points( SPs), where m = (N), and compute chains of American hospital discussed within this paper. Our network length t, store the m value pairs (end point, start point) in a security team is unable to mandate regular password changes table, i.e., each table contains m pairs of (EP, SP). Each (EP, because of the large, diverse population of close to 12,000 SP) pair represents a chain that renders t values. Because EP users. Many users in patient care never directly log in to a is the value after t iterations of fi with SP as start point. computer, only logging in to their applications, as well as Thus each table covers mt values. Memory-wise, each table many users rarely checking e-mail accounts. Additionally, takes m entries (blocks) of space. password complexity was not a requirement added until There are t functions of fi, therefore we build t tables. 2010. Since passwords are stored as hashes with a constant As we want cover entire space of O(N) values by all tables, length, it is impossible to easily determine from Active therefore we have mt × t = mt2 = N. Directory whether a password meets the complexity re- quirements, resulting in possibly thousands of noncompliant Because each table has m entries/blocks, there are t such passwords. tables. Therefore, the total memory used is M = m × t = mt. To address this issue, the idea was proposed to use Rain- bow Tables to identify which passwords were noncompliant. On-line phase: A Rainbow Table is an application of Hellman’s Time- j We try to compute fi (y) for every i = 1, . . . , t and Memory Trade-Off (TMTO) attack. Being that it is an attack j = 1, . . . , t until one of the end points is hit. Then we use method used for , it is typically used for the corresponding SP to find the predecessor x of y such illicit purposes, and very rarely utilized in an organizational that y = f(x). Therefore the number of operations is t2, it environment. In this scenario, it would allow the network represents the time complexity T = t2. security team to identify only those passwords that did not meet complexity requirements, and therefore a much smaller From pre-computation phase, we know that mt2 = N, subset of users would be required to change their passwords. then we have (mt2)2 = N 2, i.e., (mt2)2t2 = N 2. Because

978-1-5386-6154-3/18/$31.00 ©2018 IEEE 2018 IEEE 39th Sarnoff Symposium

memory requirement M = mt2, time requirement T = t2. In the above assumption, we can define a reduction Therefore we have TM 2 = N 2. A common point on function R as XORing the left 64-bit of the hash value with the curve is M = T = N 2/3. It can be verified as the right 64-bit to get an output of 64-bit, that is in P space. N 2/3(N 2/3)2 = N 2/3N 4/3 = N 6/3 = N 2. We can give an example here: Let p1 = Z8&6dh$n, Therefore, Hellman’s TMTO attack needs memory M = its MD5 hash value h1 = MD5(p1) = N 2/3 and time T = N 2/3. It is a dramatic improvement for 9bef715e662cc300796c1cfefd4f8913 = 0 exhaustive search in which pre-computation , memory R(h ) = 9bef715e662cc300 ⊕ 796c1cfefd4f8913 = M = 0 T = N 1 , but time . It also does not require e2836da09b634a13 → 62236d203b634a33 = b#m cJ3 = less memory than table/dictionary search in which pre- p . Note: for every byte generated by R(h ), we set the 7th computation = N, memory M = N, time T = 1. Therefore 2 1 bit (i.e., the most significant bit) to 0; furthermore, if both Hellman’s TMTO trades memory with time. 6th and 5th bit are 0, we set 5th bit to 1.

B. Reduction Function C. Hash Function LM Why do we need reduction function? In order to build We also need to understand the hash functions that rainbow tables, we will have to create an iterative function uses to store user’s passwords. LM f, where f has its domain size equal to range size. But (LAN Manager) hash is an outdated password hashing for a real hash function H, there is a size discrepancy of method developed by Microsoft in cooperation with 3Com its domain and range spaces. Suppose the hash function H Corporation that is considered particularly weak. It uses the uses the Message Digest 5 algorithm (MD5), and further Data Standard (DES) encryption method and is suppose domain space is all possible passwords P with 8 no longer commonly used. characters the American Standard Code for Information In- terchange (ASCII) characters (suppose each ASCII character NTLM (NT LAN Manager) is the successor to LM and is encoded in 8 bits), and the range H is 128 bit hexadecimal is a suite of multiple protocols, developed values. The domain size is 264 = 1.84 × 1019, but the range solely by Microsoft. Though it is not recommended as an size is 2128 = 3.40 × 1038. There is a huge difference. , it is still widely used to maintain compatibility for older systems and has been included in Kerberos, which is currently the Microsoft recommended authentication protocol. NTLMv2, which is the most com- mon NTLM protocol, uses the HMAC-MD5 authentication code. This code uses the MD5 hash algorithm, which is the algorithm that was used in the password cracking project described in this paper. Full text of the MD5 hash algorithm can be found in RFC1321 [4].

D. Rainbow Table A rainbow table [5] is a type of hash lookup table utilizing TMTO generated to reverse cryptographic hash functions as a means to crack password hashes. It differs from standard hash lookup tables as it requires more processing time per hash lookup, but uses much less storage. Standard hash tables, see Table I for an example, can grow to be very large as they are essentially a list of all Figure 1: Reduction function possible passwords in a space and their corresponding hash. Rainbow tables approach this problem by constructing We need to define a reduction function R that maps chains that use alternating hash and reduction functions, see a 128-bit hash value in H back to a 64-bit value in P . Figure 2 for an example. In the chain, everything is then After that we can apply hash function H again to get thrown away except for the first input and the last hash. H(pi) R(hi) When performing a hash lookup, these chains are then the iteration going. It is pi −−−−→ hi −−−→ pi+1, see regenerated until the hash is found. This greatly improves Figure 1 for illustration. When put together, we define storage efficiency but more processing power becomes f(pi) = R(H(pi)), such f function has the same domain required to perform the hash lookup. and range spaces (both 68-bit). Therefore, we can iterate f from one password pi to generate next password pi+1, then apply f again to generate another password pi+2. In the Size comparison f f f equation, it is written as pi −→ pi+1 −→ pi+2 −→ ... Given A set of 10 MD5 rainbow tables that has 99.9% accuracy an initial password p1, we iterate f function t times to get for passwords that contain all alphanumeric characters and a (pt, p1) pair, it is a (EP, SP) pair stored in the table. with a password length of 7 characters has a total size that 2018 IEEE 39th Sarnoff Symposium

Table I: Standard Hash Lookup Table Example Table II: rtgen command options Plain Text Password MD5 Hash Value hash_ Rainbow table is hash algorithm specific. Rainbow 1234 81dc9bdb52d04dc20036dbd8313ed055 algorithm table for a certain hash algorithm only helps to abc123 e99a18c428cb38d5f260853678922e03 crack hashes of that type. The rtgen supports lm, elephants b240a1eeb3453e0b08a2f86b3d025b17 ntlm, , sha1, mysqlsha1, halflmchall, ntlm- qwerty d8578edf8458ce06fbc5bb76a58c5ca4 chall, oracle-SYSTEM and md5-half. Here we Password dc647eb65e6711e155375218212b3964 generate md5 rainbow tables. Pa$$w0rd 3cc31cd246149aec68079241e71e98f6 charset All possible characters for the plaintext. RainbowTables 808d982e6312d229c5ccbf2c78ee443b “loweralpha-numeric” stands for “abcdefghi- jklmnopqrstuvwxyz0123456789”, which is defined in configuration file charset.txt. plaintext They limit the plaintext length range of the rainbow _len_min table. Here the length range is 1 to 7. So plaintexts plaintext “1” and “1234567” are likely contained in the Figure 2: Rainbow Chain Example _len_max generated rainbow table, while “12345678” is not. table_ For selecting the reduction function. Rainbow table index with different table_index parameter uses different reduction function. is approximately 43GB, with a smaller total size as more chain Longer rainbow chain stores more plaintexts and tables are added1. _len requires longer time to generate. chain Number of rainbow chains to generate. Rainbow In comparison, a theoretical MD5 hash lookup table with _num table is simply an array of rainbow chains. Size of the same password criteria would contain approximately 3.5 each rainbow chain is 16 bytes. trillion entries. part_ This allows to store a large rainbow table in many index smaller files, use different number in this param- 3.5 × 1012 entries × (7B per password + 16B per hash) = eter for each part and keep all other parameters 8.05 ×1013 bytes = approximately 80TB identical. Storing this table is obviously not a realistic option. Therefore rainbow tables, though they require more process- 2) Sorting the tables using rtsort. This sorts each table ing power, are a much more viable solution for hash lookups. by end point of each rainbow chain to make binary search possible. III.METHOD 3) Extracting all user accounts and password hashes from Active Directory. This was accomplished by The objective in our case was to crack passwords that using a PowerShell script utilizing DSInternals were seven characters in length or less and did not contain PowerShell module. any special characters. Given that Active Directory in our 4) Running a comparison of extracted password organization stores hashes using the NTLM algorithm, our hashes against the generated rainbow tables using team needed to generate rainbow tables containing these rcrack. specific hash types. Next, we needed to extract a table of all user accounts and corresponding NTLM password hashes. Step-1: Generating the tables with rtgen Our final step in the process was to compare the list of extracted hashes against our generated rainbow tables. Generating the tables uses an application within Rain- bowCrack called rtgen. Prior to generating the tables, it is The product that we used to generate the rainbow tables important to understand the data required by the application. is called RainbowCrack2. It is a suite of tools for the Refer to Table II which gives an in depth explanation from generation, sorting, and merging of rainbow tables as well as RainbowCrack for each component. The syntax for rtgen is a lookup tool for passwords inside the rainbow tables. Given as follows: the functionality of this suite, it was ideal for our project because it allowed us to generate batch files automating some rtgen hash_algorithm charset plaintext_len_min plain- of the processes that, for our needs, took about 120 hours. text_len_max table_index chain_len chain_num part_index This allowed us to run these processes 24 hours/day without and the command line options3 are explained in Table II. supervision so that our objectives could be accomplished much faster. The process consists of the following four steps: For the purpose of meeting our objective, we created a batch script to automate the process of generating the set 1) Generating the tables with rtgen. For this project, of rainbow tables. The following 50 command lines were we generated 50 tables of all passwords 7 charac- used. The script took approximately 80 hours to complete ters or less. and once finished, had generated 50 files that were each a distinct table of the set. 1https://tobtu.com/rtcalc.php 2http://project-rainbowcrack.com 3http://project-rainbowcrack.com/generate.htm 2018 IEEE 39th Sarnoff Symposium

rtgen ntlm mixalpha-numeric 1 7 0 10000 53124803 00 While ($count -le $samcount) { write-host $sam[$count].SamAccountName rtgen ntlm mixalpha-numeric 1 7 0 10000 53124803 01 $strFileName = "C:\temp\hashtmp.txt"

... If (Test-Path $strFileName){ rtgen ntlm mixalpha-numeric 1 7 0 10000 53124803 49 Remove-Item $strFileName } Step-2: Sorting the tables using rtsort $hash=Get-ADReplAccount -SamAccountName The next step in preparing the rainbow tables for our $sam[$count].SamAccountName -Domain project was to sort them. Each table contains an array of (Domain Name) -Server (Domain rainbow chains. Each rainbow chain has a start point and an Controller) -Protocol TCP end point. The rainbow chains must be sorted by end point to $hash > c:\temp\hashtemp.txt make binary search possible. This is made possible by using $hashtext = get-content the rtsort application in the RainbowCrack suite by running "C:\temp\hashtemp.txt" the command rtsort. from the directory that contained the $hashcount = 0 rainbow tables. Once sorted, the full set of rainbow tables While ($hashcount -le $hashtext.count) { for our organization totaled 37.9GB. If ($hashtext[$hashcount] -like Step-3: Extracting all user accounts & password hashes ’*NTHash:*’){ $linetrim = With our rainbow tables ready to be used, we then $hashtext[$hashcount].TrimStart( had to extract all user accounts and password hashes from "NTHash:") Active Directory to be used as the target for the pass- } word lookup. This was accomplished by using a Pow- If ($hashtext[$hashcount] -like ’*LMHash:*’){ erShell script utilizing DSInternals PowerShell Module $linetrim2 = (https://github.com/MichaelGrafnetter/DSInternals). The fol- $hashtext[$hashcount].TrimStart( lowing code is an abbreviated version of the script that "LMHash:") was used. It writes all account names and password hashes } contained in the directory specified by the user to a text file titled hashreport.txt. $hashcount++ } #Create the following files C:\temp\hashreport.txt and $sam[$count].SamAccountName + "," + C:\temp\hashtemp.txt $linetrim+ "," + $linetrim2 >> #hashreport.txt will be the finalized c:\temp\hashreport.txt report, hashtemp.txt will be used for $linetrim = " " parsing $linetrim2= " " $count++ #Specify AD information }

$sam=Get-ADUser -Filter * -Searchbase "(Location of users in Active Step-4: Running a comparison Directory)" | Select SamAccountName The final step in the process of cracking password #This is the report header hashes for our organization was to run a comparison of the extracted password hashes against our rainbow tables. "SamAccountName,NTHash,LMHash" > This was achieved by using another application from the c:\temp\hashreport.txt RainbowCrack suite entitled rcrack. The application includes a graphical user interface that is simple to use. Once the #Variables used by while loop to run as many password hashes were loaded and the rainbow tables were times as the number of account names specified, the hash cracking starts automatically. The amount stored of time required to complete the hash lookup process is $count = 0 dependent on the amount of password hashes loaded. For $samcount = $sam.Count - 1 our project, this took approximately 10 hours to complete. After this process, users’ passwords that did not meet our #Loop gets the SamAccountName for that complexity and length requirements were now identified as iteration, stores the hash in plaintext passwords. hashtemp.txt, parses the hash in to an NTHash and an LMhash, and writes results IV. RESULTS to hashreport.txt before moving in to next iteration The results of the password cracking project are shown in Table III. Additionally, a detailed breakdown of the 2018 IEEE 39th Sarnoff Symposium

Table III: Results hash, having a matching password hash made this obvious. This is a security risk because if their standard account Description Count credentials were ever stolen, their administrator accounts Total number of password hashes 11980 could also be compromised. Fine-grained password policies Unique password hashes 10995 were then put in place to ensure much stronger complexity Total number of password hashes cracked 2249 on Administrator level accounts and then these users were Unique password hashes cracked 1273 forced to change their passwords. Percentage of total password hashes cracked 18.77% Percentage of unique password hashes cracked 11.57% V. CONCLUSIONS By utilizing rainbow tables to address this issue, our team Table IV: Breakdown by Password Length was able to identify better practices for ensuring password complexity and protecting user accounts. In addition to Description Count enhancing our password complexity requirements, our orga- Passwords containing 7 characters 1961 nization has also enhanced communication and encryption Passwords containing 6 characters 136 protocols used inside our user domains. We now require Passwords containing 5 characters 131 Kerberos authentication protocol and no longer store LM Passwords containing 4 characters 20 password hashes, which uses an outdated hash algorithm, Passwords containing 3 characters 1 making them easier to crack. The efficacy of our new procedures is reflected in the fact that attempting to use Table V: Additional Account Information rainbow tables in our current environment would prove to be much more difficult. Our specific method of using rainbow Description # of accounts tables has allowed us to solve our issue, as well as give Unused accounts using the same temporary 973 (1 unique us a unique perspective of the weaknesses within our user password “Summer1” hash) domains. As evident from the results of this case study, Multiple accounts of varying privileges be- 8 (4 unique effective network security administration sometimes requires longing to the same user using a cracked hashes) unconventional techniques and strong problem solving skills password hash for solving challenging organizational issues. Multiple accounts of varying privileges be- 18 (9 unique longing to the same user using an un- hashes) cracked password hash REFERENCES [1] O. Dunkelman. Hash functions – generic attacks, of Hash Functions Seminar, University of Haifa, 2012. passwords discovered is shown in Tables IV and V. [2] M. Hellman. A cryptanalytic time memory trade off. IEEE Transac- tions on Information Theory, IT-26, 1980, pp. 401-406. Total number of password hashes is 11980. Out of these, [3] P. Oechslin. Making a Faster Cryptanalytic Time-Memory Trade-Off. the number of passwords cracked is 2249, which takes Proceedings of CRYPTO 2003, LNCS 2729, pp. 617-630. 18.77%, while the number of passwords uncracked is 9731 [4] R. Rivest. The MD5 Message-Digest Algorithm, Internet Engineering which takes 81.23%. Task Force, April 1992, available at https://tools.ietf.org/html/rfc1321 [5] S. Thomas. Advanced RT Calculator, 2017, available at With the results of our project, we were able to imple- https://tobtu.com/rtcalc.php ment new changes and procedures in our organization. New password complexity requirements were put in place, now requiring a minimum of eight characters. Once this was in place, a list was compiled of all accounts that had been compromised. This list was then sent to the I.T. help desk, which was instructed to set temporary passwords for every user on the list and then contact the user to have them set a new password, which would now comply with the new complexity requirements. The results also led to changes that had been unforeseen before this project was started. It became apparent that the temporary password (“Summer1”) that the help desk had been using when creating new accounts was not strong enough. A script was then ran to take the list of accounts using this temporary password and change it to a much stronger temporary password consisting of 16 characters and utilizing all 4 character types. The other change that had been unexpected was that numerous users with multiple accounts (for administration purposes) had been using the same password for all of their accounts. Even if we had not been able to crack their password