Security Review of the SHA-1 and MD5 Cryptographic Hash Algorithms

Recent Advances in Automatic Control, Information and Communications Security Review of the SHA-1 and MD5 Cryptographic Hash Algorithms ROMAN JAŠEK1, LIBOR SARGA2, RADEK BENDA2 1Department of Informatics and Artificial Intelligence 2Department of Statistics and Quantitative Methods Tomas Bata University in Zlin 1Nad Stranemi 4511, 760 05 Zlin 2Mostni 5139, 760 01 Zlin CZECH REPUBLIC [email protected], [email protected], [email protected] Abstract: - Database breaches and reverse engineering of hashed passwords accentuated lackluster approach of administrators maintaining sensitive data. We therefore evaluate the most-commonly utilized hashing algorithm, SHA-1 and MD5, as to their ability to withstand various threat scenarios. A comprehensive literature research will be presented which supports the hypothesis that security considerations when implementing cryptographic hash functions are sidelined in favor of backward-compatible procedures with provably lower level of resilience in face of deliberate attempts at obtaining sensitive data by unauthorized third parties. Also described will be a way to improve the current predicament to better protect confidential data using cryptographic salts. Key-Words: security, hashing, algorithm, sha-1, md5, salt 1 Introduction Security Standard codifies proper handling and Sensitive data protection has been in focus of storage of financial data [1]. European Union’s Data security researchers for a long time. While extensive Protection Directive 95/46/EC was enacted in 1995 academic coverage focused on analysis of existing [2] and in 2012, a major reform titled General Data and proposed cryptographic hash algorithms exists, Protection Regulation has started being planned to corporations and governments are slow to adopt streamline protection and sharing of personally proposed changes due to inertia and compatibility identifiable data of all member states’ citizens. issues, increased hardware requirements together Unless noted, sensitive data will refer to any with deployment costs. When benefits of these confidential electronic information user willingly changes are not clearly communicated, the necessity disclosed which may compromise his electronic to keep cryptographic systems up-to-date is identity or integrity if obtained by an unauthorized deprioritized due to lacking technical background. third party. To preclude such situations, data may be Website frontends are frequently vulnerable to converted to a fixed-size output using a hash one or more techniques such as SQL injection, null function. byte injection, buffer overflow, directory traversal, Advances in chip design and transistor and uncontrolled format strings. Relying solely on integration follows the Moore’s Law [3], network perimeter security elements should not a prediction can be made computational resources constitute basis for leaving critical portions of data available for any potentially malicious third party storages unencrypted. However, widely used data will increase in time. Periodic revisions must thus encryption schemes do not guarantee adequate level be made as to what cryptographic hash algorithms of security. To detect changes in databases counting are sufficient to protect sensitive data due to millions of records, various mathematical a parallelizing approach to brute-force and fingerprinting techniques were devised titled dictionary attacks, allowing attackers to enumerate cryptographic hash functions which provide billions of combinations per unit of time, rendering computationally efficient way to generate, store, and the hash scheme inefficient if implemented manipulate (compare, move, delete) such output incorrectly. data strings with marginal time requirements. The article describes two popular hashing Definition of sensitive data varies. Legal algorithms currently in use: MD5 and SHA-1. While incentives, namely Payment Card Industry Data some of their variants have been proven ISBN: 978-960-474-316-2 19 Recent Advances in Automatic Control, Information and Communications computationally insecure, they are nevertheless still 2.1 MD5 widely deployed as alternatives to comparably more The MD5 Message-Digest Algorithm is a 128- secure schemes for compatibility or legacy reasons. bit, 4-rounds function proposed by Ronald Rivest [6]. Successor to MD2 and MD4, it was designed as an industry standard and sanctioned by the Internet 2 Cryptographic Hash Functions Engineering Task Force (IETF) to be a part of the A cryptographic hash function, commonly Internet protocol suite. It is represented by a 32-byte abbreviated as a hash is a “…function, mathematical hexadecimal string. or otherwise, that takes a variable-length input string As every encryption scheme seeing widespread (called a pre-image) and converts it to a fixed-length adoption, MD5 was heavily scrutinized by security (generally smaller) output string (called a hash researchers as well as academia. From the properties value)” [4]. Hash may alternatively be titled of any has, it was known the function is vulnerable checksum, although checksum validates to hash collisions during which the attacker searches homogeneity and consistency of a data block while for a pre-image that hashes to the same product. If hash serves multitude of functions apart from found, they may be exploited to impersonate integrity checking such as authentication, a legitimate user and invalidate the authentication watermarking, digital signature schemes, or MACs procedure. It was further shown it is possible to find (Message Authentication Codes). a colliding hash after 15 minutes of computations on Important properties of any hash are fixed length a supercomputer setup [7] which led to in bits, irreversibility, and ease of computation. recommendations that MD5 be not used when Once the textual input is processed into a digest, no generating digital certificates. operation is theoretically capable of producing the In 2004, the first practical attack on MD5, its pre-image. However, as any hash is of fixed size, predecessor MD4 as well as several others was a brute-force attack can be mounted against it during successfully performed [8]. Further improvements which all candidate pre-images are converted into were made based on these findings. In 2005, a pair hashes and compared to the original fingerprint. If of digital certificates compliant with the X.509 a match is made, the input constitutes either the Public Key Infrastructure (PKI) standard was original pre-image, or a different one which hashes produced, proving the attack was feasible in real- to the same value, a collision. The attack is world applications [9]. Two modifications of the extremely time- and resource-intensive and was flaw were demonstrated, allowing identical considered impractical when cryptographic hash signatures to be produced on a single node with the functions were devised. former capable of performing the operation within Another feature is that a bit change in the pre- several hours using consumer-grade notebook, the image results in at least 50% change in the hash, latter achieving the same goal within 60 seconds on a phenomenon known as avalanche effect [5]. equivalent machine [10]. Avalanche effect ensures the data has not been A research was also conducted which tested tampered by a simple fingerprint check. Hashing is propensity to collision attacks in the PKI model. a lossy process; input source information content is The resulting certificate corroborated that “[it] not preserved. The functions are thus unusable as allows us to impersonate any website on the a storage option, only to ensure pre-image’s validity Internet, including banking and e-commerce sites through comparison. secured using the HTTPS [Hypertext Transfer Hashes have become a widely-utilized means of Protocol Secure]” [11]. validating any type of data irrespective of contents In 2008, the United States Computer Emergency with the only requirement being binary input form. Readiness Team (US-CERT) announced that Software libraries are usually provided by “[s]oftware developers, Certification Authorities, a database vendor out of the box with the option of website owners, and users should avoid using the purchasing additional packages. However, as most MD5 algorithm in any capacity. As previous corporations nowadays limit expenditures into research has demonstrated, it should be considered information technology, it is not reasonable to cryptographically broken and unsuitable for further assume database management systems (DBMSs) use” [12]. In 2012, a new attack purportedly will be enhanced in such a way. Therefore, only demonstrated MD5’s susceptibility to single-block encryption functionality included by default in many collisions, enabling the attacker to forge 64-byte instances of DBMSs will be considered: MD5 and messages with arbitrary hash value [13]. SHA-1. Cryptographic community recommends migration to SHA-1. ISBN: 978-960-474-316-2 20 Recent Advances in Automatic Control, Information and Communications 2.2 SHA-1 measurements, heat emissions, cycle counts) is The Secure Hash Algorithm 1 was designed by known. Independent on SHA-2, known attack the United States National Security Agency (NSA) vectors are unusable. in 1995 as a successor to the 1993’s SHA-0. With Two theoretical vectors against SHA-3 were the 160-bit digest iterated for 80 rounds, it was used proposed: a zero-sum attack applicable to the for protecting sensitive unclassified information as 9-round reduced version with no effect on the well as in Internet

Security Review of the SHA-1 and MD5 Cryptographic Hash Algorithms

GPU-Based Password Cracking on the Security of Password Hashing Schemes Regarding Advances in Graphics Processing Units

Cryptography in Modern World

Security + Encryption Standards

FIPS 140-2 Non-Proprietary Security Policy Oracle Linux 7 NSS

A Review on Elliptic Curve Cryptography for Embedded Systems

Choosing Key Sizes for Cryptography

Cryptographic Control Standard, Version

Implementation and Performance Analysis of PBKDF2, Bcrypt, Scrypt Algorithms

A New Approach in Expanding the Hash Size of MD5

Extending NIST's CAVP Testing of Cryptographic Hash Function

SHA-3 and the Hash Function Keccak

Computational Security and the Economics of Password Hacking