6.857 Computer and Network Security Fall Term, 1997
Lecture 9 : Octob er 2, 1997
Lecturer: Ron Rivest Scribe: Jesse Kornblum
Topics
Hashing in Digital Signatures
Hashing Security
MD5
BirthdayAttacks
SHA
CvHP
MACs and Hashing
1 Digital Signatures
When messages get very long, encrypting them to use as a digital signature with
algorithms such as RSA can be very exp ensive. We'd like to sign our messages
with something much shorter, a \ ngerprint" of the message, also called a \message
digest."
To create the ngerprint, we use a function ngerprint function, MD-function, cryp-
tographic hash function, etc. that compresses messages of arbitrary length into a
preset set. This function, hx, normally pro duces a ngerprint either 128 or 160 bits
long. This shorter value is much easier to sign than the full message. Our goal is
that:
The function h is ecient
The length of hx is short
The function hx is secure i.e. it has strong collision-resistance, de ned b elow 1
2 3 MESSAGE DIGEST 5 MD5
2 Hashing Security
The security of hash functions is measured by the existence of col lisions. A collision
is when two di erent messages pro duce the same hash. i.e. hm = hm Here
1 2
are some terms used in describing the security hash functions:
0
Strong Collision-Resistance: It is infeasible for an attacker to nd x and x forming
0
a collision. i.e. hx=hx
0
Weak Collision-Resistance: Given an x and hx, it is infeasible to nd x such
0 0
that hx = hx . This implies that given hx, it is infeasible to nd any x such
0
that hx=hx. i.e. That h is a one-way function.
3 Message Digest 5 MD5
MD5 is a hashing function develop ed by Ron Rivest. Amuch more detailed descrip-
1
tion can b e found in handout 8 or RFC 1321.
MD5 works by rst padding the message until it is a multiple of 512 bits long. Padding
is done as follows:
1. App end a '1' bit to the message.
2. App end '0' bits until the message is 64 bits shorter than amultiple of 512 bits
3. App end a 64-bit representation of the message's original length
The state of MD5 is kept in four 32-bit words, A, B , C , and D , all of which are
initialized to magic constant values. MD5 pro cesses the message in 512-bit blo cks.
As we pro cess the ith blo ck of message, we up date A , B , C , and D to A ,
i 1 i 1 i 1 i 1 i
B , C , and D . The output of MD5, a 128 bit value, is the nal state of A, B , C ,
i i i
and D concatenated.
For each blo ck of message, we have four rounds of up dates. Each round up dates one
of the four 32-bit words A, B , C ,orD four times. For a total of sixteen up dates p er
blo ck of message. Initially on each round, A A , B B , etc. Each of the
i i 1 i i 1
up dates is something similar to A B +A +F B ;C ;D +M +T <<< s, where
i i i i i i i i
1
Which actually comes from Applied Cryptography byB.Schneier
3
2
F is a function, M is the ith blo ck of the message, and T and s are magic constants.
i i
The symbol <<< means \rotate left". At the end of each round, we nish by
up dating all of the values one last time, namely: A A + A , B B + B ,
i i i 1 i i i 1
etc.
4 Secure Hash Algoritm SHA
SHA is a hash function develop ed by the US government. It's similar to MD5, but
uses ve 32-bit words and ve rounds per message blo ck to pro duce a 160-bit hash,
using linear transformations of the message in each blo ck.
5 Birthday Attacks
A birthday attack on a hash function attempts to use the birthday paradox to nd
collisions in a hash function. i.e. creating random messages, taking their hash
value, and checking if that hash value has b een encountered b efore. For MD5, as an
64
example, an attacker could exp ect to nd collisions after trying 2 messages. Given
to day's computing power, this is a dicult, but not imp ossible problem. Hence the
move to stronger hashing algorithms, like SHA.
6 CvHP
Yet another hash function is CvHP. It's full name is the Chaum-vanHeijst-P tzmann
hash function. First, we need to cho ose prime numb ers p and q such that p =2q+1.
q
Next, we need an and a that are b oth an element of order q i.e. 1 mo d p
and that 6= . For security, it must b e dicult to compute log .
m
1
The hash function splits the message into two parts m and m . hm ;m =
1 2 1 2
m
2
modp. It is necessary that 1 jm j;jm j q.
1 2
The heart of CvHP is that it's infeasible to nd and . As an example, let's say
that we have a collision:
m m m m
1 2 3 4
= modp
2
As an example, F a; b; c= a^b_a ^ c
4 7 MACS AND HASH FUNCTIONS
We can transform this to b e
m m m m
1 3 4 2
= modp
If we set r = m m and s = m m , we get
1 3 4 2
r s
= modp