Internet Security Protocols

Home , DNS spoofing, Secure Shell Protocol

Survey of Internet Security Protocols

Kent Tse

School of Computer Science, McGill University, Montreal, Quebec, July 1997

A THESIS SUBMITTED TO THE FACULTYOF GRADUATESTUDIES AND RESEARCHIN PARTIAL FULFILLMENT OF THE REQUIREMNTS OF THE DEGREE OF MASTERS OF SCIENCE.

Copyright @ Kent Tse, 1997 National Libraiy Bibiiit ue nationale du Cana"$, uisitiiris and Acquisitions et "pBiiiographio SeMces services bibliographiques

The author has granted a non- L'auteur a accordé une licence non exclusive licence dowing the exclusive permettant à la National Library of Canada to Bibliothèque nationale du Canada de reproduce, loan, distriiute or sell reproduire, prêter, distrr'buer ou copies of this thesis m microform, vdedes copies de cette thèse sous paper or electroaic formats. la forme de microfiche/nlm, de reproduction sur papier ou sur fomt électronique.

The author retains ownership of the L'auteur conseme la proprieté du copyright in tbîs thesis. Neither the droit d'auteur qui protège cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or otherwîse de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autoisation. 1 would like to acknowledge the School of Cornputer Science, the administrative st&, the technical support staff for their assistance in all aspects of my research. 1 would like to thank my supervisor, David Avis, for assisting me in my thesis. His constructive comments and suggestions made this thesis possible. Special thanks to Bibi Ali for understanding, encouragement, support and giving me time to work at my own pace; Luc Boulianne for al1 his inspiration and guidance to aid in the research and completion of this work; Matthew Sams and the rest of the technical support staff for picking up the extra work that 1 may have caused by changing my work schedule to accommodate my research; efforts by Jacques Desmarais for his translation work and Sandro Mazzucato for his concern and encouragement during my work on mythesis are grestly appreciated. Finally 1 would like to thank al1 my proof readers. Abstract

The Internet is becoming a widely used medium for communications and electronic commerce. However, in the current state of the Internet, user applications need to be installed to ensure that these kinds of communications are kept confidentid and private. The long term solution is to use IPv6, the new Internet Protocol, which incorporates authentication and encryption in the lower part of the network layers, but until then there are higher level protocol such as SSL and SSH, to name a few. In this thesis, a general problem that can be used to "hack'' into cornputer systems is outlined. This is usually a first step before packet snifnng is used to collect more passwords from other communications. The Pv6 authentication headers and encag sulating security payioad are introduced as the long term solution to combat both of the above problems. Findly, two current protocols are described that can authenticate and encrypt packets during communications on the Inteniet in the TCP/IP protocol without the need of the security features in IP (versions 4 and 6). iii

L'hternet devient un médium largement utilist! pour fins de communications et de commerce électronique. Cependant, dans son état courant, des logiciels doivent être installés afin d'assurer la confidentialite à ces commUILications. La solution à long terme serait de se se* de IPv6, le nouveau protocol de 1'Intemet qui incorpore des 616ments d'authentification et de cryptographie dans la partie plus basse des niveaux du réseau. Mais d'ici-la, il existe des protocols de plus haut niveau comme SSL et SSH, pour en nommer quelques-uns. Lors de cette thèse, un problème courant pouvant être exploité pour but de pénétrer dans un système d'ordinateurs, est décrit. 11 s'agit ghéralement du premier pas avant l'emploi du "packet sniffing" qui a pour but d'ammasser plus de mots de passe d'autres communications. Les entêtes d'authentification et le "encapsulating security payload" de IPv6 sont introduits en temps que solution à long terme pour combattre les deux problèmes mentionnées. Finalement, deux protocois courants sont décrits pouvant authentifié et code les "packets" durant les communications sur 191ntenietse servant du protocol de TCP/IP, sans nécessiter les fonctions sécuritaires de IP (versions 4 et 6)- CONTENTS

Contents 1 Introduction 1

2 The Internet 7 2.1 TCP/IP Protocol Suite ...... 9 2.1.1 The TCP Level ...... 11 2.1.2 TheIPLevel ...... 13 2.2 The Applications Layer ...... 14 2.3 UDP and ICMP ...... 18 3 Cryptography 19 3.1 Introduction ...... 19 3.1.1 Basic Terminology ...... 19 3.1.2 Basic Cryptographic Algorithms ...... 20 3.1.3 Digital Signatures ...... 21 3.1.4 Strength of Cryptosysterns and Attacks on Cryptosystems ... 22 3.2 Block Cipher Modes ...... 23 3.3 Some Block Algorithms ...... 24 3.4 Data Encryption Standard (DES) ...... 25 3.4.1 RC2andRC4 ...... 27 3.4.2 IDEA ...... 27 3.5 One Way Hash Functions ...... 30 3.5.1 MD5 ...... 31 3.6 Public-Key Algorithms ...... 32 3.6.1 DiEe-Hellman ...... 32 3.6.2 RSA ...... 33

4 IP Spoofing 35 4.1 Introduction ...... 35 4.2 Passive (Blind)IP Spoofing ...... 36 4.2.1 TCP Connection and Sequence Numbers ...... 36 4.2.2 TrustedHostandtheSYNAttack ...... 39 4.2.3 IP-Spoofing Example ...... 40 4.3 Active IP Spoofing ...... 41 4.3.1 Desynchronization and the Attack ...... 41 4.4 Preventative Measures ...... 43 5 IP Layer Security Architecture 45 5.1 Seety Associations ...... 47 5.2 The IP Authentication Header ...... 50 5.3 The IP Encapsulating Security Payload ...... 53 5.4 Key Management ...... 56 5.5 Combination of SeceMechanisms ...... 58

6 Weaknesses in the IP Layer Security Architecture 59 6.1 htroduction ...... 59 6.2 The Attacks ...... 61 6.2.1 Reading Encrypted Data ...... 61 6.2.2 Session Hijacking ...... 62 6.3 Preventative Measures ...... 63

7 Secure Sockets Layer (SSL) 65 7.1 Introduction ...... 65 7.2 The Secure Sockets Layer Protocol ...... 66 7.2.1 SSL Record Layer ...... 66 7.2.2 SSL Handshake Protocol ...... 69

8 The Secw Shell (SSH) 70 8.1 Introduction ...... 70 8.2 The Secure Shell Protocol ...... 71 8.2.1 TCP/IP Port Number and Some Options ...... 72 8.2.2 Protocol Version Identification ...... 73 8.2.3 The Binary Packet Protocol ...... 74 8.2.4 Key Exchange and Semer Host Authentication ...... 78 8.2.5 Packet Encryption ...... 79 8.2.6 Authentication ...... 81 8.2.7 Data Exchange ...... 84

9 Conclusion 85

A The Internet Protocol 87 A.1 IPVezsions4and6 ...... 89 A.2 IPv4 Datagram Format ...... 90 A.2.1 IPv4 Options ...... 94 A.3 IPv6 Datagram Format ...... 96 A.3.1 IPv6 Extensions ...... 97 A.3.2 IPv6 Security ...... 99 LIST OF FIGURES

List of Figures

Conceptual Organization of TCP/IP Layers ...... 9 TCP Header ...... 12 A Typical Mail Message ...... 15 SMTP Protocol Conversation ...... 16 RSH Protocol Conversation ...... 17 Round i of DES ...... 26 TCP 3-way Handshake ...... 37 TCP 3-way Spoofed Handshake ...... 37 RSH Protocol Conversation ...... 40 TCP ACK Storm ...... 43 Security Gateway and the 'Ihisted Subnetwork ...... 46 Virtual Private Network ...... 47 Example of an Authenticated IPv6 Header ...... 50 IP Authentication Header Format ...... 50 IPv6 Datagrsm using ESP ...... 53 IP Encapsulating Security Payload Format ...... 54 Cut and Paste Attack ...... 62 Session Hijacking Attack ...... 63 Secure Sockets Layer ...... 65 SSH Binary Packet Format ...... 74 Conceptual Organization of TCP/IP Layers ...... 88 IP Datagram ...... 88 IPv4 Datagram Format ...... 90 Subfields of the Senrice Type in an Pv4 datagram ...... 91 Option Code Octet ...... 94 IPv6 Datagram Format ...... 96 LIST OF TABLES vii

List of Tables

Major Differences Between IPsec. SSL and SSH ...... 4 SSL Error Alerts ...... 68 SSH Packet Types ...... 75 SSH Packet Types (cont) ...... 76 SSH Packet Types (cont) ...... 77 SSH Encryption Types ...... 79 SSH Authentication Methods ...... 81 IP Options by Class and Nwnber ...... 95 1 LNTRODUCTION

1 Introduction

A common problem on the current Internet is the growing number of intrusions being

reported by the media. A common technique used to break into computers is to

get tmsted access to a computer and then use well known senvity holes to "hack"

the root account. Once the super-user account is attained, padcet sniffers can be set

up to get more passwords as most communications are sent across the wire with no

encryption. The users do not have any protection against this kind of attack because

the first flaw lies in the Internet Protocol where they do not have any control and their

communications are sent in the clear by current application programs.

The curent version of the Internet Protocol (IPv4) is being replaced with a new

version for the next generation (IPng which will be version IPv6). This new version

of the protocol indudes packet-level authentication and encryption (which dlbe

required) which WUsolve many of the current security problems and out-date severd security enhancing tools for Pv4 (where the security headers are optional, and are

cunently unused).

The use of firewab has been increasing as the Internet and the use of electronic commerce grows. Cornputer networks and cornputet systems have become potentid targets. Many intrusions are merely teenage pranks or fun and games, but recently,

these intrusions have become more sinister. These intrusions compromise the system integrity by rendering the computer systems and networks inoperable, loss of data, altered data, replacement of system softwaxe with "trap doors" in place, and espionage

(copying proprietary Iliformation). IPv6 will reduce the need for firewdls and/or change their role on the internet, but will add to the cost and impact of performance of computer systems. This penalty should decrease in time with improvements with software and hardware and a more secure infrastructure.

Confidentiality, data integrity, and protection from traaic analysis is not currently provided under IPv4. However, there axe several existing tools which can be installed for this use. These tools reside in the transport and application layers and they provide confidentiality through encryption. The Encapsulating Security Protocol (ESP) will provide for data encryption and integrity which will protect users from trac analysis at the internet layer rather than in the transport or application layer. In other words, this protocol wiU be available without any extra work to ensure that the security enhancing applications are installed on the systems. Although there will be a marked increase in computer security when the IP security protocols will be introduced, they are not the perfect or only solution to this problem.

IPv6 is currently being tested on the Internet using tunneling to work out the bugs before the deployrnent. This process could take up to two years and even then there will stiil be many computers on the Internet which will not be upgraded for various reasons (eg. old computers without support). A need for confidentiaüty, data integrity, and protection will require that the use of exMing applications and tools that work with IPv4. There are several of such appücations and tools currently deployed on the

Internet which provide for the security that IPv6 will provide, but at different layers of the concephid mode1 of the network. Some of the existing applications that are used for this purpose are Kerberos, STEL,SSH, and SSL.

Kerberos is a complete user management systern developed and used at MIT. It is a ticket ganting service which allows entities communicating over insecure networks to prove their identity to each other while preventing others from eavesdropping or replay attacks. These tickets are used to identiQ the principals (users or services) to each other and are embedded in virtually any network protocol, thereby allowing the processes implementing the Kerberos protocol to be sure of the identity of the principals involved. The main disadvantage to this package is that it is large and complicated to install properly.

STEL (Secure TELnet) is a telnetd, rlogind, rcmd or rshd replacement package. It is a similar protocol to the SSH protocol where all transmissions of data is encrypted using one of the encryption algorithms available, DES, BDES, and IDEA.

The currently emerging standards, for encryption and authentication of communications are the Secure Sockets Layer (SSL) and the Secure SHeU (SSH).The main differences between these two protocols is that SSL is currently being used for HTTP trac and SSH is being used for rlogind, rshd, rcmd, and X11 communications. The- oretically, both of these protocols have been designed to dow either of them to be 1 P security 1 SSL 1 SSH II L Layer Internet Layer Tkansport Layer Application Layer Connections host-to-host host-to-host host-to-host I Types 1 host-to-net I - I - II subnet-to-subnet - - Default keyed MD5 MD5 RSA t Authentication 1 I Default DES-CBC DES-CBC IDEA Encryption Packets al1 HTTP, NNTP, SMTP SSH, X11, TCP/IP Encrypted Key manual public-key mechanisms public-key mechanisms Management Table 1: Major Differences Between IPsec, SSL and SSH used for the different levels of trafk. The current version of the SSL protocol is 3 and it can be seen conceptually to exist between the transport and application layers.

The current version of SSH is 1.5 and it is in the application layer. However, version

2.0 will move the protocol closer to the transport layer to allow for other protocols

(HTTP, FTP, ...) to be encrypted by the SSH protocol.

Although the protocols (IP security, SSH, SSL) being discussed in this thesis are located in different layers (internet, application, transport layers) these protocols have the goal to provide authentication and confidentiality of the data. The major dinerences can be summarized in Table 1.

The main goals of any protocol that provide for authentication and codidentiality need to attain certain goals: Cryptographie Security A secure connection can be established on an insecure

network,

Interoperability Independent implementations can exchange data (cryptographie-

ally) without knowledge of one another's implementation details,

Extensibility The authentication and encryption methods cm be reused,

Relative Efficiency The protocol should not have too much overhead to make it too

expensive to use.

However, the protocols which provide for the authentication and confidentiality at the application or transport layers need to protect services at the port level. This means that each port needs to be individuaily protected because at the transport layer each service is defined by the port number. Each service (ftp, gopher, http, rlogin, rsh, smtp, and so on) that may need to be protected require setup for each individual service. We can protect ftp, but leave http alone because we define which services will be protected by these application and transport layer protocols. Although this will provide for more flexability to define which services are protected, placing the security mechanisms dom in the IP layer secures all services automatically. There will be no need to protect each port against IP-spoofing mechanisms.

This thesis wiU describe the Internet Protocol (versions 4 and 6) and the Au- thentication Header (AH) and Encapsulating Security Protocol (ESP).The protocol 1 INTRODUCTION 6 specifications can be found in [Atk95a], [Atk95b], [Atk95c], and [AtkgM]. An introduction to cryptography and the different algorithms currently being used in several authentication and encryption systems. IP spoofing wiil be discussed with examples to demonstrate why there is a need for the authentication header and encapsulating protocol. The proposed standards for the IP security headers will be introduced and some weaknesses that can be found in these implementations will be demonstrated.

Finally, two systems (SSL FKK961 and SSH [Ylo95]) that provide for authentication and link encryption will be discussed at length with the protocols that are used to ensure that the communication is secure. These systems can be deployed now to provide for secure communjications.

This thesis consists of an ove~ewof the emerging Internet security standards and the significance that they will play against security problems on the current Internet.

The point of view is taken where the severity of eavesdropping and IP spoofing are major problems plaguing the Intemet today. The opinions expressed in this thesis about how these lower level protocols will help axe that of the author. 2 The Internet

The Internet is a collection of networks consisting of several military and government networks, local networks at universities and research institutions, regional networks such as RISQnet (Quebec regional network) and CA*net (Canadian regionai network), and several other networks used by commercial companies conducting their businesses on the Internet. The underlying protocols used to provide the lclow-level" functions needed by al1 network applications are called the TCP/IP protocol suite (also known as the Internet protocol suite) which includes IP, TCP, and UDP. The traditional

TCP/IP se~cesare: fUe tramfer FTP (file transfer protocol) allows a user to retrieve files €rom a remote

computer, [PR85]; remote login TELNET (the network teminal protocol) allows a user to log in to a

computer on a network remotely, [PR83a] and [PR83b]; and electronic mail SMTP (simple mail transport protocol) allows a user to send mes-

sages to other users on other cornputers, [Pos82] and [Cr0821 and [BPC+85].

Modern computer systems use other services such as: network file system network file systems allow files h.om a remote computer to be

accessed as if the files were on the local disk. Typical implementations are NetBlOS over TCP (PGoriented) and Sun's NFS (workstations), [Aue87a] ad

[Aue87b]; remote printing access to printers connected to remote computers (LPR on BSD,

lp on SYS V systems); remote execution execution of a program or functions remotely. The most com-

mon programs are rsh and rexec derived from Berkeley Unk. There are several

remote procedure cal1 mechanisms which have been developed by several organ-

izations (Xerox's Courier, Sun's RPC); name service allows for the management of collections of names. The most common

use is DNS (domain name service) which is a database of names and network ad-

dresses of computers. Sun's Network Information Service (NE,formerly Yellow

Pages (YP)) was designed to be a generd mechanism to distribute databases

such as usernarne/passwords and groups. network-oriented window systems dows a program to use a display on a differ-

ent computer. The most widely implemented window system is X developed at MIT.

Most of the protocols described above were designed by Sun, Berkeley and other organizations and are not part of the Internet protocol suite, but they are implernented using TCPIIP. 2.1 TCP/IP Protocol Suite

The TCP/IP Protocol is a Layered set of protocols consisting of 5 layers as shown in

Figure 1. On the lowest layer, we have the physical hardware followed by the network

Application

Transport

Intemet

Network Interface

Network Hardware

Figure 1: Conceptual Organization of TCP/IP Layers interface which the operating system kernel uses to communicate with the underlying hardware. The following layer is where IP lives, providing the connectionless packet delivery service. The TCP protocols provide the reliable transport se~cesand Bnally the application services are the programs and applications (such as rlogin, telnet, etc.) which use the Iower level services.

TCP is responsible for making sure that the packets get through to the other end. It keeps track of what is sent and retransmits anything that does not reach the other end of the connection. If the message is too large for a datagram, TCP will split the message into several datagrams and make sure dl of the datagrams reach the destination successfully. Since many applications need these hctions, TCP is a separate protocol which applications use as fwictions. However, not al1 applications need the services of TCP, but they do need the seMces of of IP to route the datagrams to the destination. Therefore, these fûnctions are grouped in IP and TCP calh on these services.

For example, electronic mail has a protocol (SMTP)which defines how to communicate with the mail server. These commands specifies the sender, recepient, and the text of the message. However, this protocol relies on the fact that we need to communicate reliably between two cornputers. TCP provides for the reliablility and

IP provides the communications. The Internet consists of several networks and the electronic mail can be sent through multiple networks before it reaches the ultimate destination.

Information is transfered as a sequence of datagrams and these datagrams me sent through the network independently, i.e. the datagram may take a different route. For example, if a 10000 octet file is being transfered, the protocols will break up this file into several datagrams because most networks can not handle 10000 octet datagrams.

Suppose it is broken up into 20 500 octet datagrams. Each of these datagrams will be sent to the destination sepazately. While these datagram are in transit, the network itself does not know that these datagrams belong to the same file. It is possible that packet 8 arrives at the destination before packet 7. Other such scenarios include the packet not getting through at al1 and therefore it mut be resent and in such a case the packet may be duplicated if the problem on the network is Fixed after we resend the packet.

2.1 The TCP Level

TCP (transmission control protocol) is responsible for breaking up the message into datagrams, reassembling the datagrams at the other end, resending any datagram that gets lost, and putting things back together in the right order. A header is used to keep track of multiple connections to a given system, a source and destination port, and a sequence number. The ports are used to keep track of different conversations and these ports range between O and 65536 (216). The assignment of a few port numben is governed by a central authority so that certain software packages (e.g. e-mail systems using sendmail) are written using these specific numbers. The port numbers which are "assigned" are within the range of O and 1024 and the ability to use one of these ports (for transmission or reception) requires special priveledges (super user access). Suppose we have two people transfering files to the same cornputer. TCP will assign a port number to each connection, say 1000 and 1001, these are known as the source ports. On the other end, two other porto are assigned for the usen,

1010 and 1011, these are known as the destination ports. This way TCP knows where to send the datagrams and which datagram belongs to which me. The TCP header

has the form as shom in Figure 2. The Acknowledgment Nurnber is used to confirm

Source Port Destination Port

Sequence Number

Acknowledgment Number

Data Offset Reservd Window

Urgent Pointer

DATA

Figure 2: TCP Header the arrivai of a packet. For example, if the sender receives a datagram with the acknowledgement number 1500, all data up to octet number 1500 have been received.

If the acknowledgement is not received in a reasonable amount of tirne, the datagram is resent. The Window field dows the destination host to indicate how much new data it is prepared to absorb. This way the destination host cm send as much as possible as long as the window is geater than zero. When it reaches zero it must wait untü the receiver increases the window. The Urgent Pointer field is used to skip ahead in the processing to a particular octet. The notation that will be used in subsequent sections for TCP connections is as

follows:

C + S:SYN(ISNc)

S -+ C:ACK(ISNc)

C + S: DATA(data)

C + S : DATA(datu), SRC = X

The first statement means that the client program C sen& a message to the server S with an initial sequence number, ISNc. This ISN is the sequence number that belongs in the TCP header. On the second statement, the server S sends a message to the client

C, acknowledging the receipt of the messages up to ISNc. The third statement shows the client sending data to the server. Finally the last statement shows that the client

C sends data to server S, but changing the information of the source TP address to X.

2.1.2 The IP Level

IP (internet protocol) is responsible for routing each datagram. This may involve crossing many dinerent media (ethernet, serial lines, phone lines). TCP simply hands

IP a datagram with a destination address and IP tries to get it to the destination without the knowledge that this datagrsm may be related to a previous datagram. A 2 THEINTERNET full discussion of the format of the IP header is shown in Appendix A.

2.2 The Applications Layer

Application protocols are placed at the application layer and they mn "on top" of

TCP/IP. When the application wants to send a message, it gives it to TCP, which makes sure it gets delivered to the other end using IP. The application protocols do not have to treat the underlying network connections differently if it is a phone line, terminal or full network connection.

Application programs nomally use random port numbers to open their connections to the remote semer. However, the destination port is usually a "we11-known" port which is unique for each service. For example, to open a FTP connection, the ftp program will choose a local port, say 1001, and specify the destination port of 21 (this is the well known port for all ftp secvers). Each connection can be described uniquely by a set of 4 numbers: source IP address, source TCP port, destination IP address, and destination TCP port. TCP and IP take care of ail the Uhard"parts and the application program can communicate with the remote cornputer as if it was comected by one single wire. The application protocol needs to speufy what comrnands are understood and the format of the data to be sent.

An example of an application protocol is SMTP (simple mail transfer protocol) which delivers electronic mail. In ou example, user 'bob' is sending mail from opus.cs.mcgil1.ca to 'jane' at po-box.cs.mcgill.ca. The contents of the message is shown in Figure 3.

Date : Sun, 18 May 1997 12 :41: 10 -5000 (1 From: bobQopus.cs.mcgill.ca To: jartego-box.cs.mcgill.ca Subject : Hello

Hou are you doing? Hope to hear from you soon.

Figure 3: A Typicai Mail Message

The program sending the mail asks the name server several queries on how to get the mail to PO-BOX.CS.MCGiLL.CA.First, it looks to see who handles the receiving of mail for PO-BOX.CS.MCGILL.CA.Because it handles its own mail, the second query is for the IP address of PO-BOX.CS.MCGILL.CA,which is 132.206.51.241.

The program then opens a TCP connection to 132.206.51.241 on port 25 (this is the well known port for SMTP) and when the connection is established the exchange of commands can begin. A typical conversation is shown in Figure 4. The labels on the left indicate the patty sending the message. PO-BOX 220 lisa.CS acgill. ca ESMTP Sendmail 8.8.5/8.8.4; Mon, 19 May 1997 21 :16 :47 -0400 (EDT) OPUS EHLO opus.cs.mcgill.ca PO-BOX 250-lisa.cs.mcgill.ca Hello bobOopus.CS.McGill.CA [132.206.3.3], pleased to meet you PO-BOX 250-EXPN PO-BOX 250-VERB PO-BOX 250-8BITMIME PO-BOX 250-SIZE PO-BOX 250-DSN PO-BOX 250-ONEX PO-BOX 250-ETRN PO-BOX 250-XUSR PO-BOX 250 HELP OPUS MAIL From:~bobOopus.cs.mcgill.ca> SIZE=xx PO-BOX 250 ... Sender ok OPUS RCPT To: ... Recipient ok OPUS DATA PO-BOX 354 Enter mail, end with . on a line by itself OPUS Date: Suri, 18 May 1997 22:41: 10 -5000 (1 OPUS From: bobQopus.cs.mcgill.ca OPUS To: janeQpo-box.cs.mcgill.ca OPUS Subj ect : Hello OPUS OPUS Kou are OU doin? Hope to hem from you soon. OPUS PO-BOX 250 VU13361 Message accepted for delivery OPUS QUIT PO-BOX 221 lisa.cs.mcgill.ca cloeing connection

Figure 4: SMTP Protocol Conversation Another example of this is the authentication methods of the rsh protocol. We have the user "bob@opus" logging into to the account 'tbobQlisa". To use the rsh mechanisms Bob needs to have a file in his home directory in '%san named ".rhostsn to be able to authenticate that bob@opus can log in without supplying a password. The one line in the .rhosts file consists of the hostname followed by the user: opus.cs.mcgill.ca bob. Whenever bob uses rsh from opus.cs.mcgili.ca to lisa.cs.mcgill.ca, he does not need to supply a password. Using the same format as in Figure 4 the rsh protocol would take the form shown in Figure 5. The REMOTEHOST parameter is taken from

OPUS USER=bobJ HOST=opus.cs.rncgill.ca LISA check .rhoetsJ opus.cs.mcgill.ca bob REMOTEüSER=bob, REMOTEHOST=opus.cs.mcgîll.ca match found, remote shell ganted OPUS CMD=dat e LISA execute CMD

Figure 5: RSH Protocol Conversation the IP layer and the authentication mechanism uses the parameter.

There is a general pattern to the responses in both of these protocols. The protocol defmes the specific set of responses that can be sent as amers to any given command.

The attacks that are shown in Section 4 rely on the predictability of packets being sent back and forth between the client and semer. 2.3 UDP and ICMP

In cases where the message is always small enough to fit in one single datagram, the complexity of using TCP is not needed. This eliminates the extra overhead for the protocol to check if the message needs to be split. However, because we onb have one datagram to send, if an answer is not returned in a few seconds, the datagram cm be sent again. Applications iike this use UDP (user datagram protocol) which is designed for applications where the sequences of datagrams are not needed. As in TCP, there is a UDP header for al1 the datagrams. UDP sends the data to IP, which adds the

IP header. UDP does not split the data into multiple datagrams and does not keep track of what it has sent. UDP port numbers are used in the same way TCP port numbers are used. There are well known port numbers for servers using UDP. The

UDP header is smaller because it only carries the source and destination port numbers and a checksum. One such protocol that uses UDP is DNS (domain name senrice).

Another alternative protocol is ICMP (Internet control message protocol) which is used for error messages and other messages destined for for TCP/IP software itself, rather than the user program. For example, when connecting to a host, the remote host may be down and an ICMP message saying "ho& unreachable" will be received.

ICMP is similar to UDP because it handles messages that fit in one datagram, but it is much simpler because it does not use port numbers in the headers. The messages are interpreted by the network sokeitself so the port number is never needed. 3 CRYPTOGRAPHY

3 Cryptography

3.1 Introduction

Cryptography is used to protect information of value from certain people. In the past,

strong encryption was used by major governments and multinational corporations, but

society is slowly moving towards an information society where these same tools will

be used to maintain privacy and confidentiality. This section will discuss the basic

terminology used in cryptography, basic cryptographic algorithms, digital signatures,

and cryptographic hash functions. The following sections will contain generd ovemiew

of block cipher algorithms, a few algorithms such as DES, DEA, some one-way hash

functions and some public key algorithms. This is only a basic introduction to this

subject matter and more information can be found in several other texts such as [Sch94]

and (Sti961.

3.1.1 Basic Terminology

A message to be sent is cl&& or plointezt which will be encoded in such a way to hide the information from outsiders using encryption. The resulting encrypted message is cailed the ciphertezt which is sent to the receiver. Upon receiving the ciphertext, decqption of the ciphertext is achieved by using a key and the coding method (nyptugraphàc olgorithm). Cryptogmphy is the science of keeping such messages secret practiced by crypto- graphers. On the other side of the fence, cryptanulysis is the art of breaking ciphers without the use of the proper key is practiced by ~yptanalysts. Cryptography is a branch of mathematics that studies the mathematical foundations of these cryptographie methods.

A method of encryption/decryption is called a ctpher. Secrecy of the dgorithms used for cryptographic methods was common in the past (such as Caeser's cipher).

Most methods now do not rely on secrecy of the algorithm, but on the intrinsic mathematical difficulty of its cryptanalysis.

3.1.2 Basic Cryptographic Algorithms

There are two classes of key-based cryptographic algorithms, symmetric (or secret- key) and asymmetric (or public-key) algorithms. The main difference in these classes of algorithms is that the same key is used for both encryption and decryption for symmetric algorithms, but different keys are used in asymmetric algorithms for encryption and decryption and the decryption key can not be efficiently derived from the encryption key.

Symmetric algorithms can be divided into two types, stream ciphers and block ciphers. A stream ciphet can encrypt a single byte of plaint& at a time, whereas block ciphers operate on a specific number of bits (usudy 64 bits) and encrypt them as a single unit. Some of these algonthms are discussed in subsequent sections.

Asymmetric algorithms allow the encryption key to be made public, allowing any-

one to encrypt messages with that key. However, only the proper recipient can decrypt

the message. The encryption key is called the public key and the decryption key is

known as the private key (or secret key).

3.1.3 Digital Signatures

Some public-key algorithms can be used to generate digital signatures which is a block

of data created using a secret key. Verification that the data was generated with the

secret key can be done using the public key. These public keys need to be trusted and therefore there are two different methods to "sign" keys. One such method known as a centralized key infrastructure where there is a universally trusted organization which signs al1 the keys. The other method is a distributed system where there is no universally trusted organization, but each user bas one or many trusted "root" users.

This method is also known as the web of trust concept which is used by PGP.

Cryptographie hash functions are typically used to compute a message digest by cornpressing the bits of a message to a fixed size hash value in a way that it is unlikely to corne up with another message that would hash to the same hash value. Typically, a hash value of 128 or more bits is cornmon. 3.1.4 Strength of Cryptosystems and Attacks on Cryptosystems

A good cryptographic system should always be designed so that it is computationally difficult to break. In theory, all cryptographic methods with a key can be broken by

trying ail possible keys in sequence (bruteforce). However, in practice, the required computing power increases exponentidly with the length of the key. However, this is not the only method that can be used to break a cryptographic method. Cryptanalysis is the art of deciphering encrypted messages without knowing the proper keys. Some of the common methods are sumrnarized: ciphertextsdy attack works on the ciphertext only. The cryptanalyst makes guesses

about the plaint&. This works well for messages that have a hed format

header. known-plaintext attack works with the ciphertext and parts of the plaintext. The

task is to decrypt the rernaining part of the ciphertext blocks with this informa-

tion, or to determuie the key. chosen-plaintext attack works on ciphertext and the plaintext. The cryptanalyst

can have any text encrypted with the unknown key. The task is to deduce the

secret key from the ciphertext/plaintext pairs. man-in-the-middle attack works for cryptographic communication and key exchange

protocols. The attacker puts himself between the two parties on the communie- ation line and exchanges separate keys with each party. The messages wiiî pass

to the attaeker, who can decrypt it and then encrypt it with the new key for the

second party. timing attack works on the exact execution times of modular exponentiation opera-

tions.

3.2 Block Cipher Modes

A block encryption algorithm operates on blocks of plaintext or ciphertext of a fixed size (usually 64 bits). The simplest way to do this is to encrypt each block of plaintext into a block of ciphertext. This method is known as the Electronic Codebook (ECB) mode where each block of plaintext always encrypts to the same block of ciphertext.

The adwtages of this mode is that a file does not need to be encrypted linearly; you san encrypt blocks in the middle of the file before the 6rst few blocks because each block is idependent of each other. This kind of feature is desired for randomly accessed files such as databases. However, the main disadvantage is that cryptanalysts can compile their own codebook fiom the plaintext and ciphertext without knowing the key.

Another mode uses a feedback mechanism; the results of the encryption of previous blocks are fed back into the encryption of the curent block. Each ciphertext block is dependent on the plaint& block and a.iI previous plaintext blocks. Cipher Block Chaining (CBC) mode takes the plaintext and XORs with the previous block of ciphertext before it encrypts it to for the ciphertext of this block. The decryption process is achieved by using the reverse order of operations that was applied in the encryption process. A block is decrypted and XORed with the previous block that had the decryption process applied. An Initialization Vector (TV) is used to encrypt random data as the first block to make each message unique because without it the message will encrypt to the same ciphertext.

In the Cipher Feedback (CFB) mode, the data is encrypted in smaller units than what is typically used as the block size in CBC mode. This is done so that the encryption can begin before the entire block is received as in CBC mode.

In Output Feedback (OFB) mode, the feedback mechanism is independent of both the plaintext and ciphertext streams. This mode is similar to the CFB mode, except that part of the previous output block is moved into the right-most positions of the queue.

3.3 Some Block Algorithms

There are several different block algorithms which have been proposed over the yem and information is readily available in any good cryptogaphy textbook. The following sections discuss some block algorithms that are used in the SSL, SSH and IP security protocols. 3.4 Data Encryption Standard (DES)

For over 15 years, the Data Encryption Standard (DES) has been the worldwide standard. The standard is beginning to show signs of old age even though it has held up well against yem of cryptanalysis. The algorithm, a block cipher, works by encrypting data in 64 bit blocks. The decryption of the ciphertext uses the same algorithm with differences in the key schedule. The length of the key is 56 bits.

DES operates on a 64-bit block of plaintext and after the first permutation, the block is broken to two halves of 32 bits (left and right). Sixteen rounds of identical operations are performed using a function f, in which the data is combined with the key. After the sixteenth round, the two halves are rejoined and the inverse permutation of the first permutation is taken to finish the algorithm.

The function f takes the key bits and shifts them and then 48 of the 56 bits are selected. The right half of the data is expanded to 48 bits through an expansion permutation and it is exclusive-0Red with the 48 bit key. This result is put through a substitution algorîthm and permuted again. The output of the Function f combined with the left half with an exclusive-OR The result becomes the new left half and the old left half becomes the right Mf and the operation is repeated another 15 times (16 times in total). This process is shown for round i in Figure 6.

Permutations and the substitution functions follow a predetermined table which transposes bits of the input block or the DES key. These tables have been left out as O n a 5.5 -. KEY l WII 2s #la -sbifw SaE1W

Figure 6: Round i of DES they are readily available in various texts such as [Sch94].

The decryption algorithm is the same as the encryption algorithm except that the keys are used in reverse order. That is, if the encryption keys for each round are ki, k2, k3, . . ., kis , then the decryption keys wiU be the reverse order: kle,kls , kir, .. ., kt.

There are four modes of operation of this algorithm: Electronic Codebook (ECB),

Cipher Block Chsiniog (CBC), Output Feedback (OFB),and Cipher Feedback (CFB).

The ANSI banking standards specify the use of ECB and CBC for encryption and CBC and n-bit CFB for authentication. 3.4.1 RC2 and RC4

RC2 and RC4 are variable key size encryption algorith designed by Ron Rivest of

RSA Data Security, Inc (RSADI).The algorithms are proprietaq and the details are not published (which does not mean that is it more secure). The RC2 algorithm was designed to be a replacement of DES by encrypting data in 64 bit blocks which does not use S-boxes. According to RSADI, the software implementation of the algorithm is three times faster than DES. RC4 is a variable key size stream cipher which is reportedly ten tirnes faster then DES. Although there have not been reports of scientists analyzing the algorithrn, RSADI's chief scientist cldms that the algorithms are not vulnerable to differential cryptanalysis. However, these dgorithms have not suMved

20 years of intense cryptsnalysis in the same way DES has.

3.4.2 IDEA

The International Data Encryption Algorithm (IDEA)is believed to be a strong and secure block algorithm. Tt operates on a 64 bit plaintex$ blocks with a key of size 128 bits. The same algorithm is used for encryption as well as decryption. The design philosophy behind this algorithm is to mix operations bom dinerent algebraic groups.

The algebraic groups that IDEA uses are: XOR,addition modulo 216 (addition, ignoring overflow) ,and multiplication modulo 216 (multiplication, ignoring overflow) . These operations are eaiily implemented in both hardware and software and they operate on 16-bit sub-blocks which makes it efficient on ldbit processors as weil.

The 64 bit block is subdivided into 4 16 bit subblocks (Xi,X2, X3, X4). The first round of eight rounds takes these sub-blocks and XORed, added and multiplied with one another and with six 16-bit sub-blocks of key material. Between each round, the second and third sub-blocks are swapped. The exact sequence of events for each round is as follows:

1. multiply Xl and the first key sub-block,

2. add X2and the second key sub-block,

3. add X3 and the third key sub-block,

4. rnultiply XIand the fourth key sub-block,

5. XOR the results of steps 1 and 3,

6. XOR the results of steps 2 and 4,

7. multiply the result of step 5 and the fiRh key sub-block,

8. add the results of 6 and 7,

9. multiply the results of step 8 with the sixth key eub-block,

10. add the results of steps 7 and 9,

11. XOR the results of 1 and 9, 12. XOR the resdts of 3 and 9,

13. XOR the results of 2 and 10,

14. XOR the results of 4 and 10.

The output of the round is steps 11,12, 13, and 14 which become the input of the next

round after swapping the two inner blodcs (Xi= 11, Xz = 13, X3 = 12, X4 = 14).

After the eighth round, a final transformation is applied:

1. muitiply XIand the first key sub-block,

2. add Xz and the second key sub-block,

3. add X3and the third key sub-block,

4. multiply X4 and the fourth key sub-block and the results are reattached to produce the ciphertext.

There are 52 (six for each round and four for the final transformation) key sub- blocks. The 128 bit key is divided into eight 16 bit subkeys. The first six subkeys are used for the first round and the remaining two are used for the first two subkeys of the second round. The 128 bit key is rotated 25 bits to the left when subkeys are needed again and divided into 8 subkeys. The fkt four subkeys are used in the second round and the remaining four are used for round three, and so on until the end of the Decryption is mtly the same a9 the encryption process except that the key sub- blocks are revened and siightly different. The key sub-blocks are either additive or multiplicative inverses of the encryption key subblocks (the multiplicative inverse of

O is O for DEA). The algorithm can work in any block cipher mode discussed i~ the section 3.2.

3.5 One Way Hash hctions

One-way hash functions operate on an arbitrary-length message to a fixed-length hash value. There are many functions that have this property, but one-way hash functions have additional properties:

a it is simple to compute the hash value,

given the hash value, it is difficult to compute the message, and

given the message, it is hard to find another message that hashes to the same

value.

The last property does not mean that there is no other message that will hash to the same value, but depending on the securîty requirements, it would take a large number of operations (usually in the order of 2m) before a message cm be found.

Attacks on these kinds of algorithms corne in two forms. The fmt is to take a hash value and create a new message that hashes to that same value. The second attack is more subtle where we find two random messages hashing to the same value. This

attack is known as the birthday attack.

Most one-way hash functions operate on two inputs: a block of text and the ha& of the previous block of text, and creating one output. The output always produces a

fbced-length output no matter what the size of the input.

MD5 processes the input text in 512-bit blocks which is initidy divided into sixteen

32-bit sub-blocks. The output of the function is four 32-bit blocks which concatenate to forrn a 12&bit hash value.

The message is fht padded to get the length of the message to be 64 bits short of being a multiple of 512. The padding consists of a single 1 followed by as many 0's as needed. The remaining 64 bits (to get the length of the message to be a multiple

512) is computed which represents the message (before padding) is appended to the message. This last step is to make sure that different messages will not look the same after the padding is added.

There are 4 rounds consisting of 16 nonlinear operatiom. Each operation performs a nonlinear huiction on three of the chaining variables (A, B, C, D) and adds that resuft to the fourth chaining variable. The result is rotated to the right a Mtiable number of bits and adds the result to one of the chaining variables. Finally the result replaces on of the chaining variables. The exact detaiis of the functions can be found in any cryptography text.

3.6 Public-Key Algorithms

Public-key cryptography was invented by Whitfield Diffie and Martin Hellman and independently by Ralph Merkle. The concept behind public-key cryptography is the notion of keys coming in pairs (encryption and decryption keys) and that the keys are independent so that it would be unfeasible to cornpute a key from another. There are numerous public-key cryptography algorithms, but many axe insecure or they are impractical because the key is large or the ciphertext is much larger than the plaintext.

Only a few algorithms are considered to be secure and practicd. However, these algorithms are only suitable for key distribution or encryption, but not both. Only two algorithms are suitable for encryption and digital signatures: RSA and ElGamal, but these algorithms are slow.

The first public-key algorithm invented was Diffie-Hellman which is based on the difn- culty of calculating discrete logarithms as compared to computing the exponentiation in the same field. This algorithm is suitable for key distribution, but not for encryp tion and decryption. The algorithm works as follows: Alice and Bob agree on two large integers g and n, such that g is less than n but greater than 1. These integers

need not be kept secret so they can be sent over insecure channels. The protocol is as

follows:

1. Alice chooses a large integer, x and computes X = gl mod n,

2. Bob chooses a large integer, y and computes Y = g' mod n,

3. Alice and Bob exchange X and Y while keeping x and y secret,

4. Alice computes k = Y" mod n, and

5. Bob computes kt = Xv mod n.

The value that Alice and Bob computes is equal to gY mod n. Now Alice and Bob have k and k' which is the secret key that Alice and Bob have computed independently.

3.6.2 RSA

The RSA algorithm was invented by Ron Rivest, Adi Shamir and Leonard Adleman.

The algorithm has withstood yesrs (since 1978) of extensive cryptanalysis. RSA is based on factoring large (100-200 digits or even larger) prime numbers. The public and private keys are computed a9 a hinction of the large prime numbers (pand q) such that the product n = p * q is used. The encryption key (e) is computed by choosing an e which is relatively prime to @ - 1)* (q - 1). The decryption key (d) is cornputed using Euclid's algorithm such that e * d = l(mod (p - 1) * (q - 1)). The public keys are e and n and the private key is d. The two primes can be discarded, but not revealed because they are no longer needed.

To encrypt a message rn, it is divided into numerical blocks such that each block has a unique representation rnodulo n. The encrypted message c will be made up of similarly sized message blocks by using the formula: q = me(mod n) and the decryption formula is: mi = cid( mod n). An application of the Euler-Fermat algorithm shows that these fomulae are correct. 4 IP Spoofing

4.1 Introduction

The TCPIIP protocol suite is widely used by several client-server based protocols.

The following attacks take advantage of TCP's reliability and the source addressing

in IP which is used for authentication by several client-server programs.

The %asport Control Protocol (TCP)is the connection-oriented, reliable tram-

port protocol in the TCPjIP suite. Hosts participating in a conversation must establish

a connection before any data is exchanged. Using the TCP protocol, the initial con-

nection consists of a three way handshake. Reliability is achieved by several ways such as sequencing and acknowledgment. Each segment is assigned a sequence number and is aiways acknowledged by the other side of the connection. The reliability makes it harder to fool, but because many hosts rely on the IP source address for authentication, which is stored in IP, flaws exist aad there are simple ways to take admntage to gain access.

The Internet Protocol (P)is the connectionless, unreliable network protocol in the

TCP/IP suite. It holds the source and the destination address (32 bit header fields) and it is IP's job to get these packets routed on the network. 4.2 Passive (Blind) IP Spoofing

The passive IP spoofing attack is a blind attack because the attacker takes over the identity of a host which is "trusted" by the server to subvert the security of the target host. The trusted host is disabled by some method (one is described in Section 4.2.2) and the attacking host forges packets to pretend that it is the trusted host to the serwr and cm cany on a conversation. The forged packets will reach the server because IP is a connectionlessoriented protocol, but the packets sent to the tmsted host from the server will be dropped (because the tnisted host is disabled). The attacker never sees these packets corne back and must guess (correctly) at what was sent to the tmsted host. Because the attack uses the host based trust mechanism in the Berkeley r*

(rlogin, nh, rcp) commands and that the attacker can predict what will be sent from the server depending on what it will send to the remote host will dow the attacking host to work in blindness.

4.2.1 TCP Connection and Sequence Numbers

A TCP connection is established by using a three way handshake shown in Figure 7.

The client (C) selects an initial sequence number (ISNc) and transmits it to the server

(S) in step 1. The server acknowledges the receipt of the initial sequence number and sen& it's own initial sequence number (ISNs). Finally, the client acknowledges the sequence number and data transmission may take place between the client and server. Figure 7: TCP 3-way Handshake

The attack lies in the assumption that one can predict ISNs and therefore irn- personate the real client. In Figure 8 the attacker (F) impersonates the client (C) and carries on the handshake with the server (S). During the second step where the

Figure 8: TCP 3-way Spoofed Handshake server sends the acknowledgment to the client rather than the impersonator (F), the impersonator can predict the contents and therefore F still sends out new information.

The impersonator must render the real client inoperative (either by crashing it or SYN flooding it) to ensure that the client (C) will not receive the ACK(ISNF)because in seeing this, a RST wiil be sent to the server and closing this handshake.

This attack succeeds because the initial sequence number can be easily cornputed.

The implementation of the choosing of the ISN can be easily computed because when a cornputer is bootstrapped, the ISN is initialized to 1. It is subsequently Uicremented by a fked amount (say X) every second. If there is a connection made, the ISN is incrernented by half of X. The reasoning behhd this scheme is that the ISN counter is a 32 bit wide number which will wrap every Y hous (if X is 128000, then Y is 9.32 hours) if there are no connections and this will rninimize the chance of getting data from an older connection using these sequence numbers, i.e. the chances are that a packet from a connection 9.32 hours earlier will have been processed properly by this time.

To get an idea of what the server's sequence number window is in, the attacker needs to connect to one of the server's ports and complete a three-way handshake. By completing the handshake, the server's ISN (ISNs) can be saved and typically this is done several times to get a good sampling of the server's ISNs. The round-trip time needs to be cdculated too, to get a value of how much time passes between the server's

ISN that the attacker gets hmthe probe to know what the increment should be. At this point the attacker has an idea of what the ISN wiil be and can initiate an attack by sending the faked sequence. The semer will take an action depending on the value of the ISN predicted and the true ISNs. If the value of the predicted ISN is equal to the ISNs the packet is placed on the receive buffer and the attack can continue.

If it is less than ISNs the packet is considered to be a retransmission and therefore it is discarded. If it is greater than ISNs, but still within the bounds of the receive window, the packet is considered to contain hiture bytes and is stored and will be used once the missing bytes arrive. If it's outside of the bounds of the receive window, the 4 IPSPOOFlNG 39

packet is dropped and TCP will send the expected sequence number to the real host.

4.2.2 Trusted Host and the SYN Attack

The trusted host is trusted because of the mechanisms in several of the Berkeley r*

commands; which have become de facto standards for many vendors (and not just

UNIX systems). A trust relationship is set up between the two machines to allow a

user on host A to be able to log in to host B without having to type in a password.

The authentication is done by using rhosts authentication, that is hostluser pairs can

be listed in a .rhosts file, or hosts in /etc/hosts.equiv. So the authentication is done

by using the source IP address which can be easily modified.

Pretending to be the tmsted host can be achieved by rendering that host unusable so that it can't answer any TCP transactions. The most common way to disable a

host is to use a denial of service attack called TCP SYN flooding. There is a limit to how many concurrent SYN requests TCP can process for a given socket because there is a limited length queue in which these incoming requests are held. This queue limit applies to both the complete and incomplete connections and once it is reached, TCP will süently &op al! new incoming SYN requests until the pending connections are dealt with. As the host is dropping packets, the attack can continue and subsequent acknowledgments sent to the trusted host will not be noticed, therefore there will not be a RST (reset) message sent to the server. If we take the example of the rsh protocol shown in Figure 5 we can have host calvin.cs.mcgill.ca executing the spoof on lisa. First opus must be rendered inoperable by using the TCP SYN attack. Once that has been accomplished the conversation will take the fonn shown in Figure 9. On the second line, lisa believes that the connec-

CALVIN USER=bob, HOST=opue.cs.mcgill.ca LISA check .rhosts, opus.cs.mcgill.ca bob REMOTEüSER~bob, RMOTEHOST=opus.cs.rncgill.ca match found, remote shell granted CALVIN CMD- -R * LISA execute CMD

Figure 9: RSH Protocol Conversation tion came €rom opus.cs.mcgill.ca because calvin set up the parameters in the IP iayer.

Even though lisa se& the message that the rernote shell is granted to opus, calvin assumes that this step was taken (since the protocol specifies the set of responses that can be sent) and can issue the command to remove all the files. Lisa will take the command and stiil assume that the connection came in legitixnately and happily execute the command removing all of Bob's files. Other examples of IP Spoofing can be found in [Dae96]. 4 P SPOOFING

4.3 Active IP Spoofing

The active IP spoofing attack is a simple variation of the passive anack where the

attacking host creates a desynchronized state on both ends of a TCP communication

and then creates acceptable packets which mimic packets which could have been sent

by either side. A desynchronized state is a connection when both ends are in an

established state, but no data is being sent. This happens when sequence numbers

are not what are expected. As was shown in Figure 7 the server and client need to

acknowledge each packet (and sequence nwnber). If the server sends an ISN which

the client does not expect or vice versa the packet will either be held until the missing

packets are received; if the sequence number is in the correct receive window or they

will be dropped because it is outside of the window.

4.3.1 Desynchronization and the Attack

During the desynchonization, the client machine will be sending packets to the server,

but if the ISN is outside of the semer receive window the packets will be dropped. The atfacker takes advantage of this and sends the same packet, but changes the sequence numbers (and checksums) to match the expected ISNs. As this is acceptable to the server, the data is processed. This attack relies on the fact that the attacking host can listen in to any and al1 packets exchanged between the client and the server and that the attacking host can forge any IP packet to pretend to be the client or the server. It was shown in subsection 4.2 that the IP packets can be eaoily spoofed.

Desynchronization can be achieved when the initial TCP connection is being made or by sending large arnounts of data to the client and the server. In the first case the attacker listens for the SYN(ISNs),ACK(ISNc) packet from the server. At this point the client receives the packet and puts itself in the established state. The attacking machine sends a RST packet to the server and reopens a connection with it's own ISNF. The server closes the first connectio~and reopens a new connection with the same parameters except for the new sequence number. The server wiil send a new SY N(ISNs), ACK(ISNp) to the client machine and the attacker sen& the

ACK packet placing the server into the established state. The client and server are in the desynchronized state and the attacker can forge packets claiming to be from either host. The second method consists of the attacker sending a large amount of data to the server and to the client. This will change the sequence numbers on the client and server and place the two machines in a desynchronized state.

This attack creates a TCP ACK storm (generation of a lot of TCP ACK packets shown in Figure 10) because a host will send, to the sender of the bad packet, the correct (expected) sequence number (Line 2 in Figure 10). This packet is itself unacceptable and therefore the host receiving this packet will send its own expected sequence number (Line 3 in Figure 10). This creates a supposedly endless loop for every data packet sent. Fortunately, TCP uses IP, which is and unrellable protocol, C -t S:SYN(ISNc) S -+ C : SYN(lS&), ACK(ISNc, Expected ISNc) C -t S : SYN(ISNc), ACK(ISlv,,Expected ISNs) S -t C : SYN(ISNs),ACK(ISNc, Ezpeded ISNc)

Figure 10: TCP ACK Storm and it may drop these packets. As these packets don't contain data, they are not retransmitted so for every dropped packet, a loop is broken.

4.4 Preventative Measures

The key to both attacks is the rate of increment for the sequence number. Bellovin suggests that the increment be randomized by a real random number generator rather than a pseudo-random number generator which is often easily invertible.

These attacks rely on the fact that the attacking machine has a super user account which is not tmted by the attacked machine. The super user is a special user which has the ability to access al1 files and peripherals on a computer. This super user account is typically used for maintenance of the computer and therefore it has the abiiity to view and modify ail data on its disks and on the network it is connected to.

The ability to forge packets for programs such as sencimail and rlogin need the abiüty to open an "assigned port" (ports 25 and 513 repectively) on the attacking machine and this can be only be achieved using the super user account. The attacking host can be on the local network or can come from anywhere from the connected Intemet.

Most of these attacks are attempted from outside of a subnet because most machines

in the local subnet are controlled and can be tnisted. Therefore filtering packets at

the gateway can reduce the chances of the attacks succeeding. The gateway should

automatically reject al1 packets which claim to come fiom inside the network which

pass through the gateway. This should work because all inside packets will not travel

to the outside and come back in to reach a host on the inside. However, this has no

value to attacks initiated from the inside.

The solution for interna1 attacks seems to be to use some form of authentication

and encryption. The proposed solution will appear once IPv6 is deployed because of the Authentication Header and Encapsulating Security Payload which will exist in the

internet layer. In the current Internet, only application layer programs offer this kind of authentication and encryption. This level of authentication and encryption needs to be set up for each port that we want to protect. For example, if we would need to protect sendmail, rlogin, or ftp, we would need to set up each of these separately and we would also require special prograrns to be able to communicate properly with the new protected service. Two packages are discussed in the following sections on the

Secure Shell and the Secure Sockets Layer (SSL). 5 IP LAYFR SECURITY ARCHITECTURE

5 IP Layer Security Architecture

The security mechanisms provided by the IP security axchitecture can be applied to either version of the Internet Protocol(4 or 6) as an option or as an extension header,

respectively. Even though version 4 can use this extra security in the IP layer, it

will cost more because of the way options are implemented in IPv4. Each host that

must examine the IPv4 datagram must process dl options, whereas under IPv6, the extension headers axe examined by ho- that "need" the extra information. The two headea that are used to provide security services at the IP layer are the Authentacation

Header and the IP Encapsulating Security Payload. The typical use of these headers are discussed in the following paragraphs, but they are not an exhaustive list of uses of these headers. The two headers can be used either independently or in conjunction to each other.

The IP Xuthentication Header (AH) provides uitegrity and authentication without confidentiality, thus enabling its implementation to be widely accepted on the worldwide Internet even in locations where the import, export, and use of encryption is regulated. This extension can be implemented between two or more hosts, between two or more gateways, and between a host or gateway and a set of ho- or gateways.

One use is to have a "security gateway" (Figure 11) which establishes security associations between hosts or gateways in a subnetwork, where these hosts are trusted not to engage in passive or active attadcs and the underlying hardware is not being attacked, Hom and I Twmi Subnctwork Nctworks (Intemet)

Figure 11: Security Gateway and the hsted Subnetwork

and externa1 untmsted systems. In this case ody the gateway needs to impiement the authentication header since the associations are made to the gateway and not to the trusted hosts. The trusted hosts may take advantage of the services provided by the security gateway. This is analogous to the use of firewdls in the current Internet. The security gateway must perforrn address-based IP packet filtering on unauthenticated packets claiming to be coming from a system known to use IP security.

The IP Encapatlating Security Payload (ESP) is designed to provide integrity, authentication and confidentiality to IP datagrams. The header can be implemented in the sarne way as the Xuthentication Header. The authentication and encryption/decryption can be performed between two hosts, between two or more gateways, or between a host or gateway and a set of hosts or gateways. A "security gateway" can be irnplemented to provide services for the trusted hosts. Gateway to gateway encryption is probably the most valuable for building private virtual networks (Fig- ure 12) across untnisted backbones. Packets from subnetwork O can be encrypted at the security gateway O and forwarded off to the security gateway 1 where it can be decrypted and forwarded to the host(s) in the subnetwork 1. Even though this is a - Un-*Hous and 1

u ( Subnetwork l J Gauway l VidRivait Network

Figure 12: Virtual Private Network good way of excluding outsiders, it is not a substitute for host-to-host encryption as there is no guarantee that the trusted subnetwork will always be trustworthy. The use of this header needs to ensure authentication and thus must use the IP Authentication

Iieader as well to provide end-to-end integrity. If there are no security gateways in the path of the connection, the two hosts that implement the ESP may use it to encrypt user data (TCP or UDP) during their communications.

5.1 Security Associations

The fundamental concept of "Security Associations" is needed by both headers to function properly. The use of the destination address and an opaque index, which is caiied the Security Parameters Index (SPI),uniquely identifies a particular "Security

Associationn. Other parameters can be added to the "Securi~Associationn by including other indices or parameters, but ody the SPI and destination address is required to identify the association. Some parameters that may need to be included in most implementations (if not all) are:

a the authentication algorithm and mode used with the IP Authentication Header;

0 the key(s) used with the authentication algorithm (AH);

a the encryption algorithm, algorithm mode, and transform being used with the

IP Encapsulating Security Payload;

the key(s) used with the encryption algorithm (ESP);

a the presence or absence and size of a cryptographie synchronization or initializ-

ation vector for the encryption algorithm (ESP);

authentication algorithm used with the ESP transform (if used);

the key(s) for the authentication algonthm used with the ESP tramform (if

used);

the lifetime of the key or time when regeneration of a new key should occur (ESP

and AH) ;

a the lifetime of the Security Association (ESP and AH);

the source address(es) of the Security Association (ESP and AH); and 5 fP LAYER SECURlTY ARCHITECTURE

0 the sensitivity level (eg. secret or unclassified) of the protected data.

The first five parameters must be specified for the implementation of either header, but of the remaining 6 parameters only the 1st needs to be specified and only systems claiming to provide multi-level security in the last case. These parameters are only recommended, but the full implications must be understood before ignoring the use these parameters.

When a connection is made, the source host uses the source user id and destination address to select an appropriate Security Association. The destination host uses the

SPI value and destination address to distinguish the association. There is only a one- way association so there are usually two SPIes in a communication. The value of SPI is normaliy chosen by the destination side of the communication, so for unicast trafic, there is no potential for conflicts between manually configured Security Associations and automatically configured (eg. through a key management protocol) associations.

In the case of multicast trafnc, the destination system is the destination multicast group and the value of SPI must be communicated between the group through some rneans.

By having the value of SPI be unique for al1 the senders of the multicast group, there is no way to authentimte the sender, but the receiver knows that the data came from a system that knows the values of the Security Association. However, if authentication is needed, each of the senders cmhave a separate Security Association (as well as the group Securiv Association). 5.2 The IP Authentication Header

The authentication header holds the authentication information for its IP datagram.

Using a secret key and a cryptographie function over the IP datagram, the packet can be authenticated. When the sender needs to send an authenticated IP packet, the authentication data is computed before fragmentation has taken place. During transit certain fields in the header must change, such as the TTL (IPv4) and Hop Limit (IPv6) and therefore must be excluded from the calculation. An authenticated IPv6 header is shown in Figure 13. The authentication header consists (shown in Figure 14) of 5

Figure 13: Example of an Authenticated IPv6 Header fields. The NEXT HEADER field (&bits) identifies the next extension header after

I Security Parameter Index I Authenticated Data Figure 14: IP Authentication Header Format the authentication payload. The LENGTH field (&bits) gives the length, in 32-bit words, of the AUTHENTfCATION DATA field. If this field is set to zero words, it 5 IP LAYER SECUWTY ARCHXTECTUEEE 51 indicates the use of a 'hulP authentication algonthm (this is the degenerate case).

The 16 bit RESERVED field is for future use and in the current irnplementation wiU be zeroed. However, it is still used to calculate the authentication data, but unused at the receiver's end. The SECURITY PARAMETER INDEX identifies the security association for this datagram. The reserved values range between 1-255 for future use of the Internet Assigned Numbers Authority (IANA). The value of O indicates that no security associatiou exists. The AUTHENTICATION DATA field is of variable length

(as indicated by the LENGTH field) and its contents is dependent on the algorithm used for the authentication.

The processing costs of this authentication header will increase for systems participating in this authenticated communication as well as increase the latency between the sender and receiver. The latency is due to the sender calculating the authenticated data and the receiver confirming the authenticated data. The steps for the sender to calculate the authentication data is as follows:

the security association is chosen according to the security policy;

a modified fields in transit (ie. TTL, hop limit) are replaced by the value of O;

a compute the authentication data according to the chosen dgorithm;

0 place the authentication data into the datagram.

When the datagram is received by the destination host the following steps are taken to authenticate the datagram:

a the security policy of the incoming datagram is checked;

a using the SPI value, the security association is determined;

a modified fields in transit (ie. TTL, hop limit) are replaced by the value of O;

a compute the authentication data according to the chosen algorithm;

a the authentication data is compared to that of the datagram's field.

The default algorithm that is used to calculate the authentication data will be the MD5 algorithm with a l2%bit key. The authentication header is a requirement of IPv6, but optiooal in IPv4.

Non-repudiation is not provided by using the keyed MD5 algorithm and therefore, if this is an issue, other aigorithms should be used in addition to the MD5 algorithm.

This header will have to be included in al1 implementations of IPv6 and must be used by ail IPv6 capable hosts. However, under IPv4 this header is still optional and there are currently no implementations of the authentication header in use. The use of this header will help eliminate several network attacks such as IP spoofing or host masquerade (see section 4). Because the header is in the Internet layer, it will be made available to any layer above it, such as the transport layer or the application layer. Under Pv4 where the header is not in general use, applications that provide authentication have to be located above the Internet layer and must use their own

algorithms and implementations.

The authentication header does not provide confidentiality to the datagrams, there-

fore, the Encapsulating Security Payload must be used to provide this functionality.

5.3 The IP Encapsulating Security Payload

The Encapsulating Security Payload (ESP) provides integrity, authentication and confidentiality to IP datagrams by encrypting the data of the IP datagram with a specified algorithm. The ESP may appear anywhere after the IP header and before the transport-layer protocol as shown in Figure 15. There are two modes in which the

Figure 15: Pv6 Datagram using ESP

ESP can be used, tunnel mode or transport mode. The tunnel mode ESP is used to encapsulate the entire IP datagram into the ESP. The transport mode ESP encapsu- lates the upper layer protocol (eg. TCP, UDP) into the ESP and then prepends a clear text Ihr6 header and thus encrypüng most of the ESP contents.

ESP can be implemented between two hosts, between a host and a security gateway, or between securiw gateways. The use of ESP between security gateways aiiows the use of a virtual private network and therefore only encrypting data when sending data through the untTuSted parts of the network and keeping the data unencrypted on the local trusted network, thus saving on the performance and monetary costs of encryption. The transport mode ESP should be used when two hosts are communie- ating securely to reduce the bandwidth usage and cost of protocol processing for these users who do not need the entire IP datagram to be kept confidential.

The encapsulating security payload consists of two 32-bit wide fields (as shown in Figure 16). The first field identifies the security association for the datagrsm and

Secunty Association Identifier (SPI) (32 bits)

Opaque Transform Data (variable length)

Figure 16: IP Encapsulating Security Payload Format is generally known as the SPI. This field is the only field which is unencrypted and transform independent. The transformation which is used for the encryption dictates the contents of the second field which is the Opaque Tkdorm Data.

The processing costs of the ESP can have a noticeable impact on network performance between participating systems, but should not adversely affect routers or intermediate systems which are not participating in the securïty association. In general the use of the ESP wiU require more processing power and more complex protocol processing which will lead to increased communication latency. This latency can be attributed to the encrypting and decrypting of each TP datagram containing the ESP.

The cost of this extra latency will be dependent on the encryption algorithm, the key size, and other factors which are specific to the implementation. Therefore, hardware encryption would be a cost-effective solution for systems requiring high throughput.

The steps that the sender must foIlow for tunnel mode ESP are as follows:

0 the security association is chosen according the the security policy;

the original datagrarn is encapsulated into the ESP;

a the appropriate encryption tranaform is applied to the ESP;

a the encrypted ESP is encapsulated in a clear text IP datagram.

On receiving the datagram, the destination systems follows the foIIowing steps:

0 the security policy of the incoming datagram is checked;

using the SPI value, the security association is deterrnined;

the clear text IP header and optional payloads are stripped off;

the appropriate decryption transform is appiied to the encrypted ESP.

In transport mode, the steps are taken in a similar fashion with some minor modi- fications. The steps are as follows:

a the security association is chosen according the the secuity policy; 0 the original transport layer hune is encapsulated into the ESP;

the appropriate encryption trsnsfoxm is appüed to the ESP;

the encrypted ESP is encspsulated in a clear text IP datagrm.

On receiving the datagram, the destination systerns follows the following steps:

the security policy of the incoming datagram is checked;

0 using the SPI value, the security association is deterrnined;

the appropriate decryption transform is applied to the encrypted ESP;

0 if the decryption succeeds, the transport-layer is removed from the ESP.

The DES-CBC algorithm is the default transform that al1 ESP implementations must include, but additional traasforms may be used. There are several other tram- forms specified for use with the ESP, such as Triple DES, DES CBC plus MD5, and replay prevention security trdorm.

5.4 Key Management

There is no Internet Standard for a key management protocol at this the and therefore, the key management protocol that will be used with the IP layer security has not been specined. Because the key management protocol is only used in the Security

Parameters Index for the Authentication Header and Encapsulating Security Payload, 5 P LAYER SECURITY ARCHITECTURE 57 key management can be completely disassociated with the specification of the IP layer security. The benefit of this design is to make it possible to replace the key management part with newer/better key management protocols when they are specified.

Several methods can be used for the key management system, including manual key configuration and public-key systems. There is work ongoing within the IETF to spe cify an Internet standard key management protocol which will work with IP layer security.

One form of key management is manual key management, where someone manually configures each system with a key and the keys of the systems which will be communicating with it. This technique is can only be used in a small and static environment because it becomes diflicult and time consuming to maintain on a Iarger scale. This is the system that will be used in the near future during the beta testing phase of IPv6, but with wîdespread deployment of the security mechanisms, an Internet standard scalable key management protocol wiil be needed.

The Simple Key-Management for Intemet Protocols (SKIP)is being proposed as this Internet standard. SKIP was designed to work with the IP security layer. More information on this standard can be found in [AMPg?]. The securîty of the keys is a large issue that needs to be addressed. A suggestion that the keys reside in hardware and the OS level will represent the keys by refering to a key index into the hardware. 5 IP LAYER SECURITY ARCHITECTURE

5.5 Combination of Security Mechanisms

The two mechanisms provide for authentication, integrity and confidentiaüty, but these are different se~ces.The Authentication Header guarantees that the datagram cornes from the right origin and that the datagram has not been tampered with during the delivery. The Encapsulating Security Payload guarantees that the message is not revealed to other parties. Using the DES CBC plus MD5 transform provide for the different services, but if such transformations are not available, the combined use of the two security mechanisms to provide these services.

The authentication header always provides integrity and authentication, but non- repudiation can only be provided if another algorithm is used, such as RSA. Therefore adding the AH to a IP datagram prior to encapsulating the datagram with the ESP can provide for authentication, integrity and confidentiality. 6 WAKNESSES IN THE a) LAYER SEC- ARCHlTECTURE 59

6 Weaknesses in the IP Layer Security Architecture

6.1 Introduction

Although the use of these mechanisms to provide authentication and encryption will increase the level of Internet security, there are still a number of attacks against various versions andior irnplementations of the protocols. Using these attacks confidentiality may be compromised and attackers can transmit phony data, therefore rendering the use of these mechanisms useless. Most of the attacks outlined in this section will concentrate from intrinsic properties of the encryption modes used and the lack of integrity checking in some of the security trmdorms and the use of host-pair keys. The main issue here is that decrypting a message with the wrong key does not necessarily mean that the output will be garbage. Checking integrity is needed to make sure that these attacks cm not be used.

When using the encapdafing security payload, the ciphers used fdl in two cat- egories: block ciphen and stream ciphers. The block cipher used is DES in cipher block chaining (CBC) mode which operates by encrypting the exclusive-OR of a plaintext block with the previous ciphertact block. The nrst blodc is encrypted using an initialization vector (IV), which is determined by both parües before the communication is initiated. The choice of the initialization vector can be constant or non-constant, but a non-constant IV can help disguise cornmon preflxes. However, in the case of the 6 WEAKNESSES IIV THE LP LAYER SECUEWTY ARCHITECTURE 60

ESP, the first encrypted block will usually be a TCP, UDP, or IP header which will almost always vary enough so that the choice of IV is inconsequential. Decryption of the cipher text is the inverse operation of encryption.

A property of CBC mode encryption is that the prefk of a CBC encryption is the encryption of the preik. For example, for a stream of ciphertext < Ci, . .. ,Ci, . .. , C, >, the sequence < Ci,... ,Ci > is the encryption < Pl,.. . , P?, >. By using a truncated block of ciphertext, an attacker can use it to launch an attack. %y generalizing this property we can take a stream of ciphertext < Ci,. . .,Ci > which is a valid CBC encwtion of < Pi,. . ., Pj > as long as the N can be set to Ci+ This allows the attacker to take any section of the encrypted message. The last property that we cm take advantage of is that CBC has limited error propagation. That is, if a cipher block is compted (either deliberately or accidentally), only this block and the following blocks are damaged. Using these three properties together a eut-and-paste operation can be used to attack a host. Same keyed cipher bloeks from different messages can be combined to make a new message and only the block immediately following the splice point will be garbled upon decryption.

The stream cipher operates on one byte at a tirne, by generating a stream of key bytes which are exclusive-ORed with the plaïntexf. Any section of the ciphertext bytes can be decrypted independently of the rest of the message as long as a correct starting point is used. Error propagation does not exist, Le. if a ciphertext byte is damaged, 6 WEAKNESSES IN THE IP LAYE23 SEC- ARCHITECTURE 61

only this byte is damaged in a predictable way and the rest of the bytes remain intact.

6.2 The Attacks

Most of the following attacks rely on the use of ESP without the use of AH. The key

used in the communications between two hosts are always the same (host-pair keying).

The attacker may have a legitimate user account on one or both computen involved

in the attack, but the attacker does not have priveledged (superuser) access. The

attacker also has the ability to read, modify, delete, or inject new packets.

6.2.1 Reading Encrypted Data

The ability to read other people's messages can be achieved in the following manner. A

legitimate message is sent from user DAto user Es on machines A and B, respectively.

The attacker X sends a message using UDP from machine A to B which is denoted

XA and XB in Figure 17. The legitimate message is taken and the unencrypted part

is discarded and the remaining block and the necessaxy padding to make the lengths

match are inserted into our forged message (see Figure 17). The legitimate data is represented in the bold boxes and the dashed boxes denote encrypted data. The body of the message will be readable by the attacker because UDP checksums can be turned off in IPv4 and under IPv6, on average only 216 (which is smdgiven cunent cornputer speeds) tries are needed to fool the checksum. 6 WEAKNESSES IN THE IP LAYER SECUMTY ARCHITECTURE

Legitimattc Monitorcd Data

Reinjected Forged Data

Figure 17: Cut and Paste Attack

6.2.2 Session Hijacking

A similar technique can be used to inject new data into a legitimate user's encrypted session. Again the attacker monitors a legitimate message between DA and EB and creates his own UDP message with data that he wants to insert into the legitimate user's stream (Figure 18). If the legitimate packet is deleted, the new data can be substituted for the payload, otherwise, a new packet can be constructed and reinjected into the stream. The use of TCP ai& this attack because TCP does not have a length field and it will read a packet which has been received aiready with the new section.

The CBC pad is used to ensure that the CBC decryption does not damage the spoofed data. A related attack involves text being sent which the attacker can cut at the proper 6 WEAKNESSES THE IP LAYER SECI/RITY ARCHITECTURE

Legitimate Monitored Data

------r---O--' : CBC l nuty 1 UDP 1 pad commands 'A *B I I 1 ml l~--~~~~~~l~~~~~~~ I

Reinjected Forgcd Data III III ,mœmmt-œ----- r------I

Figure 18: Session Hijacking Attack

CBC boundary and reinjected onto the network with a new IP header. The Iast part of the data (ckfk in the figure) can be used to restore the user's prompt to the known state before the session hijacking is attempted.

6.3 Preventative Measures

The previous section outlined a couple of attacks, but this is not an exhaustive list.

However, measures can be taken to ensure that these attacks can not be used against users. If the message is properly checked it will not be susceptible to cutting it apart.

Integrity checking of the message is the key to the defense to these attacks and this checking should use acceptably strong cryptographie techniques. Another key assumg tion that was made was that the authentication header was not used. However, the

authentication header uses transforms which protect the integrity of the message.

The second assumption we made was the the use of ho&-pair keying. Using a

connection based keying system, we can avoid reusing keying material for more than one connection. This improves the security because each connection has a different key and the attacker can use his own connection to get data that can be inserted into the legitimate connection.

It is clear from the attacks that encryption without integrity checking is useless and therefore for the security protocols to be of any use to the Internet community, the combination of the two security mechanisms should be used. 7 SECURE SOCKETS LAYER (SSL)

7 Secure Sockets Layer (SSL)

7.1 Introduction

The Secure Sockets Layer (SSL) is a security protocol aimed at preventing eavesdropping, tarnpering, or message forgery over the Internet. This protocol is currently an

Internet Draft (a working document of the Internet Engineering Task Force). The preliminary goal of this protocol is to provide privacy and reliability between two comrnunicating applications. The protocol consists of two layers, the SSL Record Pro- toc01 and SSL Handshake Protocol. The SSL Record Protocol is a lower level protocol

Application Is SSL 1 Transport 1

Network Interface

Network Hardware

Figure 19: Secure Sockets Layer which is layered on top of some reliable transport protocol. It is used to encapsulate other higher level protocols, one of which is the SSL Handshake protocol. The SSL Handshake Protocol allows the client and the semer to authenticate each other and then negotiate an encryption method (encryption algorithm and cryptographic keys) before the application begins its communications. Conceptudly, the SSL Protocol sits between the transport layer and the application layer (shown in Figure 19).

The SSL protocol initiaily defines a secret key which is used with a symmetric cryp togaphic algorithm (DES,RC4, .. . ) to provide for privacy. The peer is authenticated by using an asymmetric, or a public key cryptographic algorithm (RSA, DSS, .. . ).

The integrity of the message is guaranteed by using a keyed message authentication code (MAC) (SHA,MD5, .. .).

7.2 The Secure Sockets Layer Protocol

7.2.1 SSL Record Layer

The SSL Record Layer provides confidentiality, authenticity, and replay protection which communicates over a connection-oriented reliable transport protocol, such as

TCP. The main function of the record layer is to receive uninterpreted data from the higher layers and operate on them and finally transmit the data to the other end of the connection. The data is fiagmented into records of 214 bytes (or less). The record layer operates on this data through different operations to compress/uncompress and to encrypt/decrypt the data.

Atl records are compressed using the compression algorithm of the current state of the protocol. A compression algorithm is always defined in ail states, although the initial algorithm is the nul1 algorithm. The compression must be lossless and may not increase the content length by more than 1024 bytes. On the decompression side, a problem can be flagged on receiving a record which decompresses to more than 214 bytes.

Algorithms are defined for encryption and message authentication codes by the cipher specification of the current state of the protocol. These algorithms are used to protect all the records that are sent between the client and server. The SSL Handshake

Protocol is used to provide for shared secrets which are used to encrypt records and to compute message authentication codes on the contents of the records. These encryption and MAC algorithms operate on the compressed data from the previous step. To protect the transmissions from extra messages, altered messages, or missing messages, a sequence number is included. The decryption is processed in the reverse order to yield the message in the compressed format.

Ttansitions in ciphering strategies are implemented by the Change Cipher Spec

Protocol. This protocol dows the client and server to communicate with the current cipher specification (not the new pending cipher specification) to specify the new cipher specification. The client and the semer send out a message to initiate the change of cipher specification and subsequent messages will be sent using the new specïfication.

At this point, the client will initiate the handshake key exchange and, if necessary, the 7 SECURE SOCKETS LAYER (SSL)

Error Alert Message Description ~1~expectedniessage Inappropriate message was received. FATAL badrecord~ac Record received with an incorrect MAC, FATAL decompression_failure Decompression of improper input. FATAL han&hake faiiure Unable to negotiate an acceptable set of security parameters. FATAL no-certificate No appropriate certificate is available. NON-FATAL bad-certificate Corrupt or bad verification. NON-FATAL unmipported-certificate Unsupported cermcate. NON-FATAL certiiicatesevoked Revoked by signer. NON-FATAL certificate-expired Expired or invaiid certificate. NON-FATAL Ir certigcate-unknown Something wrong with certificate other than above reas on^. NON-FATAL 11 iliegal-parameter Field in the handshake is inconsistent or out of range. FATAL I --- Table 2: SSL Error Alerts certificate verification messages.

For unexpected messages, a protocol exists to alert the other party of the possibility of fou1 play. An alert message is sent to the other party with a level of severity and the description of the alert. Whenever a level of severity of "fatal" is sent, the connection is immediately terminated. One such case of a fatal alert message is when an unexpected change cipher spec message is received. Another alert is for closure, which helps to avoid tnincation attacks. The client and semer must share knowledge that the connections is ending and this alert notifies the recipient that the sender will not send any more messages on the currently used connection. Error handling in the

SSL Handshake Protocol uses the ermr aiert message which the detecting party sends a message to the 0th- party. On receiving a fatal alert message, the client and semer 7 SECURE SOCKETS LAYER (SSL) are required to forget any session identifier, keys, and secrets associated with this connection. The error alerts that are currently deôned are summarized in Table 2.

7.2.2 SSL HandsUe Protocol

The SSL Handshake Protocol produces the cryptographie parameters of the session state. The protocol is layered on top of the SSL Record Layer. The initial handshake consists of agreeing on a protocol version (currently version 3), choosing a crypto- gaphic algorithm, optionally authenticate each other, generate shared secrets using some form of public-key encryption. The initial message sent by the client is the

"client hello" which is acknowledged by the "server heilo" message by the server. If this server message is not sent, the connection is closed. In this hello message, the following attributes are exchanged: protocol version, session ID, cipher suite, use of compression, and some random numbers. If authentication is required, the server sends its certificate to the client and depending on the cipher suite being used, the server will request a certifiate from the client. The end of the handshake is indicated by the "server hello done" message. The client will then initiate a key exchange with the server using the public-key algorithm chosen during the hello handshake. The client and server can now exchange keys, algorithms and secrets. Once exchanged, the client and server may begin to exchange application layer type data via the SSL Record

Layer. 8 THE SECURE SIIELL (SSH)

8 The Secure Shell (SSH)

8.1 Introduction

The telnet and rlogin protocols allow a user to log into another computer over a net-

work (or the Internet), but this is insecure because the current version of the Intemet

Protocol does not implement authentication and encryption security mechanisms. The user's password is left in clear text when passed over the network through these protocols, which are susceptible to several security problems (Pspoofing, DNS spoofing, routing attacks) .

The Secure Shell (SSH)was designed and implemented by T. Monen and is currently an experimental protocol which allows a user to log into another computer over a network. This protocol provides strong authentication and secure communications over channels that are insecure. SSH closes several of these security holes and provides for two new authentication methods based on the use of .rhosts combined with RSA based authentication and pure RSA authentication. Spoofed packets are irrelevant because a.U communications are automatidy and transparently encrypted.

Another feature of the protocol is to dow for redhecting other packets (arbitrary

TCP/IP ports) through this secure chamel. For example, X11 display connections can be passed through the secure channel, therefore making this XI1 session secure as well. On every connection the server machine is RSA-authenticated by the client 8 THE SECURE SHELL (SSH) 71

to prevent trojan horses (routing or DNS spoofing) and man-in-the-rniddle attacks.

In the same way the server MA-authenticates the client before accepting .rhosts or

/etc/hosts.equiv authentication (prevents DNS spoofing, routing, or IP spoofing).

8.2 The Secure Shell Protocol

A server side program nins on a server machine and listens to a well defined port

waiting for connections to be made by the client program from the client machine.

These machines (client and server) are connected by an insecure IPv4 network (that

can be monitored, that can be spoofed, and/or tampered with). The client program

always initiates the connection which the server accepts and responds by sending

back its version identification string in human-readable form. The client parses the

identification and sends its own version identification. This is used to declare the

protocol version and software version (for debugging purposes), and validation of the

usage of the correct port. If either side of the connedion fails to undetstand this

information, the connection is closed.

After the protocol identification phase has passed, the client and server switch to

a binary packet protocol. Information is sent from the server to the client in the

form of the semer's host key, semer key, and other relevant information for the client. The client generates a 256 bit session key which is encrypted by both RSA keys

(see subsection ) and sen& this encrypted key and the selected cipher type to be 8 THE SECURE SHELL (SSH) 72

used in the communication and other relevant information. Using the key and cipher

type, the client and server switch to encryption communications. The server acknow-

ledges by sending the confirmation encrypted with the information received from the

client. The client authenticates itself using an valid authentication method (.rhosts,

/etc/hosts.equiv, RSA-host based authentication, RSA authentication, or password

authentication).

After successful authentication, the client makes requests to prepare the session.

The requests include allocating a pseudo tty, start port forwarding (XI1 or TCP/IP),

executing a shell or a command, and authentication agent forwarding. The interactive

session mode is started if a command or a shell is executed. Data in the interactive

session mode is passed bidirectionally and new forwarded connections may be opened.

8.2.1 TCP/IP Port Number and Some Options

The UNIX port number 22 has been reserved for the server, where it listens for

connections. There is no limitation on the port to be used to connect to the server by the client, but using any form of authentication involving the .rhosts or /etc/hosts.equiv file, it needs to make the ccnnection to the server fkom a privileged port (in the case of UNE the port number is less then 1024).

There are recommendations in the RFC on how to label the 'IP Type of Ser- vice' field in the IP protocol. For interactive sessions it is recommended that IP- 8 THE SECURE SHEU (SSH)

TOSLOWDELAY is used; otherwise, using IPTOS-THROUGHPUT is sufficient.

Keepalives (messages sent between the communicating hosts to keep the connection from being closed) are to be used to insure that the server will notice when the client machine's connection is cut off in the event of the computer rebooting.

8.2.2 Protocol Version Identification

When the socket is opened, the server program sends aa identification string in the form of "SSH-. The major and minor numbers are integers specifying the protocol to be used and not the softwaxe distribution version. The version number is the software distribution version of the server which has a maximum of forty characters. The version number is not currently used by the client, but may be used for debugging purposes.

The client parses this string to see if the protocol version is supported by the client software version. If there are incompatabilities, the connection is closed by the client and no connection can be made. Otherwise the client sen& its own identificatin string in the same form. The server parses this string and closes the connection if they are not compatible. However, if they are compatible, the server wilI send the next (first) packet using the binary packet protocol. 8 THE SECURE SHELL (SSH)

8.2.3 The Binary Packet Protocol

The binary packet protocol is initiated &er initial handshake is passed. During subsequent phases, this protocol is used to communicate between the client and server.

The layout of each packet is shown in Figure 20.

Packcc Padding Pacht Data Check Bytcs bnsth Type . Figure 20: SSH Binary Packet Format

The Packet LengUi field gives the length of the packet, not including this field and padding. This is a 32 bit unsigned integer starting from the most significant bit. This part of the packet is not encrypted. The maximum length of a packet is 262144 bytes.

Following the packet length, the Pudding is used for random data. If there is no encryption, this field is leR as zeros. It is used to make known plaintext attacks more difficult. The choice of the number of bytes used is based on the packet length where the formula (8 - (Packet-Length mod 8)) so at least 1 byte and up to at most 8 bytes wiU be used for this field.

The Packet &e field is used to distinguish the packets. The current implementation of SSH uses 35 dinerent packet types and is summarized in Tables 3, 4 and

5 - 8 THE SECURE SHELL (SSH)

Packe t Value Notes SSHMSGJONE Reserved. Message is never sent SSHNSGDISCONNECT Tmmediate àisconnection and the string is passed for display First message sent by the server which contains an anti-spoofing cookie, the semer's host key, server key, protocol flags, supported ciphers rnaks, supported authentications mask Client selects cipher to use and encrypted session key and the anti-spoofing cookie For authentication purposes, login name to use on the server L SSWXMSGAUTHRHOSTS Client requests for rhosts authentication SSH-CMSG AUTHILSA Client requests pure RSA authentication SSHSMSGAUTH-RSAXXALLENGE RSA challenge for the client SSH-CMSGAUTHRSARESPONSE Clients response to the RSA challenge SSH-CMSGAUTHPASSWORD Requests password authentication, plain text password is sent in ehis message Request for a pty to be allocated Change in the size of the client's window - Table 3: SSH Packet Types 8 THE SECURE SHELL (SSH)

[ Packet Value I Notes Stark a sheU and enter interactive session mode Starts a executing the given commsnd and enter interactive session mode Successful for session ke~, au thentication request, corn- pleted preparatory operation Ë&ure for an authentication rnethod or preparatory request Deliver data from the client to the server during interative session mode Deliver data from the server (from stdout of the shell) SSHSMSGSTDERRDATA Deliver data from the server (from stderr of the shell) EOF reached on input Exit status of the shell Successful channel opening Ëded - channel openint5 Table 4: SSH Packet Types (cont) Packet Value 1 Notes II n I SSHMSG-CHANNELDATA Bidirectional exchange of data for a hel A. SSHNSG-CHANNELCLOSE Close a chaanel SSHMSG-CHANNEL-CLOSE-CONFIRMATION Confirm closing of Channel SSH-CMSGXliREQUESTEORWARDING Client reqeusts XI1 forwarding SSHSMSGX11-OPEN Open XI1 channel SSH-CMSGPORTEORWARDREQWEST Request for a port to be forwarded through the protocol SSHdSG-PORT-OPEN Open a forwarded port SSH-CMSG AGENTREQUESTIORWARDING Request a connection II 1 for an authentication II

tion agent connection L. SSH-MSG-IGNORE Reduce chance of

currently used SSH-CMSGEXT-CONFIRMATION Confirm exit status message " SSH-CMSGX11 EWD-WITHAUTHSPOOFING Create a fake XI1 display for XI1 authentication SSH-CMSGAUTHRHOSTS-RSA Request rhosts and RSA authentication I SSHMSGDEBUG 1 Debugging messages Table 5: SSH Packet Types (cont) 8 THE SECURE SHELL (SSH)

8.2.4 Key Exchange and Server Host Authenfication

The ktmessage sent by the server to the client using the binary packet protocol is

SSHSMSGPUBLICXEY.At this point the server's host key, public key, a list of the supported ciphers, and a Est of valid authentication methods is sent to the client. Two extra pieces of information are sent, the first is the protocol flags and the second, to try to make IP spoohg more diffidt, is a 64 bit cookie which is randornly generated.

This cookie has to be sent back to the server by the client when it replies to this first message. During this phase the entire packet is sent in the clear, that is, there is no encryption.

A session id must be computed by the server and the client in the foUowing manner.

The server public key is used and the modulus is interpreted as a byte string (with the most significant byte first). The server's host key is then used by prepending it to the generated string and is interpreted the same way. Using MD5, a 128 bit session id is computed by the server and the client and this is the session id.

The client cornputes the session key by using 256 bits which is created by taking the session id and XORing it with the kt 16 bytes of the session id. The client responds to the SSHSMSGPUBLICXEY packet using the message type packet with the format of SSH-CMSGSESSIONXEY which contains the requested cipher type, the 64 bit cookie that was generated by the server, protocol flags, and the session key.

Only the session key is encrypted by using the server's host key and server key. Again 8 THE SECURE SHELL (SSE?)

Li - " * 1 w- Y1 U n SSH-CIPHEUONE I No encryption, clear text II ' L SSHCIPHERIDEA IDEA aigorithm in CFB mode SSH-CIPHEUES DES algorithm in CBC mode ' ' SSH-CIPHER3DES '1ii~1e-D~~algorithm in CBC mode SSH-CIPHERTSS Experimental Stream cipher algorithm- 1 SSH-CIPHERRC4 i RC~algorithm Table 6: SSH Encryption Types this packet is not encrypted and sent in clear text. The result is a string which is encrypted with two different keys differing by at lest 128 bits.

8.2.5 Packet Encryption

During the session initidization, the server sends a bitmask of dl supported encryption types (Table 6). The client can choose from any of these methods and then generates a session key (which is 256 bits) which it sends to the server. The protocol as defined in the RFC must support DES and triple-DES encryption. Most of the other encryption types specïfied in table 6 are recommended and the experimental stream cipher is op tional.

The cipher stream has a length of a multiple of 8 and the data being passed in the same direction (client to server or server to client) wiU be encrypted as if it was a continuous bufEer. This means that ail initialîzation vectors are passed from the current packet to the next packet. This also means that data in the dinerent directions are encrypted independantly from each other. 8 THE SECURE SHELL (SSH) 80

When using the DES cipher, the key is taken from the kt8 bytes of the session key excluding each least significant bit of esch byte. This gives a 56 bit key and the initislization vector is taken as dl zeros. The mode used for DES and Tkiple-DES is CBC mode. The Triple-DES cipher is actually a variant where there are three independant ciphers using DES in CBC mode. Each stage of the cipher uses separate and independant initialization vectors which are initidy set to zeros. The fint and last stages are used to encrypt the data and the second stage decrypts, that is, the data stream is encrypted through the first DES algorithm then this result is decrypted using the second DES cipher and finally this new result encrypted with the last DES cipher.

Each key is chosen from 8 consecutive bytes of the session id. The IDEA cipher uses a 16 byte key taken fkom the first 16 bytes of the session id and is used in CFB mode.

The initialization vectors are set to all zeros. The TSS cipher uses dl32 bytes of the session id for the key. This algorithm is only documented in a sample implementation in source code format. Although the cipher is quite fast, there is no guarentee that the cipher is secure. For the RC4 cipher, the session id is split in hoblocks of 16 bytes. The first block is used by the server as the key when communicating to the client and the client uses the second block for its key in the client to server direction.

This implementation is based on a Usenet post in 1995 claiming that it may be the original RSADSI RC4 cipher which is a very fast algorithm. 8 THE SECURE SEIELL (SSH) 81

-Authentication Method Notes ' - SSHAUTHRHOSTS Uses .rhosts, .shosts, /etc/hosts.equiv or /etc/shosts.equiv SSHAUTHRSA pure RSA authentication SSHAUTHPASSWOFUI password authentication t SSHAUTHRHOSTSRSA i Rh~stswith RSA authentication n Table 7: SSH Authentication Methods

8.2.6 Authentication

There are 4 methods (Table 7) of authentication which are accepted by the SSH server.

The client may only try one at a time, but may try them al1 in succession as long as the server doesn't close the connection due to exceeding the timeout. When request- ing an authentication message, the server answers with a SSHSMSGSUCCESS or a SSHSMSGEWURE based on whether the authentication is accepted or not accepted (this can be because the requested method is not recognized). A phase of the protocol is terminated by the serwr if an authentication method succeeds or the timeout (recommended to be set at 5 minutes) is reached.

SSHAUTHRHOSTS is the same authentication method used by the rlogin and rsh protocols. This method (on UNIX systems) consists of checking /etc/hosts.equiv and then, in the users home directory, .rhosts. The authentication method can only succeed if the client comects &om a privileged port and there is a line in one of these files matching the hostname and username of the client initiating the connection. Some extra checks are done by the semer to make sure that the client's claims are real by 8 THE SECURE SHELL (SSH) 82 checking certain IP options (source routing is turned off) and checking the client hoet name by reverse-mapping it and then forward-mapping the result to ensure that the

IP address is valid. Reverse-rnapping is the process of taking and IP address and asking a domain name server for the hostname. Forward-mapping is the process of taking the hostname and asking a domain name server for the IP address. This step is used to check if they findings are consistent, i.e. host.com maps 192.168.1.1 and

192.168.1.1 maps to host.com. In this method a lot of trust is placed on the remote hosts administration (root can pretend to be any user) and the name service (when reverse-mapping and forward-mapping). 'IhLst is placed in the network because some

IP-spoofing can still be achieved. However, blind IP-spoofing is prevented in this protocol whereas it is possible under the rlogin protocol.

SSHAUTHRHOSTSRSA starts using the the same authentication method as in

SSHAUTHRHOSTS with an extra method for ensuring that the client machine is the real client machine. The server requires the client to be authenticated by RSA.

The initial challenge is to be authenticated by the /etc/hosts.equiv or .rhosts files.

The server checks to see if it knows the client's host key and if it does, it encrypts a random message of 256 bits using MAwith the client's public key. The encrypted message has the fom of 32 &bit random bytes where the htbyte is left as zero, the next byte contains the value 2, the last byte contains a value of zero, and the remaining bytes contains the random values. The client decrypts the message using its private 8 THE SECURE SHELL (SSH) host key and concatentates the session key and cornputes a MD5 checksum. The new length is 48 bytes and is sent with the 16 byte checksum to the server. The checksum is required to deter a chosen plaint& attack against the RSA algorithm and the session id is required to bind it to a specific session. The server checks the checksum and the decrypted message against the original message and if they match the authentication is accepted. This method still trusts the rernote hosts administration, but the other trust mechanisms are reduced because name service can only be used with certain hosts where the private key is slready known. This is because the second part of the trust cornes from the private client host key.

SSHAUTHRSA method implements only the second part of the authentication method as in SSHAUTHRHOSTS-RSA, that is, it does not implement the rhosts authentication method. This authentication method does not trust the remote host or the network or name service. The trust is placed solely on private identification keys so anyone in possesion with these keys may log on.

The last method (SSHAUTHSASSWORD)authenticates the user using a password as a normal login. The client sends over the p~ssppordwhich is typed in by the user in a message in plain text. As long as the session is encrypted (normally) the password will be encrypted too. The ciifference between this authentication method to that of rlogin/rsh password authentication is that the client program remis the pas- word and not the semer's login program. This method puts complete trust in the 8 THE SECURE SHEU [SSH) paswrrord, therefore anyone in possesion of the pasmord may log on.

8.2.7 Data Exchange

After the authentication phase the server waits for the client to send a request and the server tries to honor all requests and responds with SSHSUCCESS or SSH-FAILURE based on the action taken. Failures may be due to unknown requests or the inability to honour a request. If the request is a SSHXMSGEXECSHELL, an interactive shell is requested and the interactive session is invoked.

During the interactive session exchanges between the server and client are performed asynchronously. Acknowledgments are not needed because TCP/IP will ensure reliable transport and the SSH binary packet protocol protects against tarnpering or

IP spoofing. This phase of the protocol ends when the client sends a SSH-CMSGEOF message and the server responds with a message of SSHSMSGEXITSTATUS which is a normal end of the protocol or when either side sen& a SSH-MSGDISCONNECT. 9 CONCLUSION

9 Conclusion

In this thesis various security problems on the Internet have been described which cm a8ect the communications of severd millions of users. The solution to IP spoofing and packet sniffing can be provided by various protocols such as SSL, SSH and the introduction of Pv6. Untü IPv6 is fdyimplemented on the Internet, protocols sudi as SSL and SSH will need to be used to provide secure communications. A marked increase in security will begin when IPv6 is deployed on the Internet.

The main advantages that IPv6 has over higher level protocols (such as SSL and

SSH) is that the location of IPv6 (internet layer) allows it to provide its services to al1 packets above this internet layer. SSL and SSH are located in the transport and application layers and therefore can only provide the confidentiality and authentication to protocols using defined port numbers.

Ali of these protocols have been shown to provide cryptographie security to be extensible, and to be relatively efficient. Interoperability of the SSL has been shown already, but IPv6 and SSH are still in the research, development and testing stages.

There is a lot of confidence that they will provide this interoperability once a few more implementations are put forward.

Work is still continuing on all these protocols to make them more efficient and more interopable. The SSH protocol is being redehed to move to a lower layer which will allow for more packets to be encrypted and authenticated. Appendices A THE INTERNET PROTOCOL

A The Internet Protocol

The most fundamental internet service consists of a packet delivery system which

is defined to be unreliable, best-effort, and connectionless. It is unreliable because delivery is not guaranteed because the packet may be duplicated, lost, delayed, or delivered out of order and it is not detected and fived by this iayer of the service.

This protocol uses a best-effort delivery system because this internet sentice tries its best to deliver the packets. Failures arise only from exhausted resources or failures in the underlying network and not because the protocol discards them capriciously. This service is a connectionless semice because each packet is treated independentiy from the other packets. Each packet may travel over different paths, different routers and they may even be lost in the attempt to deliver the packet.

The Internet Protocol (IP)debes this connectionless, unreliable, best-effort delivery service. It is one of the major protocols in intenietworking. In Figure 21 the major protocols and how they fit together conceptually is shown. On the lowest layer, we have the physical hardware followed by the network interface which the operating system kernel uses to communicate with the underlying hardware. The following layer is where IP lives, proMding the connectionless packet delivery service. The TCP pr* tocols provide the reliable transport services and hally the application senrices are the programs and applications (such as rlogin, tel.net, etc.) which use the lower level services- A THE INTERNET PROTOCOL

Appiication

Transport

Network Interface

Network Hardware

Figure 21: Conceptual Organization of TCP/IP Layers

The Internet Protocol defines the basic unit of data that passes across the TCP/IP

Internet. The protocol specifies the format of al1 data which will pass through the

TCP/IP Internet. The protocol chooses the best route For sny packet to be delivered; this is cailed the routing function. It defines rules on how and when errors should be generated, when packets should be discarded, and how hosts and gateways should process these packets. Each unit that is trderred through the network is cailed an Internet dittagram (or IP datagram or just datagram). The structure of the datagram (shown in Figure 22) consists of a header containing the source and destination addresses and a type field identifying the contents of the datagram.

I Headcr Datagram Data I Figure 22: IP Datagram A THE INTERNET PROTOCOL

A.l IP Versions 4 and 6

The current version of the IP protocol deployed on the Internet is 4 and the new proposed version dlbe 6. The following sections will describe the format of the datagram in both versions in detail. The major dineremes between IPv6 and IPv4 are:

Expanded Routing and Addressing Capabilities The address size is increases

from 32 bits to 128 bits;

Anycast Addresses New type of address to identify sets of nodes where a packet

sent to the anycast address is delivered to one of the hosts in the goup;

Header Format Simplification Sorne IPv4 header fields have been dropped or made

optional to reduce the cornmon-case packet processing time and to keep the band-

width cost as low as possible even though the new addresses are 4 times larger

(the header size is only doubled)

Improved Support for Options IP options are encoded differently for more effi-

cient forwarding, less stringent ümits on the length of the options and greater

flexibility for introducing new options in the future;

Quality-osService Capabilities Ability to label packets belonging to a particular

trdc Vow", usehl for "real-timen services (video, audio); and A THE INTERIVET PROTOCOL

Authentication and Privacy Capabilities Support for authentication, data integ-

rity, and confidentiality which will be included in al1 implementations (this is

optional in IPv4 and it is not currently implemented).

A.2 IPv4 Datagram Format

The general structure of an IP datagram can be split up in separate fields as shown in Figure 23. The format is not constrained by the underlying haxdwaxe because it

O 4 8 16 19 24 3 1

I Idcntî fication 1 RWf 1 Fragment Offset

1 Destination IP Ad-

P Options (if any) Padding

Data

Data ...

Figure 23: IPv4 Datagram Format is processed in the internet layer. The VERSION field (4 bits) is used to ver@ that the sender, gateways that it may cross, and the receîver agree on the format of the datagram. The datagram is rejected if the version information does not match what the software understands to prevent the software from misinterpreting the datagram

contents. The HEADER LENGTH field is a 4 bit field containhg the header length

in 32 bit words. The most cornmon header would have a header length of 5 (20

octets) when the IP options and padding fields are not present. The 8 bit SERVICE

TYPE field specifies how the datagram should be hdedand how it can be broken

dom in subfields as shown in Figure 24. The PRECEDENCE bits allow the sender O 1 2 3 4 5 6 7

Figure 24: Subfields of the Service Type in an IPv4 datagram

to tag each datagram according to it's received importance. Its value ranges from

O (normal) to 7 (network control). However, most hosts and gateways ignore this subfield, but it could be used to irnplement congestion control algorithrus. The D,

T, and R bits specify the suggested type of transport ranging from low delay, high

throughput, and high reliability, respectively. Since an internet can not guarantee this level of service, this field is only used as a suggestion and hint to the routing protocols such that the gateway can choose a route, if it ha9 more than one, based on these bits.

Most interactive traffic (dogin, telnet) dlhave the D bit set because they should be delivered as quickly as possible, whereas non-interactive trac (FTP transfers) requires high throughput and have the T bit set.

The TOTAL LENGTH field gives the length, in octets, of the entire datagram, in- A THE INTERNET PROTOCOL 92 cluding the header and data. Therefore the size of the data is computed by subtracting the HEADER LENGTH from the TOTAL LENGTH fields. The maximum possible size of an IP datagram is 216 (65535) octets because it is a 16 bit field. Currently, this is not a limitation, but may be in the future when higher speed networks are used and can handle datagrams of larger capacity than 216.

Datagrams are encapsulated into a network frame as a packet when it travels to the destination host. The network frame (also known as the physical frame) consists of a header and a data area where the datagram will be stored and is used by the underlying network (eg. ethernet). In the ideal case, the entire LP datagram fits into one physical frame, but to achieve this, a mhumdatagram size must be chosen. Unfortunately, the datagram may travel over several dinerent types of physical networks until it reaches its destination. The limitation given by the physical network is the maximum transfer rate (MTU)which vary between different medias. The purpose of internet design was to hide the underlying network technologies and provide convenience for the user. Therefore under TCPIIP, an initial datagram size is chosen and a way to divide the datagram to fit in the frames with smaller MTU (fragmentation). As the datagram moves between networks, the fragmented datagram do not get reassembled until it reaches its final destination, even if it passes networks with large MTUs. The three fields in the datagram header, IDENTIFICATION,FLAGS, FRAGMENT OFFSET, control this fragmentation and reassembly of datagrams process. A THE INTERNET PROTOCOL 93

The TIME TO LIVE (TTL) field specifies how long, in seconds, the datagram is valid in the intemet system. As the datagram is processed by a gateway or host, the TTL is reduced as tirne passes. The datagram is removed when the time expires and an error message is sent back to the source. The estimation is difficult because gateways do not know the transit times so each time a datagram passes a gateway, the TTL field is decremented by 1 and if it remains at a gateway the TTL is reduced by the time waiting for service.

The PROTOCOL field specifies which high-level protocol was used to create the data in the data fields. That can be viewed as the field that controls what the format of the data area looks like. This field is managed by a central authority to ensure that the protocol number is unique across the Internet.

The HEADER CHECKSUM is used to ensure the integrity of the header values.

Using network byte ordering, the checkis computed by taking 16-bit words of the header and adding them in one's complement arithmetic and then taking the one's complement of the result. Because the HEADER CHECKSUM is needed to cornpute the checksum (which is not computed yet), it is assumed that this field is zero for the computation. This checksum only applies to the header and not to the data in the remaining fields.

The SOURCE IP ADDRESS and the DESTINATION IP ADDRESS contain 32- bit IP addresses of the sender and the intended recipient. These fields should not be A THE INTERNET PROTOCOL 94 changed even though they me pansed between gateways. They indicated the original source of the datagram and the ultimate destination.

A.2.1 IPv4 Options

The IP OPTIONS field is an optional field which is typically used for network testing and debugging. When options are present, they appear contiguously without any special separator. These options can be variable length or hed length (typically 1 octet).

The octet can be divided into fields (as shown in Figure 25) which is called the option code. The COPY flag controls if gateways should copy the option to ail fragments

O 1 2 3 4 5 6 7

1 1 I I I O~ON I I t I COPY CtASS OPTIONWER ; 1 I I t I

Figure 25: Option Code Octet

(set to 1) or only copy the option to the first fragment only (set to O). The OPTION

CLASS specify the general class of the option. Currently, only classes O and 2 are used for datagram or network control and debugging and measurement respectively. Classes

1and 3 are reserved for future use. Depending on the OPTION CLASS, the OPTION

NUMBER indicates the possible opti&s (summarized in Table 8). The record route option (class O, number 7) and timestamp option (class 2, number 4) are used as a way to monitor and control how intemet gateways route datagrams. The record route option allocates space for IP addresses and as the datagram passes a gateway, that A THE INTERNET PROTOCOL

Option Option Description Class Number - End of option list No operation (for alignment) 11 Security and handling restrictions variable Loose source routing variable Record route 4 Stream identifier variable Strict source routing variable Internet timestam~ Table 8: IP Options by Class and Number gateway records its address in the adable space. It is the responsibility of the source to allocate enough space for dl the gateways, otherwise, once the space is al1 used, the gateways will not record their address. This is typically used for debugging purposes and therefore most programs will ignore this option. The timestamp works in a similar fashion allocating space for IP addresses and timestamps which are recorded in milliseconds since midnight Greenwich Mean Time. In this option the source can ask to omit the IP addresses or if it's used with the source routing options the IP addresses are already entered and if the gateway address matches the next IP address in the list the timestamp is recorded.

Source routing allows the sender to dictate a path that the datagram should travel until it reaches it's final destination. A list of IP addresses is recorded and depending if we are using strict source routing or loose source routing, the datagram will need to pass through all the IP addresses. The ciifference lies in that in strict source routing A THE INTERNET PROTOCOL 96

the packet must foiIow successive P addresses and is not dlowed to pass through

other gateways in between IP adchesses, whereas loose source routing does not have

this restriction.

A.3 IPv6 Datagram Format

The basic header of IPv6 is shown in Figure 26. The 4 bit VERSION field is used

O 4 8 16 24 3 1

Version Priority Fiow Lakl

7 ------Payload Ltngth Ncxt Header Hop Lirnit

Destination IP Addreu

Figure 26: IRr6 Datagram Format in the same way as in IPv4, but this is always set to 6. The 4 bit PEUORITY field enables source to identify the desired delivery priority of the datagrams relative to A THE INTERRrET PROTOCOL 97 other datagrams being sent from the same source. Values of O through 7 are used when the source can provide congestion control (TCP trafnc where the source can

"back off* if the network is congested) and values 8 through 15 specify datagrams where the source can not back off in response to congestion (%al-thne" datagrams being sent at a constant rate, such as live video). The lowest priority level is O for congestion-controlled trac and 8 for non-congestion-controlled tratIic, but there is no relative ordering between the two priorities.

The 24 bit FLOW LABEL field is set by the source on datagrams which need special attention during the processing on IPv6 routers. This is stül an experimental field and subject to change as the requirements becorne clearer. The PAYLOAD

LENGTH field is a lbbit unsigned integer which is the length of the remaining part of the packet following the header. The NEXT HEADER identifies the type of header following this header in the same way the PROTOCOL field was used in IPv4. The

HOP LIMIT field is almost exactly like the TTL field where the value is decremented by one by each host that processes the datagram. It was renamed to reflect how this field is actually used.

A.3.1 IPv6 Extensions

In IPv4 the options, when present, were always examined by the routers so in IPv6 the options were made into extension headers that are located between the IPv6 header A THE INTERNET PROTOCOL 98

and the transport layer header in a packet. Most of these extension headers are

not examined or processed by the routers, but are only examined when it reaches

its bal destination. This reduces the amount of time packets need to be held at

a router to process the options. The other major difference is that these extension

headers do not have a fked length, but can be of arbitrary length. Because of these

two changes it is now more practical to include various options/extensions, most

notably the Authentication and Security Encapsulation options. To retain alignment,

al1 options are an integer multiple of 8 octets and this improves performance when

processing subsequent option headers.

The available options for IPv6 are:

Routing Similar to Pv4 routing with support for powerhil new routing functionality,

such as provider selection to choose the route based on policy, performance, cost,

etc, host mobility to dow routing to nornadic cornputers to be specified, and

auto-readdressing to aliow for routes to new addresses;

Fragmentation For fragmentation and reassembly;

Authentication To authenticate the datagram;

Encapsulation Encryption to provide confidentiality;

Hop-by-Hop Options that need to be processed at each hop; and A THE INTERNET PROTOCOL

Destination Optional information to be examined by the destination host.

The "authentication header" is an extension header which provides authentication and integrity (without confîdentiality) to datagrams. The default algorithm that wiil be used to ensure interoperability within the worldwide Internet will be keyed MD5. The use of this header will eliminate a large class of network attacks which will be discussed in subsequent sections. The "encapwlating secunty header" will provide integity and confidentiality to datagrams. The default algorithm will be DES in CBC mode.

The placement of these headers at the intemet layer will help host origin authentication to the upper layer protocols and senrices. Currently on the worldwide Internet, this kind of authentication is handled mostly in the application layer and is not readily avsilable everywhere without operator intervention. A few methods will be discussed later in subsequent sections. rnFERENCES

References [Ad961 Timo Aalto. IPv6 Authentication Header and Encapsulated Security Pay- load. Technical Report Tik-110.551, Department of Computer Science, Helsinki University of TechnoIogy, May 1996. The HUT Internetworkllig Seminar. [AMP97] Ashar Aziz, Tom Mazkson, and Hemma Prafullchandra. Simple Key-Management for Internet Protocols (SKIP). http://skip.incog.com/spec/SKIP. html, April 1997. [Atk95a] Randall Atkinson. Security Architecture for the Internet Protocol. RFC 1825, NU,August 1995. [Atk95b] Randall Atkinson. P Authentication Header. RFC 1826, NRL, August 1995. [Atk95cI Randall Atkinson. IP Encapsulating Security Payload (ESP) . RFC 1827, NRL, August 1995. [Atk95dI Randall Atkinson. IP Authentication Using Keyed MD5. WC1828, NRL, August 1995. [Aue87a] Karl Auerbach. Protocol Standard for a Netbios Service on a TCP/UDP Transport: Concepts and Methods. RFC 1001, Mar& 1987. [Aue87b] Karl Auerbach. Protocol Standard for a Netbios SeMce on a TCP/UDP Transport: Detailed Specifications. MC1002, Mach 1987. [BD96] Jeremy Bradley and Neil Davies. Analysis of the SSL Protocol. Technical Report CSTR-95-021, Department of Computer Science, University of Bristol, June 1996.

[Be1891 Steven M. BeLlovin. Security Problems in the TCP/IP Protocol Suite. Computer Communication Review, Vol. 19, No. 2:pp. 32-48, April 1989. [Be1961 Steven M. Beîiovin. Problem Areas for the IP Security Protocols. In Proceedings of the Sizth [ISENIX UNIX Security Symposium, San Jose, California, July 1996. USENIX Association. [BM95] Scott Bradner and Allison Mankin. The Recomrnendation for the IP Next Generation Protocol. RFC 1752, January 1995. M. Butler, J. Postel, D. Chase, J. Goldberger, and J.K. Reynolds. Post Office Protocol - Version 2. RM= 937, February 1985. W. R, Cheswick and Steven M. Beilovin. Firewalls and Internet Secur- ity: Repehg the Wüy liocker. Addison-Wesley Publishing Company, Reading, Massachusetts, 1994. Computer Emergency Response Team (CERT). IP Spoof- ing Attacks and Hijacked Terminai Connections. CA-95:01, ftp://info.cert.org/certddvisories, January 1995.

Douglas E. Comer. Intenietworknig with TCP/IP, volume 1: Principles, Protocols, and Architecture. Prentice-Hail, Inc., Englewood Ciiffs, New Jersey, 2 edition, 1991. David H. Crocker. Standard for the Format of ARPA hternet Text Mes- sages. RFC 822, August 1982. Douglas E. Comer and David L. Stevens. Intenietworknig *th TCP/IP, volume 2: Design, Implementation, and Intemals. Prentice-Hall, Inc., Englewood Cliff", New Jersey, 1991. Infinity, Daemon9, Route. IP-Spoofing Demystified. Phmck Magazine, Vol. 7, No. 48, June 1996. Steve Deering and Robert M. Hinden. Internet Protocol, Version 6 (IPv6) Specification. RFC 1883, December 1995. W. Diffie and M. E. Hellman. New Directions in Cryptography. IEEE lhnsactiow on Information Theory, V.IT-22(n. 6):pp. 74-84, June 1997. [FBDW96] E. W. Felten, D. Balfanz, D. Dean, and D. S. Watlaeh. Web Spoofbg: An Internet Con Game. Technical Report 540-96, Department of Computer Science, Princeton University, December 1996. [FKK96] Alan O. Freier, Philip Karlton, and Paul Kocher. The SSL Protocol Ver- sion 3.0, Mar& 1996. Internet Draft, work in progress.

[GS96] Sirnson Garfinkil and Gene Spafford. Pmticul Uniz d Internet Security. O'Reilly k Associates, Inc., Sebastopol, California, 2nd edition, April 1996.

[Hae97] R. Haeni. IPV6 vs. SSL, Comparing Apples With Oranges, January 1997. Robert M. Hinden. IP Next Generation Overview, May 1996. Laurent Joncheray. A Simple Active Attack Against TCP, April 1995. Phi1 Karn, Perry hletzger, and William Allen Simpson. The ESP DES CBC 'Ikansform. RFC-1829, August 1995. RSA Laboratories. PKCS #1: MA Encryption Standard. version 1.5, November 1993. Jon Postel. Internet Protocol. RIT-791, September 1981. Jonathan Postel. Simple Mail Transfer Protocol. RFC 821, August 1982. Jonathan Postel and J. Reynolds. Telnet Protocol Specification. RFC 854, May 1983. Jonathan Postel and J. Reynolds. Telnet Option Specifications. RFC 855, May 1983. Jonathan Postel and J. Reynolds. File Transfer Protocol (FTP). WC 765, October 1985. Ron Rivest. The MD5 Message Digest Algorithm. RFC 1321, ApriI 1992.

Bruce Schneier. Applied Cryptogmphy: Pm tocois, A lg orithms, and Source Code in C. John Wilyey & Sons, Inc., New York, 1994. Adam Shostack. An Overview of SSL (Version 2). http://www.homeport .mg/ adam/ssl.html, May 1995. Douglas R. Stinson. Cryptogmphy - Theory und Practzce. CRC Press, Inc., Boca Raton, Florida, 4 edition, 1996. Andrew. S. Tanenbaum. Cornputer Networks. Prentice-Hail, Inc., Upper Saddle River, New Jersey, 3 edition, 1996. David Wagner and Bruce Schneier. Analysis of the SSL 3.0 Protocol. In PmeeedingJ of the Smnd USENIX Workshop on Electronic Commerce, pages 29-40, Oakland, California, November 1996. USENIX Association. Tatu Monen. The SSH (Secure Shell) Remote Login Protocol. Internet Draft, work in progress, July 1995. [Mo961 Tatu Monen. Secure Login Connections Over the Internet. In Proceed- ings of the Siah USENIX Security Symposium, Focwing on Applications of Cryptogiophy, pages 37-42, San Jose, California, July 1996. USENIX Association. [Mo97a] Tatu Ylonen. SSH Authentication Protocol. Internet Draft, work in progress, March 1997. Wo97bl Tatu Ylonen. SSH Connection Protocol. Internet Draft, work in progress, March 1997. Il"trn"EIh A0 E'1JC;~ufritOFt'IlTI TEST TARGET (QA-3)