TCP/IP Traffic Classification Based on Port Numbers
Total Page:16
File Type:pdf, Size:1020Kb
TCP/IP Traffic Classification Based on Port Numbers Patrick Schneider Division of Applied Sciences, Harvard University, 29 Oxford Street, Cambridge, MA 02138, USA [email protected] Abstract 1.1.2. Registered Port Numbers A new approach called TCP/IP Traffic Classification In addition to the well-known ports below 1024 there are Based on Port Numbers is presented here. This approach is more port numbers assigned to applications but are located primarily based on the correlation between the transport anywhere from 1024 to 65535. While the IANA can not layer’s port number and the corresponding application or control uses of these ports it does register or list uses of group of applications. We further suggest that advanced these ports in RFC1700 [5] as a convenience to the commu- classification models are used in combination with the port nity. On most systems the registered ports can also be used number based traffic classification. Future routers can by ordinary user processes or programs executed by ordi- provide different quality of service (QoS) to flows depending nary users. on the port number or range of port numbers they are using. 1.1.3. Reserved Port Numbers A combination of multiple characteristics enables a router to select real-time flows and handle them with a higher A port number in the range of 1 to 1023 is called a priority. The hit-rate can be increased by using additional Reserved Port Number. In UNIX operating systems only a characteristics such as packet interarrival time distribution process with superuser privileges can assign itself a and packet size distribution in addition to the port numbers. reserved port. Some applications, notably Rlogin, Klogin, Measurements of real-world traffic show the percentage of Login and SSH, use the concept of reserved ports as part of Internet traffic. their authentication between the client and server (server on well-known port while client on reserved port). 1. Introduction 1.1.4. Ephemeral Port Numbers 1.1. Port Numbers Ephemeral port numbers are for short lived connections, as used usually by clients. This is because a client typically exists only as long as the user running the client needs its While IP addresses determine the physical endpoints of a service, while servers typically run as long as the host is up. network connection, port numbers [1] determine the logical In Unix ephemeral ports are in the range from 1024 to 5000. endpoints of the connection. Port numbers are 16-bit inte- The Solaris 2.2 operating system represents a notable gers with a useful range from 1 to 65535. exception. By default it places ephemeral ports in the range 1.1.1. Well-Known Port Numbers from 32768 to 65535 (the physical limit of the 16 bit port number range). In a client-server application, the server usually provides Most PC based TCP/IP cards provide their ephemeral its service on a well-known port number. Well-known port ports in the range of 1024 to 5000, since their software is numbers are a subset of the numbers which are assigned to based on the freely available and usable FreeBSD source applications. According to RFC1700 [5], well-known port code [6]. numbers are managed by the Internet Assigned Numbers Authority (IANA). They used to be in the range from 1 to 1.2. Classifying Traffic 255, but in 1992 the range was increased up to 1023. Each Unix system holds a list containing all the well- In order to provide optimal quality of service (QoS) three known port numbers known to the hosts. This list is stored basic types of traffic can be distinguished. Interactive data, in the file /etc/services. bulk data and data for real time applications. The former two classes are considered as elastic data. While real-time Page 1 of 6 applications do not wait for late data to arrive, elastic appli- It is hard to set an upper limit for the buffer that is still cations will always wait for data to arrive. It is not that these considered real time data. We will call traffic that produces applications are insensitive to delay; to the contrary, signifi- a continuous output (audio or video) but allows any amount cantly increasing the delay of a packet will often harm the of buffering bulk data rather than real-time data. Namely application’s performance. An example of classifying flows audio transmissions over HTTP are considered bulk data. is given in [4], definig six classes of traffic flow. 2. Selected TCP/IP Applications 1.2.1. Interactive Data Interactive data is of a bursty character with small 2.1. The File Transfer Protocol (FTP) packets, single or in small groups, being transmitted sporad- ically. Since each bit of interactive data is of high value in The commonly used application FTP is the Internet stan- order to achieve correct performance, the reliable TCP dard for file transfer. The FTP TCP protocol differs from connection is usually preferred over the simpler UDP most other applications because it uses two TCP connec- protocol. The delay added to interactive data by the trans- tions to transfer a file. mission must be as low as possible. A control connection is established on well-known TCP port 21 (FTP-control) for commands from the client to the 1.2.2. Bulk Data server and for the server’s replies. A dedicated data connec- If the flow is bulk data, large amount of data - in the tion is created each time bulk data is transferred between range of kilobytes to typically several megabytes - are trans- two points. Two different modes can be entered alterna- ferred. The delay requirements are rather lax for asynchro- tively. In the default port mode, the client does a passive nous bulk transfer such as electronic mail (unattended data open on an ephemeral port and communicates the port transfer) or news (‘filler’ traffic), and of intermediate impor- number via the control connection to the server. The server tance for interactive bulk transfer such as FTP data establishes the data connection by performing an active (attended data transfer). However, in order to achieve high open on well known TCP port 20 (FTP-data) to the client’s performance, a high throughput should be aspired to. It is ephemeral port. crucial for most bulk data applications to provide a reliable The PASV command toggles to the passive mode service. requesting the server to listen on an ephemeral port and to wait for a connection to be established by the client. The 1.2.3. Real-Time Data server tells the client its ephemeral port number. In the In a real-time application, the source takes some signal, passive mode, both port numbers are ephemeral. packetizes it, and then transmits the packets over the network. The network inevitably introduces some variation Client Server in the delay of the delivered packets. The receiver depack- Eph 21 1 PASV etizes the data and then attempts to faithfully play back the Client Server signal. This is done by buffering the incoming data and then 21 Eph1 Port Eph 21 replaying the signal. 2 21 2 Port Eph 21 The performance of a playback application is measured Eph1 along two dimensions: latency and fidelity. Some playback Eph 2 20 Eph3 Eph applications, in particular those that involve interaction Data Data 2 Eph between the two ends of a connection such as a phone call, 2 Eph2 are rather sensitive to the latency; other playback applica- Passive Off Passive On tions, such as transmitting a movie, are not. Similarly, appli- cations exhibit a wide range of sensitivity to loss of fidelity. Figure 1: Simplified model for FTP using default Intolerant applications require an absolutely faithful play- port mode (left) and passive mode (right). back, while tolerant applications can tolerate some loss of fidelity. It is expected that the vast bulk of audio and video 2.2. Remote Procedure Calls (RPC) applications will be tolerant. If the data buffer exceeds a certain size (in seconds of The Sun RPC protocol is based on the remote procedure material played back in real time) the real-time character of call model, which is similar to the local procedure call the data turns into bulk character. A large buffer enables a model. One could think of the RPC protocol as the jump-to- real-time application to wait for late packets (within the subroutine instruction (“JSR”) of a network. limits given by the buffer size) which is considered non RPC servers typically bind an ephemeral port and some a real-time behavior. reserved one. After the kernel has allocated a port, it is Page 2 of 6 registered at the port mapper in order to keep track of which but according to RFC 1945 [5] other ports can be used. RPC programs are using which ports. The port mapper Ports different from 80 are sometimes chosen in order to provides its services on the well-known port 111 (UDP and allow more than one HTTP server to run at the same time on TCP). Section 29.4 of [1] describes the port mapper in the same host. Port 81 is a ‘hot standby’ port since it is not detail. When a client wishes to make an RPC call to a given assigned to a different server. Multiple concurrent servers service, it will first contact the port mapper daemon on the on the same host are sometimes considered necessary in server machine to determine the port number to which RPC order to improve the overall performance for hefty messages should be sent. The port mapper makes dynamic frequented hosts.