Algorithm and Code

Total Page:16

File Type:pdf, Size:1020Kb

Algorithm and Code

The Technion - Israel Institute of Technology Electrical Engineering Computer Networks Laboratory

Project Report Subject:

Double FTP Client

Author: Jonathan Charbit

Project Supervisor: Ilan Hazan

Summer 5759

1 Abstract

Project specifications as defined by the Technion’s Network & Communication lab were to design a FTP client, which could retrieve a file from different servers, or even from the same one using two different pathes.

The idea was to get different parts of the same file, using FTP command “REST”, as specified in RFC 959.

Here we implemented a very simple version under UNIX environment written in C programming language : two parallel processes are run, each one retrieving the first and second part of the selected file.

2 Index

Abstract 2

Index 3

Introduction 4

Theoretical background 5

1.The File Transfer Protocol 5

2.Terminology 7

User guide 11

1.First mode of operation 11

2.Second mode of operation 12

Program design 13

A. Routine description 13

1.Main routines 13

2.Other routines 18

B. Data types 21

Conclusion and suggestions 22

3 Introduction

The basic idea was to establish two FTP connections to retrieve the two parts of a same file.

The way that we have selected to implement the program was to get all the information about the second server at level of the program execution, i.e. before the program is running, the user has to specify where to retrieve the second part of the file, and the details about the second connection.

At the beginning of the program, a struct variable is set up that will be used to run the second process (when using the 2give command, see Program design). The second process is independent: it has its own data and control connections.

4 Theoretical background

1.The File Transfer Protocol

The File Transfer Protocol, FTP, is the primary Internet standard for file transfer: from server to client (retrieve) or from client to server (put). FTP was written specifically for computers running TCP/IP and the protocol is using two TCP connections in the same time: _ control connection, for communication of FTP commands (from client to server) and FTP reply (from server to client) _ data connection for transfer of data (file transfer or list of files in current directory)

FTP objectives were to promote sharing of files (computer programs or data), to encourage indirect or implicit use of remote computers, to shield a user from variations in file storage systems among hosts, and to transfer data reliably an efficiently.

Though usable directly by a user at a terminal, FTP was designed mainly to be used by programs. It supports several commands that allow bidirectionnal transfer of both binary and text files between computers, where the requesting computers acts as a client, and the second one as a server.

A user account is required on the remote machine. Some servers allow anonymous connections. The user protocol-interpreter ( PI ) initiates the control connection, under Telnet protocol. After the connection being established, FTP commands are sent by user-PI to the server process via the control connection. In the second direction, FTP replies are sent by server process to user-PI. The communication channel from the user-PI to the server-PI is established as a TCP connection from the user to the standard server FTP port. The user protocol interpreter is responsible for sending FTP commands and interprets the FTP replies of the server.

5 When user wants to retrieve a file, he has to send to server the value of a free port to open on it the data connection. This is done by FTP command “PORT port_number” (the command is sent on control connection, like others FTP commands). Then user has to listen on the specified port and to wait for data transferring from server. When the transfer ends, the user-PI has to close the data connection. If user wants to close data connection before the whole file has been sent (which is done by the first process in our program, because it retrieves the first part of the file), user-PI has to close the data connection and send (on control connection) the FTP command “ABOR” (stand for abort).

Not like the control connection, the data connection is not permanent : for each transfer of data (file transfer or file list of current directory), a new data connection has to be opened.

At the end of the FTP connection, it is the responsibility of the user to request the closing of the control connection (via FTP command “QUIT”), while it is the server that closes it effectively.

6 2.Terminology

ASCII In FTP, ASCII characters are defined to be the lowest half of an eight- bit code set (i.e. the most significant bit is zero).

Control Connection The communication path between the USER-PI and SERVER-PI for the exchange of FTP commands and replies. This connection follows the Telnet protocol.

Data Connection A full duplex connection over which data is transferrred, in a specified mode and type. The data transferred may be a part of a file, an entire file or a number of files (also list of files in current directory is sent over data connection). The path may be between a server-DTP and a user-DTP, or between two server-DTP’s.

Data Port The passive data transfer process “listens” on the data port for a connection from the active transfer process in order to open data connection.

DTP The data transfer process establishes and manages the data connection. The DTP can be passive or active.

EOF The end-of-file condition that defines the end of a file being transferred.

7 FTP commands A set of commands that can be sent from user to server and control information flowing from user to client in both direction.

File An ordered set of computer data (including programs), of arbitrary length, uniquely identified by a pathname.

Pathname Pathname is defined to be the character string, which must be input to a file system by a user in order to identify a file. Pathname normally contains device and/or directory names, and file name specification. FTP does not yet specify a standard pathname convention. Each usr must follow the file naming conventions of the file systems involved in the transfer.

PI The protocol-interpreter; the user and server sides of the protocol have distinct roles implemented in a user-PI and a server-PI.

Reply A reply is an acknowledgment (positive or negative) sent from server to user in response to FTP command. The general form of a reply is a completion code (including error codes) followed by a text string. The codes are for use by programs and the text is usually intended for human users.

Server-DTP The data transfer process, in its normal “active” state, establishes the data connection with the “listening” data port. It sets up parameters for transfer and storage and transfers data on command from its PI. The DTP can be placed in a “passive” state to listen for, rather than initiate a connection on the data port.

8 Server-FTP process A process or set of processes which perform the function of file transfer in cooperation with a user-FTP process and possibly another server. The functions consist of a protocol interpreter (PI) and a data transfer process (DTP).

Server-PI The server protocol interpreter “listens” on FTP port for a connection from a user-I and establishes a control connection. It receives standard FTP commands from the user-PI, sends replies, and governs the server-DTP.

Type The data representation type used for data transfer and storage. Type implies certain transformations between the time of data storage and data transfer.

User A person or a process on behalf of a person wishing to obtain file transfer service. The human user may interact directly with a server-FTP process, but use of a user-FTP process is preferred since the protocol design is weighted towards automata.

User-DTP The data transfer process “listens” on the data port for a connection from a server-FTP process. If two servers are transferring data between them, the user-DTP is inactive.

User-FTP process A set of functions including a protocol interpreter, a data transfer process and a user interface which together perform the function of file transfer in cooperation with one or more server-FTP processes. The user

9 interface allows a local language to be used in the command-reply dialogue with the user.

User-PI The user protocol interpreter initiates the control connection from its FTP port to the server-FTP process, initiates FTP commands, and governs the user-DTP if that process is part of the file transfer.

10 User guide

Two modes of operation are available:

 simple FTP  multiple FTP

1.First mode of operation

In the first mode of operation, a simple FTP client is running with this list of commands implemented:

 curdir : prints the remote current directory  dir : prints list of the files in the remote current directory  giveme file_name: retrieves file file_name on server  lcd directory: changes local current directory to directory  ldir : prints list of the files in the local current directory  mkdir new_dir : create a directory named new_dir in the local file system  quit : stops the program  rhelp : prints list of the server supported FTP commands  type data_type: changes the data representation type to data_type

To run in first mode, just type:

client server_name

All the commands are case sensitive and should be typed in lowercase.

11 2.Second mode of operation

When using the second mode, parallel transferring from two servers is available.

To run in second mode, one has to give all the information about the second server:

client first_server_name second_server_name username password path file_name file_size

When you get the “mftp>” prompt, just type 2give, and the two processes are running.

12 Program design

A. Routine description

1.Main routines:

The program is beginning in the file my_client.c inside the main() function.

If a second server is specified by user, the struct variable file_location (see data types) is set up.

Then, ConnectToServer() is called.

ConnectToServer (server_name):

Server_name : string argument, name of the server to connect to.

 creates a socket (only the control connection is created)  connects to server (and get a reply)  calls to Login() (with NULL as login and pass arguments).  if success, calls to GetAndInterpret() in a loop till the quit command is typed by user  close the control connection

GetandInterpret(sock):

Sock: struct my_socket * argument, see datatypes

This is the principal function of the program, where command typed by user is processed.

By calling GetCmdandParam(), the command line is decomposed into command name and arguments.

13 Here the list of all the commands; the principal ones are described in detail:

 curdir : prints the remote current directory  dir : prints list of the files in the remote current directory  giveme file_name: retrieves file file_name on server  lcd directory: changes local current directory to directory  ldir : prints list of the files in the local current directory  mkdir new_dir : create a directory named new_dir in the local file system  quit : stops the program  rhelp : prints list of the server supported FTP commands  type data_type: changes the data representation type to data_type  2give: retrieves two parts of the file in two different processes curdir: sends on the control connection the FTP command “PWD” (for Present Working Directory) and print the reply on screen (by calling ReadAndPrint). dir:

 opens the data connection by calling OpenDataConnection() with the struct sock  sends, on the control connection, the port number for the data connection by calling SendPort()  sends, on the control connection, the FTP command “LIST”  reads on data connection and prints the file list, by calling WriteList()

giveme file_name: calls GetFile() with the filename get from the command line as second parameter, 0 as third parameter meaning that the transfer begins at the first byte of the file, and ILLIMITED as last parameter to say that the transfer ends only at the end of the file.

lcd directory:

Executes the system call chdir with parameter directory.

14 ldir:

Executes the shell command ls –l.

mkdir new_dir: uses system calls to create new_dir on local working directory quit:

 sends FTP command “QUIT” to server  prints reply  return QUIT value rhelp:

 sends FTP command “QUIT” to server  prints reply which contain the command list (the command list is sent by server as a reply on control connection)

type data_type:

 sends FTP command “TYPE” with appropriate parameter to server  prints reply

2give:

 splits in two processes  father process : calls GetFile() with appropriate parameters : the file name and size are taken from the struct file_location * my_file (see data types) set up in the beginning of the program. The third parameter is 0, meaning that the transfer begins at the first byte of the file; fourth parameter is the number of blocks to transfer: it is calculates in a way that no “hole” is left between the two processes (sometimes it might causes a “double copying “ of some bytes).

15 Blocks_to_transfer=file_size/(2*FILE_BLOCK_SIZE) + 2

The “ +2” comes to prevent “holes” because Blocks_to_transferis is an integer variable, and some problems may occur when divising.  son process : calls to ConnectToNewServer() to create new sockets for control and data connections and retreive the second part of the file.

Now we continue the main routine description.

GetFile(sock, file, beginning, blocks_to_transfer):

Sock: my_socket * argument, see data types File: string argument, file name to transfer Beginning: number of bytes to restart from (set to zero if transfer from the beginning of the file) blocks_to_transfer: number of blocks to transfer, size of each block is FILE_BLOCK_SIZE bytes (set to ILLIMITED if transfer till the end of the file)

 calls OpenDataConnection()  creates (or opens if existing) file on the local file system  sends FTP command “REST” (stand for restart) with the value of beginning  moves file pointer in the local file system in the right place  reads on data connection and prints on file on a loop (each loop is transferring a block) till blocks_to_transfer are transferred or end of file is reached  prints total transferred bytes  if stops before EOF, sends FTP command “ABOR” (for abort)  closes file pointer and data connection socket  returns transferred bytes

16 ConnectToNewServer(my_file):

My_file: file_location* argument, see data types

 creates socket for the new control connection  connects to new server, using username and password from my_file (set up at the beginning of the program)  calls to GetFile() with appropriate parameters: the file name and size are taken from the struct file_location * my_file (see data types) set up in the beginning of the program. Third argument is (size+FILE_BLOCK_SIZE)/2. Fourth argument is ILLIMITED.

17 2.Other routines:

ReadAndPrint(sd, gen_code):

Sd: integer argument, number of the control connection socket Gen_code: integer variable, set up to FIRST_ITERATION needed for recursive calls

ReadAndPrint() is a recursive function. Every call gets one reply from the server and prints it. The number of reply the server is going to send is not known at the time the first reply is sent. FTP protocol specifies that the server should put a “marker” at the beginning of the last reply. ReadAndPrint() is calling itself until this marker is seen on the beginning of the reply.

To identify the first call, gen_code is set to FIRST_ITERATION.

GetCmdandParam(cmd, param, cmd_line):

Cmd: string that contains ,at the end of the routine, the name of the command typed by user Param: array of strings, that contains ,at the end of the routine, the parameters of the command line Cmd_line: command line typed by user

GetCmdandParam() decomposes cmd_line in words : first word in cmd, and next words in param.

18 Login(sd, login, pass):

Sd: integer argument, number of the control connection socket Login: string argument, username for the connection (useful for the second connection) Pass: string argument, password for the connection (useful for the second connection)

 gets username from user and sends USER command on control connection  gets password from user and sends PASS command on control connection  if login and pass are not NULL, they are sent to server

OpenDataConnection(sock):

Sock: my_socket * argument, see data types

 creates socket for data connection  chooses a port for data connection, randomally  listen on this port

SendPort(sock):

Sock: my_socket * argument, see data types

 gets data connection port number  sends FTP command “PORT” with the appropriate number

19 SendRest(sock, num):

Sock: my_socket * argument, see data types Num: number of bytes to restart from

 sends FTP command “REST” with the appropriate number  prints reply

WriteList(sock):

Sock: my_socket * argument, see data types

 reads on data connection and prints to screen on a loop until no more data is sent by server  prints reply (from server on control connection)

20 B. Data types:

For socket ids, a struct variable has been defined. struct my_socket { int ctl; /* socket id for the control connection */ int data; /* socket id for the data connection */

}

A struct variable has been defined for information about the file to retrieve by the help of the command “2give” (with two parallel running processes). struct file_location { char* server_name; /*server name for the second connection*/ char* file_name; /* name of the file to retreive*/ char* login; /*username for the second connection*/ char* password; /*password for the second connection*/ char* path; /*path of the file to retreive*/ int size; /* size of the file to retreive*/

}

21 Conclusion and Suggestions

This project was only the first steps in the implementation of a very wide idea: parallel transfer of different parts of a same file.

Many advanced implementations could be designed in future:

 increasing the number of parallel connections  implementation using threads instead of different processes  insertion of time measurement to compare performances  instead of attribute to each connection a part of the file before running time, ones could design a program in which each connection takes care of a “File block”, and when the whole “File block” has been transferred go for the next “File block”

Example with N connections and File block size :1000 Kbytes

Connection #1: 1-1000 Connection #2: 1001-2000 . . . . Connection #N: (N-1)*1000 +1- N*1000

The first of the N connections that complete its “File Block” will transfer the N+1th “File Block” : N*1000 +1 – (N+1)*1000… and so on…

 File block size could be defined differently and dynamically for each connection, in view of connection performance

22 23

Recommended publications