Network Programming and Protocol Design What Is a Socket?
Total Page:16
File Type:pdf, Size:1020Kb
Network Programming and Protocol Design What Is A Socket? • Sockets (also known as Berkeley Sockets) is an application programming interface (API) to the operating system’s TCP/IP stack • Used on all Unixes • Windows Sockets are similar • Some Unixes have XTI (X/Open Transport Interface) and TLI (Transport Layer Interface) (not covered here) • Can be used for other protocol stacks than TCP/IP (not covered here) 2 Socket Functions • Socket functions: – Allocating resources: socket, bind – Opening connections: accept, connect – Closing connections: close, shutdown – Sending and receiving: read, write, recv, send, recvfrom, sendto, recvmsg, sendmsg – Other: getsockopt, setsockopt, getsockname, getpeername • Utilities – Network information (host names, addresses) – Byte order 3 Byte Order • Addresses and port number must be in network byte order – Also known as big-endian, most-significant byte first • Conversion routines for 16 and 32-bit integers: – htons, htonl ("host to network short/long") – ntohs, ntohl ("network to host short/long") • Null macros ("#define htons(x)(x)") on big-endian systems 4 Address Structures struct in_addr { in_addr_t s_addr; /* IPv4 address, network byte order */ }; struct sockaddr_in { sa_family_t sin_family; /* AF_INET */ in_port_t sin_port; /* 16-bit port, network byte order*/ struct in_addr sin_addr; /* IPv4 address */ char sin_zero[8]; /* always zero */ }; 5 Address Functions • From string to address: unsigned long inet_addr(const char *cp) – returns -1 on error • From address to string: char* inet_ntoa(struct in_addr) – return pointer to statically allocated buffer – surprisingly, is thread-safe (uses thread-specific data) ON SOME UNIXES • inet_aton (NOT ON ALL UNIXES) – from ascii "194.197.118.20" to struct in_addr – int inet_aton(const char *cp, struct in_addr *inp); 6 Creating A Socket • int socket(int domain, int type, int protocol) • Domain is – AF_INET for TCP/IP protocols – AF_UNIX (AF_LOCAL) for Unix named pipes, others • Type is – SOCK_STREAM (TCP) – SOCK_DGRAM (UDP) – SOCK_RAW (raw IPv4) – others • Protocol is usually zero • Returns new socket descriptor, or -1 on error 7 A Typical TCP Client socket connect read/write close 8 Connecting • int connect(int sockfd, struct sockaddr *serv_addr, int addrlen); • Establishes a connection to server • Return values are 0 on success, -1 on error (errno set accordingly) • Typical errno values: – ECONNREFUSED: host is up, but no server listening – ETIMEDOUT: host or network is down? 9 Example: Simple HTTP Client (1/4) /* * Simple HTTP client program, version 1. * Written by [email protected]. */ #include <arpa/inet.h> #include <netinet/in.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/socket.h> #include <sys/types.h> #include <unistd.h> 10 Example: Simple HTTP Client (2/4) void die(const char* message) { fprintf(stderr, "%s\n", message); exit(1); } int main(int argc, char *argv[]) { int sockfd, n; struct sockaddr_in addr; unsigned char buffer[4096]; if (argc != 4) die("usage: geturl ip-address port local-url"); 11 Example: Simple HTTP Client (3/4) /* Open socket */ if ((sockfd = socket(AF_INET, SOCK_STREAM, 0)) == -1) die("socket error"); /* Parse address */ memset(&addr, 0, sizeof(addr)); addr.sin_family = AF_INET; addr.sin_port = htons(atoi(argv[2])); addr.sin_addr.s_addr = inet_addr(argv[1]); if (addr.sin_addr.s_addr == -1) die("bad address"); /* Connect to remote host */ if (connect(sockfd, (struct sockaddr*) &addr, sizeof(addr)) == -1) die("connect error"); 12 Example: Simple HTTP Client (4/4) /* Send HTTP request */ write(sockfd, "GET ", 4); write(sockfd, argv[3], strlen(argv[3])); write(sockfd, " HTTP/1.0\r\n\r\n", 13); /* Read response */ while ((n = read(sockfd, buffer, sizeof(buffer))) > 0) write(STDOUT_FILENO, buffer, n); /* Close and exit */ close(sockfd); return 0; } 13 Simple HTTP Client In Action $ ./httpclient1 192.168.3.4 80 / HTTP/1.1 200 OK Date: Sat, 24 Apr 1999 17:08:25 GMT Server: Apache/1.3.4 Last-Modified: Fri, 26 Feb 1999 15:28:20 GMT Connection: close Content-Type: text/html <html><head><title>Example Inc.</title></head> <body> <h1>Welcome to Example Inc’s web server!</h1> ... $ 14 What Is A Server • Background process • No user interface • Handles service requests from network • Can also send requests, for example DNS and NTP • Typically must handle many requests concurrently -> some kind of multitasking needed 15 What Is Special In A Server • Concurrency & network I/O • Protocol encoding/decoding • Application-specific logic • System interaction: Unix daemons, Windows NT services • Logging • Security 16 Network I/O On Unix • Sockets are file descriptors • Important I/O operations for TCP sockets: – accept – connect – read, write – close, shutdown • Important I/O operations for UDP sockets: – sendto – recvfrom 17 Binding To Specific Port • int bind(int sockfd, struct sockaddr *my_addr, int addrlen) • bind assigns a specific local address and port to the socket – normally used for servers (where the port must be known) – IP address (or IPADDR_ANY) • In clients, you don’t usually need bind – a random "ephemeral" port is chosen automatically – the correct interface and IP address are chosen automatically 18 bind() example • Bind socket to local port 80 (for HTTP server) struct sockaddr_in my_addr; memset(&my_addr, 0, sizeof(my_addr)); my_addr.sin_family = AF_INET; my_addr.sin_port = htons(80); my_addr.sin_addr.s_addr = INADDR_ANY; if (bind(sockfd, (struct sockaddr*) &my_addr, sizeof(my_addr)) == -1) die("bind failed"); 19 bind() notes • Bind can also bind to specific IP address! – Most hosts have at least two interfaces, loopback and Ethernet – One physical interface can have multiple addresses ("IP Aliasing") – Web servers need to bind to specific IP with IP Aliasing – INADDR_ANY binds to all addresses • Binding to ports 1-1023 requires root privileges on Unix – Traditionally used for security: do not trust this! 20 Getting Remote Host Name • gethostbyaddr() converts IP address to domain name • Not all addresses have domain names • Not secure: the owner of IP address can return any name he wants! • Partial solution: Double DNS lookup – gethostbyaddr(IP_ADDRESS) -> NAME – gethostbyname(NAME) -> NAME_ADDRESSES (0...N) – check that IP_ADDRESS = NAME_ADDRESSES 21 TCP Connections • When the server calls accept() it gets: – file descriptor for reading/writing data – remote IP + port (from getpeername) – local IP + port (from getsockname) 22 UDP "Connections" • UDP is not really connected • When the server calls recvfrom() it gets – packet data – remote IP + port 23 Iterative UDP Server initialize wait for packet process request send reply 24 Iterative UDP Server • Single thread of execution • If processing doesn’t take long, works well! – Otherwise the service is blocked • Simple, easy to coordinate access to resources • Must be careful not to use blocking operations: – CPU-intensive tasks – SQL database queries – gethostbyname, gethostbyaddr 25 Example: radiusd • Potentially lots of requests • Uses UDP • Very little processing per request • Solution: single-threaded UDP server. 26 Process-per-connection TCP Server receive data initialize process request wait for connection send reply fork close connection & exit 27 Process-per-connection TCP Server • New process started for each connection • Good sides: – Easy to use, works! – Reliable: If one process dies, others continue • Problems: – Starting new processes is slow – Co-operation between processes is limited or difficult • Access to shared resources (log files, etc.) needs to be coordinated 28 Example: telnetd • Each connection takes quite long, so process starting overhead is not a problem • Asynchronous I/O would be very difficult • Solution: process-per-connection TCP server 29 Process Pre-allocation • Since starting processes is slow, start all processes at the beginning • Memory used by unused processes is wasted. • Only a limited number of connections concurrently. • Used very successfully! 30 Threads • Creating threads is much faster than processes • All modern Unixes and Windows NT have threads • Shared memory makes co-operation easy • Access to shared memory needs to be coordinated 31 Asynchronous TCP Server initialize wait for events receive data send more accept new and process reply data connection 32 Asynchronous TCP Server • Single thread of execution • Event multiplexing using poll() or select() • Easy to coordinate access to resources • Must avoid blocking operations 33 Example: Bind DNS Server • Lots of requests • Needs to be very fast • Most requests are UDP, but some TCP • Very little CPU processing • In-memory database and cache – (hard to share between processes) • Needs to be portable to legacy systems -> no threads • Solution: asynchronous I/O for both UDP and TCP 34 Concurrency In Clients • Typically clients have user interface, etc. • Using separate processes is difficult, since there are communication needs between network process and user interface. • Solution: threads. 35 Distributed Computing • Generally a view of shared computing and data resources, transparent communications between programs and access to objects located in other hosts • Sun RPC is the first popular protocol – CORBA is currently somewhat popular – Both are based on an abstraction layer that hides the network • Web Services and .Net take a slightly different approach – XML is used to represent all kinds of objects • Advantages are access to shared resources, transparent communications and flexibility • Disadvantages are added complexity and security