What’s a protocol? HTTP (Hypertext Transfer human protocol network protocol: Protocol)

hi TCP connection hello request TCP connection how much reply. Nan Niu ([email protected]) is this? Get http://www.toronto.edu/index.html CSC309 -- Summer 2008 $50 I’ll give you $40 time

2

What’s a protocol? Protocol Stacks Human Protocols: Network protocols: Host A Host B  “thank you … you’re  drive device, rather than Protocols Examples welcome” human, interaction HTTP, FTP, read/write from/to socket Application telnet, email application application service across  “hello … hi … my name is  all communication activity abstract network ordered byte stream … pleased to meet you” in Internet is governed by TCP/UDP transport transport Reliable, efficient end user service  Price haggling protocols Routers Routing, IP network network network network Flow-control Congestion-cntl Bridges … specific msgs sent protocols define format & Framing, error order of messages sent and data link data link data link data link recovery, media … specific actions taken access Repeaters when msgs received received among network Modulate raw bits SONET, DSL physical physical physical physical entities, and actions taken on onto media – light … may be context or /electrical/radio culture sensitive message transmission, receipt pulses 3 4

Application and Application-Layer Protocol Layering Protocols Protocols provide specialized services by Application building on services provided by other protocols. application – running in network hosts in transport network “user space” data link Internet Unreliable physical application Reliable in- – exchange messages to transport best effort Application (FTP, Telnet, WWW, email) network datagram order implement app. data link physical delivery Transmission Control Protocol byte stream (process- (UDP) (TCP) delivery – e.g., email, file transfer, process) (process- the Web Internet Protocol (IP) process) Application-layer protocols Unreliable best effort Network Interface (Ethernet, ATM) – one “piece” of an app. point-point end-end – define messages datagram datagram Hardware (fiber, twisted-pair copper, coax, radio) exchanged by apps and delivery delivery (host-host) (network actions taken interface- application interface) – connectivity provided by transport Physical network connection lower layer protocols data link physical 5 6

1 Client-Server Paradigm HTTP (Hypertext Transfer Protocol) Client:  initiates contact with • Created by Tim Berners-Lee at CERN

server (“speaks first”) H – Defined 1989-1991 TT P r  typically requests service equ PC running HT est • Standardized and much expanded by the IETF TP Explorer re from server, spo nse • Rides on top of TCP protocol  for Web, client is – TCP provides: reliable, bi-directional, in-order byte implemented in browser; st ue stream eq Apache Web p r for e-mail, in mail reader tt se server h on sp • Goal: transfer objects between systems Server: re tp ht – Do not confuse with other WWW concepts:  provides requested • HTTP is not page layout language (that is HTML) service to client Sun running • HTTP is not object naming scheme (that is URLs) Navigator  e.g., Web server sends • Text-based protocol requested Web page, mail – Human readable server delivers e-mail 7 8

The Web: HTTP Protocol HTTP in Operation

HTTP Suppose user enters URL www.toronto.edu/cs/index.html  Web’s application layer (containing text and references H to 10 jpeg images) protocol TT 1a. http client initiates TCP P r equ connection to http server PC running HT est  client/server model TP Explorer re (process) at 1b. http server at host spo  client: browser that nse www.toronto.edu Port 80 is www.toronto.edu waiting for requests, receives, and default for http server TCP connection at port 80. st ue “accepts” connection, displays Web objects eq Apache Web p r tt se server notifying client h on http client sends http  server: Web server sends sp 2. re tp request message objects in response to ht 3. http server receives (containing URL request message, forms requests /cs/index.html) into TCP Sun running response message  http1.0: RFC 1945 Navigator connection socket containing requested  http1.1: RFC 2068 object (/cs/index.html), time sends message into socket 9 110

HTTP in Operation (Cont’d) HTTP Request Message: General Format

5. http client receives HTTP method sp URL sp HTTP version cr lf request line response message 4. http server closes TCP time containing html file, connection. header field name : field value cr lf displays html. Parsing html … header file, finds 10 referenced lines jpeg objects header field name : field value cr lf cr lf 6. Steps 1-5 repeated for each of 10 jpeg objects

Request Body

111 112

2 Request Line HTTP Request Header • Header fields HTTP method sp URL sp HTTP version cr lf header field name : field value – HTTP method • Examples: • GET – return content of specified document • HEAD – return headers only of GET response – Accept: text/html • POST – execute specified doc with enclosed data – Accept: image/jpg – URL (only domain portion) • /host-identifier/path – Accept-language: en; en-US; fr • e.g. /www.toronto.edu/headlines/ – If-Modified-Since: 11 June 2008 – HTTP version • HTTP/1.1 • HTTP/1.0

113 114

HTTP Request Example HTTP Response Message: General Format

request line HTTP version sp status code sp status phrase cr lf response line (GET, POST, HEAD header field name : field value cr lf header commands) … GET /somedir/page.html HTTP/1.0 lines User-agent: Mozilla/4.0 header field name : field value cr lf header Accept: text/html, image/gif,image/jpeg cr lf Accept-language:fr lines

(extra carriage return, line feed) Carriage return, line feed indicates end Response Body of message

115 116

Response Line Response Status Code • A few example codes HTTP version sp status code sp status phrase cr lf 200 OK – request succeeded, requested object later in this – HTTP version message – Status code (3-digit), 1st digit representing: 301 Moved Permanently • 1XX – informational – requested object moved, new location specified • 2XX – success later in this message (Location:) • 3XX – redirection 400 Bad Request • 4XX – client error – request message not understood by server • 5XX – server error 404 Not Found – Status phrase – requested document not found on this server • Brief text explanation of status code (e.g. OK) 505 HTTP Version Not Supported 117 118

3 HTTP Response Header HTTP Response Example status line: • Header fields (protocol HTTP/1.0 200 OK status code Date: Wed, 11 June 20018 12:00:15 GMT header field name field value status phrase) : Server: Apache/1.3.0 (Unix) Last-Modified: Wed, 11 June 2008 …... Content-Length: 6821 header • Examples: Content-Type: text/html lines

– Content-Type: text/html Carriage return, – Content-Length: 4028 line feed indicates end – Language: en; of message CSC309 (Summer 2008) – Last-Modified: 11 June 2008 ... data, e.g., requested 119 html file 220

Conditional GET GET vs. POST: Typical Usage • Goal: don’t send object client server • GET (static pages) if client has up-to-date copy (cached) object – User clicks a link http request msg • client: specify date of not – Browser requests GET from server If-Modified- cached copy in http Since: modified – Server responds XHTML page to browser request time http response • POST (client can send information to If-Modified-Since: HTTP/1.0 server) 304 Not Modified • server: response – User types in a form & submits contains no object if – Browser POST form data to server http request msg cached copy is up-to- If-modified- object – Server replies the POST request date: since: modified HTTP/1.0 304 Not http response Modified HTTP/1.1 200 OK … 221 222

HTTP is Stateless and Anonymous Session Tracking • State information is NOT maintained • URL override across multiple HTTP requests – Visit: http://somesite/doc_x – When the connection is over, it is over – Links on that page all have a session id: – There are protocols that maintain “state” http://somesite/doc_x/session_id_assigned • FTP () • Hidden variables • IMAP (Internet Message Access Protocol) – Problem • How to relate two requests? (e.g., buy 2+ items) • Cookies • Challenges – Hold small amount of data – Stored on the client-side (generated & updated – Creating a stateful user experience on top of a by the server) fundamentally stateless protocol – Attached to each HTTP request – Keeping track of user identify

223 224

4 Cookies Performance Enhancement in HTTP 1.1 • server sends “cookie” to client in response client server • In HTTP 1.0 msg – Connection closes immediately after response is normal http Set-cookie: 1678453 returned request msg • client presents cookie – Causing slow performance if the content in later requests time normal http response + contains multiple images Set-cookie: # • Each image must start a new request and a new cookie: 1678453 connection cookie- • server matches normal http request • In HTTP 1.1 presented-cookie with cookie: # specific action server-stored info – Connection is kept open a bit longer after the response is returned – authentication normal http response – Allow multiple requests on a single connection – remembering user preferences, previous choices 225 226

Non-persistent and Persistent Connections

Connection Length Non-persistent Persistent • HTTP/1.0 • default for HTTP/1.1 • When does the data end? • server parses request, • on same TCP responds, and closes connection: server, parses request, . Without persistent connections, when TCP connection connection closes. responds, parses new • 2 RTTs to fetch each request,.. . With persistent connections, reply object • Client sends requests header includes content length. • Each object transfer for all referenced suffers from slow objects as soon as it start receives base HTML. • Fewer RTTs and less But most 1.0 browsers use slow start. parallel TCP connections. 227 228

HTTP 1.0 vs. HTTP 1.1 Persistent Connection Performance • Benefits greatest for small objects. • Venkata N. Padmanabhan and Jeffrey C. • Serialized requests do not improve response Mogul, "Improving HTTP Latency.," in time. nd Proceedings of the The 2 International • Pipelining requests can result in large win. WWW Conference, Chicago, IL, USA, Oct • Server resource utilization reduced due to 1994 fewer connection establishments and fewer • Compared download latency for HTML active connections. documents with varying number of • TCP behavior improved. embedded images – Longer connections help adaptation to available bandwidth. – Larger congestion window improves loss recovery.

229 330

5