05 4547 ch03 1/24/03 8:44 AM Page 29

3 HTTP Transactions

HTTP TRAFFIC CONSISTS OF REQUESTS AND RESPONSES. All HTTP traffic can be associated with the task of requesting content or responding to those requests. Every HTTP message sent from a to a is classified as an HTTP request, whereas every message sent from a Web server to a Web browser is classified as an HTTP response. HTTP is often referred to as a stateless protocol.Although this is accurate, it does little to explain the nature of the Web.All this means, however, is that each transaction is atomic, and there is nothing required by HTTP that associates one request with another. A transaction refers to a single HTTP request and the corresponding HTTP response. Another fundamental topic related to the nature of the Web is the topic of connections.

Connections When I speak of a connection in HTTP,I refer to a TCP connection.As illustrated in Figure 3.1, a TCP connection requires three separate messages.

TCP SYN TCP SYN + ACK TCP ACK

Figure 3.1 A TCP connection requires three messages.

SYN and ACK are two flags within the TCP segment of a packet. Because TCP is such a common transport layer protocol to be used in conjunction with IP,the combined pack- et of an IP packet containing a TCP segment is sometimes called a TCP/IP packet, even though it would best be described as a packet within a packet. By this example, you can see that a connection is unlike what you might otherwise expect.After this exchange, both computers simply consider themselves connected. In terms of HTTP,this simply means the server is ready to receive requests from this specific .There is no real 05 4547 ch03 1/24/03 8:44 AM Page 30

30 Chapter 3 HTTP Transactions

active connection in the traditional sense. It is better described as an understanding between the two computers that they are connected. An example of this type of connection is a two-way radio. If you and a friend both have two-way radios, you can establish a similar method for ensuring that you are both able to send and receive messages properly.To do this, you can send a message (by talking into the radio) asking to establish a connection.Your friend sends back a confirmation message acknowledging your request and agreeing to the connection.At this point, you feel confident that each of you can both send and receive messages, but your friend can- not be assured of this without knowing whether you received the confirmation.You send back a final message acknowledging the receipt of your friend’s confirmation.At this point, you both have confidence in your ability to communicate with these radios. This series of events is very similar to a TCP connection.

Note A single connection can support multiple HTTP transactions. In many cases, multiple HTTP transactions are required to properly render a URL in a Web browser due to images and other associated content.

Example HTTP Request It is probably easiest to get an idea about what HTTP is by looking at a few examples. Using my 1.2.0 browser, I type in the URL http://127.0.0.1/ and press Enter.This is actually a request to the Web server running on my local computer (127.0.0.1 is a special IP address called the loopback address).The request that my browser sends is as follows: GET / HTTP/1.1 Host: 127.0.0.1 User-Agent: /5.0 Galeon/1.2.0 (X11; Linux i686; U;) Gecko/20020326 Accept: text/xml,application/xml,application/xhtml+xml,text/;q=0.9, text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2, text/css,*/*;q=0.1 Accept-Language: en Accept-Encoding: gzip, deflate, compress;q=0.9 Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 Keep-Alive: 300 Connection: keep-alive 05 4547 ch03 1/24/03 8:44 AM Page 31

Connections 31

Example HTTP Response In this example, my Web server gives the following response: HTTP/1.1 200 OK Date: Tue, 21 May 2002 12:34:56 GMT Server: Apache/1.3.22 (Unix) (Red-Hat/Linux) mod_python/2.7.8 Python/1.5.2 mod_ssl/2.8.5 OpenSSL/0.9.6b DAV/1.0.2 PHP/4.0.6 mod_perl/1.26 mod_throttle/3.1.2 Last-Modified: Thu, 01 Nov 2001 20:51:45 GMT ETag: “df6b0-b4a-3be1b5e1” Accept-Ranges: bytes Content-Length: 2890 Connection: close Content-Type: text/html

Test Page for the Apache Web Server on Red Hat Linux (...) The majority of the response is HTML (omitted for brevity). Only the first few lines are HTTP.Thus, as intended, HTTP does not have much overhead. Lower-level protocols such as TCP and IP have even less overhead than HTTP,however, due mostly to the fact that HTTP is intentionally readable.This makes it easy to study and comprehend.

Example Transaction A good example transaction to review is a search on Google. Being one of the most popular sites on the Web, most people have interacted with this site at one time or another.When performing a search on HTTP (see Figure 3.2), you enter HTTP into the form field and click the button labeled Google Search. When using my Web browser to perform this search, the following HTTP request is sent when I click the button: GET /search?hl=en&q=HTTP&btnG=Google+Search HTTP/1.1 Host: www.google.com User-Agent: Mozilla/5.0 Galeon/1.2.0 (X11; Linux i686; U;) Gecko/20020326 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9, text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2, text/css,*/*;q=0.1 Accept-Language: en Accept-Encoding: gzip, deflate, compress;q=0.9 05 4547 ch03 1/24/03 8:44 AM Page 32

32 Chapter 3 HTTP Transactions

Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 Keep-Alive: 300 Connection: keep-alive

Figure 3.2 Searching Google for the term “HTTP”.

Google’s Web site responds: HTTP/1.1 200 OK Server: GWS/2.0 Date: Tue, 21 May 2002 12:34:56 GMT Transfer-Encoding: chunked Content-Encoding: gzip Content-Type: text/html Cache-control: private Set-Cookie: PREF=ID=58c005a7065c0996:TM=1021283456:LM=1021283456:S=OLJcXi3RhSE; domain=.google.com; path=/; expires=Sun, 17-Jan-2038 19:14:07 GMT

(Web content compressed with gzip) Of interest in this response is that the Web content is of a format that cannot be printed, binary. Because my browser specified in its request that it accepts gzip (GNU zip) encoding, Google chose to encode the response with gzip.This is a popular compression algorithm that allows for a quicker transfer due to the smaller size of the HTTP response. My browser decompresses the content in order to reveal the HTML it needs to render the Web page (the results of my search). 05 4547 ch03 1/24/03 8:44 AM Page 33

Summary 33

Summary This should give you a good introduction to the basics of HTTP transactions. It is important to remember that an HTTP transaction consists of a single HTTP request and the corresponding HTTP response, but that multiple transactions are sometimes required for a single Web page to be ultimately rendered in your Web browser. In the next chap- ter, I explain some Web development techniques that leverage knowledge of HTTP to make Web development easier and of higher quality. 05 4547 ch03 1/24/03 8:44 AM Page 34