Hypertext Transfer Protocol
Hypertext Transfer Protocol
Ing. Pierluigi Gallo HTTP
Hypertext Transfer Protocol
used in the WWW protocol used for communication between web browsers and web servers client-server paradigm
TCP port 80
RFC 1945
Ing. Pierluigi Gallo Introduction to HTTP
80% of Internet flows are HTTP connections Early protocol is HTTP 0.9 read only Today we use HTTP 1.0 read, input, delete, ... New version: HTTP 1.1 performance optimizations
Ing. Pierluigi Gallo HTTP Overview
Client (browser) sends HTTP request to server
Request specifies affected URL
Request specifies desired operation
Server performs operation on URL
Server sends response
Request and reply headers are in pure text
Ing. Pierluigi Gallo Static Content and HTML
Most static web content is written in HTML
HTML allows Text formatting commands Embedded objects Links to other objects
Server need not understand or interpret HTML
Ing. Pierluigi Gallo URI,URN,URL
RFC 3305
Uniform Resource Identifier Identifies a resource
Uniform Resource Name The name of the resource with in a namespace like a person’s name
Uniform Resource Locator How to find the resource, a URI that says how to find the resource like a person street address
URI concept is more general than its use in web pages (XML, …)
Ing. Pierluigi Gallo HTTP - URLs
URL Uniform Resource Locator protocol (http, ftp, news) host name (name.domain name) port (80, 8080, …) directory path to the resource resource name
absolute relative http://www.tti.unipa.it/~pg/pg/Teaching.html http://xxx.myplace.com:80/cgi-bin/t.exe
Ing. Pierluigi Gallo URI examples
http://example.org/absolute/URI/with/absolute/ path/to/resource.txt ftp://example.org/resource.txt urn:issn:1535-3613
/relative/URI/with/absolute/path/to/resource.txt relative/path/to/resource.txt ../../../resource.txt
./resource.txt#frag01
Ing. Pierluigi Gallo HTTP - methods
Methods GET retrieve a URL from the server simple page request depending on the requested page: run a CGI program run a CGI with arguments attached to the URL POST preferred method for forms processing run a CGI program parameterized data in sysin more secure and private
Ing. Pierluigi Gallo HTTP - methods
Methods (cont.) PUT Used to transfer a file from the client to the server HEAD requests URLs status header only used for conditional URL handling for performance enhancement schemes retrieve URL only if not in local cache or date is more recent than cached copy DELETE deletes page from server
Ing. Pierluigi Gallo req-resp approach
client-server network protocol
in use by the World-Wide Web since 1990
request-response HTTP request messages for HTML pages, images, scripts and styles sheets. Web servers handle these requests by returning response messages that contain the requested resource.
Ing. Pierluigi Gallo Example of an HTTP Exchange
Client Server GET www.cs.virginia.edu
Retrieve Data From Disk
Ing. Pierluigi Gallo Fetching Multiple Objects
Most web-pages contain embedded objects (e.g., images, backgrounds, etc)
Browser requests HTML page
Server sends HTML file
Browser parses file and requests embedded objects
Server sends requested objects
Ing. Pierluigi Gallo Fetching Embedded Objects
Client Server GET www.cs.virginia.edu
Retrieve Data From Disk
GET image.gif
Retrieve Image From Disk
Ing. Pierluigi Gallo HTTP Request Packets Sent from client to server
Consists of HTTP header header is hidden in browser environment contains: content type / mime type content length user agent - browser issuing request content types user agent can handle HTTP 1.1 is the and a URL latest version GET /simtec/httpgallery/introduction/ HTTP/1.1 Accept:*/* Accept-Language: en-gb Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0) Host: www.httpwatch.com Ing. PierluigiConnection: Gallo Keep-Alive HTTP Request Headers
Precede HTTP Method requests
headers are terminated by a blank line
Header Fields: From Accept Accept-Encoding Accept Language
Ing. Pierluigi Gallo HTTP 1.1 vs 1.0
Additional Methods (PUT, DELETE, TRACE, CONNECT + GET, HEAD, POST)
Additional Headers
Transfer Coding (chunk encoding)
Persistent Connections (content-length matters)
Request Pipelining
Ing. Pierluigi Gallo HTTP 1.0
Client opens a separate TCP connection for each requested object Object is served and connection is closed Advantages maximum concurrency Limitations TCP connection setup/tear-down overhead TCP slow start overhead
Ing. Pierluigi Gallo HTTP 1.0
connect() Client SYN Server SYN, ACK write() ACK, GET www.cs.virginia.edu Retrieve Data From Disk close() connect() SYN SYN, ACK write() ACK, GET image.gif Retrieve Image From Disk Ing. Pierluigiclose() Gallo HTTP 1.1
To avoid a connection per object model, HTTP 1.1 supports persistent connections Client opens TCP connection to server All requests use same connection Problems Less concurrency Server does not know when to close idle connections
Ing. Pierluigi Gallo HTTP 1.1
connect() Client SYN Server SYN, ACK write() ACK, GET www.cs.virginia.edu Retrieve Data From Disk write() GET image.gif Retrieve Image From Disk
close()
Ing. Pierluigi Gallo Server Side Close()
Client Server connect() SYN Set timeout SYN, ACK Reset write() ACK, GET www.cs.virginia.edu timeout Retrieve Data From Disk GET image.gif write() Retrieve Image From Disk
Timeout! Ing. Pierluigi Gallo close() CGI Scripts
Common Gateway Interface web server software can delegate the generation of web pages to a stand-alone application, an executable file. CGI scripts are URLs with a .cgi extension The script is a program (e.g., C, JAVA, …) When the URL is requested, server invokes the named script, passing to it client info Script outputs HTML page to standard output (redirected to server) Server sends page to client
Ing. Pierluigi Gallo CGI Execution
fork()
CGI Server Script
Send page
Request Response
Ing. Pierluigi Gallo Modified-Since:
Used with GET to make a conditional GET
if requested document has not been modified since specified date a Modified 304 header is sent back to client instead of document client can then display cached version
Ing. Pierluigi Gallo Status Header
“HTTP/1.0 sp code”
Codes: 1xx - reserved for future use 2xx - successful, understood and accepted 3xx - further action needed to complete 4xx - bad syntax in client request 5xx - server can’t fulfill good request
Ing. Pierluigi Gallo HTTP Response Headers
Sent by server to client browser
Status Header Entities Content-Encoding: Content-Length: Content-Type: Expires: Last-Modified: extension-header
Body – content (usually html)
Ing. Pierluigi Gallo Status Codes
200 OK 401 unauthorized
201 created 403 forbidden
202 accepted 404 not found
204 no content 500 int. server error
301 moved perm. 501 not impl.
302 moved temp 502 bad gateway
304 not modified 503 svc not avail
400 bad request
Ing. Pierluigi Gallo Statelessness
Because of the Connect, Request, Response, Disconnect nature of HTTP it is said to be a stateless protocol i.e. from one web page to the next there is nothing in the protocol that allows a web program to maintain program “state” (like a desktop program). “state” can be maintained by “witchery” or “trickery” if it is needed
Ing. Pierluigi Gallo Maintaining program “state”
Hidden variables (
Sessions Special header tags interpreted by the server Used by ASP, PHP, JSP Implemented at the language api level
Ing. Pierluigi Gallo HTTPS Secure Hypertext Transport Protocol Ing. Pierluigi Gallo SSL vs S-HTTP
Secure Sockets Layer (SSL) protocol
has become the Internet’s key “secure protocol”
supports security across a variety of Internet transfer protocols (FTP, HTTP, IRC, etc…)
Support public keys and private key.
Ing. Pierluigi Gallo Introduction S-HTTP Secure Hypertext Transport Protocols (S-HTTP) - Is a modified version of the Hypertext Transport Protocols (HTTP). - Encryption for Web documents. - Provides the client (browser) the ability to verify message by using a Message Authentication Code (MAC). - Primary purpose is to enable commercial transactions within a wide range of applications. Ing. Pierluigi Gallo S-HTTP
Methods Signature Encryption Message sender Authenticity
Ing. Pierluigi Gallo S-HTTP vs HTTP
HTTP message S-HTTP message message body encrypted message message header body message header
Ing. Pierluigi Gallo 3 steps how server create S-HTTP message
Server Encrypt Methods List
KPCS-7 RSA Diffie- Hellman Encrypt Method KPCS-7
Client Encrypt Methods List
KPCS-7 RSA Diffie- Hellman
Server compares encryption lists and selection. Ing. Pierluigi Gallo from the plain text to the encrypted message
Public-Key Cryptography Standards (PKCS)
S-http header Client’s PKCS-7 !@##$$%%* PKCS-7 This is This is !@##$$%%* @***&^&^%$ my PKCS-7 @***&^&^%$ @#$$@@!&^ my Session message @#$$@@!&^ Key message
RFC 2315. Used to sign and/or encrypt messages under a PKI. Used also for certificate dissemination (for instance as a response to a PKCS#10 message). Formed the basis for S/MIME, which is as of 2010 based on RFC 5652, an updated Cryptographic Message Syntax Standard (CMS). Often used for single sign-on. Ing.PKCS Pierluigi Gallo Cryptographic Algorithm and digital signature modes for S-HTTP
S-HTTP provides message protection in 3 ways:
Digital signature
Message authentication
Message encryption
Ing. Pierluigi Gallo Further readings
Hypertext Transfer Protocol -- HTTP/1.1 RFC 2616
Hypertext Transfer Protocol -- HTTP/1.0 RFC 1945
Upgrading to TLS Within HTTP/1.1 RFC 2817
HTTP Over TLS RFC 2818
HTTP Authentication: Basic and Digest Access Authentication RFC 2617
HTTP State Management Mechanism (Cookies) RFC 2109
HTTP State Management Mechanism (Cookie2) RFC 2965
Ing. Pierluigi Gallo HTTP proxy clients servers Reply Req. Req. proxy Reply
The proxy sits between the client and the server. In the simplest case, instead of sending requests directly to the server the client sends all its requests to the proxy. The proxy then opens a connection to the server, and passes on the client's request. The proxy receives the reply from the server, and then sends that reply back to the client the proxy is acting like HTTP client (to the remote server) Ing. Pierluigi Gallo HTTP server (to the initial client) how the proxy works
Client send requests to the proxy. If the requested document is in its cache, the proxy serves the request from its cache. Otherwise, the proxy forward the request to the server. Server replies the request through the proxy (proxy keep a copy of the requested document).
Ing. Pierluigi Gallo proxy caching
performance improvement
Reduce the user-perceived latency associated with obtaining Web documents.
Lower the network traffic from the Web servers.
Reduce the service demands on content providers.
Ing. Pierluigi Gallo Proxy and privacy
The proxy can: inspect the requested URL and selectively block access to certain domains reformat web pages (for instances, by stripping out images to make a page easier to display on a handheld or other limited-resource client), perform other transformations and filtering.
Normally, web servers log all incoming requests for resources. client IP address the browser or other client program that they are using (called the User- Agent date and time the requested file.
a proxy hides client personally identifiable information,
All requests coming from clients using the same proxy appear to come from the IP address and User-Agent of the proxy itself
Ing. Pierluigi Gallo Other ideas (taken from papers)
Web Proxy Servers and Off-peak Prefetching Remote Clients Web Servers Web Proxy Server
(Cache) Fire wall Peak Level Bandwidth Bandwidth
0 Ing. Pierluigi Gallo Day 1 Day 2 Caching, proxing, filtering
Content delivery networks Codeen
Content Filtering Dansguardian, squidguard, …
http proxy squid, tinyproxy, Apache Traffic Server, …
firewall iptables, firehol, …
Ing. Pierluigi Gallo get your hand dirty
/usr/local/etc/dansguardian.conf
/usr/local/etc/dansguardian/lists/bannedsitelist /usr/local/etc/tinyproxy.con
browser configuration (to use proxy) run a web server
in our lab we already have a proxy! We need to instruct ourtesting proxy to contact the ‘official’ one.
http://alien.slackbook.org/dokuwiki/doku.php?id=slackware:parentalcontrol
Ing. Pierluigi Gallo