
Hypertext Transfer Protocol (HTTP) Fundamentals Péter Jeszenszky Faculty of Informatics, University of Debrecen [email protected] Last modified: September 10, 2021 Contents ● Introduction, concepts ● Messages ● Methods ● Status codes ● Content negotiation ● Web server software 2 Hypertext Transfer Protocol ● A stateless application-level protocol for distributed, collaborative, hypertext information systems. ● Was developed as a joint effort between IETF and W3C. ● See: IETF HTTP Working Group https://httpwg.org/ 3 Characteristics ● A request/response protocol based on the client-server model. ● Stateless – I.e., subsequent requests are treated independent of each other. ● Extensible – E.g., methods, status codes, header fields. ● General-purpose – Although mainly used for communication between clients and web servers, in principle, can be used for any other purpose. 4 History (1) ● The first documented version: – HTTP 0.9 (Tim Berners-Lee) https://www.w3.org/Protocols/HTTP/AsImplemented.html ● Very simple, supports only GET requests for which a HTML document consisting of ASCII characters is sent back as a response. ● HTTP/1.0: – Tim Berners-Lee, Roy T. Fielding, Henrik Frystyk Nielsen, Hypertext Transfer Protocol—HTTP/1.0, RFC 1945, May 1996. https://www.rfc-editor.org/rfc/rfc1945 ● Uses MIME-like messages that also contain meta-information about enclosed content. – Supports not only the transmission of HTML documents but also of any other media types. ● Supports multiple methods (GET, HEAD, POST, PUT, DELETE, LINK, ULINK). ● Authentication (basic authentication) ● … 5 History (2) ● HTTP/1.1: – Roy T. Fielding, James Gettys, Jeffrey C. Mogul, Henrik Frystyk Nielsen, Tim Berners-Lee, Hypertext Transfer Protocol —HTTP/1.1, RFC 2068, January 1997. https://www.rfc-editor.org/rfc/rfc2068 ● New features: persistent connections, content negotiation, more sophisticated caching, range requests, … – Roy T. Fielding, James Gettys, Jeffrey C. Mogul, Henrik Frystyk Nielsen, Larry Masinter, Paul J. Leach, Tim Berners- Lee, Hypertext Transfer Protocol—HTTP/1.1, RFC 2616, June 1999. https://www.rfc-editor.org/rfc/rfc2616 ● An update to RFC 2068. 6 Current Standard ● Roy T. Fielding (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, RFC 7230, June 2014. https://www.rfc-editor.org/rfc/rfc7230 ● Roy T. Fielding (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, RFC 7231, June 2014. https://www.rfc-editor.org/rfc/rfc7231 ● Roy T. Fielding (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests, RFC 7232, June 2014. https://www.rfc-editor.org/rfc/rfc7232 ● Roy T. Fielding (ed.), Yves Lafon (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Range Requests, RFC 7233, June 2014. https://www.rfc-editor.org/rfc/rfc7233 ● Roy T. Fielding (ed.), Mark Nottingham (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Caching, RFC 7234, June 2014. https://www.rfc-editor.org/rfc/rfc7234 ● Roy T. Fielding (ed.), Julian F. Reschke (ed.), Hypertext Transfer Protocol (HTTP/1.1): Authentication, RFC 7235, June 2014. https://www.rfc-editor.org/rfc/rfc7235 7 Secure HTTP ● Eric Rescorla, HTTP Over TLS, RFC 2818, May 2000. https://www.rfc-editor.org/rfc/rfc2818 – Originally, this specification defined the https URI scheme, that is now defined by RFC 7230. ● Rohit Khare, Scott Lawrence, Upgrading to TLS Within HTTP/1.1, RFC 2817, May 2000. https://www.rfc-editor.org/rfc/rfc2817 ● Tim Dierks, Eric Rescorla, The Transport Layer Security (TLS) Protocol Version 1.3, RFC 8446, August 2018. https://www.rfc-editor.org/rfc/rfc8446 8 HTTP/2 ● The next major version of HTTP after HTTP/1.1. ● Web page: https://http2.github.io/ ● Specifications: – Mike Belshe, Roberto Peon, Martin Thomson (ed.), Hypertext Transfer Protocol Version 2 (HTTP/2), RFC 7540, May 2015. https://www.rfc-editor.org/rfc/rfc7540 – Roberto Peon, Herve Ruellan, HPACK: Header Compression for HTTP/2, RFC 7541, May 2015. https://www.rfc-editor.org/rfc/rfc7541 9 Sessions ● A session is a sequence of requests and responses between a client and a server. ● The HTTP protocol, by nature, is stateless and does not provide support for session management. ● Session management can be implemented with the help of cookies. 10 How it Works GET /index.html HTTP/1.1 User-Agent: Browser Host: www.example.com Accept: */* HTTP/1.1 200 OK Date: Fri, 23 Aug 2019 13:15:42 GMT Content-Type: text/html Content-Length: 1024 <!DOCTYPE html> <html> <head> <title>Hello, world!</title> ... 11 curl ● Command line tool (curl) and library (libcurl) for transferring data that supports a number of protocols. https://curl.se/ https://github.com/curl/curl – Written in: C – Platform: Linux, macOS, Windows, … – License: X11 License ● Supported protocols: FTP, HTTP, HTTPS, SCP, SFTP, … 12 HTTPie ● Command line HTTP client. https://httpie.io/ https://github.com/jakubroztocil/httpie/ – Written in: Python – Platform: Linux, macOS, Windows – License: New BSD License 13 Web Developer Tools ● Chromium, Google Chrome, Opera: – Chrome Developer Tools (DevTools) https://developer.chrome.com/devtools ● Firefox: – Firefox Developer Tools https://developer.mozilla.org/docs/Tools ● Safari: – Web Development Tools https://developer.apple.com/safari/tools/ https://support.apple.com/guide/safari-developer ● Chromium-based Edge: – Microsoft Edge (Chromium) Developer Tools https://docs.microsoft.com/en-us/microsoft-edge/devtools-guide-chro mium 14 Further Tools ● Postman https://www.postman.com/ – Available as a native application. ● Platform: macOS, Linux, Windows ● License: non-free ● Further information: Postman Learning Center https://learning.postman.com/ 15 Terminology (1) ● Resource: The target of an HTTP request identified by a URI. ● Representation: – Information that is intended to reflect a past, current, or desired state of a given resource. – Can be readily communicated via the protocol. – Consists of a set of representation metadata and a potentially unbounded stream of representation data. ● Content negotiation: – An origin server might be provided with, or be capable of generating, multiple representations that are each intended to reflect the current state of a target resource. – Content negotiation is a mechanism for selecting the most appropriate representation to a given request. – This representation is called the selected representation. 16 Terminology (2) ● Message: The basic unit of HTTP communication. ● Payload: – A representation transmitted in a message. 17 Terminology (3) ● The terms client and server refer only to the roles that programs perform for a particular connection. The same program might act as a client on some connections and a server on others. – Client: A program that establishes a connection to a server for the purpose of sending one or more HTTP requests. – Server: A program that accepts connections in order to service HTTP requests by sending HTTP responses. 18 Terminology (4) ● User agent: A client program that initiates a HTTP request. – E.g., web browser, web crawler, command line tool (curl, wget), custom application, … ● Origin server: A program that can originate authoritative responses for a given target resource. ● Sender/recipient: A program that sends or receives a given message, respectively. 19 Terminology (5) ● Intermediary: allows requests to be satisfied through a chain of connections. – There are three types of intermediaries: proxy, gateway, tunnel. 20 Intermediaries User agent Intermediary Intermediary Origin server Implementation Diversity ● Both user agents and origin servers can be of many kinds. – User agents: general-purpose browsers, household appliances, entertainment devices, command line tools, mobile apps, … – Origin servers: web servers, configurable networking components, office machines, autonomous robots, traffic cameras, … 22 The http and https URI Schemes (1) ● Defined for the purpose of identifying resources on a potential origin server listening for connections on a given TCP port. – https uses a TLS-secured connection for communication. ● Syntax: – 'http://' host [':' port] [path] ['?' query] ● If the port subcomponent is not given, TCP port 80 is the default. – 'https://' host [':' port] [path] ['?' query] ● If the port subcomponent is not given, TCP port 443 is the default. ● The path must start with a '/' character or must be empty. 23 The http and https URI Schemes (2) ● The origin server for a URI is identified by the host component and the optional port component. – The path and the optional query component identifies a potential target resource within that origin server's namespace. ● Note that the presence of a URI does not imply that there is always an HTTP server listening for connections on the given host and port. 24 The http and https URI Schemes (3) ● URI comparison: – An empty path component is equivalent to a path of '/'. – The scheme and host components are case-insensitive and normally provided in lowercase. All other components are compared in a case- sensitive manner. – Characters other than those in the “reserved” set are equivalent to their percent-encoded octets. ● For example, the following URIs are equivalent: – http://www.inf.unideb.hu/, http://www.inf.unideb.hu:80/, http://www.inf.unideb.hu, http://www.inf.unideb.hu:80 – http://www.inf.unideb.hu/~jeszy/, http://www.inf.unideb.hu/%7Ejeszy/, HTTP://www.INF.UNIDEB.hu/~jeszy/ 25 Messages ● There are two types of messages: – Request
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages96 Page
-
File Size-