Today DNS Hostname Versus IP Address Many Uses Of
Total Page:16
File Type:pdf, Size:1020Kb
Today 1. Domain Name System (DNS) primer Content Distribution Networks 2. The Web: HTTP, hosting, and caching 3. Content distribution networks (CDNs) COS 418: Distributed Systems Lecture 24 Kyle Jamieson [Selected content adapted from M. Freedman, B. Maggs and S. Shenker] 2 DNS hostname versus IP address Many uses of DNS • DNS host name (e.g. www.cs.princeton.edu) • Hostname to IP address translation – Mnemonic name appreciated by humans – IP address to hostname translation (reverse – Variable length, full alphabet of characters lookup) – Provides little (if any) information about location • Host name aliasing: other DNS names for a host • IP address (e.g. 128.112.136.35) – Alias host names point to canonical hostname – Numerical address appreciated by routers – Fixed length, decimal number – Hierarchical address space, related to host location • Email: Lookup domain’s mail server by domain name 3 4 1 Original design of the DNS DNS: Goals and non-goals • Per-host file named /etc/hosts • A wide-area distributed database – Flat namespace: each line = IP address & DNS name – SRI (Menlo Park, California) kept the master copy • Goals: – Everyone else downloads regularly – Scalability; decentralized maintenance – Robustness • But, a single server doesn’t scale – Global scope – Traffic implosion (lookups and updates) • Names mean the same thing everywhere – Single point of failure – Distributed updates/queries – Good performance • Need a distributed, hierarchical collection of servers • But don’t need strong consistency properties 5 6 Domain Name System (DNS) The DNS namespace is hierarchical • Hierarchical name space divided into contiguous sections called zones . Root – Zones are distributed over a collection of DNS servers TLDs: com. gov. edu. • Hierarchy of DNS servers: – Root servers (identity hardwired into other servers) fcc.gov. princeton.edu. nyu.edu. – Top-level domain (TLD) servers cs.princeton.edu. – Authoritative DNS servers • Hierarchy of namespace matches hierarchy of servers • Performing the translations: – Local DNS servers located near clients • Set of nameservers answers queries for names within zone – Resolver software running on clients • Nameservers store names and links to other servers in tree 7 8 2 DNS root nameservers DNS root nameservers • 13 root servers. Does this scale? • 13 root servers. Does this scale? • Each server is really a cluster of servers (some geographically distributed), replicated via IP anycast A Verisign, Dulles, VA A Verisign, Dulles, VA C Cogent, Herndon, VA C Cogent, Herndon, VA (also Los Angeles, NY, Chicago) D U Maryland College Park, MD D U Maryland College Park, MD G US DoD Vienna, VA K RIPE London G US DoD Vienna, VA K RIPE London (plus 16 other locations) H ARL Aberdeen, MD H ARL Aberdeen, MD J Verisign I Autonomica, Stockholm J Verisign (21 locations) I Autonomica, Stockholm (plus 29 other locations) E NASA Mt View, CA E NASA Mt View, CA F Internet Software F Internet Software Consortium, M WIDE Tokyo Consortium, M WIDE Tokyo Palo Alto, CA Palo Alto, CA plus Seoul, Paris, (and 37 other locations) San Francisco B USC-ISI Marina del Rey, CA B USC-ISI Marina del Rey, CA L ICANN Los Angeles, CA L ICANN Los Angeles, CA 9 10 TLD and Authoritative Servers Local name servers • Top-level domain (TLD) servers • Do not strictly belong to hierarchy – Responsible for com, org, net, edu, etc, and all top- level country domains: uk, fr, ca, jp • Each ISP (or company, or university) has one – Network Solutions maintains servers for com TLD – Also called default or caching name server – Educause non-profit for edu TLD • When host makes DNS query, query is sent to its local • Authoritative DNS servers DNS server – An organization’s DNS servers, providing – Acts as proxy, forwards query into hierarchy authoritative information for that organization – Does work for the client – May be maintained by organization itself, or ISP 11 12 3 DNS resource records DNS in operation • DNS is a distributed database storing resource records • Most queries and responses are UDP datagrams • Resource record includes: (name, type, value, time-to-live) – Two types of queries: Type = A (address) Type = CNAME • name = hostname • name = alias for some • Recursive: Nameserver responds with answer or error • value is IP address “canonical” (real) name Client Nameserver • value is canonical name www.princeton.edu? Type = NS (name server) Type = MX (mail exchange) Answer: www.princeton.edu A 140.180.223.42 • name = domain (e.g. • name = domain princeton.edu) • value is name of mail • value is hostname of server for that domain • Iterative: Nameserver may respond with a referral authoritative name server for Client Nameserver this domain www.princeton.edu? Referral: .edu NS a.edu-servers.net. 13 14 A recursive DNS lookup Recursive versus iterative queries . (root) authority 198.41.0.4 Recursive query Iterative query edu.: NS 192.5.6.30 com.: NS 158.38.8.133 • Less burden on entity • More burden on query io.: NS 156.154.100.3 initiating the query initiator edu. authority 192.5.6.30 www.princeton.edu? princeton.edu.: NS 66.28.0.14 pedantic.edu.: NS 19.31.1.1 • More burden on • Less burden on Contact 192.5.6.30 for edu. Client nameserver (has to return nameserver (simply www.princeton.edu? Contact 66.28.0.14 an answer to the query) refers the query to www.princeton.edu? for princeton.edu. www.princeton.edu? another server) www.princeton.edu A 140.180.223.42 • Most root and TLD servers won’t answer (shed load) Local nameserver princeton.edu. authority 66.28.0.14 – Local name server . (root): NS 198.41.0.4 www.princeton.edu.: A 140.180.223.42 answers recursive edu.: NS 192.5.6.30 query princeton.edu.: NS 66.28.0.14 www.princeton.edu.: A 140.180.223.42 15 16 4 $ dig @a.root-servers.net www.freebsd.org +norecurse (authoritative for org.) ;; Got answer: $ dig @199.19.54.1 www.freebsd.org +norecurse ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57494 ;; Got answer: ;; QUERY: 1, ANSWER: 0, AUTHORITY: 2, ADDITIONAL: 2 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39912 ;; QUESTION SECTION: ;; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 0 ;www.freebsd.org. IN A ;; QUESTION SECTION: ;; AUTHORITY SECTION: ;www.freebsd.org. IN A org. 172800 IN NS b0.org.afilias-nst.org. org. 172800 IN NS d0.org.afilias-nst.org. ;; AUTHORITY SECTION: freebsd.org. 86400 IN NS ns1.isc-sns.net. ;; ADDITIONAL SECTION: freebsd.org. 86400 IN NS ns2.isc-sns.com. b0.org.afilias-nst.org. 172800 IN A 199.19.54.1 freebsd.org. 86400 IN NS ns3.isc-sns.info. d0.org.afilias-nst.org. 172800 IN A 199.19.57.1 Glue records [Output edited for clarity] 17 [Output edited for clarity] 18 (authoritative for freebsd.org.) $ dig @ns1.isc-sns.net www.freebsd.org +norecurse DNS caching ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17037 ;; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 3 • Performing all these queries takes time – And all this before actual communication takes place ;; QUESTION SECTION: ;www.freebsd.org. IN A • Caching can greatly reduce overhead ;; ANSWER SECTION: – The top-level servers very rarely change www.freebsd.org. 3600 IN A 69.147.83.33 • Popular sites visited often ;; AUTHORITY SECTION: – Local DNS server often has the information cached freebsd.org. 3600 IN NS ns2.isc-sns.com. freebsd.org. 3600 IN NS ns1.isc-sns.net. freebsd.org. 3600 IN NS ns3.isc-sns.info. • How DNS caching works ;; ADDITIONAL SECTION: – All DNS servers cache responses to queries ns1.isc-sns.net. 3600 IN A 72.52.71.1 – Responses include a time-to-live (TTL) field ns2.isc-sns.com. 3600 IN A 38.103.2.1 ns3.isc-sns.info. 3600 IN A 63.243.194.1 • Server deletes cached entry after TTL expires Plays a key role in CDN (Akamai) load balancing [Output edited for clarity] 19 20 5 Today Anatomy of an HTTP/1.0 web page fetch 1. Domain Name System (DNS) primer • Web page = HTML file + Client Server embedded images/objects 2. The Web: HTTP, hosting, and caching • Stop-and-wait at the granularity of objects: – Close then open new TCP connection for each object Web page 3. Content distribution networks (CDNs) • Incurs a TCP round-trip- time delay each time – Each TCP connection may stay in “slow start” Objects … 21 22 HTTP/1.0 webpage fetch: Timeline Letting the TCP connection persist • Known as HTTP keepalive Bytes Client Server HTTP/1.0 finish received • Still stop-and-wait at the Object downloads granularity of objects, at the application layer – HTTP response fully received before next HTTP GET dispatched … • ≥ 1 RTT per object Time (milliseconds) • Fetch 8.5 Kbyte page with 10 objects, most < 10 Kbyte 23 24 6 HTTP Keepalive avoids TCP slow starts Pipelining within HTTP • Idea: Pipeline HTTP GETs and Bytes Client Server HTTP/1.0 finish their responses received Keep-alive finish • Main benefits: 1. Amortizes the RTT across multiple objects retrieved … 2. Reduces overhead of HTTP requests, packing multiple requests into one packet … Time (milliseconds) • Implemented in HTTP/1.1 Incur one slow start, but stop-and-wait to issue next request 25 Pipelined HTTP requests overlap RTTs Today Bytes 1. Domain Name System (DNS) primer HTTP/1.0 finish received Keep-alive finish HTTP/1.1 2. The Web: HTTP, hosting, and caching finish – Handling heavy loads 3. Content distribution networks (CDNs) Time (milliseconds) • Many HTTP requests and TCP connections at once • Overlaps RTTs of all requests 28 7 Hosting: Multiple machines per site Hosting: Load-balancer approach • Problem: Overloaded popular web site • Solution #2: Single IP address, multiple machines – Replicate the site across multiple machines – Run multiple machines behind a single IP address • Helps to handle the load • Want to direct client to a particular replica.