Link Prefetching in Mozilla: a Server-Driven Approach ∗
Total Page:16
File Type:pdf, Size:1020Kb
Link Prefetching in Mozilla: A Server-Driven Approach ∗ Darin Fisher Gagan Saksena [email protected] [email protected] Abstract siderably between products. For example, some of these products analyze the contents of pages or the This paper provides a synopsis of a server-driven user's browsing history to predict what the user link prefetching mechanism that we have designed might visit next. Indeed, a large body of research and implemented for the Mozilla web browser, a covering a variety of approaches to prefetching exist popular Open Source web browser. The mecha- on this topic [10, 11, 12, 13, 15, 16]. Other appli- nism depends on the origin server or an interme- cations rely on some cooperation between browser diate proxy server determining the best set of doc- and server. uments for the browser to prefetch. The browser follows prefetch directives provided by the server, In addition, content authors have devised ad-hoc either embedded in an HTML document using the JavaScript-based techniques to prefetch content. <LINK> tag or specified via Link HTTP response These generally involve loading content into hid- headers. The browser determines when best to pre- den <IMG> or <IFRAME> HTML elements so as to fetch the specified URLs based on its own heuristics. pre-populate the browser's document cache. While In this paper, we describe the mechanism and dis- these JavaScript-based approaches have the bene- cuss some of the practical issues that impacted its fit of working with most browsers, they often re- design and implementation. sult in complex JavaScript code that is difficult to maintain. With such techniques, the browser is not 1 Introduction able to prioritize normal requests above prefetch re- quests. As a result prefetch requests may end up Web caching is a common optimization used by web competing with normal requests for network band- browsers to reduce latency for the user when they width. Additionally, JavaScript-based approaches re-visit a web page. For narrow-band Internet con- afford the server little opportunity to control traf- nections, the performance gain can be dramatic. It fic and limit server load resulting from prefetch re- is of interest therefore to consider mechanisms that quests. would enable the web browser to pre-populate its cache of documents, such that when the user ini- In this paper we describe a link prefetching mech- tially views a web page, the latency will be nearly anism that involves modifications to both the reduced to that of loading the web page from the browser and the server. It is designed to pro- local cache. This technology is commonly referred vide a solution to prefetching that is \friendlier" to as prefetching and has been shown to be effective to both client and server, specifically addressing in reducing latency in the Web [14]. some of the shortcomings of existing ad-hoc meth- ods. The browser follows special directives from Despite the potential benefit of prefetching, it has the web server (or proxy server) that instruct it to not been widely deployed at this time. In addi- prefetch specific documents. The server can inject tion to the family of Mozilla-based browsers, only these directives into an HTML document or into a few production web browsers (e.g., WebTV and the HTTP response headers. The browser responds iCab) and personal web proxies (e.g., NetSonic Pro, to the prefetch directives after it finishes rendering Naviscope, and Wcol) support some form of link the document (or HTTP response) containing the prefetching. The methods of prefetching vary con- directives. This mechanism allows servers to control precisely what is prefetched by the browser, and it ∗This work was performed at Netscape/AOL during 2002. allows the browser to determine when best to pre- [The \next" link relation type] refers to the fetch documents based on local network inactivity, next document in a linear sequence of doc- browser idle time, etc. uments. User agents may choose to pre-load the \next" document, to reduce the perceived We chose a server-driven approach to link prefetch- load time. ing because we believe that servers (including in- termediate proxy servers) can better predict what We have implemented this mechanism in the users are likely to visit next. Since pages often con- Mozilla browser along with some variations on this tain numerous hyperlinks that the user is unlikely same approach, which are described below. The to follow, it is generally difficult for the browser to WebTV browser also follows this recommendation determine the best subset of hyperlinks to prefetch. [1]. (Our prior approach involved extending all For example, an origin server or content author may linked tags with a prefetch marker attribute [2] know that the user is browsing through a series of and using pathfiles [3] to prefetch dynamic content. large images or documents (e.g., an image slide- That technique had its limitations and has since show or pages in a book) in which case it would be evolved into our current implementation.) advantageous for the browser to prefetch the next image or document in the series. Or, while the user Motivated by the fact that the \next" relation is is confronted with a login prompt, the site might intended to represent the \next" document in a like to have the browser prefetch the portions of the linear sequence of documents, and given that web site that do not require user authentication. In such pages should be able to specify multiple documents cases there is a static relationship between the pre- to be prefetched, we chose to have Mozilla addi- vious and the next document, and static prefetch tionally recognize the link relation type \prefetch" directives can be encoded into the documents to in- to indicate a document that should be prefetched. dicate what should be prefetched next. In addition This link relation type currently is not standard- to static prefetch directives, a server might dynami- ized; however, the XHTML 2.0 draft specification cally inject prefetch directives into outbound HTTP [9] declares the \prefetch" link relation type. responses for the benefit of a link prefetching en- Examples of these two types of link prefetching abled browser. For example, the server may want HTML directives follows: to instruct the browser to prefetch the most popular URLs on a given day. Because this technique works <LINK rel="next" href="next.html"> with HTTP headers, a proxy can very well generate <LINK rel="prefetch" href="big.jpg"> such dynamically determined directives for popular Suppose next.html contains an <IMG> tag, which links. would cause big.jpg to be loaded. These tags What follows is a description of the link prefetching might be found in a document in which next.html mechanism along with a number of practical issues is likely to be the next document visited by the user. that shaped our design and implementation of link As a result of these tags, the browser would silently prefetching for the Mozilla browser, a popular Open prefetch the two documents while the user is per- Source web browser. We added our implementation haps busy reading through the current document. to version 1.2 and 1.0.2 of the Mozilla browser. It Link prefetching directives specified solely in the is also included in Netscape 7.01 and other recent body of a document have limited utility in solving Mozilla-based browsers. problems in which content to be prefetched might be 2 Prefetching Directives dynamically determined by the origin server or an intermediate proxy server. To support such config- The HTML 4.01 specification [7] briefly mentions urations, Mozilla additionally recognizes link pre- an opportunity for browsers to prefetch URLs indi- fetching directives specified via the Link HTTP cated by content authors via the <LINK> tag. The header. The Link HTTP header appears in sec- description of the \next" link relation type (from tion 19.6.2.4 of RFC 2068 [6]; however, the updated section 6.12 of [7]) says the following: version of the HTTP 1.1 standard, RFC 2616 [8] does not define it. Use of this header, for example to enable servers to apply style sheets to a group page, directs the browser to load something else. of documents, is described in section 14.6 of the However, this approach ignores the problem of mul- HTML 4.01 specification [7]. The link prefetching tiple applications on the user's system competing example from above could be equivalently written for network bandwidth. To minimize this problem as follows: the Mozilla browser prefetches documents sequen- Link: <next.html>; rel="next" tially, deferring the next prefetch until after the Link: <big.jpg>; rel="prefetch" former has completed. While this is not an ideal solution, it minimizes the problem by reducing the Note that as with many HTTP response head- prefetching application's demand on the operating ers, the Link header may also appear in a <META> system's networking subsystem. HTML tag via the http-equiv attribute. This is described in section 7.4.4 of the HTML 4.01 speci- Given that web pages are commonly composed fication [7]. of multiple HTML frames, the browser may see link prefetching directives from one or more of the We recommend the HTTP-level Link header mech- frames. Therefore, the Mozilla browser does not anism above the other mechanisms since it allows begin prefetching until all frames, and any in-line proxy servers to more easily intercept prefetch di- content they contain, have finished loading. This rectives and, for example, process them on behalf approximately corresponds to the time at which the of a browser that does not support link prefetch- DOM's onLoad event handler for the HTML <BODY> ing.