Adaptive Website Design Using Caching Algorithms

Total Page:16

File Type:pdf, Size:1020Kb

Adaptive Website Design Using Caching Algorithms Adaptive Website Design using Caching Algorithms Justin Brickell Inderjit S. Dhillon Dharmendra S. Modha Department of Computer Department of Computer IBM Almaden Research Sciences Sciences Center University of Texas at Austin University of Texas at Austin San Jose, CA, USA Austin, TX, USA Austin, TX, USA [email protected] [email protected] [email protected] ABSTRACT Keywords Visitors enter a website through a variety of means, includ- Data Mining, Adaptive Web Sites, Caching, Pattern Mining, ing web searches, links from other sites, and personal book- Access Logs, Shortcutting marks. In some cases the first page loaded satisfies the vis- itor’s needs and no additional navigation is necessary. In other cases, however, the visitor is better served by content 1. INTRODUCTION located elsewhere on the site found by navigating links. If As websites increase in complexity, they run headfirst into the path between a user’s current location and his eventual a fundamental tradeoff: the more information that is avail- goal is circuitous, then the user may never reach that goal or able on the website, the more difficult it is for visitors to will have to exert considerable effort to reach it. By mining pinpoint the specific information that they are looking for. site access logs, we can draw conclusions of the form “users A well-designed website limits the impact of this tradeoff, so who load page p are likely to later load page q.” If there is that even if the amount of information is increased signifi- no direct link from p to q, then it would be advantageous cantly, locating that information becomes only marginally to provide one. The process of providing links to users’ more difficult. Typically, site designers ease information eventual goals while skipping over the in-between pages is overload by organizing the site content into a hierarchy of called shortcutting. Existing algorithms for shortcutting re- topics, and then providing a navigational tree that allows quire substantial offline training, which make them unable visitors to descend into the hierarchy and find the informa- to adapt when access patterns change between training ses- tion they are looking for. In their paper on adaptive web- sions. We present improved online algorithms for shortcut site design [12], Perkowitz and Etzioni describe these static, link selection that are based on a novel analogy drawn be- master-designed websites as “fossils cast in HTML.” They tween shortcutting and caching. In the same way that cache claim that a site designer’s a priori expectations for how a algorithms predict which memory pages will be accessed in site will be used and navigated are likely to inaccurately re- the future, our algorithms predict which web pages will be flect actual usage patterns, especially as the site adds new accessed in the future. Our algorithms are very efficient and content over time. As it is infeasible for even the most ded- are able to consider accesses over a long period of time, but icated site designer to understand the goals and access pat- give extra weight to recent accesses. Our experiments show terns of all site visitors, Perkowitz and Etzioni proposed significant improvement in the utility of shortcut links se- building websites that mine their own access logs in order lected by our algorithm as compared to those selected by to automatically determine helpful self-modifications. existing algorithms. One example of a helpful modification is shortcutting, in which links are added between unlinked pages in order to al- Categories and Subject Descriptors low visitors to reach their intended destinations with fewer clicks. Typically a limit N is imposed on the maximum I.5.2 [Pattern Recognition]: Design Methodology; H.2.8 number of outgoing shortcuts on any one particular page. [Database Management]: Database Applications—Data The shortcutting problem can then be thought of as an op- Mining timization problem to choose the N shortcuts per page that minimize the number of clicks needed for future visitors to General Terms reach their goal pages. These shortcuts may be modified Algorithms, Experimentation at any time based on past accesses in order to account for anticipated changes in the access patterns of future visi- tors. Finding an optimal solution to this problem would require an exact knowledge of the future and a precise way Permission to make digital or hard copies of all or part of this work for of determining each user’s goal. However, shortcutting al- personal or classroom use is granted without fee provided that copies are gorithms must provide shortcuts in an on-line framework, not made or distributed for profit or commercial advantage and that copies so the shortcuts must be chosen without knowledge of fu- bear this notice and the full citation on the first page. To copy otherwise, to ture accesses. Rather than solving the optimization problem republish, to post on servers or to redistribute to lists, requires prior specific exactly, shortcutting algorithms use heuristics and analyze permission and/or a fee. WEBKDD’06, August 20, 2006, Philadelphia, Pennsylvania, USA. past accesses in order to provide good shortcuts. Copyright 2006 ACM 1-59593-444-8 ...$5.00. In this paper, we draw a novel analogy between shortcut- ting algorithms, which maintain an active set of shortcuts on tion session. Eirinaki and Vazirgiannis [6] give a survey of each page, and caching algorithms, which maintain an ac- the use of web mining for personalization. tive set of items in cache. The goal of caching algorithms— Operating at a level between global adaptations and in- maximizing the fraction of future accesses for items in the dividual adaptations are group adaptations. The mixture- active set—is analogous to the goal of shortcutting algo- model variants of the MinPath algorithm [1] are examples rithms. The main contribution of this paper is the Cache- of group-based shortcutting algorithms. When suggesting Cut algorithm for shortcutting. By using replacement poli- shortcuts to a website visitor, they first classify that visi- cies developed for caching applications, CacheCut is able tor based on browsing behavior, and then provide shortcuts to run with less memory than other shortcutting algorithms, that are thought to be useful to that class of visitors. Classi- while producing better results. A second contribution is the fying users requires examining the “trails” or “clickstreams” FrontCache algorithm, which uses the same caching tech- in the access log, which are the sequences of pages accessed niques in order to select pages to promote on the front page by individual visitors. Other researchers have investigated with direct links. trails without the intention of adapting a website. Banerjee The remainder of this paper is organized as follows. In and Ghosh [3] use trails to cluster users. Cooley et al. [5] Section 2 we discuss related work in adaptive website prob- discover association rules to find correlations such as “60% lems. In Section 3 we give definitions for terms that are used of clients who accessed page A also accessed page B.” Yang throughout the remainder of the paper. Section 4 gives a et al. [15] conduct temporal event prediction, in which they formulation of the shortcutting problem and presents two also estimate when the client is likely to access B. shortcutting algorithms from existing literature. In Sec- Our work follows the global model of shortcutting seen tion 5 we detail our CacheCut algorithm for shortcutting, in [10], in which shortcutting is viewed as a global adapta- and also describe the FrontCache algorithm for promot- tion that adds links to each page that are the same for every ing pages with links on the front page. Section 6 describes visitor. Like Perkowitz’ algorithm, when choosing shortcuts our experimental setup and the results of our experiments. for a page p we pay close attention to the number of times Finally, in Section 7, we offer some concluding thoughts and other pages q are accessed after p within a trail; however, suggest directions for future work. our algorithm provides improvements in the form of reduced memory requirements and higher-quality shortcuts. A re- 2. RELATED WORK lated work by Yang and Zhang [16] sought to create an im- proved replacement policy for website caching by analyzing Perkowitz and Etzioni [11] issued the original challenge the access log. In contrast, our work uses existing caching to the AI community to build adaptive web sites that policies to create an improved website. learn visitor access patterns from the access log in order to automatically improve their organization and presenta- tion. Their follow-up paper [12] presented several global 3. DEFINITIONS adaptations that affect the presentation of the website to Before describing CacheCut and other shortcutting al- all users. One adaptation from their paper is “index page gorithms from the literature, we provide some definitions synthesis,” in which new pages are created containing col- that we will use throughout the paper. lections of links to related but currently unlinked pages. In Site Graph. The site graph of a website with n unique his thesis [10], Perkowitz presents the shortcutting problem pages is a directed n-node graph G = (V, E) where epq ∈ E as a global adaptive problem, in which links are added if and only if there is a link from page p to page q. to each page to ease the browsing experience of all site Shortcut. A shortcut is a directed connection between visitors. Ramakrishnan et al. [13] have also done work in web pages p and q that were not linked in the original site global adaptation; they observe that frustrated users who graph, i.e., epq 6∈ E. cannot find the content they are looking for are apt to Shortcut Set.
Recommended publications
  • Lesson Plan – Internet Navigation and Search
    TEACHER’S Name: Nancy Ferguson REEP LEVEL(S): 200 & 250 LIFESKILLS UNIT: Work LESSON OBJECTIVE(S): Using the internet, research basic information on companies/organizations of employment interest. Language Objectives: 200: Identify job sources; ask and answer questions about jobs 250: Ask and answer questions about jobs; categorize jobs; conduct a modified search Technology Objectives #10-13 of REEP Technology Curriculum Identify the parts of a web page and website addresses; Access the internet by using a browser icon; Given a web address (URL), access the appropriate web site using a web browser; Navigate and find information on a particular web site by scrolling, clicking on links, and using the browser navigation and drop down menus. TECHNOLOGY INTEGRATION (if any): computer lab; internet; overhead projector LANGUAGE SKILLS TARGETED IN THIS LESSON (X all that apply): _X_Speaking _X_ Listening _X_ Reading _X_Writing ESTIMATED TIME: Approximately 3-4 hours over a few days including 2 lab sessions. RESOURCES AND MATERIALS NEEDED: “The Internet – Info Grid (200-250)” “Internet Navigation: Guided Note-Taking (200-250)” (Versions A and B); “Basic Internet Navigation – Partner Activity (200-250)” “Internet Search – Company Information (200-250)” (1 overhead copy + student copies); “Company Information – Info Grid (200-250)” Computers connected to internet Overhead projector Other Materials: Strips of paper or index cards; markers; tape LESSON PLAN AND TEACHER’S NOTES Pre-requisite: Basic Computer Vocabulary and Skills in REEP Technology Curriculum. Part 1 – Basic Internet Navigation In classroom… Motivation/Background Building 1 Help students focus on what they know about the Internet, and how (and how often) the Internet factors into their lives.
    [Show full text]
  • Lady Gaga Fails to Obtain Transfer of 'Fan Site' Domain Name International
    Lady Gaga fails to obtain transfer of ‘fan site’ domain name Cybersquatting International - Hogan Lovells November 09 2011 In Germanotta v oranges arecool XD (Claim No FA1108001403808), singer Stefani Germanotta, known as Lady Gaga, has lost her bid to gain control of the domain name ‘ladygaga.org’ on the basis that it was pointing towards a non-commercial fan website. The case was brought under the Uniform Domain Name Dispute Resolution Policy (UDRP) and filed with the National Arbitration Forum (NAF), based in Minneapolis, United States. The respondent was listed as oranges arecool XD. To be successful in a UDRP procedure, a complainant must evidence that: l the domain name is identical, or confusingly similar, to a trademark or service mark in which it has rights; l the respondent has no rights or legitimate interests in respect of the domain name; and l the domain name has been registered and is being used in bad faith. Gaga had no problem in proving the first requirement, as she had registered three federal LADY GAGA trademarks with the US Patent and Trademark Office in various classes. However, the three-member panel found that Gaga had not established that the respondent had no rights or legitimate interests under the second requirement. Given that the three requirements are cumulative, the complaint failed, and it was not necessary for the panel to consider the last requirement in relation to bad faith. The respondent asserted that she was operating a genuine non-commercial fan website at the domain name ‘ladygaga.org’, which contained no commercial links and included a prominent disclaimer, as follows: "Ladygaga.Org is just a unprofitable unofficial fansite, we do not get money from it.
    [Show full text]
  • Webguide for Schools: User’S Guide to Microsites
    WebGuide for Schools: User’s Guide to Microsites Table of Contents 1 What is a Microsite? ............................................................................................ 3 2 Tour of Your Microsite ........................................................................................ 3 2.1 The Homepage ................................................................................................................................................ 3 2.2 The Menu ......................................................................................................................................................... 4 3 Getting Started ..................................................................................................... 5 3.1 Creating a Microsite........................................................................................................................................ 5 3.2 Logging In ........................................................................................................................................................ 5 3.3 How Will People Find My Microsite? ........................................................................................................... 5 4 “About” Section .................................................................................................... 6 5 Pages ...................................................................................................................... 7 5.1 Adding a Page.................................................................................................................................................
    [Show full text]
  • Microsites and Web Portals
    LIFE SCIENCES MARKETING: InsiGhts FOR POsitiVE OUTCOMES in 2014 Microsites and Web Portals Wondering about the difference between a microsite and a web portal? Take a look at examples of each that Signal has developed on behalf of other companies in your space. Microsites navigation enabled users to easily locate the data they were seeking. Microsites are “mini websites” that More than 750 copies of the report deliver an engaging web experience. were downloaded. They are ideal for communicating with a very specific target audience, » Read full case study sharing detailed information about a product, service or content offering, Alma Lasers and increasing website “stickiness” by isolating key messages from other distractions. Below are microsites Signal has built for life sciences companies. Quintiles The Laser360 microsite targeted consumers interested in learning about aesthetic treatments. To drive engagement, Signal created a “before & after” photo gallery as well as a Physician Locator web tool. » Read full case study The New Health Report microsite was the main vehicle Signal developed to promote the analytical survey. Advanced Web Portals account managers. It was utilized to create customized marketing materials Web portals bring information from for more than 1,000 key accounts. diverse sources together in a uniform way. They can enable partners and » Read full case study internal stakeholders to access campaign information and easily Alma Lasers customize marketing materials. Below are web portals Signal has built for life sciences companies. Novartis Signal built the Office By Alma portal for physicians in support of a consumer campaign. It provided access to starter kit instructions, phone scripts, logos, print ad slicks, consultation guides, customizable direct mail pieces and To facilitate the distribution of program email marketing tools.
    [Show full text]
  • How to Set up a Lab Website Step
    How to Set Up a Lab Website A lab website is a great way to communicate your research. The college offers three avenues for setting up a website: 1. a free, WordPress based platform set up by the Communications Office 2. an at-cost option set up by the CVM Continuing Education Office 3. an at-cost option developed by an outside vendor and incorporated by the university’s Office of Information Technology Below is information on how to navigate all options. Step One • There are two important aspects to keep in mind with any website option: o content you include o adherence to branding • Be sure to gather your information and organize it in a way you find best represents your lab. Information to Consider Including • Research Emphasis • Publications • Current Projects • News, Awards, Presentations • Research team members • Opportunities • Contact information • Here is information for keeping the website on brand: https://brand.ncsu.edu/ Step Two • Reach out to the communications office about your website needs and what you envision, including the intended audience and what you wish to communicate. • It is highly suggested that you seek guidance on the various options by scheduling a meeting with the communications team (Tom Krupa and Mike Charbonneau) • Please populate the template provided by the communications office with the information you would like to add to your website Step Three • Once you have your content, you can decide if you want a Wordpress site or an at-cost site. • If you are going with the free in-house version, Tom Krupa ([email protected]) can take the template you filled out and set up the structure of the lab website.
    [Show full text]
  • Announce Your Wedding Online to Friends and Family with a Wedding Website
    Announce Your Wedding Online to Friends and Family with a Wedding Website Posted by TBN On 04/10/2012 Wedding planning seems to go on for months and months, but that final moment when it all comes together seems to be over within just the blink of an eye. One way to simplify your wedding planning is to create your own wedding website. A wedding website enables you to capture RSVP emails, make announcements about the wedding to friends and relatives, and more. What is a Wedding Website? A wedding website is your own Web URL or a certain amount of space hosted on another's website that announces your wedding and includes some information that possible guests would want to know. The website is dedicated solely to your wedding. It's a great way to let those nearby or abroad know about your wedding date and details, who will be in the wedding, and what to expect. The wedding website may contain photos of the bride and groom with special captions to explain each photo, the bride's and groom's names, the wedding date, an optional wedding song lyric, and a listing of the bride and groom parties. You can also post personal information about how you met, how the groom proposed, and who you are now. The website may also have a guestbook for visitors to leave comments for you. You can even list links to your bridal registries to make gift giving easy. Your Own Web Space versus Using Space on Another Site If you don't mind paying monthly fees for your own domain name and web hosting, you can get your own wedding website and hire a designer or design it on your own if you have the know-how.
    [Show full text]
  • An Analysis of Selection of Mirror Sites for Parallel Processing
    IJSART - Volume 5 Issue 4 – APRIL 2019 ISSN [ONLINE]: 2395-1052 An Analysis of Selection of Mirror Sites for Parallel Processing Pallavi Vishwakarma1, Neha Khare2 1, 2 Dept of Computer Science 1, 2 Takshshila Institute of Engineering & Technology, Jabalpur MP, India Abstract- With the recent development of internet, we are able that describes and gives information about other data. The to retrieve documents from multiple server sites, like the package metadata is usually just a copy of the embedded mirror sites, to increase the downloading speed, make better package metadata for all packages on the repository. Package use of available network bandwidth and parallel processing managers download the package metadata from a repository speed of servers. Recent work by Rodriguez, Kirpal, and so that they know which packages are available from that Biersack [1] studied how to use the existing HTTP protocol repository. for retrieving documents from mirror sites in parallel to reduce the download time and to improve the reliability. The proposed approach utilizes to retrieve specific data in a mirror server site, which requires no changes on existing server and client settings. In this Work, we present a mathematical model that describes the problem of parallel download from multiple mirror sites. Based on the model, we present algorithms for selecting the best subset of mirror sites for parallel download. The proposed algorithm is implemented. Keywords- Server selection, Parallel download, HTTP, Protocol, mirror site, HTTP, FTP, Meta data, NSGA-II, AMOSA. Fig-1 Application of Meta data I. INTRODUCTION This also provides the package manager with dependency information needed to perform dependency A mirror website is a site that is a reproduction of an resolution.
    [Show full text]
  • Create a Wedding Website with Jimdo
    Create a Wedding Website with Jimdo Table of Contents I. Introduction II. Start Creating Your Wedding Website A. Select a Template for Your Wedding Website B. Choose a Domain Name for Your Wedding Website III. What Pages Does a Wedding Website Need? IV. Accept RSVPs V. Link to Your Registry VI. Event Details VII. Search Engine Optimization for Wedding Websites VIII. Do’s and Don’ts for Wedding Websites IX. Wedding Website Checklist X. Additional Resources A. Example Wedding Websites B. High-Quality Free Photographs for your Wedding Website 1 Introduction First comes love, then comes marriage; somewhere in between comes creating the perfect wedding website. Despite all of the options out there right now, building a wedding website is the best solution for creating a centralized location where your guests can find your wedding date, locations for the ceremony and reception, and your registries. You can also share photos, video and other media, and accept RSVPs all from the same place. Jimdo makes building a wedding website easy. Any of our flexible designs can become a wonderful wedding website template. And our sites have tools to help you land near the top of Google’s results so if your guests forget your domain name, your site is a quick search away. It is also easy to integrate third-party tools like social media feeds, a countdown timer, and the weather forecast for the big day. This guide will teach you everything you need to know about creating your own wedding website with Jimdo. Start Creating Your Wedding Website Whether you’re planning for a big or small gathering, Jimdo is a great solution for creating your wedding’s website.
    [Show full text]
  • The List of Helpful Helpers the List of Helpful Helpers
    The List Of Helpful Helpers Thanks to Ann for Assembling the Costume and Character Photos list. Costume Photos and Character Photos Andrew Lloyd Webber musicals official website http://www.reallyuseful.com/rug/shows/cats/music/photos.htm One fan's photo gallery (with captions identifying each character, including views of the characters: Munkustrap, Grizabella, Old Deuteronomy, and the Rum Tum Tugger. http://www.catanna.com/catsphotos.htm Munkustrap, Victoria, Demeter, etc. http://entertainment.webshots.com/photo/2467430690075926526IpNYGI Great Performances: Cats - http://www.pbs.org/wnet/gperf/shows/cats.html The online companion to the PBS television series. Includes video clips, performers, and history of the play. CATS - The Unofficial Site - http://cats-musical.w.interia.pl Contains general information about the musical, lyrics, video photos, and Cats ASCII art. Official UK Tour website - http://www.catstour.co.uk The official website for the UK tour. Ahdeya's Little Corner of the Junkyard - http://www.twospotz.com/ Fan site with exciting fan fiction and fan art,information on the Swedish production, Who's who in Cats, Cats Synopsis, coloring gallery and online greetings. Demeta's Wickerbasket - http://www.angelfire.com/musicals/wickerbasket/ Information, pictures, and bloopers. Jellicle Magic - http://victorian.fortunecity.com/rothko/31/fram_en.htm The home of the Jellicle Chat. This site contains a history of Cats, a synopsis, the Jellicle connection, and links. English and Spanish. World of Jellicle Cats - http://members.tripod.com/~jellylorum/index.html Cats traders, fan corner, meet with other fans, Jellicle greeting cards, Cats fan clique, links, games and downloads, cats encounters, cats sightings, characters, discography, fan reviews, FAQ, lyrics, history, and photo album.
    [Show full text]
  • The Role of Responsive Design in Web Development
    Webology, Volume 14, Number 2, December, 2017 Home Table of Contents Titles & Subject Index Authors Index The Role of Responsive Design in Web Development Fernando Almeida Centre for Innovation, Technology and Entrepreneurship, INESC TEC, Porto, Portugal. E-mail: [email protected] José Monteiro Higher Polytechnic Institute of Gaya, ISPGaya, V.N.Gaia, Portugal. E-mail: [email protected] Received October 12, 2017; Accepted December 25, 2017 Abstract Responsive design allows software developers to build a Web page that can dynamically adapt to the size of the devices. This development philosophy enables the rendering of Web pages in a fast and optimized way, ensuring a good user experience on mobile devices, tablet and desktop. In the scope of this study, we intend to explore the main advantages and limitations associated with responsive Web design. We adopted a quantitative approach based on a questionnaire filled by 181 professionals in the industry that allowed us to identify the reasons that lead software developers to the adoption of the responsive design and also address the limitations felt by them. The results obtained indicate that offering a good user experience and increasing accessibility stands out as being the most important advantages. On the other hand, the main limitations include the compatibility with older Web browsers, the higher loading time and the difficulties in optimizing user experience. Finally, it was found that the perception of the advantages and limitations of responsive design is distinct for professionals with more professional experience in the field and for freelancer developers. Keywords Web development; Responsive design; Web designing; User experience 48 http://www.webology.org/2017/v14n2/a157.pdf Introduction The advent of the Web turned possible the appearance of new form of transmission ideas and contents in online environments.
    [Show full text]
  • Website, Content, and Design Terminology
    Website, Content, and Design Terminology ACCESSIBILITY: Relates to web design/coding standards and refers to how easy it is for everyone to use your website, including people who are visually impaired or in any way physically handicapped, or limited by older or less common computers and software. These days with the smaller screen sized tablets and smart-phones, accessibility for use on all devises is important; especially with the growing number of people using smaller screen devices to go online. ADDRESS BAR: The white bar towards the top of your computer screen. It will normally have something typed in it that starts with "http://" This is where you type in the address of a website that you want to visit. ANCHOR TEXT: The text a link (hyperlink) uses to refer to your web page. These make a difference in your search engine results. BACKLINKS: Links from other website pages to yours. Backlinks are used to increase a site’s popularity with search engines and to get more people to visit your site. The quality of a backlink and its anchor text is factored into Google’s algorithm when deciding how much importance to place on it. BANDWIDTH: It may help if you read "traffic" first, but very simply, bandwidth relates to how much a resource is used. An analogy would be a freeway. The wider the freeway, the more traffic (users) it can handle. The narrower it is, the less people can use it at once (without problems). When a website gets a lot of visitors, it will use a lot of bandwidth.
    [Show full text]
  • Using Twitter in University Research, Teaching and Impact Activities
    Impact of Social Sciences Maximizing the impact of academic research Using Twitter in university research, teaching and impact activities A guide for academics and researchers Amy Mollett, Danielle Moran and Patrick Dunleavy Twitter is a form of free micro-blogging which allows users to send and receive short public messages called tweets. Tweets are limited to no more than 140 characters, and can include links to blogs, web pages, images, videos and all other material online. You can start tweeting in 10 minutes, anytime, from your computer, smart phone or tablet. By following other people and sources you are able to build up an instant, personalized Twitter feed that meets your full range of interests, both academic and personal. Thousands of academics and researchers at all levels of experience and across all disciplines already use Twitter daily, alongside more than 200 million other users. Yet how can such a brief medium have any relevance to universities and academia, where journal articles are 3,000 to 8,000 words long, and where books contain 80,000 words? Can anything of academic value ever be said in just 140 characters? This guide answers these questions, showing you how to get started on Twitter and showing you how Twitter can be used as a resource for research, teaching and impact activities. © LSE Public Policy Group 2011 1 Setting up your Twitter account 1. Go to www.twitter.com/signup 2. Enter your name, email address and a memorable password. Choose a username which makes you easily identifiable to others. Some academics include their academic title in their username, but this is down to personal choice.
    [Show full text]