USOO8943 043B2

(12) United States Patent (10) Patent No.: US 8,943,043 B2 Lymberopoulos et al. (45) Date of Patent: Jan. 27, 2015

(54) DYNAMIC COMMUNITY-BASED CACHE 2007/0100650 A1* 5/2007 Ramer et al...... 705/1 FORMOBILE SEARCH 2007/014.3255 A1 6/2007 Mathur et al. 2007/0214131 A1 9, 2007 Cucerzan et al. 2008.OOO5695 A1 1/2008 OZZie et al. (75) Inventors: Dimitrios Lymberopoulos, Bellevue, 2008.007 1776 A1 3/2008 Cho et al. WA (US); Emmanouil Koukoumidis, 2008, 0071988 A1 3/2008 Schloter et al. 2008, 0207182 A1 8/2008 Maharajh et al. Athens (GR); Jie Liu, Medina, WA 2008/0243776 A1 10, 2008 Brunner et al. (US); Feng Zhao, Issaquah, WA (US); 2008/0270379 A1 10, 2008 Ramakrishna Douglas Christopher Burger, 2009,0287684 A1 11/2009 Bennett Redmond, WA (US) OTHER PUBLICATIONS (73) Assignee: Microsoft Corporation, Redmond, WA Ku et al., “Location-Based Spatial Query Processing in Wireless (US) Broadcast Environments', IEEE Transactions on Mobile Comput (*) Notice: Subject to any disclaimer, the term of this ing, vol. 7, No. 6, Jun. 2008.* patent is extended or adjusted under 35 (Continued) U.S.C. 154(b) by 378 days. Primary Examiner — James Trujillo (21) Appl. No.: 12/692,626 Assistant Examiner — Mohsen Almani (74) Attorney, Agent, or Firm — Steve Wight; Judy Yee: (22) Filed: Jan. 24, 2010 Micky Minhas (65) Prior Publication Data (57) ABSTRACT US 2011 FO184936A1 Jul. 28, 2011 A “Community-Based Mobile Search Cache' provides vari ous techniques for maximizing the number of query results (51) Int. Cl. served from a local “query cache', thereby significantly lim G06F I7/00 (2006.01) iting the need to connect to the Internet or cloud using or G06F 7/30 (2006.01) other wireless links to service search queries. The query (52) U.S. Cl. cache is constructed remotely and downloaded to mobile CPC ...... G06F 17/30902 (2013.01) devices. Contents of the query cache are determined by min USPC ...... 707/721 ing popular queries from mobile search logs, either globally (58) Field of Classification Search or based on queries of one or more groups or Subgroups of USPC ...... 707/767 users. In various embodiments, searching and browsing See application file for complete search history. behaviors of individual users are evaluated to customize the query cache for particular users or user groups. The content of (56) References Cited web pages related to popular queries may also be included in the query cache. This allows cached web pages to be dis U.S. PATENT DOCUMENTS played without first displaying cached search results when a 7,548,915 B2 6, 2009 Ramer et al. corresponding search result has a Sufficiently high click 2003/0225885 A1* 12/2003 Rochberger et al...... TO9,226 through probability. 2006, OOO2320 A1 1/2006 Costa-Requena et al. 2007/0061244 A1 3/2007 Ramer et al. 19 Claims, 7 Drawing Sheets

QUERY CACHE CREATON MOBILE COMMUNITY - ',

w

5 140 QUERYiLINKPAIR

EXTRACTIONMODULE

sorTED UERY. QUERY CACHE LINK, woME CONSTRUCTION Triples MODULE

: INCREMENTAL TRANSMISSION CACHEMODULE UPDATE - MODULE aaaaaaaaaaaaaaaa.

wes say AdveRTISEMENT . ENT --> (ACTRS650s MOBILE ...E.--- -suss sussesssss device (SEE Fig.2) US 8,943,043 B2 Page 2

(56) References Cited kamvarpdf?key1=1378046&key2=8765414621&coll= GUIDE&d=GUIDE&CFID=72436322&CFTOKEN= OTHER PUBLICATIONS 24847593>>, Proceedings of the 9th international conference on Human computer interaction with mobile devices and services, Sep. Hosseini et al., “Mining Search Engine Query Log for Evaluating 9-12, 2007, pp. 8. Content and Structure of a Web Site’, 2007 IEEE/WIC/ACM Inter Kamvar, et al., “Query Suggestions for Mobile Search: Understand national Conference on Web Intelligence.* ing Usage Patterns'. Retrieved at < http://www.esprockets.com/ Ozcan et al., “Space Efficient Caching of Query Results in Search papers/chi2008.pdf>>, CHI 2008, Apr. 5-10, 2008, pp. 4. Engines”, 2008 IEEE.* Kamvar, et al., "Computers and and mobile phones, oh my'. Hopfner, et al., “Towards Trie-Based Query Caching in Mobile Retrieved at << http://www2009.org/proceedings/pdf p801.pdf>. DBS”. Retrieved at << http://subs.emis.de/LNI Proceedings/Pro Proceedings of the 18th international conference on Worldwide web ceedings43/GI-Proceedings.43-11.pdf>>, pp. 106-121. 2009, Apr. 20-24, 2009, pp. 801-810. Said, et al., “Data Prefetching Algorithm in Mobile Environments', Markatos, et al., “A Top-10 Approach to Prefetching on the Web”, Retrieved at < http://www.eurojournals.com/ejsr 28 3 14. Retrieved at < http://www.ics.forth.gr/carv/r-d-activities/ww pdf>>, European Journal of Scientific Research, vol. 28, No. 3, 2009, pp. 478-491. wPerf INET98 prefetch/>>, Technical Report 173, ICS-FORTH, Agarwal, Sachin, “Intelligent Content Caching for Mobile Devices'. Aug. 1996, pp. 1-10. Retrieved at < http://www.deutsche-telekom-laboratories. “Mobile Local Search Ad Revenues to Reach $1.3B by 2013”, def-agarwals/publications/ict2006.pdf>>, Proceedings of The 13th Retrieved at << http://www.marketingcharts.com/interactive/mo International Conference on Telecommunications, May 9-12, 2006, bile-local-search-ad-revenues-to-reach-13b-by-2013>>. The Kelsey pp. 4. Group's Mobile Local Media Forecast (2008-2013), Feb. 25, 2009, Sorrel, Charlie, “2TB Memory Cards Coming Soon'. Retrieved at < pp. 1-2. http://www.wired.com/gadgetlab/2009/01/two-terabyte-sd), Gad "Monsoon Solutions'. Retrieved at < http://www.msoon.com/ get Lab, Hardware News and Review, Jan. 12, 2009, pp. 1. LabEquipment/PowerMonitors), Jan. 22, 2010, pp. 1. Balasubramanian, et al., “Enhancing Interactive Web Applications in Nanopoulos, et al., “A Data Mining Algorithm for Generalized Web Hybrid Networks'. Retrieved at <. Proceedings of the 14th ACM international con stamp.jsp?tp=&isnumber=&arnumber=1232270&tag=1>>, IEEE ference on Mobile computing and networking, Sep. 14-19, 2008, pp. Transactions on Knowledge and Data Engineering, vol. 15, No. 5, 11. Sep./Oct. 2003, pp. 1155-1169. Church, et al., “A Large Scale Study of European Mobile Search Padmanabhan, et al., “Using Predictive Prefetching to Improve Behaviour'. Retrieved at << http://delivery.acm.org/10.1145/ World WideWeb Latency”. Retrieved at <>, ACM 2834414621&col=GUIDE&d=GUIDE&CFID= SIGCOMM Computer Communication Review, vol. 26, No. 3, Jul. 72554973&CFTOKEN=14719436>, Proceedings of the 10th inter 1996, pp. 15. national conference on Human computer interaction with mobile Pitkow, et al., “Mining Longest Repeating Subsequences to Predict devices and services, Sep. 2-5, 2008, pp. 13-22. World Wide Web Surfing'. Retrieved at << https://eprints.kfupm. Church, et al., “Mobile Information Access: A Study of Emerging edusa/50895/1/50895.pdf>, Proceedings of USITS 99: The 2nd Search Behavior on the Mobile Internet”. Retrieved at < http:// USENIX Symposium on Internet Technologies & Systems Boulder, delivery.acm.org/10.1145/1240000/1232726/a4-church. Oct. 11-14, 1999, pp. 13. pdf?key1=1232726&key2=6084414621&col=GUIDE&dl= “Sony Ericsson Xperia Xla '. Retrieved at , ACM www.Sonyericsson.com/x 15, Jan. 22, 2010, pp. 1-2. Transactions on the Web, vol. 1, No. 1, May 2007, pp. 1-38. Teevan, et al. “Information Re-Retrieval: Repeat Queries in Yahoo’s Guha, et al., “Serving Ads from Localhost for Performance, Privacy, Logs”. Retrieved at << http://delivery.acm.org/10. 1145/1280000/ and Profit”. Retrieved at << http://adresearch.mpi-Sws.org/privad 1277770, p151-teevan.pdf?key1=1277770&key2= workshop.pdf>. In Proceedings of Hot Topics in Networking 5285414621&col=GUIDE&d=GUIDE&CFID= (HotNets), Oct. 2009, pp. 6. 74245115&CFTOKEN=57614160>. Proceedings of the 30th Kamvar, et al., “A Large Scale Study of Wireless Search Behavior: annual international ACM SIGIR conference on Research and devel Google Mobile Search”. Retrieved at <, CHI 2006 Proceedings • en-us/um/people?sdumais sigir2005-personalized search.pdf>. Search & Navigation: Mobiles & Audio, Apr. 22-27, 2006, pp. 701 Proceedings of the 28th annual international ACM SIGIR conference 709. on Research and development in information retrieval 2005, Aug. Kamvar, et al., “Deciphering Trends in Mobile Search”. Retrieved at 15-19, 2005, pp. 8. << http://www.maryamkamvar.com/publications/ Yi, et al., “Deciphering Mobile Search Patterns: A Study of Yahoo! KamvarBalujaComputerMagazine.pdf>>, Computer, vol. 40, No. 8, Mobile Search Queries'. Retrieved at << http://www2008.org/pa Aug. 2007, pp. 36-40. pers/pdf fp846-yipdf>>, WWW 2008, Apr. 21-25, 2008, pp. 257 Kamvar, et al., “The Role of Context in Query Input: Using Contex 266. tual Signals to Complete Queries on Mobile Devices'. Retrieved at << http://delivery.acm.org/10.1145/1380000/1378046/p405 * cited by examiner U.S. Patent Jan. 27, 2015 Sheet 1 of 7 US 8,943,043 B2

QUERY CACHE ------CREATION A. MOBILE COMMUNITY We

125 130 135 MOBILE MOBILE ... DEVICE DEVICE s ess a a e 105 as 120 es e COMMUNITY MOBILE SEARCHENGINE(S) a SEARCHLOGS

100 115 140

QUERY/LINK PAIR SEARCHLOG MINING EXTRACTION MODULE INTERNET CLOUD || MODULE WOLUMESORTING NETWORKO MODULE 110 150 145 -155 sess

OUERY CACHE CACHE SIZE SORTED QUERY, CONSTRUCTION -- SPECIFICATION: LINK, VOLUME MODULE MODULE TRIPLETS 165

175 170 ------Y------4 160

: INCREMENTAL CACHE CACHE UPDATE - TRANSMISSION MODULE QUERY CACHE(S) MODULE ......

......

190 185 a A------Y------

------ADVERTISEMENT ADVERTISEMENT ---> STORE CACHING MODULE : MOBILE DEVICE 180 (SEE FIG.2) FIG. 1 U.S. Patent Jan. 27, 2015 Sheet 2 of 7 US 8,943,043 B2

------1 215 Y QUERY CACHE USE CACHE UPDATE ONMOBILE DEVICE

MODULE

> PERSONAIZATION: MODULE.------V270

RE-RANKING 210 OUERY CACHE -----> MODULE firstreatist: (WITHOPTIONAL PAGE---, ------

MODULE DOCUMENT CACHE) - ERyoffs

CREATIONMODULE a 220 225 s 280 QUERY INPUT CACHE SEARCH MODULE MODULE (QUERYTERMIN CACHE) 23 240 ADa so INSERT voo AND REPORT-1- MODULE MOBILE COMMS OUERY OUERY ...:-- DISPLAY MODULE (3G, SEARCH RESULT DEVICE WiFi, ETC.) MODULE MODULE RESULT DISPLAY MODULE

LINK LINK INTERNET CLOUD DISPLAY (-- SELECT . P NETWORK MODULE MODULE ------265-1 t 180 FIG 2 U.S. Patent Jan. 27, 2015 Sheet 3 of 7 US 8,943,043 B2

AGGREGATE OUERY VOLUMEASA PERCENTAGE OF TOTAL OUERIES

s D - O c H a CD Y CD CD <

1OOOO

UNIOUE OUERIES

FIG. 3

AGGREGATE LINKVOLUMEASA PERCENTAGE OF TOTAL SELECTEDLINKS

T

5000 1OOOO 15000 20000 25000

UNICRUE LINKS

FIG. 4 U.S. Patent Jan. 27, 2015 Sheet 4 of 7 US 8,943,043 B2

CUMULATIVE OUERY-LINKVOLUMEAS A FUNCTION OF THE MOST POPULAR OUERY-LINK PARS ts "SATURATION /..noPOINT |

10000 2OOOO 30000 40000 50000 60000 70000

UNIQUE QUERY-LINK PAIRS

FIG. 5 U.S. Patent Jan. 27, 2015 Sheet 5 Of 7 US 8,943,043 B2

EXEMPLARY HASH QUERY LINKPAIR HAS BEEN ACCESSED TABLE ARCHITECTURE

OUERY HASH LINK 1 LINK #2 N FLAGs

HASH ("youtube",0) gag (O8761A49, 0.4) (98BA4311, 0.35) 0xFFFEFFFF HASH ("youtube", 1) 954.31A4A (A614311B 0.2) (CF432E1B, 0.1) 0xFFFFFFFF

HASH ("videos", 0) 1E51C65A (18EE1A49, OR) (08761A49, 0.01) 0xFFFFFFFF

HASH ("yotube", 0) 51BD34A3 (O8761A49, 0.3) VI (98BA4311, 0.01) 0xFFFFFFFF

HASH ("http://www.youtube.com”) = 08761A49

FIG. 6 U.S. Patent Jan. 27, 2015 Sheet 6 of 7 US 8,943,043 B2

EXTRACTING SEARCHRESULTS FROM THE LOCAL QUERY CACHE IN RESPONSE TO AUSERENTERED LOCAL QUERY (FOR N = 32"LINKS' FILES) OUERY CACHE

740 G5 ("http://m.facebook.com") = 51 BD34AF G saw %. 32 750

1E51C65A, 1000 . . . 51BD34AF,1400 N. ------

Welcome to Facebook Facebook Facebook is a social utility that connects people with friends and others who work, ...
m.facebook.com

Facebook Mobile Facebook is a social utility that connects people with friends and others who work, ... m.facebook.com FIG. 7 U.S. Patent Jan. 27, 2015 Sheet 7 Of 7 US 8,943,043 B2

GENERAL PURPOSE COMPUTER

DISPLAY DEVICE

SYSTEM MEMORY (RAM/FLASH/ETC.)

INPUT OUTPUT COMMUNICATIONS DEVICE(S) DEVICE(S) INTERFACE US 8,943,043 B2 1. 2 DYNAMIC COMMUNITY-BASED CACHE cache. This conventional Scheme has two drawbacks. First, FORMOBILE SEARCH connecting and fetching data every time the user enters a character might not always be feasible due to the high latency BACKGROUND of the 3G link. This can be especially true for mobile devices equipped with physical keyboards that enable the user to enter 1. Technical Field characters quickly. Second, this scheme might actually A “Community-Based Mobile Search Cache' provides increase power consumption by requiring increased usage of various techniques for maximizing the number of query the data link (3G, WiFi, etc.) to retrieve search results based results served from a remotely constructed local “query on numerous predicted queries. cache' downloaded to mobile communications devices Such 10 A somewhat related scheme addresses the idea of “data as cellphones, PDAs, media players, etc., thereby limiting the prefetching in mobile environments for reducing search need to connect to the Internet or cloud using 3G or other latencies. In general, this scheme determines the user's cur wireless links to service search queries entered by the user. rent location and automatically downloads location depen 2. Related Art dent data to the mobile device (e.g., restaurants, stores, phone The availability of Internet connections on mobile devices 15 numbers, etc.) within the general area Surrounding the user's such as cell phones, PDAs, netbooks, notebooks, etc., allow current location. As the user moves or changes location, this users to easily access the web and their personal files stored in information is updated. In particular, information related to the cloud. However, Internet access using mobile devices the user's current area is prefetched first, with closely related often uses relatively expensive links such as 3G or other data being grouped together. Further, the user's direction of radio-frequency based networks, which often charge for that travel is considered such that if the user moves to the north, data link on a per-byte or per-time data transfer basis. Further, location-based data to the south is not prefetched since it is the limited resources (CPU, memory, bandwidth, etc.) that assumed that the user will not need the data related to areas are often available on the mobile device together with the behind him (i.e., south). As with the prior scheme, while this huge amount of information stored on the web/cloud makes conventional Scheme can reduce query result latencies on the the process of searching and quickly displaying the right 25 mobile device, it will likely increase power consumption by piece of information on mobile devices a significant chal requiring increased usage of the data link (3G, WiFi, etc.) to lenge. Consequently, a user entering a search query on her “prefetch” location-based data as the user moves about. mobile device may desire a quicker and more focused search Another conventional scheme provides content caching for result than when entering the same query on that user's desk mobile devices by preemptively downloading popular multi top computer with a high-speed wired connection to the Inter 30 media content to those devices. In general, this scheme uses a net. “content selection algorithm' for push-based multimedia Many current state-of-the-art approaches to addressing content caching scenarios. This scheme attempts to improve Such issues generally use the mobile device as a text input cache hit rates by exploiting correlations between users who device for entering search queries. Users Submit these search request similar content (based on each user's multimedia queries on their devices, which then transfer the query to the 35 content viewing history), with that information being used to cloud/Internet through the 3G link (or other radio link). The select cache content on an individual user basis. Simpler cloud is then responsible for serving the queries and sending versions of Such schemes simply push the same set of the the results back to the mobile device through the 3G link. most popular multimedia content onto all connected mobile Unfortunately, this typical search architecture has at least two devices, with the limit on the amount of that content typically major drawbacks, including latency and power consumption. 40 based on the available memory of the mobile device. In particular, when compared to a typical local network connection, the 3G or other radio link is slow. Therefore, SUMMARY establishing a connection with the cloud and exchanging information over the radio link takes time that negatively This Summary is provided to introduce a selection of con impacts user experience. In fact, mobile users are often on the 45 cepts in a simplified form that are further described below in go, and they both need and expect the results of their query as the Detailed Description. This Summary is not intended to soon as possible. Further, on typical mobile devices, the radio identify key features or essential features of the claimed sub link is one of the most power hungry components. Conse ject matter, nor is it intended to be used as an aid in determin quently, frequent or prolonged use of the radio link leads to ing the scope of the claimed Subject matter. faster draining of the mobile device's battery. Additionally, in 50 Users of mobile communications devices such as cell a typical mobile environment, a radio link is not always phones, PDAs, netbooks, notebooks, etc., often use these available to the mobile device, thereby preventing the user mobile devices to search information while on the go by from entering search queries for data or information in the accessing one or more remote search engines or query search cloud. services (e.g., BingTM, Google(R), Yahoo.(R), etc.) using a wire Several schemes have attempted to address some of these 55 less link to the Internet, or some other cloud or network. In issues. For example, one conventional mobile search scheme searching this information, the user generally enters a tar attempts to reduce search time, or latencies, resulting from geted and/or contextual search query into the mobile device. relatively slow data links used by many mobile devices by In general, a “Community-Based Mobile Search Cache', as evaluating characters of a search term while they are being described herein, provides various techniques for maximiz entered by the user. This mobile search scheme then continu 60 ing the number of query results served from a local “query ously attempts to predict what the user query will be based on cache', thereby significantly limiting the need to use 3G, the characters that the user has entered so far, and uses each of WiFi, or other wireless radio links to connect the mobile the resulting predicted queries to download a predetermined device to the Internet, cloud, or network when entering search number of search results to a local cache while the user is queries. entering the query. Once the complete search term has been 65 More specifically, in contrast to conventional mobile entered by the user, search results corresponding to the com search schemes, the Community-Based MobileSearch Cache plete search term are displayed on the mobile device from the provides a local query cache of search results and/or related US 8,943,043 B2 3 4 web pages for “popular queries based on an evaluation of considered in various embodiments to re-rank Search results historical community queries. The content of the local query returned by the Community-Based Mobile Search Cache. In cache is determined by mining popular search queries from general, re-ranking of search results ensures that information mobile search logs of a broader community of mobile that may be more relevant to the user will be ranked higher, devices. This data mining is periodically performed either on 5 and thus displayed higher on the list of search results returned a global basis (i.e., using the search history of all users of by the Community-Based Mobile Search Cache in response mobile devices), or based on the search queries of one or more to the user query. groups or Subgroups of users (typically numbering in the Some of the advantages of the local query cache provided thousands or millions) for the purpose of periodically updat by the Community-Based Mobile Search Cache are that ing the query cache. 10 Deterministic or probabilistic caching of popular results search accuracy and search result rankings are preserved and/or related web pages based on this data mining allows the (relative to either the original search results being mined or to Community-Based Mobile Search Cache to immediately the community “popularity of those results based on aggre return search results and/or web pages to the user from the gate link Volume), thereby precluding the need to maintain a local query cache without requiring the mobile device to 15 search index on the mobile device. In fact, in most embodi access the Internet or cloud using its 3G or other wireless ments, ranking does not generally run on the mobile device, connection. It is also important to understand that the query since when results are presented to the user, they are simply cache is generally not being searched to identify matches to presented in the order in which they would have been returned user-entered queries, but rather that that the query cache is from the original search index (retrieved by mining commu already indexed, complete with rankings for the search results nity search logs, as noted above). However, if those results of a large set search terms at the time the query cache is have been re-ranked, as noted above, the results are instead created. However, in various embodiments, some or all of the displayed based on their new rankings. search results are re-ranked based on a variety of factors as In view of the above summary, it is clear that the Commu discussed in further detail herein. nity-Based Mobile Search Cache described herein provides The set of popular queries and web pages mined from the 25 various techniques for maximizing the number of query broader community of mobile devices changes over time. results served from a remotely constructed local “query Consequently, in various embodiments, the local query cache cache', thereby significantly limiting the need to connect to is periodically updated to make Sure that the most up-to-date the Internet or cloud using 3G or other wireless links to information is available locally on the mobile device. The service search queries. In addition to the just described ben Community-Based Mobile Search Cache will periodically 30 efits, other advantages of the Community-Based Mobile (e.g., hourly, daily, weekly, etc.) mine the search logs to Search Cache will become apparent from the detailed identify the latest set of popular queries and web pages. The description that follows hereinafter when taken in conjunc periodically updated local query cache is downloaded to tion with the accompanying drawing figures. mobile devices on demand, preferably, though not necessar ily, at a time that will reduce or eliminate the need for the 35 DESCRIPTION OF THE DRAWINGS mobile device to use 3G, WiFi, or other wireless links during the download process. For example, in various embodiments, The specific features, aspects, and advantages of the the local query cache can be downloaded to the user's desk claimed subject matter will become better understood with top, laptop, notebook, or netbook computer via a typical regard to the following description, appended claims, and Internet or network connection, which is then “synched' or 40 accompanying drawings where: otherwise copied to the mobile device to transfer or update a FIG. 1 illustrates a general flow diagram for constructing current copy of the query cache on the mobile device. and providing a community-based query cache to one or more Over time, in various embodiments, the Community mobile devices for use in implementing a “Community Based Mobile Search Cache locally monitors the queries Based Mobile Search Cache, as described herein. entered as well as the search results clicked by the individual 45 FIG. 2 provides an exemplary flow diagram that illustrates user, and performs two discrete tasks. First, it expands the program modules for implementing various embodiments of local query cache to include all those queries and search receiving, using, and updating the community-based query results accessed by the user that do not initially exist in the cache of FIG. 1 on an individual mobile device, as described community part of the local query cache. In that way, the local herein. query cache can take advantage of the repeatability of the 50 FIG. 3 provides an example of a graph of real world data queries Submitted by the mobile users to serve as many que showing aggregate query Volume as a percentage of total ries as possible locally on the mobile device. Second, it col queries, derived from a record of queries and search results of lects information about user clicks, such as when and how a community of mobile devices, as described herein. many times the user clicks on a link after a query is Submitted, FIG. 4 provides an example of a graph of real world data to customize ranking of search results to the user's click 55 showing aggregate link Volume as a percentage of total history. selected links, derived from a record of queries and search In various embodiments, the local query cache down results of a community of mobile devices, as described loaded to the mobile devices of specific users, or to particular herein. 'groups' or “communities of users, is customized based on FIG. 5 provides an example of aggregate query-link Vol any of a variety of factors. These factors include, but are not 60 ume as a function of most popular query-link pairs derived limited to, geographic region of the user, age and/or gender of from a record of queries and selected search results from a the user, language of the user, user browsing or search histo community of mobile devices, as described herein ries, the contents of electronic calendars or other electronic FIG. 6 illustrates an exemplary hash table data structure for documents maintained on a particular mobile device, etc. In linking queries to search results, as described herein. various embodiments, these factors are used in at least two 65 FIG. 7 illustrates an example of a sequence of steps for different ways. First, to identify which queries should be extracting 'search results' from the local query cache in included in the local query cache. Second, these factors are response to a user entered query, as described herein. US 8,943,043 B2 5 6 FIG. 8 is a general system diagram depicting a simplified 1.1 System Overview: general-purpose computing device having simplified com As noted above, the “Community-Based Mobile Search puting and I/O capabilities for use in implementing various Cache.” provides various techniques for various techniques embodiments of the Community-Based Mobile Search for maximizing the number of query results served from a Cache, as described herein. 5 remotely constructed local “query cache', thereby signifi cantly limiting the need to connect to the Internet or cloud DETAILED DESCRIPTION OF THE using 3G or other wireless links to service search queries. The EMBODIMENTS processes Summarized above are illustrated by the general system diagram of FIG. 1. In the following description of the embodiments of the 10 In particular, the system diagram of FIG. 1 illustrates the claimed Subject matter, reference is made to the accompany interrelationships between program modules for implement ing drawings, which form a parthereof, and in which is shown ing various embodiments of the Community-Based Mobile by way of illustration specific embodiments in which the Search Cache, as described herein. Furthermore, while the claimed subject matter may be practiced. It should be under system diagram of FIG. 1 illustrates a high-level view of 15 various embodiments of the Community-Based Mobile stood that other embodiments may be utilized and structural Search Cache, FIG. 1 is not intended to provide an exhaustive changes may be made without departing from the scope of the or complete illustration of every possible embodiment of the presently claimed Subject matter. Community-Based Mobile Search Cache as described 1.0 Introduction: throughout this document. In general, a “Community-Based Mobile Search Cache.” In addition, it should be noted that any boxes and intercon as described herein, provides various techniques for maxi nections between boxes that may be represented by broken or mizing the number of query results served from a local “query dashed lines in FIG. 1 represent alternate embodiments of the cache', thereby significantly limiting the need for mobile Community-Based Mobile Search Cache described herein, communications devices to use 3G, WiFi, or other wireless and that any or all of these alternate embodiments, as radio links to connect to the Internet, cloud, or other network 25 described below, may be used in combination with other when entering search queries. In other words, the Commu alternate embodiments that are described throughout this nity-Based Mobile Search Cache uses a cache-based archi document. tecture to locally serve search queries by using the query In general, as illustrated by FIG. 1, the processes enabled cache on mobile communications devices such as cell by the Community-Based Mobile Search Cache begin opera phones, PDAs, netbooks, notebooks, etc. 30 tion by using a search log mining module 100 to extract data For example, users of mobile communications devices from one or more community mobile search logs 105. As often use these devices to search information while on the go discussed in further detail in Section 2.1 and Section 2.2, the by accessing one or more remote search engines or query community mobile search logs 105 include records of search search services (e.g., BingTM, Google(R), Yahoo.(R), etc.) using queries with corresponding search results and user link selec 35 tion information derived from one or more search engines 115 a wireless link to the Internet, or some other cloud or network. interacting with a community of mobile devices (i.e., mobile In searching this information, the user often enters a targeted community 120). In other words, the mobile community 120 and/or contextual search query into the mobile device. In includes a plurality of mobile devices (e.g., mobile devices contrast to conventional mobile search Schemes, the Commu 125, 130, 135), typically numbering in the thousands or mil nity-Based Mobile Search Cache provides a local query 40 lions of users, that Submit queries to the search engines 115 cache of search results and/or related web pages for popular which interact with the Internet, cloud, or network 110 to queries based on an evaluation of historical community que retrieve or return search results. These queries along with the ries. The content of the local query cache is determined by results of those queries (i.e., returned links and user selection mining popular search queries from mobile search logs of a of one or more of those links) are recorded or stored in one or broader community of mobile devices. Periodic updating of 45 more databases referred to hereinas the aforementioned com the query cache is achieved by periodically performing the munity mobile search logs 105. data mining, either on a global basis (i.e., search history of all As discussed in further detail in Section 2.1 and Section users of mobile devices), or based on the search queries of one 2.2, mining of the community mobile search logs 105 by the or more groups or subgroups of users (typically numbering in search log mining module 100 involves examining the infor the thousands or millions). In various embodiments, search 50 mation contained in those logs using a query/link pair extrac queries Submitted by individual users are monitored and used tion module 140 to extract pairs of queries and corresponding to personalize the query cache for those users. links returned by the search. For example, using the BingTM The size of the query cache provides a space constraint that search engine to enter the query term 'space shuttle' returns is used when mining the mobile search logs to decide which a large number of links, including, for example, "www.na part of the search logs is the most appropriate to be stored so 55 sa.gov/mission pages/shuttle/main/index.html. 'en.wiki that the number of search queries that are locally served is pedia.org/wiki/space shuttle'. "www.space.com/space maximized. Note that in various embodiments, query cache shuttle', etc. Thus, the query term “space shuttle' when size can be determined on an individual user basis, can be set coupled with any of these returned links represents a “query/ at user specified levels, or can be constructed in various sizes link pair. Once the query/link pair extraction module 140 has that are either automatically selected or selected by the user 60 extracted the available query/link pairs from the community for download to the users’ mobile device. In various embodi mobile search logs, a Volume sorting module 145 sorts these ments, whenevera search query is associated with Sufficiently query/link pairs based on the aggregate Volume associated high probability to a specific web page contained in the query with each query/link pair to construct a set of query, link, cache, the Community-Based Mobile Search Cache auto volume triplets 150 (i.e., query/link pairs with eitheran asso matically displays that web page to the user immediately 65 ciated click-through volume or a volume-based “score”). In without first displaying the search results in response to the other words, the “volume” of these triplets 150 represents the query input by the user. total number of instances in the mobile community 120 that a US 8,943,043 B2 7 8 particular link was selected in response to a particular query. device 180 on demand, the query cache is preferably, though Note that any particular user might search for the same thing not necessarily, transmitted to the mobile device at a time that more than once. See Table 1 in Section 2.2 for an example of will reduce or eliminate the need for the mobile device to use query, link, Volume triplets. 3G, WiFi, or other wireless links during the download pro Next, a query cache construction module 155 evaluates the CCSS, sorted query link, volume triplets 150 to determine which part For example, in various embodiments, the query cache 160 of the community mobile search logs 105 is the most appro is downloaded to the user's desktop, laptop, notebook, or priate to be stored so that the number of search queries that are netbook computer via a typical Internet or network connec locally served is maximized. In particular, as discussed in tion. The query cache 160 is then “synched' or otherwise further detail in Section 2.2, the query cache construction 10 copied to the mobile device to transfer or update a current module 155 selects a subset of links having the highest aggre copy of the query cache on the mobile device. The mobile gate Volume along with the query terms associated with the device 180 is then capable of performing local “searches' corresponding triplets to construct one or more query caches using the downloaded copy of the query cache 160. However, 160. Note that as described in Section 2, in various embodi it should also be understood that the mobile device 180 can ments, the Community-Based Mobile Search Cache makes 15 still perform conventional searches and the like by accessing use of a hash table data structure for linking queries to search one or more remote search engines 115 or other remote data results. In this case, the hash table (not shown) is constructed bases (not shown) to retrieve information from the Internet, along with the query cache by the query cache construction cloud, or network 110. module 155 and is then maintained locally (on each mobile Further, in various embodiments, an incremental cache device) separate from the query cache (also maintained update module 175 is used to provide incremental update locally on each mobile device). information to the mobile device 180 via the cache transmis Note that it is expected that different mobile devices (or sion module 170. As with the query cache 160, in various user requirements or specifications) will allow for query embodiments, the incremental update information can also be caches 160 of different sizes (i.e., available memory or stor downloaded to the user's desktop, laptop, notebook, or net age allocated to the query cache). Therefore, in various 25 book computer via a typical Internet or network connection, embodiments, a cachesize construction module 165 is used to which is then “synched' or otherwise copied to the mobile determine the number of query/link pairs that are included in device 180 to update the copy of the local query cache on the the query cache for particular mobile devices or users or mobile device. groups of mobile devices or users. In other words, depending Finally, in various embodiments where advertisements are upon the space available, the query cache construction mod 30 cached and stored with, or accessible with, the query cache ule will select the first n unique links (and paired queries) that 160, an advertisement caching module 185 is used to select a fit in the cache (given the space allocated for the query cache set of advertisements from an advertisement store 190 based 160) from the sorted query link, volume triplets 150. Again, as on any desired advertisement selection and/or advertisement noted above, the “volume” portion of these triplets is either billing mechanism. Selected advertisements are then actual aggregate Volume, or a Volume-based "score” (i.e., a 35 included in or with the query cache, as noted above. As “ranking score') as discussed in Section 2.2. The selected discussed below, one or more cached advertisements are then query/link pairs are then used to construct the query cache presented to the user in various embodiments based on vari 160. ous conventional techniques for matching particular adver Note also that as discussed in further detail in Section 2.1.1, tisements to user entered queries. in various embodiments, various versions of the query cache 40 FIG. 2 illustrates the interrelationships between program 160 are constructed based on various criteria, including com modules for implementing various embodiments of the Com munity mobile search logs 105 of particular users or particu munity-Based Mobile Search Cache within a particular lar groups of users, logs derived from particular geographic mobile device, as described herein. Furthermore, while the regions or languages, etc. As such, it should be understood system diagram of FIG. 2 illustrates a high-level view of that the query cache construction module 155 is capable of 45 various embodiments of the Community-Based Mobile creating a plurality of different versions of the query cache Search Cache, FIG. 2 is not intended to provide an exhaustive 160 that can then be transmitted to, or selected by, individual or complete illustration of every possible embodiment of the users or mobile devices. Community-Based Mobile Search Cache as described In addition, as discussed in further detail in Section 2.4, in throughout this document. In addition, it should be noted that various embodiments, additional content, Such as, for 50 any boxes and interconnections between boxes that may be example, individual documents, advertisements, web pages, represented by broken or dashed lines in FIG. 2 represent or even entire web sites, are cached and stored with, or acces alternate embodiments of the Community-Based Mobile sible with, the query cache 160. Since web pages or docu Search Cache described herein, and that any or all of these ments generally require more storage space than the Sorted alternate embodiments, as described herein, may be used in query link, volume triplets 150, the amount of this additional 55 combination with other alternate embodiments that are content is also limited by the available storage space, as described throughout this document. defined or specified by the cache size specification module In general, the system diagram of FIG. 2 builds on FIG. 1 165. by illustrating various embodiments showing how the query A query cache transmission module 170 then transmits the cache 160 of FIG. 1 is received, used, and/or updated within query cache 160 to individual mobile devices 180 for local 60 an individual mobile device 180. Note that the query cache use by those devices. Note that while only one mobile device 160 generated as illustrated by FIG. 1 may be locally modi 180 is illustrated in FIG. 1 for purposes of explanation, in fied or updated due to various local conditions or actions on operation of the Community-Based Mobile Search Cache, it the mobile device 180. Consequently, the query cache 160, is assumed that many thousands or millions of mobile devices once received by the mobile device of FIG. 2, is referred to as will receive a copy of a query cache 160 (though not neces 65 a “local query cache 200' to differentiate the local copy of the sarily the same copy) for local use. Note also that while the query cache from the remote copy of the query cache 160 query cache 160 can be directly transmitted to the mobile constructed as illustrated by FIG. 1. US 8,943,043 B2 10 More specifically, as illustrated by FIG. 2, the mobile Cache can display one or more relevant ads to the user as he device 180 receives the query cache 160 (or query cache types a query by using the ad insert and report module 243 to update information, as described above with respect to FIG. select relevant advertisements (either cached or from one or 1) transmitted by the cache transmission module 170 of FIG. more remote sources) and incorporate them into the search 1. This information is received by the mobile device 180 via results page constructed by the query result module 230. For a cache receipt module 205 using conventional communica example, for every possible keyword that the user might enter, tions protocols and methods. The mobile device 180 then the Community-Based Mobile Search Cache can instantly stores the received copy of the query cache 160 as the local retrieve and display an advertisement about a business near query cache 200. In the case of receiving cache update infor the location of the user that is related to this keyword. How mation rather than a complete query cache, a cache update 10 ever, it should be understood that criteria other than user module 215 is used to integrate that information into the local location can also be used for selection of advertisements to be query cache 200. displayed to the user (see Section 2.5). Further, as discussed above with respect to FIG.1, multiple Finally, in various embodiments, the ad insert and report versions of the query cache 160 may be created based on module 243 also keeps track of advertisements that are either various criteria, including community mobile search logs 105 15 presented to the user and/or selected by the user. This adver of particular users or particular groups of users, logs derived tisement presentation and selection information is then peri from particular geographic regions or languages, etc. As such, odically reported back to a remote server by the ad insert and in various embodiments, a cache selection module 210 is report module 243 for advertiser billing purposes. In other provided on the mobile device 180 to allow the user (or words, this periodically reported information allows adver mobile device) to select one or more of the different versions tisers to be billed on a per placement, per impression and/or of the query cache 160 for use as the local query cache 200. A per-selection basis for advertisements presented to the userby simple example of this concept is that a user in Paris, France the Community-Based Mobile Search Cache. that intends to travel to New York City may want to select a It should also be understood that the use of the local query copy of the query cache 160 derived from community mobile cache 200 by the mobile device180 is generally transparent to search logs 105 generated by mobile devices in New York 25 the user. In other words, the result page(s) presented to the City for use as the local query cache 200 on that user's mobile user on the display device 240 will appear to the user to be the device 180. same as if the user had entered the query via a conventional Regardless of whether a particular query cache (element search engine that then polled or accessed a remote database 160 of FIG. 1) is selected for the local query cache 200, once to return results to the user. In fact, in various embodiments, the selected query cache has been received by the mobile 30 the search engine being used on the mobile device 180 is device 180, user queries entered on the mobile device are simply modified to access the local query cache 200 directly serviced locally using the local query cache. More specifi instead of transmitting the query string for remote handling as cally, a query input module 220 is used to enter a query string with conventional search engine usage. Alternately, in vari or search term (similar to the entry of a query term in any ous related embodiments, the operating system includes an conventional browser). A cache search module 225 then 35 interrupt or the like that watches for a user query string examines the local query cache 200 to determine whether the entered into any conventional search engine, and then causes Subject query term exists in the local query cache. If that the query string to be handled locally via the aforementioned query term does exist in the local query cache 200, a query cache search module 225, as described above, with the results result module 230 synthesizes or constructs one or more then being returned to that search engine. Further, in various results pages (see Section 2.6) based on the content of the 40 embodiments, the results page(s) synthesized by the query local query cache with respect to the query term. result module 230 are identified as having been synthesized In other words, as discussed in further detail throughout from the local query cache 200 so that the user is fully aware Section 2 of this document, the local query cache 200 does not of where those results came from. include copies of actual search results pages, but rather, it In the case that the query term entered via the query input stores a ranked list of links and associated query terms mined 45 module 220 does not exist in the local query cache 200, a from the results of the original community mobile search logs query search module 245 simply performs a conventional (element 105 of FIG. 1). Note that in various embodiments, search by using a mobile communications module 250 to the ranking information is maintained in a separate hash table access the internet, cloud, or network 110 to perform that (not shown), as discussed in Section 2. These links and asso search. The results of the conventional search are then ciated query terms, when retrieved from the local query cache 50 returned to the query search module 245. The query search 200 are then automatically formatted by the query result module 245 then simply passes those results to the result module 230 and used to locally (i.e., on the mobile device) display module 235, which then presents the downloaded synthesize or construct one or more 'search result pages. A search results to the user via the conventional display device result display module 235 then presents the synthesized 240. search results page(s) to the user via a conventional display 55 In either case, once the search results have been presented device 240 that is integral (or coupled to) the mobile device to the user via the display device 240, the user can browse or 180. select any of the links presented in those results in the same Further, as noted above, in various embodiments, adver manner as with a results page returned by any conventional tisements may also be cached in or along with the query search engine. In particular, a link select module 260 allows cache. In the case of advertisements, as discussed in further 60 the user to select any of the links being displayed on the detail in Section 2.5, various advertisements that are related to display device 240. As noted above, in various embodiments, the content in the local query cache are included in, or in the Community-Based Mobile Search Cache stores or caches conjunction with, the local query cache 200. In addition, in additional content, such as, for example, individual docu various embodiments, an ad insert and report module 243 is ments, web pages, or even entire web sites, either in the local capable of contacting a remote server to download one or 65 query cache 200, or in a separate file or database (not shown more relevant advertisements in real-time for presentation to in FIG. 2) that is accessible to the Community-Based Mobile the user. In either case, the Community-Based Mobile Search Search Cache. US 8,943,043 B2 11 12 Therefore, in various embodiments, once a link has been Finally, in yet another embodiment, the local query cache selected, the link select module 260 first checks the local 200 is periodically evaluated by a “favorites’ creation module query cache 200 (or the associated file or database of addi 280 to extract a set of the highest probability links to auto tional content) to determine whether the page or document matically create a list of favorite links that are presented to the corresponding to the selected link is stored locally. If that user as simply a list of favorites or the like. For example, while information is stored locally, then the link select module 260 the list of favorites can be any length desired, some number forwards that information to a link display module 265, which (e.g., 15 links) of the most frequently selected unique links then presents the information corresponding to the selected (based on the aforementioned query, link, Volume triplets) are link to the user via the display device 240. If the information simply presented to the user as a user selectable list. This list (i.e., web page, document, etc.) corresponding to the selected 10 is presented to the user via the display device 240, with user link is not stored locally, then the link select module 260 selection of these links being handled as described above. In passes the selected link to the mobile communications mod various embodiments, this list of favorites is user editable, has ule 250. The mobile communications module 250 then a user selectable size, and can be displayed to the user on retrieves the information corresponding to the selected link demand via user selection of a favorites button, a favorites from the internet, cloud, or network 110. The retrieved infor 15 menu item, etc. mation is then returned to the link select module 260. The link 2.0 Operation of the Community-Based Mobile Search select module 260 then passes the information to the link Cache: display module 265, which then presents the information The above-described program modules are employed for corresponding to the selected link to the user via the display implementing various embodiments of the Community device 240. Based Mobile Search Cache. As summarized above, the Further, in various related embodiments where the infor Community-Based Mobile Search Cache provides various mation (i.e., web page, document, etc.) corresponding to the techniques for maximizing the number of query results selected link is stored locally (on mobile device 180), when served from a remotely constructed local “query cache', ever a search query is associated with a Sufficiently high thereby significantly limiting the need for mobile devices to probability of click-through to a specific web page (based on 25 connect to the Internet or cloud using 3G or other wireless the aforementioned data mining), the corresponding web links to service search queries. page is immediately displayed to user without first displaying The following sections provide a detailed discussion of the the search results via the result display module 235. operation of various embodiments of the Community-Based As noted above, in various embodiments, the local query Mobile Search Cache, and of exemplary methods for imple cache 200 is locally customized, modified, or updated due to 30 menting the program modules described in Section 1 with various local conditions or actions on the mobile device 180, respect to FIG. 1 and FIG. 2. In particular, the following including, but not limited to user browsing or search histories, sections provide examples and operational details of various the contents of electronic calendars or other electronic docu embodiments of the Community-Based Mobile Search ments maintained on a particular mobile device, etc. (see Cache, including: search log mining; constructing the query Section 2.3 and 2.4). This type of personalization or customi 35 cache; result page construction; customizing the query cache; zation is performed by a personalization module 270. In synthesizing search results pages; re-ranking search results in general, the personalization module adds, removes, or re the query cache; and automatic bookmark or “favorites' cre scores links in the local query cache 200 in response to user ation. interaction with the local query cache overtime. For example, 2.1 Search Log Mining: in various embodiments, a local data store accessible to the 40 As noted above, the Community-Based Mobile Search mobile device 180 including one or more documents, emails, Cache makes use of a query cache that is derived from mobile calendars, etc., is scanned to extract names, dates, places, or search logs of a broader community of mobile devices. In other information corresponding to one or more query terms examining typical mobile search logs from large numbers of in the query cache. Links to these documents emails, calen users, it has been observed that a relatively small number of dars, etc., are then included in the local cache 200. Then, 45 queries and links from a set including millions of actual when the user enters a corresponding query term, links to the mobile searches can be considered as being "highly popular. locally available information will be included in the results In other words, a relatively small set of mobile searches presented to the user on the display device 240, as described correspond to a significant percentage of the most common above. mobile searches. In partially related embodiments, similar factors are con 50 For example, in a tested embodiment of the Community sidered in various embodiments to re-rank search results Based Mobile Search Cache, a database of actual mobile returned by the Community-Based Mobile Search Cache (see search logs comprising about 100 million queries to a con Section 2.3.1 for a discussion of personalized re-ranking). In ventional search engine was recorded over an approximately general, a re-ranking module 275 re-ranks search results in three month period. This database of community-based the local query cache 200 to ensure that information that may 55 mobile search logs was analyzed to determine query, link, and be more relevant to the user of the mobile device 180 will be Volume triplets (i.e., query/link pairs with associated Volume) ranked higher. More specifically, as discussed in Section for every mobile query processed and recorded by the search 2.3.1, by monitoring user queries and clicks over time, the engine. As illustrated by FIG. 3, it was observed that more Community-Based Mobile Search Cache is aware of when than 60% of the aggregate mobile search query Volume dur and how many times the user selects a link after a given query 60 ing this period corresponded to as few as 10,000 unique is submitted. The Community-Based Mobile Search Cache queries. Similarly, as illustrated by FIG. 4, it was observed uses this information to incrementally update the ranking that more than 60% of the aggregate link Volume (e.g., click score of the cached search results to offer a personalized through from search results, entering hyperlinks directly, etc.) search experience to individual users. As such, results that are during this same period corresponded to as few as 6,000 more relevant are displayed higher on the list of search results 65 unique hyperlinks or web pages. displayed on the display device 240 by the result display Consequently, in view of these examples of real-world module 235 in response to the user query. data, it was determined that high cache hit rates could be US 8,943,043 B2 13 14 achieved by using a local cache of a relatively small subset of cantly from the most popular mobile searches in Houston, the total community-based mobile search logs. Therefore, the Tex. Therefore, if the location of the mobile device can be Community-Based Mobile Search Cache mines the mobile located within a particular region using GPS, user input, or search logs to decide which part of the search logs is the most other conventional localization techniques, including using appropriate to be stored so that the number of search queries the locations of cell towers or WiFi sites that the mobile that are locally served is maximized. Note that the query device is connected to, a local query cache corresponding to cache size can be a function of available or allocated space popular searches in that region can be uploaded to the mobile (e.g., available local memory or storage) for particular mobile device. Note also that since the query caches are constructed devices, and that different query cache sizes can be used for from relatively large data sets (i.e., many queries and many particular mobile devices. Further, it should also be clear that 10 in view of FIG.3 and FIG. 4, beyond some easily determin users), a user interface is provided in various embodiments to able point, relatively little will be gained in terms of cache hit allow the user to select one or more particular local query rates by increasing the size of the query cache (with respect to caches for download to the mobile device. In other words, in caching queries and links). In fact, various tested embodi various embodiments, the user is provided with the capability ments of the Community-Based Mobile Search Cache 15 to select one or more query caches from a set of available showed consistent positive cache hit rates when using the query caches corresponding to particular regions, languages, query cache on the order of about 60% when using a query groups, etc. cache of about 32 MB in size. Therefore, since typical mobile In the case of per-device customization, one important devices, such as cell phones, PDAs, netbooks, notebooks, consideration is memory allocated or available for the local etc., generally include many gigabytes of local storage, the query cache. Specifically, as the available memory is query cache generally requires a negligible amount of that increased, more information, including search results, hyper available storage. links, web pages, images, etc., can be included in the local However, while relatively large increases in the storage of query cache. Consequently, in various embodiments, mul additional queries and links in the query cache may not sig tiple query caches are constructed using a range of maximum nificantly increase cache hit rates beyond some point (as can 25 memory requirements, with a query cache of the appropriate be seen by FIG. 3 and FIG. 4), additional storage space, if size then being downloaded to particular mobile devices. available for the query cache, can be used to cache docu Note that in various embodiments, the user is provided with ments, web pages or even entire web sites. For example, over the capability to set maximum and/or minimum sizes for the relatively short time periods, e.g., hours, days or even weeks, query cache. many highly popular search results can be considered 30 2.2 Constructing the Query Cache: “static'. In other words, much of the content of the Internet is As noted above, the query cache is constructed by mining relatively unchanging or consistent, even for time-sensitive popular search queries from mobile search logs gathered and dynamic web content, when considered in the short term. from the query history of a broader community of mobile Clearly, this will not be the case for real-time content such as devices. In general, every query in these logs includes a live stock quotes. However, some content (e.g., web pages or 35 generally very large set of ranked matches (generally in the documents) Such as, for example, daily news reports, blogs, form of an ordered list of hyperlinks that typically include weather reports, historical stock quotes or charts, maps, res brief content Summaries) for the particular user query that taurant guides, etc., can be considered “static' within the spawned the corresponding log entry. Further, user selection lifetime of the query cache (i.e., the period between updates or entry of one or more of those hyperlinks is also recorded of the query cache). 40 along with the original query. This provides a measure of Therefore, in various embodiments, as discussed in further query/link pairs that is used to generate a ranked list of links detail in Section 2.4, a set of one or more relatively “static' for each query term based on actual link selection by the web pages (selected on the basis on their aggregate link community (i.e., aggregate link Volume for a particular query Volume determined by mining the mobile search logs) are term, as illustrated by FIG.3 and FIG. 4) rather than merely included in the query cache (or in a separate data set locally 45 the rankings provided by the search engine. available to the Community-Based Mobile Search Cache). In As is well known to those skilled in the art, query result other words, in various embodiments, a set of one or more pages and corresponding user selections are typically auto web pages (or documents) having the highest probability of matically provided, and permanently recorded, as an inherent being viewed by the user are included in the local query cache part of the search process, regardless of the particular search (or a separate database or data set accessible to the Commu 50 engine being used (e.g., BingTM, Google(R), Yahoo.(R), etc.). By nity-Based Mobile Search Cache. These documents or web recording all of this information for all mobile searches (or for pages are then served locally in response to user selection of particular groups or Subgroups of users of mobile devices), a corresponding link. As such, the mobile device is not Volume measurements for both query terms and hyperlink required to connect to the Internet or cloud using 3G or other selection are also inherently available via mining of the wireless links to service corresponding search queries entered 55 mobile search logs. by the user. Therefore, as noted above, the mobile search logs can be 2.1.1 Customized Mining for Individual Users or Groups: mined to extract a subset of the available information for use In various embodiments, the query cache is customized on in constructing the query cache. Note that in many cases, a per-user, per-device, or per-group basis. For example, in the search results returned from a particular query point to the case of per-user query cache customization, various factors 60 same web page or document in the case that the same web can be considered Such as user locality or geographic region, page was found or indexed by different paths. Therefore, in user language, prior user search logs, user'community', etc. order to further reduce the size of the query cache, in various Given these types of considerations, the local query cache for embodiments of the Community-Based Mobile Search individual users is then customized based on “groups of Cache, these “duplicate' results are removed when construct similar users. 65 ing the query cache. However, it should be understood that For example, in the case of geographic region, the most where different mobile users selected different search results popular mobile searches in New York City may differ signifi pointing to the same web page or document, those individual US 8,943,043 B2 15 16 selections will still be counted for purposes of aggregate link cated to local query cache. As noted above, various mobile volume for the particular link. devices will have differing memory capacities (either physi For example, in a tested embodiment, the Community cal or allocated limits). These memory capacities can be set Based Mobile Search extracted query, link, volume triplets by either the mobile device itself based on its available from the mobile search logs and sorted them in descending memory resources, or by the user depending on how much order (of aggregate Volume). Given this information, the first storage space and memory the user is willing to allocate to the in unique links that fit in the cache (given the space allocated local query cache. In either case, the copy of the local query for the query cache) are then used to construct the query cache cache transmitted to any particular mobile user will not along with a set of the most popular query terms associated exceed the memory threshold for that user. Assuming a rela with those links. 10 tively consistent size for query-link pairs and links, the More specifically, the search results stored in the query memory and storage requirements can be easily computed for cache are extracted directly from the mobile search logs in a a given set of entries (such as those seen in Table 1) based on data-driven way that acts to identify the most popular queries the number of unique queries and links they contain. and search results that are of interest to the whole mobile Next, with respect to the aforementioned “cache saturation 15 threshold, this concept is used populate the local query cache community. Table 1 shows a hypothetical example of the type by starting from the top entry in Table 1, then running down of information extracted from the search logs. A set of triplets though its entries and continuously adding query-link pairs of the form: is generated and Sorted until a query-link pair is reached with a “normalized volume’ based on Volume. The term query corresponds to the exact lower than a predetermined threshold V. The “normalized query string Submitted to the search engine, the term link Volume of a query-link pair is generated by dividing the corresponds to the search result that was selected by a mobile pair's volume by the total volume of all query-link pairs in the user after entering the particular query, and the term Volume search logs. For instance, the normalized Volume of the first represents the number of times in the search logs that the query-link pair in Table 1 is equal to: 10/10=0.01. Since the specific link was selected after entering the query string entries in Table 1 are sorted based on their volume, the nor query. For instance, the first data row in Table 1 can be 25 malized Volume of query-link pairs is monotonically decreas interpreted as follows: Over the period that the mobile search ing. Note that V, can be set at any desired level. However, as logs were recorded, there were 1 million searches where the V, is decreased, the size of the cache will tend to increase link “http://mobile.myspace.com” was selected after the since more query-link pairs will have a normalized volume query “myspace' was Submitted. greater than V, 30 One advantage of using the cache Saturation threshold can TABLE 1. be observed by the graph shown in FIG. 5. In particular, FIG. 5 illustrates an example of aggregate query-link volume as a Example of Query-Link-Volume Triplets function of most popular query-link pairs derived from a Query Link Volume record of queries and selected search results from a commu 35 myspace http://mobile.myspace.com 1,000,000 nity of mobile devices. From an empirical CDF of the volume myspace http://www.myspace.com 950,000 of all query-link pairs in the search logs, it is apparent that the facebook http://m.facebook.com 900,000 value of adding query-link pairs to the local query cache quickly diminishes past some point. This point is referred to youtube http://m.youtube.com 500,000 yotube http://m.youtube.com 300,000 herein as a “saturation point' which can be used to set the 40 aforementioned cache Saturation threshold. For example, weather http://m.veather.com 200,000 using the example provided by the graph of FIG. 5, after having cached approximately 20000 query-link pairs, even marginally increasing the aggregate Volume requires a large Total Volume 100,000,000 number of query-link pairs. For instance, slightly increasing 45 the aggregate volume from 58% to 62% requires doubling the Deciding which entries to store is straightforward. To amount of query-link pairs from 20000 to 40000. maximize the query volume that can be served by the local In tested embodiments of the Community-Based Mobile query cache, the most popular query-link pairs are stored. In Search Cache, evaluation of various mobile search logs has other words, the selected entries represent a subset drawn shown that the cache saturation threshold will be quickly from the top entries of Table 1 (based on volume). The higher 50 reached long before the local query cache begins to approach the volume of a query-link pair, the higher the probability that the memory or storage resources available on the mobile a given user will try to access this query-link pair on his device. In fact, as illustrated by FIG. 5, the saturation point of mobile device. Deciding how many of the most popular the cache (using the exemplary data illustrated by the graph) query-link pairs to store is a more complicated process. In this is quickly reached when the most popular query-link pairs case, the number of query-link pairs to cache is selected based 55 that correspond to approximately 55% of the cumulative on eithera “memory threshold' (i.e., a maximum or allowable query-link Volume have been cached. At this point, the local memory or storage space) or "cache Saturation threshold'. query cache (and associated hash table) generally requires First, with respect to the use of a memory threshold, in less than 2 MB of memory on the mobile device. With various embodiments, the local query cache is held in local memory on typical mobile devices generally being measured memory (i.e., local flash or RAM memory). In this case, a 60 in Gigabytes, it should be clear that the local query cache can memory threshold (M) limits the number of query-link pairs be easily accommodated within the storage resources avail that are to be cached. For example, in this case, starting from able on typical mobile devices. the top entry in Table 1, the Community-Based Mobile Search Independent of which threshold is used (i.e., memory or Cache simply scans down through the entries and continu cache Saturation), the methodology described above identi ously adds query-link pairs to the local query cache until a 65 fies then top entries in the Sorted query-link data (e.g., Table specific flash or RAM memory threshold M, is reached. This 1). However, before using these entries to build the local threshold represents the maximum memory that can be allo query cache, in various embodiments, the Community-Based US 8,943,043 B2 17 18 Mobile Search Cache first computes a volume-based “rank In embodiments of the Community-Based Mobile Search ing score' from the volume information associated with every Cache that make use of a hash table data structure for linking query-link pair. In general, the ranking score for each query queries to search results, the hash table is maintained locally link pair is produced by normalizing the Volume across all (on each mobile device) separate from the query cache (also links that correspond to the query. For example, in the case of 5 maintained locally on each mobile device). While both the query “myspace' in Table 1 (covering the top two rows of hash table and the query cache can be maintained in the same Table 1) has a total volume of 1,950,000 (i.e., 1,000,000+950, storage if desired, there are performance advantages achieved 000). Therefore, the ranking score for the link “http:// by placing the hash table in System memory (which has sig mobile.myspace.com' is equal to 1,000,000/1,950, nificantly faster access times than flash or hard drive-based 10 storage). 000–0.513, while the ranking score for the link “http://ww In particular, in various embodiments, performance of w.myspace.com” is 950,000/1,950,000-0.487. The gener Community-Based Mobile Search Cache is improved on the ated triplets are also used in various embodiments to 15 table being to link queries to search results. This linkage is build the local query cache. done in three steps. First, given a hash of a Submitted query There are a number of inherent advantages in the approach the hash table is used to quickly determine whether there are described above. For example, not only does this approach cached search results for the query (i.e., matching hash in the store the most popular queries across all mobile users, but it hash table) and thus, if there is a corresponding cache hit or a also stores only the most popular search results for every cache miss in the query cache. Second, in the case of a cache query. Even though there might be tens or even hundreds of hit, the hash table provides pointers to the database (i.e., the search results available for a given query, only those search query cache) where the search results for the Submitted query results that are popular across all mobile users are cached. are located. Third, along with each search result pointer, the Again, this popularity is evidenced, and measured, by a large hash table provides its ranking score, enabling the Commu number of user's selecting a particular link for a particular 25 nity-Based Mobile Search Cache to properly rank (and thus query. Advantageously, this approach constrains the amount order) search results on the phone before displaying them to of memory resources required to build the query cache. For the user. instance, in a tested embodiment of the Community-Based The database of search results representing the query cache Mobile Search Cache, it was observed that an average of 1.5 is placed in flash memory or any other desired storage of the search results per query were cached based on the use of the 30 mobile device. The role of the query cache is to store the above-described cache saturation threshold. This can result available search results. Ideally, the query cache is stored in a into 3 to 5 times lower memory overhead when compared to format that that occupies the least possible storage space a simpler approach where some fixed number, M, of the top while allowing results to be quickly retrieved in order to search results for every popular query are cached. improve local performance of the Community-Based Mobile Another advantage is that each query and search result pair 35 Search Cache. However, any desired storage and access for extracted from the search logs has a known Volume (i.e., mat can be used for the query cache. The information stored aggregate link selection count by the community of mobile in the database for each search result includes the necessary users). By caching the Volume (or the normalized volume information for generating the same or similar user experi across all pairs in the cache) of every query-link pair, the ence that the user would see when directly accessing a con mobile device can rank search results for a given query locally 40 ventional search engine using conventional wireless commu without the need to reach a search engine over the 3G or other nications techniques. In other words, as with a conventional wireless link. search engine, the user will be presented with a set of one or In addition, by processing the mobile search logs in the more links including actual web address, a short description manner described above, the Community-Based Mobile of the web-site and the human readable form of the web Search Cache automatically discovers the most common mis 45 address. spellings of popular queries. In particular, given the cumber Further, in various embodiments, over time and as the user Some text input interfaces on phones or other mobile devices, Submits queries and clicks on search results, the Community mobile users, as a community, frequently misspell their que Based Mobile Search Cache updates both the hash table and ries. These frequent misspellings appear in the search logs as the database of search results. First, every time the user clicks popular queries, enabling the Community-Based Mobile 50 on a search result, its ranking score is increased in the hash Search Cache to cache search results for all these query mis table for that particular user. In addition, if a new query or a spellings. As a result, queries such as "yotube”, “facebok. new search result is selected that does not exist in the query “utube', and "yaho' will be automatically included in the cache (i.e., the Community-Based Mobile Search Cache local query cache without specifically considering whether accesses a remote search engine to retrieve results), both the those terms are misspellings of a desired query term. Fortu 55 hashtable and the database of that user are updated so that this nately, since most search engines are adapted to automatically query and search result can be retrieved from the cache in the correct for these types of spellings errors, the resulting query future. link pairs will generally be valid for the search that was 2.2.2 Hash Table Architecture: actually intended by the user. The following paragraphs describe a hash table-based 2.2.1 Query Cache Storage Architecture: 60 architecture for mapping queries to search results contained The set of search logs. Clearly, this information can be ated to the search result (short description, web address etc.) stored in many different formats without departing from the from the query cache. The second number (e.g., “0.4) cor concepts described herein for providing local cache-based responds to the ranking score of the search result. The ranking search capabilities on mobile devices. As such, it should be 30 understood that the precise form of link storage described score of every query-link pair is represented by its normalized below for the query cache is provided only for purposes of volume, as described above. The last field of each entry in the explanation, and that the concepts described herein are not hash table is a 64-bit number that is used to log information intended to be limited to the precise form of the query cache about the two search results in this entry (the 32 most signifi storage techniques that are described for purposes of expla cantly bits correspond to the first search result while the 32 35 nation. least significant bits correspond to the second search result). For example, in a tested embodiment of the Community However, in a tested embodiment of the query cache, only one Based Mobile Search Cache, each search result the query bit for each link was used to indicate if the user has ever cache was stored by storing its title (which serves as the link accessed the specific query-link pair. The rest of the flag bits to the landing page), a short description of the landing page are reserved for future purposes. As such, it should be clear 40 and the human readable form of the link. A formatted that the number of bits in the flag is not of particular relevance. example of this information is illustrated at step S6 (790) of For example, consider the first entry in the example hash FIG. 7, which shows the following formatted search result: table shown in FIG. 6. This entry corresponds to the query Facebook Mobile “youtube' and thus, its first field contains the hash value of the Facebook is a social utility that connects people with string “youtube' (i.e., “95431 A49). The most popular 45 friends and others . . . search result for this query (most selected in the mobile com m.facebook.com munity) points to the web address"http://www.youtube.com As such, it should be clear that the Community-Based Mobile and therefore the second field of this entry contains the hash Search Cache displays search results in the same or similar value of this web address along with its ranking score (i.e., way as a conventional search engine would in order to provide “08761A49, 0.4”). Similarly, the third field contains the hash 50 a transparent user query/search experience. and score information for the next most popular search result In a tested embodiment of the Community-Based Mobile for this query (i.e., “98BA4311, 0.35'). Further, from the Search Cache, the amount of memory used to store this infor value of the bit flags, it can be seen that the user has already mation for a search result in the query cache is on average 500 submitted the query youtube' on his phone and he has only bytes. However, the actual memory space required can vary clicked on the first search result immediately after. Note also 55 significantly depending on the internal structure offlash chips that there is a second entry for the term “youtube' in FIG. 6. used in the mobile device. For example, flash memories are While this will not be the case for every query term, there can generally organized in blocks of fixed size that are usually be multiple entries for each query term depending upon the equal to 2 KB, 4 KB or 8 KB depending on the memory score associated with the links corresponding to that query density of the chip. This means that even small files with size term. 60 less than the size of a flash block, will appear to occupy a full In general, given a set of