Relationship Analysis Between User's Contexts and Real Input Words

Relationship Analysis between User’s Contexts and Real Input Words through Twitter Yutaka Arakawa, Shigeaki Tagashira and Akira Fukuda Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan 744, Motooka, Nishi-ku, Fukuoka, Fukuoka, JAPAN, 819-0395 Email: arakawa, shigeaki, [email protected] Abstract—In this paper, we propose a method to evaluate such as location, presence and time. We mainly focus on effectiveness of our proposed context-aware text entry by using how to make dictionary dynamically among above-mentioned Twitter. We focus on ”geo-tagged” public tweets because they three component of predictive transform in text entry. The include user’s important contexts, real location and time. We also focus on TV program listing because 50% traffic of iPhone reason is that the word not included in the dictionary cannot in Japan is generated from our home, in which I often tweets in be recommended, and we are not specialists of philolog- watching a TV. Cyclical collecting system based on Streaming API ical syntactic parsing. In our proposed system, based on and Search API of Twitter is proposed for gathering the target user’s current contexts, the dictionary in the mobile phone tweets efficiently. In order to find the relationship between user’s is updated periodically in cooperation with the dictionary contexts and really used words, we compare really-tweeted words with words obtained from Local Search API of Yahoo! Japan that creation server on the Internet, which generates user’s current is used for our context-aware text entry and words obtained from dictionary dynamically by using several public Web APIs. In TV program listing. We analyze 471274 tweets that have been our current prototype, “location” and “time” are adopted as collected from 15 December 2009 to 10 June 2010 for specifying a user’s context, and landmark names surrounding user are the relationship to landmark information and TV program. As a added based on user’s location, and TV programs’ title and result, we show that 5.1% of tweets include landmark words, and 9% of tweets include TV program words. Additionally, we bring performers’ name are added based on “time”. The reason why out that there are location dependent words and time dependent we use TV program is that 50% iPhone traffic in Japan is words. transmitted through home WiFi networks. As a result, if you input “H” at neighbor venue of Globecom2010, the system I. INTRODUCTION may suggest “Hyatt Regency Miami” as one of candidates. If In a recent research in Japan[1], it is turned out that over you input “J” or “K” in watching 24, the system may return fifty percent of Internet users access the Internet from mobile “Jack Bauer” or “Kiefer Sutherland” respectively. We have devices. And among them, over eighty percent users access already constructed the OpenWnn-based prototype system on the information by using not hierarchical menu in official site Android terminal[4]. We have already tried several system but search engines such as Google. Moreover, current mobile architectures, and have confirmed that one of them can achieve devices can use not only text messaging but also web-mail enough response time[5]. Also, we are evaluating our system such as Gmail. It indicates that one has an opportunity to through demonstration and questionnaire with some persons. input a long text. The increase of text input on mobile devices Although our system looks effective, there is an important drives the demand for improving a text input method. remaining issue. Since we started this research with the Recent mobile phones generally equip clever text entry assumption that such kind of system will be convenient for us, which have a function of predictive transform. This function there is no evidence or quantitative evaluation for representing consists of dictionaries, syntactic parsing, and learning. When the effectiveness. It is hard to gather large amount of results a user inputs “a”, it picks up the words which start with “a” through questionnaire-based evaluation. In addition, if we log from a dictionary, and recommends some candidate words really inputted sentences in a mobile phone, we must consider which seems appropriate according to syntactic parsing. In privacy protection in relation to personal data. addition, based on user’s input history, such as frequency or In this paper, we propose a method for evaluating context- time stamp of latest use, it sorts the order of the candidates. In aware text entry by using Twitter. Twitter is a micro blogging these days, iWnn[2], one of Japanese text entry adopted many service on the Internet, where a short message of up to 140 kinds of mobile phones, suggests more appropriate words characters, called tweet, can be posted. And these tweets are according to current seasons, time, body of received e-mail, re- generally open for the public. The reason why we focus on lationships between superior and inferior. As another approach Twitter is 1) we can obtain huge amount of public strings of that differs from text entry, “Google Suggest” provides often- various users, 2) Geotagging API released at November 2009 used keyword combination for optimizing search terms and enables users to add geo code to each tweet. It means that reducing keystrokes. we can extensively and publicly collect real input sentences Meanwhile we have proposed context-aware text entry[3] that include users’ real location. Therefore, we think that to which can suggest useful words based on user’s context analyze collected data clears up the relationship between user’s Train Transit Application E-mail Application Map Application Internet I took a Yamanote train Departure Roppongi Hyatt Regency Miami Local device Internal server External server from Shinbashi. Soon, I Destination Tokyo will arrive Shibuya . Meet API access module GPS sensor Context Estimation Schedule API For estimation Route Search (XML parser) at Statue of Hachiko. Engine Local context Global context Presense API Acceleration Context sensor updater Dictionary is dynamically generated by using public APIs on the Internet Other sensors Location API Feedback Nearest station API Schedule/Calender API Landmark Info. API Asynchronous Yahoo API Shibuya, Shinjuku, Train, Go, Take, Ride, Bank of America, Hyatt, Input API access module Google API Roppongi, Tokyo, etc. Shinbashi, Hachiko James L. Knight Center Hiragana !"# Roman character For making dictionary Select & Sort Engine Japanese language (XML parser) Schedule API morphological Context-Aware (MeCab) Fig. 1. Typical Effective Examples of context-aware text entry analysis Amazon API IME Direct ATOK plugin Tabelog API contexts and real input words. As a result, it can show the GuruNavi API effectiveness of our proposed text entry quantitatively. Output "Personal Context mixture of Chinese characters and Dictionary" As a general dictionary First, we construct the tweet collecting system that obtains Japanese phonetic characters kana-kanji Japanese tweets with geocode, where we effectively combine conversion API two APIs of Twitter, Streaming API and Search API to gather huge amount of tweets. Our system has already gathered half- Fig. 2. The architecture of prototype system million tweets since 15 December 2009. Next, we analyze collected data by comparing with the data that obtained from other APIs. In this paper, we use “Yahoo! Local search processes cyclically, the words related to a certain place API[6]” for obtaining landmark information, and use “TV become suggested, and normal words will be suggested in program listing on the Internet” for obtaining TV programs’ other place. title and performers’ name. These APIs are the same as APIs for making dictionary in our context-aware text entry. In our A. System architecture relationship analysis, both data are separated into some words The architecture of prototype system is shown in Figure 2． by using “Yahoo! Japanese language morphological analysis It is composed of three parts, local device, our server on the API” and “Yahoo! Key phrase extraction API”. Internet, and general web services on the Internet. The local As a result, we show that 5.1% of tweets include land- device has various sensors such as GPS and acceleration. In mark words, and 9% of tweets include TV program words. our prototype system, we use a PC as local device and adopt Additionally, we bring out that there are location dependent the Google Maps API as GPS sensor for setting user’s location words and time dependent words. The rest of the paper is visually. organized as follows. We present our context-aware input The internal server in the center of Figure 2 is a main part of method editor proposed previously in Section 2. In section our proposed system. It collects information and estimates of 3, we explain about Twitter. And following section explains user’s context, creates the dynamic dictionary, and suggests the relationship analysis. Finally, results are shown in Section 5. words by utilizing user’s context. These functions are possible to construct on local device. However, we set it into the II. CONTEXT-AWARE TEXT ENTRY FOR MOBILE PHONE server over the global network because it is important not only Fig.1 shows a typical service examples in which our pro- accuracy of estimation algorithm but also processing speed. posed context-aware text entry will work effective. It indicates Besides, we architect it works asynchronously to collect sensor the importance of words varies with a location (i.e., user’s information by the system and to input text on local device. context). For example, nearby station name is used at stations, The dictionary is updated whenever location is varied. As landmark name is used at a new places, product name is a result, local device only searches pre-constructed database used at bookstores and electronic retail stores.

Relationship Analysis Between User's Contexts and Real Input Words

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support