778 JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009

A Smart Targeting System for

Weihui Dai School of Management, Fudan University, Shanghai 200433, P.R.China Email: [email protected]

Xingyun Dai and Tao Sun School of Management, Fudan University, Shanghai 200433, P.R.China Email: {xingyundai, stephensun2004}@163.com

Abstract—With the rapid increase of online advertisements solved by traditional advertising, such as “I know I have in marketing, as well as the user's attention resources are wasted half of the money in advertising, but I don’t know becoming scarce, targeting technique is applied to improve which half”. Therefore, if we send advertisements to the the efficiency and effectiveness of online advertising by right audience through online advertisements targeting delivering the right advertisement to the right audience, at technology, it can improve the efficiency of the online the right time and situation, and with the right method. The attention of current targeting is transformed from content advertisements platform, make visitors become buyers, targeting, frequency targeting, time targeting and save the cost of advertisers and improve customer’s geographical targeting to the extended targeting based on loyalty. It is greatly valuable for today’s companies, who user’s characteristics, with data mining technique and is under the short-life and highly complicated and behavior modeling technique on the analysis of users. The competitive business environments [2] [3]. To reach the protection of privacy has been an argued issue along with goal, the system needs thoroughly understand user’s the application of above technique. current interests, establish proper system architecture, and This paper presents a smart targeting system with high protect the user’s privacy at the same time. efficiency, effectiveness and protection of privacy. It uses Some scholars carry out a large number of quantitative Web content mining and Web usage mining techniques to track and mine the user behaviors hiding in the historical and modeling researches on online advertising targeting and current user sessions, and designs interfaces for system. Aggarwal C.C. (1998)pointed out that most of the maintaining the advertising rules, which can control the online advertising system used user-based targeting targeting system to automatically deliver the personalized method for the banner advertisements, and introduced a advertisements. All the related knowledge and rules are number of statistics, optimization and scheduling model integrated by one unified vector space model. The process of [4]. Mobasher B. (2001) proposed the use of associated targeting doesn’t require any sensitive data input by users, mining rules based on user behaviors to provide an so their privacy is protected. effective and extendable customized webpage technology

[5]. Milani A. (2004, 2005) presented the use of fuzzy Index Terms—online advertising, web advertising, personalization, behavioral targeting, web mining similarity algorithm [6] [7] and gave a general adaptive online service architecture based on evolutionary genetic algorithm [8]. Scholars in the research of personalized I. INTRODUCTION network services argued that it is necessary to have targeting system from content targeting, frequency The emergence and development of online targeting, time targeting and geographical targeting to the advertisements are subjected to three direct influences, extended targeting based on user’s characteristics [9]-[12]. such as the quantity of websites, the quantity of online Recently, scholars have expressed their concern about customer services and the promotion of broadband. At user’s privacy issues in personalized recommendation present, these three conditions have become mature with system [13]-[16]. Kazienko P.and Adamski M.(2004) the rapid application of Internet. According to introduced AD ROSA system for integrating user iResearch’s prediction, for example, China's online behavior and content mining technologies to reduce the advertising market will have 30% compound growth in user's input and protect the privacy[16]. the next few years, and reach 21.6 billion in 2009[1]. Therefore, our paper is aimed to address the further With the growing proportion of online advertisements in research on ST (Smart Targeting) system to meet the marketing, as well as the user's attention resources are current development trend of online advertisements, becoming scarce, people are eager to rely on the displaying the advertisements to the right audience and advanced technology to solve the problems that can’t be protecting their privacies efficiently and effectively. It uses Web content mining and Web usage mining Manuscript received Feb 20, 2009; revised April 30, 2009. This techniques to track and mine the user behaviors hiding in research was supported by National High-tech R & D Program (863 the historical and current user sessions, and designs Program) of China (No.2008AA04Z127), National Natural Science Foundation of China (No.70401010),and Shanghai Leading Academic Discipline Project (No.B210).

© 2009 ACADEMY PUBLISHER JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009 779

interfaces for maintaining the advertising rules, which After user’s first request for a webpage, start a new can control the targeting system to automatically deliver session and then the system will collect every step of the the personalized advertisements. All the related user’s assess behavior including URL, time and behavior knowledge and rules are integrated by one unified vector of clicks. We can know the content that he/she interested space model. The process of targeting doesn’t require any in and the access mode he/she prefer. These factors can sensitive data input by users, so their privacy is protected. provide the parameters when we want set the right advertisements to the user. II. TARGETING SYSTEM FOR ONLINE Fig.2 shows the operation of ST system. We have ADVERTISING following steps to reach the goal: 1) Data Collection A. Concept and Operation Through the use of Web data mining technology to In order to realize targeting for advertising, the smart extract webpage content and the history of the user targeting (ST) system we presented here must have some session data, as well as track and dig the current user’s basic functions: webpage and the user related data behaviors, we can get the user’s interest and preferences. collection, web content modeling and model updating, 2) Pattern Recognition user modeling and model updating, and targeting [25]. We can get the typical behavior of a group of users by We can see the concept of ST system in Fig.1. clustering the keywords and user session after keyword extraction. 3) User Modeling For each group of users who have similar access behaviors, we establish their short-run and long-run user model. 4) User Matching For each user, we match him/her to the proper user model based on the user's current interest or long-run interest. 5) Targeting We set different advertisement options for the users by different user model with appropriate advertisement rules

Fig.1 Concept of ST System to achieve the purposes of targeting.

Fig.2 Operation of ST System

© 2009 ACADEMY PUBLISHER 780 JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009

importance coefficients in title, description and the whole B. Data Collection t j ST system uses the web log format that is compatible keywords. n is the quantity of webpage that include with W3C [17]. The format is as follows: the keyword. A method named HACM can be used to find the Jaccard coefficient which can help us to get the group. Jaccard coefficient tp tp L w ∗w ∑k =1 ik jk sim(t p,t p) = i j tp tp tp tp ST system uses sid (SessionID) to record and L (w )2 + L (w )2 − L w ∗w ∑k =1 ik ∑k =1 jk ∑k =1 ik jk recognize the user’s session [18] , uid to record and recognize user’s unique ID. Referer is the coming path of i, j∈[1,L]. current page. We can know the time the user spended at The whole keywords become K groups and the mean different pages with referrer and uri. The other 1 keyword web vector ctp = ∑ n t p can parameters please refer to the document [17] k i =k 1 ik nk The display and click of the advertisements have the same log format: represents the feature of group k. nk is the number of

keywords of group k. Similarly, we can get mean keyword advertisement vector 1 n . t a={wta,wta, ,wta } is the ctak = ∑i k= 1tika j j1 j2 L jM nk Type is used to distinguish display from click. Time, c- advertisement vector of the keyword i belong to group k. ip and referrer can be judged by the advertisement engine Similarly, we can have the users’ sessions clustered. while other parameters are transferred by links. Referer u = {u ,u , , u } represents all the users and represents the parent link, advid represents the ID of 1 2 L N advertisement and camid represents the ID of sa sa sa sa s a = {w , w ,L, w } ( wij represents the number advertisement activity. i i1 i2 iM that advertisement j is being requested during the session C. Pattern Recognition sa of use i. wij can be normalized to[0,1])is all user’s At this stage, we should get all the keywords and then cluster them. Here we use word segmentation method to session vector. get the keywords. Firstly, we can get the segmentation After clustering with HACM, we can get the Jaccard result from both sides based on dictionary processing. coefficient, group mean user’s session vector and mean Secondly, some rules are used to disambiguate based on advertisement access vector as follows: sp sp statistics processing. Finally save the preliminary L w ∗w ∑k =1 ik jk keywords d = {t , t , , t } from the web space sim(s p,s p) = , 1 2 L Q i j sp sp sp sp L (w )2 + L (w )2 − L w ∗w ∑k =1 ik ∑k =1 jk ∑k =1 ik jk w ={p1, p2,L, pL} to the database. Next the preliminary keywords should be clustered. 1 n 1 n csp = ∑ k s p, csa = ∑ k s a. We suppose the page vector of keywords is k n i =1 ik k n i =1 ik k' k tp tp tp ⎡ w w w ⎤ D. User Modeling ⎢ 11 12 L 1L ⎥ ⎢ tp tp tp ⎥ In ST system, there are two kind of user model. One is w w w tp = ⎢ 21 22 L 2L ⎥ short-term user model and the other is long-term user ⎢ M M M M ⎥ model. ⎢ tp tp tp ⎥ 1) Short-run user model ⎢w w w ⎥ ⎣ Q1 Q1 L QL ⎦ In the short-term user model, the coordinate of L is the quantity of webpage in web space . From asp asp asp w webpage access vector asp ={w ,w ,L,w } is as the information retrieval theory[18,19], we have the i i1 i2 iL coordinate: follows: tp b t d k L ⎧1, if p j ∈ w(1), w ji = (tf ji + αtf ji + βtf ji + γtf ji ) ∗ log( ), j ∈[1,Q],i ∈[1, L] ⎪ t j n asp ⎪ asp' w = ⎨δ ∗ w , if p ∈ w(2) , b ij ij j represents the weight of keyword t j in page pi and tf ji , ⎪ ⎪0, if p j ∈ w(0). t , d and k represents the frequency that keyword ⎩ tf ji tf ji tf ji δ ∈[0,1] is constant, w(1) is the webpage collection t appear in body, title, description and the whole j which has just been accessed, w(2) is the webpage 、 、 keywords respectively. α β γ are the keyword’s collection which was accessed before, and w(0) is the

© 2009 ACADEMY PUBLISHER JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009 781

webpage collection which has not been accessed yet. We Similarly, we have sa , q and spt . ui ui ui asp ' asp can assume δ = 0.7 , wij is different from the wij sp , sa and q can be calculated as follows: ui ui ui before update. j −1 The vector of the time for the session lasting sp = nui ∗ s p u ∑ j =1l j , aspt aspt aspt i u aspt = {w , w , , w } can be count form i i1 i2 L iL i nui j −1 summing the time the user spends on all pages and then sa = ∑ h ∗ s a , ui j = 1 j aspt ui normalized to wij ∈[0,1], j ∈[1, L] . j −1 The coordinate of advertisement display vector q = nui ∗ q u ∑ j = 1D j , asa asa asa i u asai ={wi1 ,wi2 ,L,wiM } is as follows: i ∈ [0,1] 、 ∈ [0,1] 、 ∈ [0,1] are the discount ⎧1, if a j ∈a(1), l h D ⎪ coefficients. We assume = 0.6 , = 0.65 and asa ⎪ asa' l h wij = ⎨μ ∗ wij , if a j ∈a(2), ⎪ D = 0.7 . j is smaller when it is more near present and ⎪0, if a ∈a(0). the j of current user’s session is 1. ⎩ j The same to short-run user model, we will consider the is constant. Here we assume , is μ ∈[0,1] μ = 0.8 a(1) time that the user spends on webpage. Similarly, the advertisement collection which has just been appeared, nui is the advertisement collection which was appeared normalize the vector sptu = ∑ s j pt until the a(2) i j = 1 u before, and a(0) is the advertisement collection which has i spt not been appeared yet. asa' is different from the asa coordination w ui ∈[0,1], j ∈[1, L] . wij wij uij before update. The coordinate of user’s search behavior during current E. User Matching q q q Also, we have two ways to match the user to the proper q = {w ,w , ,w } session vector i i1 i2 L iQ is as follows: user model. One is based on the user’s current interest and the other is based on the user’s long-run interest. ⎧1, if t j ∈ k(1), 1) Matching based on current interest ⎪ q ⎪ q' To match the user to the proper user model, we should w = ⎨ρ ∗ w ,ift ∈ k(2), ij ij j find the smallest cos(asi ,ctp ) and cos(asp ,csp ) . is ⎪ i k i k asi i ⎪0, if t ∈ k(0). ⎩ j the vector that represents the user’s interest ρ ∈[0,1] is constant. Here we assume ρ = 0.85 . k(1) is ( , the symbol ⊗ represents that the asi i = asp i ⊗ aspt i the keyword collection which has just been searched, corresponding coordinates of current session vector asp k(2) is the keyword collection which was searched before, i multiplied by webpage time vector aspt ). The smaller and k(0) is the keyword collection which has not been i q ' q cosine value is, the higher similarity. We can calculate searched yet. w ij is different form the wij before the cosine value as follows: update. asi ctp ∑L w ∗w 2) Long-run user model j =1 ij kj cos(asi ,ctp ) = , In the long-run user model, all the history session i k asi 2 ctp 2 ∑L (w ) ∗ ∑L (w ) vector of User ui ∈u can be record because of the j =1 ij j =1 kj unique ID in ST system, including the session vector asp csp L w ∗w sp that represents the situation the user access webpage, ∑j =1 ij k' j ui cos(aspi,cspk') = . sa that represents the situation the user access asp2 csp2 ui L (w ) L (w ) ∑j =1 ij ∗ ∑j =1 k' j advertisements, q that represents the keywords for the ui As to the vector qi , we have another vector user’s searching and spt u that represents the time for the i aqa aqa aqa sessions lasting. Then we have, aqai ={wi1 ,wi2 ,L,wiM } that represents the spui spui spui influence of user’s searching behavior to the weight of sp =< s 1 p,s 2 p, ,s p >={w ,w , ,w } u u u i L uinuii ui1 ui2 L uiL i 2 advertisement. Then we get, sp u represents the session j of user and Q q i ∑ w ∗t a j = 1 ij j . spui aqai = wuiy represents the long-run weight for interest of user Q q ∑ j = 1wij ui in page y. 2) Matching based on long-run interest

© 2009 ACADEMY PUBLISHER 782 JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009

There are two methods for matching based on long-run history and the recently period will get the smaller weight. interest. The first one is not only concerning the current In this situation, we have: access behavior of the user but also concerning the rank = (1− sa ) ⊗ (1− asa ) ⊗ ep ⊗ ap ⊗ (ξcta +ψaqa + ζcsa ) ui ui i i k ui k ' history of session. Then we can use sp u , q and spt u asa and ep are only associated with current i ui i i i situation, but others are associated with history. asp instead of i , qi and aspti . Similar to matching For the condition that only concern about the history, : based on current interest, we can get asi , cta , csa we have u ' i k k' rank = (1− sa ) ⊗ (1− asa ) ⊗ ep ⊗ ap ⊗ (ξcta +ψaqa ) ui ui i i k ui and aqau . The second one is concerning the user’s i III. SYSTEM IMPLEMENTATION AND interest reflected from the history. Then, assume j=0 EVALUATION when calculate sp , sa , aspt and q . u i ui i ui asi cta A. Multi-agent Architecture Therefore, u , k and aqau are all we need. i i A well designed architecture may ensure good system F. Targeting performance as well as its flexibility and expansibility, therefore update and transfer of this system can be After knowing the user’s user model, it is time to make implemented on high efficiency. ST system uses multi- the right advertisement for him/her with some agent architecture to achieve this goal as fig.3. advertisement rules. Nowadays, some new rules are appeared along with the development of online advertising, such as CPI(Cost Per Impression)、CPM (Cost Per Month)、CPC(Cost Per Click)、CPS (Cost Per Sale)、CPA(). Now we discuss the method for targeting. 1) Targeting with the current interest After getting all the relative vectors, we can create a sort vector ranki for personalized advertisement. The vector can be calculated as follows: , ranki = (1− sia) ⊗ (1− asai ) ⊗ epi ⊗ ap ⊗ (ξctak +ψaqai + ζcsak ' ) ep ep ep ep = {w , w , , w } represents whether the i i1 i2 L iM advertisement is allowed to display during the session i of ⎧ epu i eq ⎪1, if w j > wij , the user. wij = ⎨ ⎩⎪0, others. The coefficients ξ , ψ and ζ represents the importance of the vector ctak , aqai and csak' in ranki . Fig.3 Multi-agent Architecture

Here we assume , ψ = 1.2 and . ξ = 1 ζ = 0.9 There were ten agents used in this system and formed ranki includes nearly all the necessary information. the multi-agent architecture. The functions of each agent For example, (1− s a) will make the advertisements are as followings: i that the user has clicked get the smaller weight during Agent_P1: preprocessing agent for visiting sessions. It the targeting later, (1− asai ) will make the processes the log file collected by system, and forms a advertisement displayed nearest get the smaller weight, visiting vector of sessions on advertisement. ep and ap are parts of the advertisement rules and Agent_P2: preprocessing agent for history sessions. It i processes the log file collected by system, and forms a aqa cta k , u and csa k ' reflect the content and visiting vector of sessions on web pages. i Agent_D1: mining agent for advertisement visitation. access behavior mode that the user interested in. It analyzes visiting vector of sessions on advertisement, Every HTTP request will update rank i and then reflect and forms a visiting pattern. characteristics of the user in time. Agent_D2: mining agent for web page visitation. It 2) Targeting with the long-run interest analyzes visiting vector of sessions on web pages, and For the condition that concern about the current forms a visiting pattern. Agent_M: monitoring agent for sessions. It detects and situation and history, ranki will also reflect the session manages sessions and users’ behaviors.

© 2009 ACADEMY PUBLISHER JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009 783

Agent_C: processing agent for integration of Table 1 provided its visiting pathway, visiting pattern, information. It processes information from legacy data interested topics, displaying advertisements, and the and external data, and integrates them into the centre clicks on these advertisements. database. Agent_W1: downloading and renewing agent for web Table.1 Process of Test Session pages. It downloads and renews relative web pages. Agent_W2: preprocessing agent for web contents. It extracts keywords and forms the keyword vector of web contents. Agent_W3: mining agent for web contents. It utilizes the vector of web contents to form a cluster of keywords, and establish the concept space of web contents. Agent_R: processing agent for advertisement rule management. It control advertising and targeting by adjustment of rules. B. Evaluation Test We evaluated the functions of this system by a test session. This session included 9 steps as in Fig.4.

First, the user visited homepage. At the beginning of this session, ST system found the sole user ID from his cookies, and matched his visiting behavior with Pattern No.2. According to the keywords of visiting web pages, ST system continued to matching his interested topics with Topic No.8. This is an entertainment topic, so the advertisement A from music base, and the advertisement B from iPod were recommended to this user. In the second step, the user clicked and was linked to a music web page. He was still matched with Pattern No.2 and Topic No.8, advertisement C and advertisement D from the similar entertainment advertisement base were recommended. But in the third step, he visited the entertainment web page, while his visiting behavior pattern was changed and matched with Pattern No.14. This is a pattern of young movie fans, so a famous movie of advertisement D was recommended. By such-and-such steps, ST system had recommended 16 advertisements in this session. By matching and tracing the visiting behavior pattern and interested topics of users, ST system can make a recommend strategy at real-time. To evaluate the efficiency and effectiveness of ST system, we adopted a real-test of 24 days. We applied this system to a real website. Advertisements displayed on this website were recommended by manual operation in the first 8 days, but by ST system in the last 16 days. C. Evaluation Indexes and Contrastive Results

The evaluation of online advertisement is different

from that of traditional advertisement. It will consider the visitor’s interactive response on specified advertisement.

Fig.5 outlines the transformation of advertisement effects on Internet.

Fig.4 Test Session and Advertisement Targeting Fig.5 Transformation of Advertisement Effects

© 2009 ACADEMY PUBLISHER 784 JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009

Advertising attracts visitors’ attention and creates its The contrastive results of dynamic changes with 24 test impression to these visitors, thereof leads to some clicks days are Fig. 6 and Fig.7. and further behaviors by the interested visitors. In this process, perceived impression and clicked percent by visitors are key factors to decide the transformation of advertisement effects [25]. A series of indexes as CP(X) are usually applied to evaluate the online advertisement and decide its calculation in cost. In 1995, InfoSeek and Netscape first used the CPM in their online advertisement. In 1996, Yahoo and P&G presented CPC. Hereafter, Hoffman and Novak established the exposure metrics and the interactivity metrics [26]. Up to now, CP(X) has been widely used by Google and most famous websites as the representative of interactivity metrics. CP(X) includes some important indexes as followings: CPT-Cost Per Time, - CPI Cost Per Impression, Fig.6 Change of CTR and CPC CPM-Cost Per Month Per Thousand Impression, CPC-Cost Per Click, By simple analysis of statistics, ST system has CPS-Cost Per Sale, improved the efficiency and effectiveness of CPA-Cost Per Action. advertisements. In this system, any sensitive data of the In CPC, an important index of CTR (Click Through users are not required to be input, so their privacy is well Rate) is usually applied to evaluate the advertising effect: protected. CTR = Clicks / Impressions In our evaluation, the well-known indexes of CTR (Click Through Rate), CPC (Cost Per Click) and CPA (Cost Per Action) were adopted to give the contrastive results as Table.2 and Table.3.

Table.2 Results of Statistics

Fig.7 Change of Actions and CPA

Table.3 Squares and F Test IV. SYSTEM APPLICATION AND PROSPECT The main motives for online advertising may be different, such as brand promotion, increase of visitors on website, direct sales of products, and support services for other distribution [25]. ST system can adjust its targeting strategies to satisfy the demands for different motives. ST system has the potential prospect to be well applied in the following online advertising modes: (1) Port Advertising In this mode, online advertising is carried out on some website ports. ST system can help website ports select From Table.3, we can find that the F parameters of advertisement and match online advertising to most CTR, CPC and CPA are 245.358 、 402.841 、 62.861 visitors’ behaviors. respectively, while P <0.001. This indicates that ST (2) Agency Advertising system has significant impacts on recommendation. In this mode, the agency has usually ordered some From Table.2, we can find that the average of CTR has specified advertising schedules from websites, and should increased 106.81%, while the CPC and CPA decreased decide the solution to maximize his profits. ST system 48.36% and 48.15% respectively.

© 2009 ACADEMY PUBLISHER JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009 785

can help the agency match schedules to an optimized [5] Mobasher B., Dai H., Luo T., et al., “Effective solution. personalization based on association rule discovery from (3) Owner Advertising Web usage data,” in Proceedings of WIDM 2001, Chiang In this mode, online advertising is carried out on the H. L. and Lim E. P., Eds. Atlanta: ACM, 2001, pp.9-15. [6] Milani A., “Minimal knowledge anonymous user profiling owner’s website. ST system can help website match for personalized services,” LNAI, vol. 3533, pp. 709-711, advertisement to personalized visitors by targeting 2005. technique. [7] Milani A., Morici C., and Niewiadomski R., “Fuzzy matching of user profiles for a banner engine,” LNCS, vol. V. CONCLUSION 3045, pp.433-443, 2004. [8] Milani A. and Marcugini S., “An architecture for Targeting technique is applied to improve the evolutionary adaptive web systems,” LNCS, vol. 3828, pp. efficiency and effectiveness of online advertising. The 444-454, 2005. attention of current targeting is transformed from content [9] Zeng Chun, Xing Chun-xiao, and Zhou Li-zhu, “A survey targeting, frequency targeting, time targeting and of personalization technology,” Journal of Software, vol. geographical targeting to the extended targeting based on 13(10), pp. 1952-1961, Oct. 2002. user’s characteristics, with high efficiency, effectiveness [10] Yan Ying,Wang Daling,and Yu Ge, “Association rules and protection of privacy. mining algorithm of webpage views for personalized This paper presented a new smart targeting system for recommendation,” Computer Engineering, vol. 31(1), pp. 79-81, Jan. 2005. online advertising. It uses Web content mining and Web [11] Zhang Bingqi, “A collaborative filtering recommendation usage mining techniques to track and mine the user algorithm based on domain knowledge,” Computer behaviors hiding in the historical and current user Engineering, vol. 31(21), pp.7-10, Nov. 2005. sessions, and designs interfaces for maintaining the [12] Wu Lihua and Liu Lu, “User profiling for personalized advertising rules, which can control the targeting system recommending systems—A review,” Journal of The China to automatically deliver the personalized advertisements. Society For Scientific and Technical Information, vol. All the related knowledge and rules are integrated by one 25(1), pp. 55-62, Jan. 2006. unified vector space model. The process of targeting [13] Baudisch P. and Leopold D., “Attention, indifference, doesn’t require any sensitive data input by users, so their dislike, action: web advertising involving users,” Netnomics, vol. 2(1), pp. 75-83, 2000. privacy is protected. This system has the potential [14] Juels A., “... and privacy too,” LNCS, prospect to be well applied in some online advertising vol. 2020, pp. 408-424, 2001. modes, such as Port Advertising, Agency Advertising and [15] Berendt B. and Teltzrow M., “Addressing users’ privacy Owner Advertising. concerns for improving personalization quality: towards an However, this system is to be improved and perfected integration of user studies and algorithm evaluation,” LNAI, in practical application. While applied in large website, vol.3169, pp.69-88, 2005. the scale of vectors in pattern recognition will become [16] Kazienko P. and Adamski M., “Personalized web huge, so the optimization in data processing and some advertising method,” LNCS, vol.3137, pp.146-155, 2004. fast algorithms are further research work on this system. [17] Hallam-Baker P. M. and Behlendorf B., “W3C working draft,” http://www.w3.org/pub/WWW/TR/WD-logfile- Complex adaptive technology and rules, such as ANN 960323.html, Mar. 20 2007. (Artificial Neural Network) technology and multi- [18] Salton G., Automatic Text Processing-The Transformation, objective optimization rules, are also significant to be Analysis, and Retrieval of Information by Computer, MA: explored in this work. Addison-Wesley, 1989. [19] Kazienko P., Clustering of Hypertext Documents Based on ACKNOWLEDGMENT Flow Equivalent Trees, Wroclaw: Wroclaw University of Technology, 2000. This research was supported by National High-tech R [20] Schwarzkopf E., “An adaptive web site for the UM2001 & D Program (863 Program) of China conference,” in Proceedings of the UM2001, (No.2008AA04Z127), National Natural Science Gmytrasiewicz and Vassileva J., Eds. Berlin Heidelberg: Foundation of China (No.70401010) and Shanghai Springer-Verlag, 2001, pp.77-86. Leading Academic Discipline Project (No.B210). [21] Han J. W. and Kamber M., Data Mining: Concepts and Techniques, MA: Morgan Kaufmann Publishers, 2001. [22] Zhu T., “Using markov chains for structural link prediction REFERENCES in adaptive web sites,” LNAI, vol.2109, pp.298-300, 2001. [1] iResearch, China Online Advertising Research Report [23] Salton G., Automatic Text Processing-The Transformation, 2008-2009, http://www.iresearch.com.cn, Mar.18 2009. Analysis, and Retrieval of Information by Computer, MA: [2] Fink J and Kobsa A., “A review and analysis of Addison-Wesley, 1989. commercial user modeling servers for personalization on [24] Salton G., Wong A., Yang C. S., A vector space model for the world wide web,” User Modeling and User-Adapted automatic indexing. Communications of the ACM, 1975, Interaction, vol. 10(3/4), pp. 209-249, 2000. 18(5):613-620. [3] Allen C., Kania D., and Yaeckel B., Internet World Guide [25] Sun Tao, Research on User Behavior Targeting for Online to One-To-One Web Marketing, New York: John Wiley Advertising [D]. Shanghai: Fudan University, 2007. and Sons, 1998. [26] Hoffman D. L., Novak T. P., “Advertising pricing models [4] Aggarwal C. C., Wolf J. L., and Yu P. S., “A framework for the World Wide Web,” in Internet Publishing and for the optimizing of WWW advertising,” LNCS, vol. 1402, Beyond: the Economics of Digital Information and pp. 1-10, 1998.

© 2009 ACADEMY PUBLISHER 786 JOURNAL OF COMPUTERS, VOL. 4, NO. 8, AUGUST 2009

Intellectual Property, Hurley D., Kahin B. and Varian H., Dai became a member of IEEE in 2003, a senior member of Eds., Cambridge, UK: MIT Press, 2000. China Computer Society in 2004, and a senior member of China Society of Technology Economics in 2004.

Xingyun Dai received her BS degree in Software Engineering in 2008 from East China Normal University, China. Weihui Dai received his BS degree in Automation She is current a master student at School of Management, Engineering in 1987, his MS degree in Automobile Electronics Fudan University, China. in 1992, and his PhD in Biomedical Engineering in1996, all from Zhejiang University, China. He is currently an Associate Professor at the Department of Information Management and Information Systems, School of Management, Fudan University, Tao Sun received his BS degree in Information Management China. He has published more than 100 journal papers and and Information Systems in 2004 from Zhejiang University, conference articles in Software Engineering, Information China, and his MS degree in Information Management and Systems, Service Management, Net-ecology, e-Business, etc. Dr. Information Systems in 2007 from Fudan University, China.

© 2009 ACADEMY PUBLISHER