Identifying and Utilizing Contextual Information for Banner Scoring in Display Advertising
Total Page:16
File Type:pdf, Size:1020Kb
Eindhoven University of Technology MASTER Identifying and utilizing contextual information for banner scoring in display advertising Kliger, V.P. Award date: 2012 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain MASTER'S THESIS Identifying and Utilizing Contextual Information for Banner Scoring in Display Advertising Author: Vitaly Kliger (0755759) [email protected] Supervisors: Dr. Jeroen de Knijf Dr. Mykola Pechenizkiy SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE AT EINDHOVEN UNIVERSITY OF TECHNOLOGY EINDHOVEN 2012 2 Abstract The goal of the current project is to find a way of incorporating contextual information into web analytics. Web-analytics is a relatively new area, which does not have an established definition of context. Hence, literature survey of state of the art technologies as in the area of web-analytics, as in related areas of data mining and recommendation systems was conducted. We propose a definition of context allows for flexibility with defining contextual features from management, and evaluating them with data mining tools. Then, a framework for studying contextual features and applying them in web-application is proposed. The framework represents a seamless line of data processing with website logs as input and a model as output. The major part of the work is devoted to subgroup discovery algorithm, suitable for web- analytics task. The algorithm is capable of building a model, describing significant regions of the weblog data, describing rather unusual behavior with respect to a property of interest. The case study is devoted to Kliknieuws.nl, a Dutch online local news company whose underlying business model implies generating revenue by publishing banners on its web pages. Currently, Kliknieuws.nl charges advertisers for impressions only. However, it has an ambition to adjust its business model and charge for clicks too. Hence, the goal of the subgroup discovery algorithm developed is find a set of setting, relevant both to website itself and to external environment, under with probability of click on a banner is higher or lower. Then, Kliknieuws’ banner placement algorithm will be adjusted in such a way that pay-per-click banners will be shown to visitors when they are more likely to be clicked, and pay-per-impression banners will be shown in the opposite situation. We proposed some improvements, relevant to performance and quality of the model, as well as evaluation framework. We also discuss and evaluate diverse phenomena why in this or that situation, people adopted certain behavior with respect to clicking on banners. 3 Table of Content Abstract ......................................................................................................................................................... 2 1. Introduction ............................................................................................................................................. 7 1.1. Motivation ......................................................................................................................................... 7 1.2. Research Questions ........................................................................................................................... 9 1.3. Thesis Structure ................................................................................................................................. 9 2. Preliminaries .......................................................................................................................................... 11 2.1. Context ............................................................................................................................................ 11 2.2. Web Analytics .................................................................................................................................. 11 2.2.1. Overview ................................................................................................................................... 11 2.2.2. Data Collection .......................................................................................................................... 13 2.2.3. Data Presentation ..................................................................................................................... 16 2.2.4. Metrics ...................................................................................................................................... 17 2.3. Data Mining ..................................................................................................................................... 18 2.3.1. Recommendation Systems ....................................................................................................... 19 2.3.2. Expanding Recommendation Systems with Context ................................................................ 20 2.3.2.1. Computer Science View .................................................................................................... 20 2.3.2.2. Business View.................................................................................................................... 25 2.4. Taxonomy of Contextual Features .................................................................................................. 27 3. Methodology .......................................................................................................................................... 30 3.1. Subgroup Discovery Algorithm ........................................................................................................ 32 3.1.1. Introduction .............................................................................................................................. 32 3.1.2. Main Elements .......................................................................................................................... 34 3.1.3. Challenges ................................................................................................................................. 35 3.1.4. Specifics of the Dataset and Task .............................................................................................. 37 3.1.5. Formal Notations ...................................................................................................................... 37 3.1.6. Quality Measures ...................................................................................................................... 41 3.1.7. Search strategies ....................................................................................................................... 44 3.1.8. Pruning strategies ..................................................................................................................... 45 3.1.9. Redundancy Removing Strategies ............................................................................................ 46 3.1.10. Pseudo-code of the algorithm ................................................................................................ 48 3.1.11. Applying the rules ................................................................................................................... 49 3.1.12. Summary of Settings Used ..................................................................................................... 49 3.2. Implementation ............................................................................................................................... 50 3.3. Evaluation Framework ..................................................................................................................... 51 4. Case Study .............................................................................................................................................. 53 4.1. Kliknieuws Business Case ................................................................................................................ 53 4.2. Dataset Description ......................................................................................................................... 53 4.3. Defining View on the Data ............................................................................................................... 54 5. Experiments ........................................................................................................................................... 57 5.1. Testbed Setup .................................................................................................................................. 57 4 5.2. Results ............................................................................................................................................