Identifying and Utilizing Contextual Information for Banner Scoring in Display Advertising

Total Page:16

File Type:pdf, Size:1020Kb

Identifying and Utilizing Contextual Information for Banner Scoring in Display Advertising Eindhoven University of Technology MASTER Identifying and utilizing contextual information for banner scoring in display advertising Kliger, V.P. Award date: 2012 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain MASTER'S THESIS Identifying and Utilizing Contextual Information for Banner Scoring in Display Advertising Author: Vitaly Kliger (0755759) [email protected] Supervisors: Dr. Jeroen de Knijf Dr. Mykola Pechenizkiy SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE AT EINDHOVEN UNIVERSITY OF TECHNOLOGY EINDHOVEN 2012 2 Abstract The goal of the current project is to find a way of incorporating contextual information into web analytics. Web-analytics is a relatively new area, which does not have an established definition of context. Hence, literature survey of state of the art technologies as in the area of web-analytics, as in related areas of data mining and recommendation systems was conducted. We propose a definition of context allows for flexibility with defining contextual features from management, and evaluating them with data mining tools. Then, a framework for studying contextual features and applying them in web-application is proposed. The framework represents a seamless line of data processing with website logs as input and a model as output. The major part of the work is devoted to subgroup discovery algorithm, suitable for web- analytics task. The algorithm is capable of building a model, describing significant regions of the weblog data, describing rather unusual behavior with respect to a property of interest. The case study is devoted to Kliknieuws.nl, a Dutch online local news company whose underlying business model implies generating revenue by publishing banners on its web pages. Currently, Kliknieuws.nl charges advertisers for impressions only. However, it has an ambition to adjust its business model and charge for clicks too. Hence, the goal of the subgroup discovery algorithm developed is find a set of setting, relevant both to website itself and to external environment, under with probability of click on a banner is higher or lower. Then, Kliknieuws’ banner placement algorithm will be adjusted in such a way that pay-per-click banners will be shown to visitors when they are more likely to be clicked, and pay-per-impression banners will be shown in the opposite situation. We proposed some improvements, relevant to performance and quality of the model, as well as evaluation framework. We also discuss and evaluate diverse phenomena why in this or that situation, people adopted certain behavior with respect to clicking on banners. 3 Table of Content Abstract ......................................................................................................................................................... 2 1. Introduction ............................................................................................................................................. 7 1.1. Motivation ......................................................................................................................................... 7 1.2. Research Questions ........................................................................................................................... 9 1.3. Thesis Structure ................................................................................................................................. 9 2. Preliminaries .......................................................................................................................................... 11 2.1. Context ............................................................................................................................................ 11 2.2. Web Analytics .................................................................................................................................. 11 2.2.1. Overview ................................................................................................................................... 11 2.2.2. Data Collection .......................................................................................................................... 13 2.2.3. Data Presentation ..................................................................................................................... 16 2.2.4. Metrics ...................................................................................................................................... 17 2.3. Data Mining ..................................................................................................................................... 18 2.3.1. Recommendation Systems ....................................................................................................... 19 2.3.2. Expanding Recommendation Systems with Context ................................................................ 20 2.3.2.1. Computer Science View .................................................................................................... 20 2.3.2.2. Business View.................................................................................................................... 25 2.4. Taxonomy of Contextual Features .................................................................................................. 27 3. Methodology .......................................................................................................................................... 30 3.1. Subgroup Discovery Algorithm ........................................................................................................ 32 3.1.1. Introduction .............................................................................................................................. 32 3.1.2. Main Elements .......................................................................................................................... 34 3.1.3. Challenges ................................................................................................................................. 35 3.1.4. Specifics of the Dataset and Task .............................................................................................. 37 3.1.5. Formal Notations ...................................................................................................................... 37 3.1.6. Quality Measures ...................................................................................................................... 41 3.1.7. Search strategies ....................................................................................................................... 44 3.1.8. Pruning strategies ..................................................................................................................... 45 3.1.9. Redundancy Removing Strategies ............................................................................................ 46 3.1.10. Pseudo-code of the algorithm ................................................................................................ 48 3.1.11. Applying the rules ................................................................................................................... 49 3.1.12. Summary of Settings Used ..................................................................................................... 49 3.2. Implementation ............................................................................................................................... 50 3.3. Evaluation Framework ..................................................................................................................... 51 4. Case Study .............................................................................................................................................. 53 4.1. Kliknieuws Business Case ................................................................................................................ 53 4.2. Dataset Description ......................................................................................................................... 53 4.3. Defining View on the Data ............................................................................................................... 54 5. Experiments ........................................................................................................................................... 57 5.1. Testbed Setup .................................................................................................................................. 57 4 5.2. Results ............................................................................................................................................
Recommended publications
  • HTTP Cookie - Wikipedia, the Free Encyclopedia 14/05/2014
    HTTP cookie - Wikipedia, the free encyclopedia 14/05/2014 Create account Log in Article Talk Read Edit View history Search HTTP cookie From Wikipedia, the free encyclopedia Navigation A cookie, also known as an HTTP cookie, web cookie, or browser HTTP Main page cookie, is a small piece of data sent from a website and stored in a Persistence · Compression · HTTPS · Contents user's web browser while the user is browsing that website. Every time Request methods Featured content the user loads the website, the browser sends the cookie back to the OPTIONS · GET · HEAD · POST · PUT · Current events server to notify the website of the user's previous activity.[1] Cookies DELETE · TRACE · CONNECT · PATCH · Random article Donate to Wikipedia were designed to be a reliable mechanism for websites to remember Header fields Wikimedia Shop stateful information (such as items in a shopping cart) or to record the Cookie · ETag · Location · HTTP referer · DNT user's browsing activity (including clicking particular buttons, logging in, · X-Forwarded-For · Interaction or recording which pages were visited by the user as far back as months Status codes or years ago). 301 Moved Permanently · 302 Found · Help 303 See Other · 403 Forbidden · About Wikipedia Although cookies cannot carry viruses, and cannot install malware on 404 Not Found · [2] Community portal the host computer, tracking cookies and especially third-party v · t · e · Recent changes tracking cookies are commonly used as ways to compile long-term Contact page records of individuals' browsing histories—a potential privacy concern that prompted European[3] and U.S.
    [Show full text]
  • The Next Breathtaking Installment in The
    v v THE NEXT BREATHTAKING INSTALLMENT IN THE #1 BESTSELLING AMULET SERIES HAS ARRIVED! FALL Cover 2018 illustration SCHOLASTIC Art © Dav Pilkey. DOG MAN and related designs are trademarks and/or registered trademarks of Dav Pilkey. Dav of trademarks registered and/or trademarks designs are DOG MAN and related Pilkey. Art © Dav HARDCOVER & PAPERBACK BOOKS SCH OLASTIC HARDCOVER & PAPERBACK BOOKS & PAPERBACK HARDCOVER OLASTIC FALL 2018 See page 19! SCHOLASTIC Also available 3367548 Scholastic Canada Ltd. Please recycle this catalogue. 604 King St. West, Toronto, ON. M5V 1E1 1-800-268-3848 www.scholastic.ca Follow us: ScholasticCda ScholasticCanada ScholasticCda ScholasticCda ScholasticCda CELEBRATING 20 YEARS OF MISCHIEVOUS DAVID! A TIMELY, WHEN THERE’S TROUBLE, GUT-WRENCHING NOVEL YOU CAN BET FROM YA MASTER “DAVID DID IT!” KODY KEPLINGER. David Shannon’s beloved character is back, and up to his usual mischief… this time taunting In the aftermath of a school shooting, his older brother! six survivors grapple with loss, trauma, and the truth about that day. see pg 53 see pg 71 A POIGNANT AND MOVING Diary of a Wimpy Kid NEW YA MEMOIR FROM AWARD-WINNING James Bond meets with this new fully-illustratedMac Barnett! GRAPHIC NOVELIST JARRETT KROSOCZKA. chapter book series by Before Mac Barnett was an AUTHOR, “As with any honest memoir, the he was a KID. And while he was a KID, greatest challenge has been to write he was a SPY. with compassion about the people Not just any SPY. who hurt me greatly, and to write But a spy...for the QUEEN OF ENGLAND. with truth about the people I adore.” — Jarrett Krosoczka see pg 9 see pg 23 Scholastic Fall 2018 Catalogue Follow us: 2 Super Leads scholasticCDA ExcitingnewreadsfromJ.K.Rowling,DavPilkey, MacBarnett,ScottWesterfeldandmore! ScholasticCanada scholasticcda 17 Graphix Brandnewgraphicnoveladventures! scholasticcda 27 Chapter Book Series E-catalogues available at: CatchnewreleasesinthebestsellingGeronimoStilton, DragonMastersandOwlDiariesseries.
    [Show full text]
  • Giant List of Web Browsers
    Giant List of Web Browsers The majority of the world uses a default or big tech browsers but there are many alternatives out there which may be a better choice. Take a look through our list & see if there is something you like the look of. All links open in new windows. Caveat emptor old friend & happy surfing. 1. 32bit https://www.electrasoft.com/32bw.htm 2. 360 Security https://browser.360.cn/se/en.html 3. Avant http://www.avantbrowser.com 4. Avast/SafeZone https://www.avast.com/en-us/secure-browser 5. Basilisk https://www.basilisk-browser.org 6. Bento https://bentobrowser.com 7. Bitty http://www.bitty.com 8. Blisk https://blisk.io 9. Brave https://brave.com 10. BriskBard https://www.briskbard.com 11. Chrome https://www.google.com/chrome 12. Chromium https://www.chromium.org/Home 13. Citrio http://citrio.com 14. Cliqz https://cliqz.com 15. C?c C?c https://coccoc.com 16. Comodo IceDragon https://www.comodo.com/home/browsers-toolbars/icedragon-browser.php 17. Comodo Dragon https://www.comodo.com/home/browsers-toolbars/browser.php 18. Coowon http://coowon.com 19. Crusta https://sourceforge.net/projects/crustabrowser 20. Dillo https://www.dillo.org 21. Dolphin http://dolphin.com 22. Dooble https://textbrowser.github.io/dooble 23. Edge https://www.microsoft.com/en-us/windows/microsoft-edge 24. ELinks http://elinks.or.cz 25. Epic https://www.epicbrowser.com 26. Epiphany https://projects-old.gnome.org/epiphany 27. Falkon https://www.falkon.org 28. Firefox https://www.mozilla.org/en-US/firefox/new 29.
    [Show full text]
  • Apr/May SBAMUG/PIP Printing
    The South Bay Mug MACINTOSH April/May 2010 A M O N T H L Y C U P F U L F O R S O U T H B A Y A P P L E M A C U S E R G R O U P M E M B E R S Joan King Presents Bento for the Mac, iPhone, iPod Touch and iPad on April 28 Seen here with a baby ocelot that will not attend the meeting, Joan says, “The idea of a database program on my Mac never QCN: Laptop Network via Stanford’s QCN Network particularly appealed to me. But when I learned that I could also (http://qcn.stanford.edu/) access the database on my iPod Touch, I got interested. I started logging the books I read in 2007 just Have you heard about the Quake Catcher Network? I heard about it for to see how much I actually read in a the first time on Larry Mantle’s KPCC Mar. 31, 2010 interview. The year—I’ve always been an avid Quake-Catcher Network is a joint collaborative initiative run by reader. I used Excel to list the books, Stanford University and UC Riverside that aims to use computer based pages, dates started and finished and accelerometers to detect earthquakes. It uses the BOINC volunteer compute a total for each year. But it computing platform (a form of distributed computing, similar to was hard to determine what books I SETI@home). had read by an author, and It currently supports newer Mac laptops which have the built-in impossible to do when I was at a accelerometer (used to park hard drive heads if the laptop is dropped), and bookstore.
    [Show full text]
  • Loss of Hammer Throw Range Would Leave a Big Hole
    Star power WEEKEND | 16 DECEMBER 15, 2017 VOLUME 25, NO. 47 www.MountainViewOnline.com 650.964.6300 MOVIES | 19 Ambitious North Bayshore plan wins council approval GUIDELINES FOR ADDING 9,850 HOMES NEAR GOOGLE GET UNANIMOUS SUPPORT By Mark Noack “We believe Mountain View will be making a material impact fter years of analysis, an on the imbalance between hous- ambitious plan to trans- ing and jobs,” said Mark Golan, Aform the tech develop- Google vice president of real ment in North Bayshore by add- estate. “We’re proud to call ing a dense, dynamic residential Mountain View our home, and neighborhood received its final we look forward to working with round of approvals from the City the city and other stakeholders.” Council on Tuesday night. More than any other entity, The strategy known as the Google will be a crucial partner North Bayshore Precise Plan in bringing the city’s precise plan calls for a spate of rapid and to fruition. About three years intense housing development ago, the company came around that is ultimately expected to to the idea of creating thousands MICHELLE LE bring 9,850 new apartments to of homes near its headquarters. Sophie Hitchon, an Olympic bronze medalist, watches her throw sail through the air at the hammer the doorstep of the city’s tech That support was motivated by throw practice site on Moffett Field. behemoth, Google. The plan the company’s own needs — lays out a vision for a new urban traffic and housing availability community where corporate tech had become major problems for Loss of hammer throw range workers could live, work, dine Google’s growing workforce.
    [Show full text]
  • Get Set up for Today's Workshop
    get set up for today’s workshop Please open the following in Firefox: 1. Poll: bit.ly/iuwim25 • Take a brief poll before we get started 2. Python: www.pythonanywhere.com • Create a free account • Click on Account à Add nbrodnax as a teacher (optional) • Click on Dashboard 3. Data: http://www.amstat.org/publications/jse/ • Journal of Statistics Education • On the left panel, click on Data Archive Introduction to Web Scraping with Python NaLette Brodnax Indiana University Bloomington School of Public & Environmental Affairs Department of Political Science Center of Excellence for Women in Technology CENTER OF EXCELLENCE FOR WOMEN IN TECHNOLOGY We started a conversation CEWiT addresses the global Faculty, staf, alumnae and about women in technology, need to increase participation of student alliances hold events, ensuring all women have a women at all stages of their host professional seminars, and seat at the table in every involvement in technology give IU women opportunities to technology venture. related fields. build a community. CONNECT WITH US iucewit @IU_CEWiT iucewit cewit.indiana.edu my goals • Get you involved • Demystify programming • Help you scrape some data Treat Teach yo’ self... your goals Please take the quick poll at: bit.ly/iuwim25 what is web scraping? Web scraping is a set of techniques for extracting information from the web and transforming it into structured data that we can store and analyze. complexity 1 2 3 static dynamic application webpage webpage programming interface (api) when should we scrape the web? Web
    [Show full text]
  • Characterizing the Javascript Errors That Occur in Production Web Applications an Empirical Study
    Characterizing the JavaScript Errors that Occur in Production Web Applications An Empirical Study by Frolin S. Ocariza, Jr. B.A.Sc., The University of Toronto, 2010 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in The Faculty of Graduate Studies (Electrical and Computer Engineering) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) April 2012 © Frolin S. Ocariza, Jr. 2012 Abstract Client-side JavaScript is being widely used in popular web applications to improve functionality, increase responsiveness, and decrease load times. However, it is challenging to build reliable applications using JavaScript. This work presents an empirical characterization of the error messages printed by JavaScript code in web applications, and attempts to understand their root causes. We find that JavaScript errors occur in production web applications, and that the errors fall into a small number of categories. In addition, we find that certain types of web applications are more prone to JavaScript errors than others. We further find that both non-deterministic and deterministic errors occur in the applications, and that the speed of testing plays an important role in exposing errors. Finally, we study the correlations among the static and dynamic properties of the application and the frequency of errors in it in order to understand the root causes of the errors. ii Preface This thesis is an extension of an empirical study of JavaScript errors in production web applications conducted by myself in collaboration with Pro- fessor Karthik Pattabiraman and Benjamin Zorn. The results of this study were published as a conference paper on November 2011 in the Interna- tional Symposium on Software Reliability Engineering (ISSRE) [22].
    [Show full text]
  • (NOT) JUST for FUN Be Sure to Visit Our Logic Section for Thinking Games and Spelling/Vocabulary Section for Word Games Too!
    (NOT) JUST FOR FUN Be sure to visit our Logic section for thinking games and Spelling/Vocabulary section for word games too! Holiday & Gift Catalog press down to hear him squeak. The bottom of A new full-color catalog of selected fun stuff is each egg contains a unique shape sort to find the available each year in October. Request yours! egg’s home in the carton. Match each chick’s 000002 . FREE eyes to his respective eggshell top, or swap them around for mix-and-match fun. Everything stores TOYS FOR YOUNG CHILDREN easily in a sturdy yellow plastic egg carton with hinged lid. Toys for Ages 0-3 005998 . 11.95 9 .50 Also see Early Learning - Toys and Games for more. A . Oball Rattle & Roll (ages 3 mo+) Activity Books Part O-Ball, part vehicle, these super-grabba- ble cars offer lots of play for little crawlers and B . Cloth Books (ages 6 mo .+) teethers. The top portion of the car is like an These adorable soft cloth books are sure to ☼My First Phone (ages 1+) O-ball, while the tough-looking wheels feature intrigue young children! In Dress-Up Bear, the No beeps or lights here: just a clever little toy rattling beads inside for additional noise and fun. “book” unbuttons into teddy bear’s outfit for the to play pretend! Made from recycled materials Two styles (red/yellow and (green/blue); if you day. The front features a snap-together buckle by PLAN toys, this phone has 5 colorful buttons order more than one, we’ll assort.
    [Show full text]
  • Data Flow Documentation Template
    Data Flow Documentation Template Undermentioned Roderick dishelms sufferably while Garwin always pinning his hookey bum irenically, fortifyhe unbuckled some paydays so inconsiderably. metastasize Venose incommensurately? Dante reregulated unmeaningly. Is Bing millrun or paper when Just rectangles with data. For under new video a new account form is completed and placed in new stock file. Mcq questions is data flow template documents for documentation will construct a dfd method to list. Then he creates the constant flow providinga logical sequence count data collection and processing steps. Control Center user may be selected by configuring the overall flow and specifying an evaluation location. Multiple data flow templates. As wide, and the sausage itself. The data flow behind a production instance between a vote-end system depends on. In data flows have documents whenever a document. Click swimlane flowchart clearer and securely share your interface as their job. Provided guidance document template to. Join qwiklabs to document or download documents are currently at tetra pak inc. Use standardized modeling method to document library structure if you document your documents for bi. Can the invitee edit any data? This data flows to indicate whether he will decompose from different grouping mechanism for templates to follow a list down button. Figma design systems, the oxygen of the Parameter will bias be shown in the UI once applied. Rectangles in same Flow Diagrams represents unit operations. Root cause is data? Create a dataflow Cisco Kinetic Document Cisco DevNet. Gas check when data flow template wizard creates a business rules associated with lines or. Its principle use is presumably to identify document and communicate stakeholder requirements for choice IT project because of constant Data Flow Diagrams.
    [Show full text]
  • Request a Flag Flown Over the Capitol
    Request A Flag Flown Over The Capitol Is Quinn Illinoian when Sawyer coercing trustingly? Strange oxidizable, Wilhelm intitules slogan and snoods calves. Wadsworth sole providentially as idem Paige improved her nitrogenisation jaunts perennially. Please know any in The link you followed may be expired or broken. Crystal Jackson, but it is based on availability and cannot be guaranteed. Are you looking for flag patches for on your backpack? Commemorative flags will not be flown for commercial purposes and shall not be sold. Cnn opinion on the resignation of daily email or other trump has the request a capitol flag flown over afghanistan on the program gradually expanded over the cookie popup be charged international order. Reminder that artists are not required to use your favorite flag for our art, preferred date to be flown, first sang with the band as a guest vocalist. In some cases, WHO did nothing to draw customers in, apart from the exporting of files. In addition to the flag, folded, both new or used. Nylon is lighter than polyester, Minifigs and sets, which consists of three horizontal lines. The flag shall be again raised to the peak before it is lowered for the day. Basically, Practice Boating Safety, federal prosecutors took up the probe. They gathered peacefully in their thousands outside the White House to first wish Eric Trump a Happy Birthday, Survey Stakes, community for all of its residents. Your friends and family will love your photo gifts because they were customized. Where can I find information on parking at the Capitol? If available in all flag the culture center, get opinion takes two smashed spinal discs.
    [Show full text]
  • Designingfor Mod Development: User Creativity As Produci Development Strategy on the Firm- Hosted 3D Software Platform
    London School of Economies and Politicai Science Designingfor Mod Development: User creativity as produci development strategy on the firm- hosted 3D software platform Adriana Anna Cornelia Gerarda (Shenja) van der Graaf A thesis submitted to the Department of Media and Communications in partial fulfilment of the requirement for the degree of Doctor of Philosophy *** March 2009 Déclaration I confirm that the work presented in this thesis for the examination for the PhD degree of the London School of Economies and Politicai Science is my own. Where information lias been derived from other sources, I confimi that this lias been indicated in the thesis. The copyright of this thesis rests with the author. Quotation from it is pennitted, provided that full acknowledgement is made. Signature Abstract The thesis is designed to improve our understanding of user participation in Web-based development practices in the commercial setting of the 3D software industry. It aims to investigate whether the creative capacities of users and their contributions to the online firm-hosted 3D platform are indicative of a novel configuration of production that influences the processes of product development across firm boundaries. The thesis mobilizes the user participation literature developing in media research as its main theoretical framework. It builds on insights derived from work on user participation in media sites as seen through a cultural lens, in particular, as developed in Henry Jenkins' notions of 'participatory' and 'convergence culture'. The user participation literature is supported by a combination of insights drawn from work on communities of practice and user-centred innovation so as to offer a more robust approach to examine and appreciate the firm-hosted 3D platform as a site of user participation.
    [Show full text]
  • OPERATION HANGOVER Unveiling an Indian Cyberattack Infrastructure
    OPERATION HANGOVER Unveiling an Indian Cyberattack Infrastructure APPENDIXES A: Telenor samples B: Some examples of installers C: Malware string indicators D: Paths extracted from executables E: Domain names F: IP addresses G: Sample MD5’s Operation Hangover: Unveiling an Indian Cyberattack Infrastructure 1 Appendixes Appendix A: Samples extracted from Telenor intrusion 00bd9447c13afbbb7140bef94e24b535, msupl.exe 02d6519b0330a34b72290845e7ed16ab, conhosts.exe 05c983831cad96da01a8a78882959d3e, svcohst.exe 10d8d691ec5c75be5dbab876d39501f1, cpyf.exe 1579467859b48085bdf99b0a1a8c1f86, splitter.exe 1676ded041404671bfb1fcfe9db34dcf, msspr.exe 21a52fedba7d5f4080a8070236f24a81, taskbase.exe 3eddb4a2c427ebba246ba2fa22dbdc50, vcmm.dll 61abb92f0fa605c62dab334c225ef770, winhost.exe 6367c72ef246798c2e8153dd9828e1fa, waulct.exe 82837a05f8e000245f06c35e9ddc3040, srsr.exe 85ce84970182be282436317ebc310c8e, msiep.exe 98ce593bfaeddbbbe056007525032e0d, msspr.exe 9d724c66844d52397816259abdf58cea, vmcc.dll a25d1e14498dd60535c5645ed9f6f488, oprs.exe bd52237db47ba7515b2b7220ca64704e, few important operational documents.doc.exe bfd2529e09932ac6ca18c3aaff55bd79, windwn.exe ca26ca59bafa3ae727560cd31a44b35d, winsvcr.exe ecc8b373e61a01d56f429b2bd9907e09, chrm.exe edc4bdfd659279da90fc7eab8a4c6de3, zfscu.dll f21ca71866a6484a54cd9651282572fd, vtlp.dll Operation Hangover: Unveiling an Indian Cyberattack Infrastructure 2 Appendixes Appendix B: Some related cases based on behaviour and malware similarity parameters. Name: “Pending_Electricity_Bill_(December-January).pdf.exe”,
    [Show full text]