Implementation and Evaluation of a Framework to Calculate Impact Measures for Wikipedia Authors
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Causal Inference in the Presence of Interference in Sponsored Search Advertising
Causal Inference in the Presence of Interference in Sponsored Search Advertising Razieh Nabi∗ Joel Pfeiffer Murat Ali Bayir Johns Hopkins University Microsoft Bing Ads Microsoft Bing Ads [email protected] [email protected] [email protected] Denis Charles Emre Kıcıman Microsoft Bing Ads Microsoft Research [email protected] [email protected] Abstract In classical causal inference, inferring cause-effect relations from data relies on the assumption that units are independent and identically distributed. This assumption is violated in settings where units are related through a network of dependencies. An example of such a setting is ad placement in sponsored search advertising, where the clickability of a particular ad is potentially influenced by where it is placed and where other ads are placed on the search result page. In such scenarios, confounding arises due to not only the individual ad-level covariates but also the placements and covariates of other ads in the system. In this paper, we leverage the language of causal inference in the presence of interference to model interactions among the ads. Quantification of such interactions allows us to better understand the click behavior of users, which in turn impacts the revenue of the host search engine and enhances user satisfaction. We illustrate the utility of our formalization through experiments carried out on the ad placement system of the Bing search engine. 1 Introduction In recent years, advertisers have increasingly shifted their ad expenditures online. One of the most effective platforms for online advertising is search engine result pages. Given a user query, the search engine allocates a few ad slots (e.g., above or below its organic search results) and runs an auction among advertisers who are bidding and competing for these slots. -
Understanding Attention Metrics
Understanding Attention Metrics sponsored by: 01 introduction For years and years we’ve been told that if the click isn’t on its deathbed, it should be shivved to death immediately. While early display ads in the mid-90s could boast CTRs above 40%, the likely number you’ll see attached to a campaign these days is below 0.1%. But for some reason, even in 2015, click-through rate remains as vital to digital ad measurement although its importance is widely ridiculed. Reliance on the click means delivering unfathomable amounts of ad impressions to users, which has hyper-inflated the value of the pageview. This has led to a broken advertising system that rewards quantity over quality. Instead of building audiences, digital publishers are chasing traffic, trying to lure users to sites via “clickbait” headlines where the user is assaulted with an array of intrusive ad units. The end effect is overwhelming, ineffective ads adjacent to increasingly shoddy content. However, the rise of digital video and viewability have injected linear TV measurement capabilities into the digital space. By including time in its calculation, viewability has opened a conversation into the value of user attention. It has introduced metrics with the potential to assist premium publishers in building attractive audiences; better quantify exposure for advertisers; and ultimately financially reward publishers with more engaged audiences. This is a new and fast developing area, but one where a little education can go a long way. In addition to describing current attention metrics, this playbook will dive into what you should look for in an advanced metrics partner and how to use these metrics in selling inventory and optimizing campaign performance. -
Initiativen Roger Cloes Infobrief
Wissenschaftliche Dienste Deutscher Bundestag Infobrief Entwicklung und Bedeutung der im Internet ehrenamtlich eingestell- ten Wissensangebote insbesondere im Hinblick auf die Wiki- Initiativen Roger Cloes WD 10 - 3010 - 074/11 Wissenschaftliche Dienste Infobrief Seite 2 WD 10 - 3010 - 074/11 Entwicklung und Bedeutung der ehrenamtlich im Internet eingestellten Wissensangebote insbe- sondere im Hinblick auf die Wiki-Initiativen Verfasser: Dr. Roger Cloes / Tim Moritz Hector (Praktikant) Aktenzeichen: WD 10 - 3010 - 074/11 Abschluss der Arbeit: 1. Juli 2011 Fachbereich: WD 10: Kultur, Medien und Sport Ausarbeitungen und andere Informationsangebote der Wissenschaftlichen Dienste geben nicht die Auffassung des Deutschen Bundestages, eines seiner Organe oder der Bundestagsverwaltung wieder. Vielmehr liegen sie in der fachlichen Verantwortung der Verfasserinnen und Verfasser sowie der Fachbereichsleitung. Der Deutsche Bundestag behält sich die Rechte der Veröffentlichung und Verbreitung vor. Beides bedarf der Zustimmung der Leitung der Abteilung W, Platz der Republik 1, 11011 Berlin. Wissenschaftliche Dienste Infobrief Seite 3 WD 10 - 3010 - 074/11 Zusammenfassung Ehrenamtlich ins Internet eingestelltes Wissen spielt in Zeiten des sogenannten „Mitmachweb“ eine zunehmend herausgehobene Rolle. Vor allem Wikis und insbesondere Wikipedia sind ein nicht mehr wegzudenkendes Element des Internets. Keine anderen vergleichbaren Wissensange- bote im Internet, auch wenn sie zum freien Abruf eingestellt sind und mit Steuern, Gebühren, Werbeeinnahmen finanziert oder als Gratisproben im Rahmen von Geschäftsmodellen verschenkt werden, erreichen die Zugriffszahlen von Wikipedia. Das ist ein Phänomen, das in seiner Dimension vor dem Hintergrund der urheberrechtlichen Dis- kussion und der Begründung von staatlichem Schutz als Voraussetzung für die Schaffung von geistigen Gütern kaum Beachtung findet. Relativ niedrige Verbreitungskosten im Internet und geringe oder keine Erfordernisse an Kapitalinvestitionen begünstigen diese Entwicklung. -
WTF IS NATIVE ADVERTISING? Introduction
SPONSORED BY IS NATIVE ADVERTISING? Table of Contents Introduction WTF is Native 3 14 Programmatic? Nomenclature The Native Ad 4 16 Triumvirate 5 Decision tree 17 The Ad Man 18 The Publisher Issues Still Plaguing 19 The Platform 6 Native Advertising Glossary 7 Scale 20 8 Metrics 10 Labelling 11 The Church/State Divide 12 Credibility 3 / WTF IS NATIVE ADVERTISING? Introduction In 2013, native advertising galloped onto the scene like a masked hero, poised to hoist publishers atop a white horse, rescuing them from the twin menaces of programmatic advertising and sagging CPMs. But who’s really there when you peel back the mask? Native advertising is a murky business. Ad executives may not consider it advertising. Editorial departments certainly don’t consider it editorial. Even among its practitioners there is debate — is it a format or is it a function? Publishers who have invested in the studio model position native advertising as the perfect storm of context, creative capital and digital strategy. For platforms, it may be the same old banner advertising refitted for the social stream. Digiday created the WTF series to parse murky digital marketing concepts just like these. WTF is Native Advertising? Keep reading to find out..... DIGIDAY 4 / WTF IS NATIVE ADVERTISING? Nomenclature Native advertising An advertising message designed Branded content Content created to promote a Content-recommendation widgets Another form to mimic the form and function of its environment brand’s products or values. Branded content can take of native advertising often used by publishers, these a variety of formats, not all of them technically “native.” appear to consumers most often at the bottom of a web Content marketing Any marketing messages that do Branded content placed on third-party publishing page with lines like “From around the web,” or “You not fit within traditional formats like TV and radio spots, sites or platforms can be considered native advertising, may also like.” print ads or banner messaging. -
Google Analytics Education and Skills for the Professional Advertiser
AN INDUSTRY GUIDE TO Google Analytics Education and Skills For The Professional Advertiser MODULE 5 American Advertising Federation The Unifying Voice of Advertising Google Analytics is a freemium service that records what is happening at a website. CONTENTS Tracking the actions of a website is the essence of Google MODULE 5 Google Analytics Analytics. Google Analytics delivers information on who is visiting a Learning Objectives .....................................................................................1 website, what visitors are doing on that website, and what Key Definitions .......................................................................................... 2-3 actions are carried out at the specific URL. Audience Tracking ................................................................................... 4-5 Information is the essence of informed decision-making. The The A,B,C Cycle .......................................................................................6-16 website traffic activity data of Google Analytics delivers a full Acquisition understanding of customer/visitor journeys. Behavior Focus Google Analytics instruction on four(4) Conversions basic tenets: Strategy: Attribution Model ............................................................17-18 Audience- The people who visit a website Case History ........................................................................................... 19-20 Review .............................................................................................................. -
Reliability of Wikipedia 1 Reliability of Wikipedia
Reliability of Wikipedia 1 Reliability of Wikipedia The reliability of Wikipedia (primarily of the English language version), compared to other encyclopedias and more specialized sources, is assessed in many ways, including statistically, through comparative review, analysis of the historical patterns, and strengths and weaknesses inherent in the editing process unique to Wikipedia. [1] Because Wikipedia is open to anonymous and collaborative editing, assessments of its Vandalism of a Wikipedia article reliability usually include examinations of how quickly false or misleading information is removed. An early study conducted by IBM researchers in 2003—two years following Wikipedia's establishment—found that "vandalism is usually repaired extremely quickly — so quickly that most users will never see its effects"[2] and concluded that Wikipedia had "surprisingly effective self-healing capabilities".[3] A 2007 peer-reviewed study stated that "42% of damage is repaired almost immediately... Nonetheless, there are still hundreds of millions of damaged views."[4] Several studies have been done to assess the reliability of Wikipedia. A notable early study in the journal Nature suggested that in 2005, Wikipedia scientific articles came close to the level of accuracy in Encyclopædia Britannica and had a similar rate of "serious errors".[5] This study was disputed by Encyclopædia Britannica.[6] By 2010 reviewers in medical and scientific fields such as toxicology, cancer research and drug information reviewing Wikipedia against professional and peer-reviewed sources found that Wikipedia's depth and coverage were of a very high standard, often comparable in coverage to physician databases and considerably better than well known reputable national media outlets. -
Auditing Wikipedia's Hyperlinks Network on Polarizing Topics
Auditing Wikipedia’s Hyperlinks Network on Polarizing Topics Cristina Menghini Aris Anagnostopoulos Eli Upfal DIAG DIAG Dept. of Computer Science Sapienza University Sapienza University Brown University Rome, Italy Rome, Italy Providence, RI, USA [email protected] [email protected] [email protected] ABSTRACT readers to deepen their understanding of a topic by conveniently ac- People eager to learn about a topic can access Wikipedia to form cessing other articles." Consequently, while reading an article, users a preliminary opinion. Despite the solid revision process behind are directly exposed to its content and indirectly exposed to the the encyclopedia’s articles, the users’ exploration process is still content of the pages it points to. influenced by the hyperlinks’ network. In this paper, we shed light Wikipedia’s pages are the result of collaborative efforts of a com- on this overlooked phenomenon by investigating how articles de- mitted community that, following policies and guidelines [4, 20], scribing complementary subjects of a topic interconnect, and thus generates and maintains up-to-date and high-quality content [28, may shape readers’ exposure to diverging content. To quantify 40]. Even though tools support the community for curating pages this, we introduce the exposure to diverse information, a metric that and adding links, it lacks a systematic way to contextualize the captures how users’ exposure to multiple subjects of a topic varies pages within the more general articles’ network. Indeed, it is im- click-after-click by leveraging navigation models. portant to stress that having access to high-quality pages does not For the experiments, we collected six topic-induced networks imply a comprehensive exposure to an argument, especially for a about polarizing topics and analyzed the extent to which their broader or polarizing topic. -
Towards Content-Driven Reputation for Collaborative Code Repositories
University of Pennsylvania ScholarlyCommons Departmental Papers (CIS) Department of Computer & Information Science 8-28-2012 Towards Content-Driven Reputation for Collaborative Code Repositories Andrew G. West University of Pennsylvania, [email protected] Insup Lee University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/cis_papers Part of the Databases and Information Systems Commons, Graphics and Human Computer Interfaces Commons, and the Software Engineering Commons Recommended Citation Andrew G. West and Insup Lee, "Towards Content-Driven Reputation for Collaborative Code Repositories", 8th International Symposium on Wikis and Open Collaboration (WikiSym '12) , 13.1-13.4. August 2012. http://dx.doi.org/10.1145/2462932.2462950 Eighth International Symposium on Wikis and Open Collaboration, Linz, Austria, August 2012. This paper is posted at ScholarlyCommons. https://repository.upenn.edu/cis_papers/750 For more information, please contact [email protected]. Towards Content-Driven Reputation for Collaborative Code Repositories Abstract As evidenced by SourceForge and GitHub, code repositories now integrate Web 2.0 functionality that enables global participation with minimal barriers-to-entry. To prevent detrimental contributions enabled by crowdsourcing, reputation is one proposed solution. Fortunately this is an issue that has been addressed in analogous version control systems such as the *wiki* for natural language content. The WikiTrust algorithm ("content-driven reputation"), while developed and evaluated in wiki environments operates under a possibly shared collaborative assumption: actions that "survive" subsequent edits are reflective of good authorship. In this paper we examine WikiTrust's ability to measure author quality in collaborative code development. We first define a mapping omfr repositories to wiki environments and use it to evaluate a production SVN repository with 92,000 updates. -
Chapter 3. Event Expressions: Filtering the Event Stream
Chapter 3. Event Expressions: Filtering the Event Stream As we discussed in Chapter 1, the Live Web is uses a data model that is the dual of the data model that has prevailed on the static Web. The static Web is built around interactive Web sites that leverage pools of static data held in databases. Interactive Web sites are built with languages like PHP and Ruby on Rails that are designed for taking user interaction and formulating SQL queries against databases. The static Web works by making dynamic queries against static data. In contrast, the Live Web is based on static queries against dynamic data. Streams of real-time data are becoming more and more common online. Years ago, such data streams were just a trickle, but the advent of technologies like RSS and Atom as well as the appearance of services like Twitter and Facebook has turned this trickle into a raging torrent. And this is just the beginning. The only way that we can hope to make use of all this data is if we can filter it and automate our responses to it as much as possible. As we’ll discuss in detail in Chapter 4, the task is greater than mere filtering, the task is correlating these events contextually. Correlating events provides power well beyond merely using filters to tame the data torrent. This chapter introduces in detail the pattern language, called “event expressions1,” that we will use to match against streams of events. Event expressions are an important way that we match events with context. -
Checkuser and Editing Patterns
CheckUser and Editing Patterns Balancing privacy and accountability on Wikimedia projects Wikimania 2008, Alexandria, July 18, 2008 HaeB [[de:Benutzer:HaeB]], [[en:User:HaeB]] Please don't take photos during this talk. What are sockpuppets? ● Wikipedia and sister projects rely on being open: Anyone can edit, anyone can get an account. ● No ID control when registering (not even email address required) ● Many legitimate uses for multiple accounts ● “Sockpuppet” often implies deceptive intention, think ventriloquist What is the problem with sockpuppets? ● Ballot stuffing (some decision processes rely on voting, such as request for adminships and WMF board elections) ● “Dr Jekyll/Mr Hyde”: Carry out evil or controversial actions with a sockpuppet, such that the main account remains in good standing. E.g. trolling (actions intended to provoke adversive reactions and disrupt the community), or strawman accounts (putting the adversarial position in bad light) ● Artificial majorities in content disputes (useful if the wiki's culture values majority consensus over references and arguments), especially circumventing “three-revert-rule” ● Ban evasion What is the problem with sockpuppets? (cont'd) ● Newbies get bitten as a result of the possibility of ban evasion: ● Friedman and Resnick (The Social Cost of Cheap Pseudonyms, Journal of Economics and Management Strategy 2001): Proved in a game-theoretic model that the possibility of creating sockpuppets leads to distrust and discrimination against newcomers (if the community is otherwise successfully -
The Case of Wikipedia Jansn
UvA-DARE (Digital Academic Repository) Wisdom of the crowd or technicity of content? Wikipedia as a sociotechnical system Niederer, S.; van Dijck, J. DOI 10.1177/1461444810365297 Publication date 2010 Document Version Submitted manuscript Published in New Media & Society Link to publication Citation for published version (APA): Niederer, S., & van Dijck, J. (2010). Wisdom of the crowd or technicity of content? Wikipedia as a sociotechnical system. New Media & Society, 12(8), 1368-1387. https://doi.org/10.1177/1461444810365297 General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:27 Sep 2021 Full Title: Wisdom of the Crowd or Technicity of Content? Wikipedia as a socio-technical system Authors: Sabine Niederer and José van Dijck Sabine Niederer, University of Amsterdam, Turfdraagsterpad 9, 1012 XT Amsterdam, The Netherlands [email protected] José van Dijck, University of Amsterdam, Spuistraat 210, 1012 VT Amsterdam, The Netherlands [email protected] Authors’ Biographies Sabine Niederer is PhD candidate in Media Studies at the University of Amsterdam, and member of the Digital Methods Initiative, Amsterdam. -
The New World of Content Measurement Why Existing Metrics Are Flawed (And How to Fix Them)
The New World of Content Measurement Why existing metrics are flawed (and how to fix them) Copyright © 2014 Contently. All rights reserved. contently.com Carta Marina, Olaus Magnus Death to the pageview. — Paul Fredrich VP of Product, Contently GAME OF METRICS: OVERCOMING THE MEASUREMENT CRISIS IN THE CONTENT MARKETING KINGDOM Editor’s Note At Contently, three things have been a constant topic of discussion over the past six months: basketball, cult comedies from the early 1990s, and the disastrous state of content measurement in the marketing and media industries. Simply put, everyone was using flawed, superficial metrics to measure success—like a doctor using x-ray glasses instead of an x-ray machine. Even as forward-thinking content analytics platforms like Chartbeat began to make headway towards measuring content in more sophisticated ways, we saw brand publishers we worked with getting left out. That’s because brands face a fundamentally different set of content measurement challenges than traditional publishing companies. In pursuit of a solution, we did a lot of research and hard thinking, and even built an analytics tool of our own. This report is the result of our findings. We hope you find it useful, and that it empowers you to firmly tie your content to business results. — Joe Lazauskas Contently Editor in Chief 3 CONTENTLY GAME OF METRICS: OVERCOMING THE MEASUREMENT CRISIS IN THE CONTENT MARKETING KINGDOM Table of Contents I. Introduction 5 II. History 6 III. The Present Day 7 IV. Flawed Proxy Metrics 9 V. The Key to the Content Marketing Kingdom 11 VI. The Future 13 4 CONTENTLY GAME OF METRICS: OVERCOMING THE MEASUREMENT CRISIS IN THE CONTENT MARKETING KINGDOM You’re measuring the success of your content all wrong.