<<

POWER TO THE PEOPLE:

Why Panels Are Essential for Counting People and Providing Valid Demographics for Hybrid Online Services

JANUARY 2009

© 2009 comScore, Inc. POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

Successful online marketers don’t advertise to computers -- they understand the need to advertise to people. Decades ago, advertisers eschewed household ratings in favor of demographically-targeted ratings. Why would online advertisers want anything less than accurate demographic data if they aim to move dollars from TV to the ?

Online marketers understand that they need an accurate count of the number of people comprising their target audiences at every site, along with an accurate measure of their age, gender and income, if they and their agencies are to be able to plan and execute effective campaigns. So it’s no coincidence that agencies, publishers and advertisers -- in TV, , print and the Internet alike -- rely on panel-based services for buying and selling hundreds of billions of dollars of advertising every year. Across all media, panels are the gold standard for measuring people-based audiences.

On the Internet, some suppliers – including comScore and Quantcast -- have recently introduced measurement services that use server-side data. Since server-side data alone cannot provide unique visitor counts or visitor demographics, which of these services does the best job of handling the challenge of translating server data into people data?

Let’s take a look.

To begin, we’ll review the recently-announced IAB and MRC guidelines for measuring online audiences.

The IAB and MRC Guidelines for Audience Reach Measurement

The IAB and MRC stress the importance of people-based measurements. On December 8, 2008, the IAB and the MRC released their audience reach measurement guidelines. This is a major industry-wide initiative that provides clear, consistent definitions of metrics and sets standards for how to measure unique audiences across different methodologies. The foundational principles of the guidelines for counting unique visitors to a Web site -- or unique people exposed to an online ad campaign -- stressed the following:

 Client-initiated counting is crucial. These guidelines rely on the central concept that counting should occur on the client (i.e. Internet user) side , not the server side, and that counting should occur as closely as possible to the final delivery of an ad to the client.

 In order to report a Unique User, the measurement company must utilize in its identification and attribution processes, underlying data that is, at least in a reasonable proportion, attributed directly to a person .

 In no instance may a measurement organization report Unique Users purely through algorithms or modeling that are not at least partially traceable to information obtained directly from people , as opposed to browsers, computers, or any other non-human element.

 Transparency to data users is a paramount goal of these guidelines. Appropriate disclosures must be made to users concerning the measurement methodologies employed.

 2009 comScore, Inc. POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

Since both the IAB and MRC recognize an imperative need for using client-side data “that is, at least in a reasonable proportion, attributed directly to a person”, it follows that panel-centric hybrid measurement is potentially well suited to combining server-side measurement and people-level data for audience measurement.

The comScore People Panel comScore designed its industry leading online audience measurement service in a manner that is compliant with the new IAB and MRC guidelines.

 To paraphrase the guidelines: Panel-based syndicated measurement organizations have complex methodologies for selecting, recruiting and maintaining panels; collecting data from panelists; editing, projecting and weighting data and reporting audience activity. Strengths of these organizations are the ability to filter out non user-generated traffic and attribute audience activity to users and the known of users in the panel . This information is gathered through a combination of manual and automated techniques, some of which can involve direct contact with panelists and some involve the use of software, metering techniques or other devices.

 Client-side counting of people at comScore . The foundation of comScore’s approach is a 2-million person panel that includes -- as one of its core design criteria -- an accurate and continuous identification of the behavior of individuals . Using its panel, comScore developed and validated UDR2, a patented biometric technology based on individual Internet users' unique mouse and typing behavior, to identify different members of the household using the computer at any given moment. This is a critically important issue because census data show that fully two-thirds of Internet users utilize multiple-user machines. As a result, without knowing precisely who is on the computer at any point in time, it’s not possible to accurately count the number of unique site visitors, the number of unique people exposed to online ads or to know their demographics. Some services are unable to identify who is using the computer at any given moment -- a critical flaw in a syndicated measurement system. In contrast, comScore’s patented tracking of individuals, coupled with validated demographics, provides the quality audience data demanded by media professionals.

comScore’s Panel-Centric Hybrid Audience Measurement Systems comScore recently introduced new services for tracking streaming video and distributed media content using a panel- centric hybrid audience measurement system. (Distributed content is editorial content, including video, that is carried via widgets and social networking applications and published on a web page other than where it originated.) comScore uses server-side beaconing and tagging to ensure that content is accurately tracked as it’s distributed across the Web and to also ensure that niche audiences are appropriately counted. But, the comScore panel and its patented tracking of individual users remains central to the measurement of all people-metrics, as dictated by the IAB and MRC guidelines.

 2009 comScore, Inc. PAGE 3 POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

The essence of comScore’s hybrid methodology is to combine ‘usage quantity’ from census server-side data, thus benefiting from census level accuracy, with person-level demographics and person-normalized usage rates measured directly from the panel, in order to provide the most accurate Total Usage, Unique Visitor counts, and Audience Demographics. Person-level metrics provided by the comScore panel are the key foundation, along with census measurement, on which the entire system rests.

Without a Valid Panel, Server Data Can’t Provide Accurate Demographics or Unique Visitor counts

Along with comScore, Quantcast has introduced a service that uses site-server data. A significant concern arises, however, regarding how Quantcast can provide an accurate measure of the people who are visiting sites and their demographics without a panel providing a direct measure of people. In a December 8, 2008 post on Quantcast’s blog titled “ Why Quantcast doesn’t use a panel” , Quantcast makes it clear that they don’t use panels:

“Panel-based measurement solutions lack the scale, flexibility and immediacy to deliver actionable insights for an increasingly complex and fragmented marketplace.”

http://blog.quantcast.com/quantcast/2008/12/why-quantcast-doesnt-use-a-panel.html

The Quantcast approach begins with the tagging of server data from participating sites. (It is important to note, however, that according to Quantcast’s Web site, approximately only 5% of the top 100,000 sites are willing to cooperate with Quantcast, raising troubling questions about how accurate the Quantcast data can be across the many sites that are not cooperating.) Quantcast refers to these participating server data as “Direct Measurement” and claims that this allows them to provide Unique Visitor counts based on cookie counts adjusted for “cookie deletion”.

Quantcast readily acknowledges some of the challenges of translating cookies into people, with a particular focus on cookie rejection and cookie deletion:

“Direct measurement utilizes cookies, small text files recording a visit of an internet browser to a particular web site, as the atomic measurement unit. Cookies are blind to the person who is doing the internet browsing and in isolation do not necessarily accurately reflect the precise number of people visiting a given web destination. The complicating issues include cookie deletion, non acceptance of cookies by a browser, multiple machine/device use and multiple people using the same machine.

For example, when web users delete Internet cookies, the same browser can generate multiple cookies for a given site. As a result the number of cookies is greater than the number of people. Additionally, the many-to-many relationship between people and internet accessible machines adds complexity to media measurement: multiple people in a household may use one machine (and access the same sites), and individual users may use multiple machines to access a given site. The varied way in which all these factors may combine, results in unique differences between cookies and people for every media property.” http://www.quantcast.com/white-papers/quantcast-cookie-corrected-audience-white-paper.pdf

 2009 comScore, Inc. PAGE 4 POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

However, any cookie deletion adjustment, even assuming it is accurate and assuming it properly filters out non-user requested activity, aims at measuring what amounts to “permanent” cookies. Unfortunately, “permanent” cookies are still cookies and not people. Even if we ignore cookie deletion, the relationship between cookies and people is a tortuous and inconsistent one. Specifically, a user visiting a site from both home and work is represented by two different cookies. The same is true of someone who uses Internet Explorer and Firefox on the same machine. Conversely, two users sharing a machine are represented by one cookie. Furthermore, the ratio between the number of cookies and people depends on the site and is constantly changing over time. A cookie for a niche site may represent one person on a multi-user machine, whereas the cookie for a more popular site may represent multiple persons if more than one user visited the same site from the same machine. Ultimately, “Direct Measurement” leads to “permanent cookie” counts with no consistent relationship to the true, people audience counts.

Quantcast’s Direct Measurement is supplemented with “Multiple Reference Points,” which they say include:

“Data sets comprising click-stream and non-PII (Non-Personally Identifiable Information) user data. These data sets are sourced from multiple parties including companies, ISPs (broadband and dial-up) and toolbar vendors” http://www.quantcast.com/white-papers/quantcast-methodology.pdf

This method is also adopted by some other measurement companies, such as Compete. However, none of these companies identifies the source of any of these “multiple reference data points”.

Quantcast’s lack of transparency as to the source of its ISP data has led some to conclude that they are obtained from United Online / NetZero, one of the few ISPs willing to sell its click stream data to third parties. If Quantcast does indeed use data from United Online / NetZero, it is important to note that this ISP is predominately composed of dial- up users -- who can hardly be considered representative of the Internet user population, without the inclusion of the subscribers of AT&T, Comcast, TWC, Verizon, other telecom and cable companies, or even AOL’s narrowband users (all of whom refuse to sell their data to third parties). Even more important, the non-personally identifiable nature of both toolbar and ISP data presents insurmountable challenges in trying to accurately identify who is using a computer at any point in time and determine their demographics -- because the majority of Internet users in the U.S. utilize multiple-user computers. In fact, 65% of Internet users in the U.S. use a machine shared by other people in the household to access the Internet. At best, ISP or toolbar data boil down to (roughly) counting machines and not people. The relationship between machines and people has the same problems as the one between “permanent” cookies and people, for almost exactly the same reasons.

So, if the unique audience counts from ISP or toolbar data are problematic, what about the demographics?

 2009 comScore, Inc. PAGE 5 POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

The manner in which Quantcast obtain its demographic data is referenced, but not made clear in Quantcast’s published documents:

“Multiple Reference Points supplement the directly measured data within the visit graph. Reference sources include data sets that would traditionally be used in panel approaches comprising click-stream and non-PII (Personally Identifiable Information) user data. These data sets are sourced from multiple parties including market research companies, ISPs (broadband and dial-up) and toolbar vendors and cover in excess of 2 million individuals (1.5 million in the US). Quantcast also provides support for Quantified Publishers to share non-PII data via the direct measurement solution and this provides cookie-level data on many tens of millions of individuals.” http://www.quantcast.com/white-papers/quantcast-methodology.pdf

One could infer that two sources of demographic data could potentially be used by Quantcast: site registration data and / or zip + 4 data (provided by the ISP). Both are problematic. To begin, there is an obvious concern about the accuracy of site registration data. Debbie Williamson (social network advertising analyst at eMarketer) believes site registration data are often falsified. Second, zip + 4 data do not describe the characteristics of people actually using the computer, but rather, simply represent the demographics of people living in the same block groups as the owner(s) of the machines being used in the database. Neither source can be considered as providing accurate demographic data on the people actually using the computer.

Whatever the source of Quantcast’s demographic data, it would appear that the Quantcast methodology matches cookies to the demo data: http://www.quantcast.com/white-papers/quantcast-cookie-corrected-audience-white-paper.pdf

As such, there are two fundamental problems with Quantcast’s approach:

1. As was noted earlier, fully two thirds of Internet users utilize multiple user machines. As a result, a cookie containing demographic data placed on a computer is unable to distinguish between the various individuals using the computer at any point in time. In fact, using the cookie approach, it is entirely possible that a 25 year-old male seeing an online ad could be classified as a 50 year old female.

2. comScore’s research has shown that cookies are deleted at varying rates according to the age and gender of the machine’s users, thereby creating additional inaccuracies in the demographic characteristics of site visitors or online ad recipients if measured through cookies.

That said, how exactly does Quantcast translate the cookie measures derived from server data to the people-based measures demanded by the IAB and MRC? You can look long and hard but not find a clear, logical answer in any of Quantcast’s published documents. Quantcast’s “inference based engine” remains opaque at best, not transparent as required by the IAB and MRC guidelines.

 2009 comScore, Inc. PAGE 6 POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

What is clear from public documents is that some users of Quantcast services have encountered vexing problems with the accuracy of Quantcast’s demographic data.

“Someone explain to us why every variable on Quantcast’s demographic chart regarding the Tribble Advertising Agency has flipped? Evidently we now cater to the only the highest educated, highest income individuals on earth, with a larger percentage of African American followers.” http://www.tribbleagency.com/?p=3825

Quantcast’s response:

“Thank you for joining the Quantified publisher program and for having your site directly measured. If I look at your profile here http://www.quantcast.com/tribbleagency.com , I can see that you just recently placed our tags live. When you “Quantify” we are able to provide a much more accurate and granular understanding of your audience and the demo’s are NOT built upon any panel or reference point. Quantcast is also NOT providing a pixel/panel intersection of your audience, but instead relies on a large mathematical inference model to determine the demographics.

In some cases a site will have very similar demographics whether they are Quantified or estimated through Quantcast. In other cases these numbers can drastically change. In your case, the directly measured demo’s did change, and appear to be much better.” http://www.tribbleagency.com/?p=3825

Clearly, Quantcast’s demographic description of a site can change drastically depending on whether the site is cooperating with Quantcast and providing access to their server data. However, since only 5% of sites are cooperating, it would appear that Quantcast is itself acknowledging the uncertainty of the accuracy of the majority of their site demographic data.

Quantcast’s commentary also confirms that their demographic methodology does not meet the IAB and MRC guidelines, which stipulate:

“In no instance may a census measurement organization report Unique Users purely through algorithms or modeling that are not at least partially traceable to information obtained directly from people , as opposed to browsers, computers, or any other non-human element.”

Demographic data for sites cooperating with Quantcast are not attributed directly to a person. Instead, as noted above, Quantcast relies purely on algorithms:

“Quantcast is also NOT providing a pixel/panel intersection of your audience, but instead relies on a large mathematical inference model to determine the demographics.”

 2009 comScore, Inc. PAGE 7 POWER TO THE PEOPLE: COMSCORE METHODOLOGY WHITE PAPER JANUARY 2009

Regarding MRC accreditation, Quantcast’s published documents state that:

“Industry acceptance of Quantcast’s ground-breaking media measurement solution is our top priority and we have initiated the MRC audit process. Our pre-audit is complete and we anticipate completion of the first full phase of the audit in 2009.” http://www.quantcast.com/white-papers/quantcast-methodology.pdf

However, a quick check with the MRC reveals that Quantcast has only submitted for accreditation their methodology for tagging of server side data, leaving unaddressed (and unaccredited) the major issue of how they derive people data and demographics from cookies. Thus, there appears to be a gaping hole explaining how Quantcast purports to provide valid demographics for their audience measurement reports.

Summary

It’s necessary for advertisers and their agencies to exercise great care when selecting a partner to provide audience measurement data for their important media planning decisions. The IAB and MRC guidelines make it clear that people-based counting should, at least in part , occur on the client side, even if other metrics are measured from the server side. This, in turn, requires that any audience measurement service seeking to be accredited by the MRC meet these guidelines. comScore’s Panel-Centric Hybrid Measurement Services clearly do so.

One might say: “Power to the people.”

 2009 comScore, Inc. PAGE 8