<<

The News at 404: Archiving and accessing online news content

by Katie Raso

B.A. (Hons.), Simon Fraser University, 2011

Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Arts

in the School of Communication Faculty of Communication, Art and Technology

© Katie Raso 2018 SIMON FRASER UNIVERSITY Summer 2018

Copyright in this work rests with the author. Please ensure that any reproduction or re-use is done in accordance with the relevant national copyright legislation.

Approval

Name: Katie Raso Degree: Master of Arts Title: The News at 404: Archiving and accessing online news content Examining Committee: Chair: Christopher Jeschelnik Lab Instructor Rick Gruneau Senior Supervisor Professor Enda Brophy Supervisor Associate Professor

Date Defended/Approved: May 25, 2018

ii

Abstract

This project documents the rates at which Canadian newspapers archive digital articles. In addition, this project presents the variation in rates at which national news articles are archived by the secondary archiving services Canadian Newsstream, Factiva, and LexisNexis. A quantitative content analysis was conducted to identify potential trends in how and why some articles are excluded from archives. The study finds significant article loss by Canadian Newsstream, Factiva, and LexisNexis archives with rates of missing articles being three to five times higher than the news media sites.

Several factors impact the archiving rates including the use of video and the reliance on wire stories which are not archived at the same rate as content generated in-house. At its root, this project raises questions about long-term access to information. As the news media transitions further into the digital realm, the ability of individuals to access content becomes less certain.

Keywords: digital news; Canadian news media; political economy of news; online archiving; digital memory

iii

Dedication In memory of my mother who was excited to see what my research would reveal.

iv

Acknowledgements I am profoundly grateful to my SFU community for their support during what was an unconventional education and writing process. Thanks to Jan Marontate and Jason Congdon for helping me navigate my coursework while I was caring for my mother, and to Shane Gunster and Enda Brophy for guiding my independent study. This project would not have been possible without the support and invaluable input of my supervisor, Rick Gruneau.

Additional thanks to Karen, Brandy, and Joel who carried me through the most difficult chapter of my life; to my sister who continued to push me; and to my partner, Nick, who kept me heartened, housed, and fed while I wrote.

My research and analysis have been greatly informed, challenged, and improved by my academic community to whom I am indebted. My thanks to Kathi Cross, Donald Gutstein, Bob Hackett, Stuart Poyntz, Patrick McCurdy, and Josh Tabish. Special thanks to my colleague and collaborator, Bob Neubauer whose unwavering belief in me carried me through.

Thank you to Mike De Souza for providing me with a journalist’s perspective and for his fearless contributions to the Canadian news media.

I am grateful to Dr. Claude Juneau and the community of Shannon, Quebec. I hope that this thesis helps to preserve their decade of struggle for justice.

Finally, this research was supported by the Social Sciences and Humanities Research Council of Canada.

v

Table of Contents

Approval ...... ii Abstract ...... iii Dedication...... Error! Bookmark not defined. Acknowledgements ...... Error! Bookmark not defined. Table of Contents ...... vi List of Tables ...... Error! Bookmark not defined. List of Figures ...... Error! Bookmark not defined.

Introduction ...... 1

Chapter 1. News in The Information Age ...... 4

Chapter 2. Investigating Digital News Archives ...... 20

Chapter 3. Results and Discussion ...... 31

Chapter 4. Conclusion ...... 44

References ...... 47

Appendix A. Field-test Coding Protocol ...... 57

Appendix B. Final Coding Protocol...... 62

Appendix C. Images ...... 66

Appendix D. Additional Figures and Tables ...... 74

vi

List of Tables

Table 1: Archive rates by date, p.31 Table 2: Archiving rates by secondary archive, p.32 Table 3: Summary of findings, p.34-35

vii

List of Figures

Figure 1: Archiving totals by site with news sites aggregated for comparison, p.33 Figure 2: Percentage of news stories from wire, p.40 Figure 3: Wire stories by source, p.41

viii

Introduction In 2007, Superior Court Judge Bernard Godbout ruled that the residents of Shannon, Quebec could “proceed with a class-action lawsuit against the Department of National Defense and SNC-Tech over contaminated drinking water in the community” (CBC News, 2007, para.1). This decision followed the admission by the Department of Defense in 2000 that decades of improper disposal of trichloroethylene (TCE) at CFB Valcartier—located three kilometres North of Shannon—had resulted in the polluting of local water sources (CBC News, 2007). This contamination, which occurred over a period of thirty years, is blamed by residents for abnormally high rates of cancer in the community (CBC News, 2007).

The residents' class action lawsuit commenced in January, 2011 and continued for eleven months (Charles Veilleux and Associés, n.d.). Throughout the trial, Canadian newspapers from all major agencies published articles detailing expert testimony and scientific reports which confirmed the water’s contamination and its potential health effects. However, by December, 2011, all Postmedia articles about the trial were no longer accessible through the publications' websites.

“Dead” story links, or stubs, are common on news sites as content is shifted from the front page into website archives (Shewchuk & Mietkiewicz, 2009). However, the disappearance of the Shannon stories extends beyond dead stubs—the stories cannot be found anywhere on Postmedia’s news sites. Some of the articles were preserved by secondary archiving services. For example, LexisNexis archived 61 articles about the trial from Canadian newspapers (see Appendix Image 1). Canadian Newsstand archived 24 Postmedia stories about the lawsuit (see Appendix Image 2). Despite being described by Postmedia as “Canada’s most significant information database” ( Inc., 2010, para 1), at the time of its operation as an archive, FPInfomart yielded no search results related to the Shannon, Quebec TCE lawsuit (see Appendix Image 3, Image 4).

The disappearance of the 24 Postmedia stories might be partially explained by the limited access to FPInfomart that pay-per-view users had. Without a subscription, user access was limited to 183 of FPInfomart's news titles (Infomart, 2012). Full subscribers, by contrast, had 1

access to 1434 news titles (Infomart, 2012). It is possible that behind their premium pay-wall, FPInfomart provided access to the Shannon trial articles. What is certain is that, following FPInfomart’s rebrand as Infomart, any articles that may have been archived behind the wall were gone.

This thesis explores a somewhat unexpected question: what happened to the coverage of the Shannon, Quebec trial? During an earlier media monitoring project, I was responsible for recording news stories about the class action lawsuit brought forward in 2011 by the residents of Shannon against SNC Lavalin and the Department of National Defence. Over the course of the trial there was considerable expert testimony, details of which were published in Postmedia news outlets across Canada. In my role as a media monitoring researcher, I became familiar with the case and recorded much of this expert testimony. So I was struck how, within a matter of months, articles detailing the trial’s events had vanished from the Postmedia websites. Because the project I was working on had a limited scope, there was no way of knowing if this loss of articles was specific to this trial or if it represented a broader problem of selectiveness associated with the archiving of digital news stories.

As news outlets focus their resources on producing more digital content—in some cases ceasing print production entirely—I believe the questions of if and how online news is archived have become ever more pressing. In the time since I began research for the thesis, news organizations have intensified their online strategies: the Globe and Mail produces Globe Unlimited content which comprises online articles that are only available to subscribers; the Globe and Mail, Star, CBC News, and have produced smartphone news apps; and the New York Times, Washington Post, and Wall Street Journal have all developed daily channels for Snapchat Discover (Spangler, 2017). As a result of these changes, there is a growing body of news content that exists only in a digital form.

While methods of delivering news content online have developed extensively over the past decade, strategies for preserving this content remain unclear and, therefore, leave the content vulnerable to erasure. Moreover, the study of these archiving efforts remains in its infancy. This thesis aims to contribute to a broad understanding of the processes, policies, and perhaps 2

politics, of news archiving in the digital era, by examining the extent to which online news stories are preserved, and to assess if there is a pattern to erasures of Canadian online news content. To do so, this project captures a sample of news stories from national Canadian news websites and monitors which stories are archived by their original publication and by secondary archiving services. The sampled stories were revisited five years after being recorded to assess the capacity of both news websites and secondary archives to preserve aging news content.

The thesis is organized into four chapters. The first chapter sets the context by describing some key political economic dynamics of news organizations and news production in the ‘information age.’ The second chapter provides a literature review of what research has been done to assess the state of digital news archiving and provides a methodology for this project informed by this landscape. More extensive methodological details are included in several Appendices. Chapter three provides the results of this project and discusses the significance and implications of these findings. The final chapter comprises concluding remarks and suggestions for future research on this topic.

3

Chapter One News in the Information Age: Digital Memory, Neoliberalism and Problems of Access

A central challenge experienced by Internet researchers and archivists is how to accurately document a dynamic subject that is forever changing. Describing his experience of website archiving, Niels Brügger (2005) recounts, “[t]he website was—as a whole—not the same as when I had started; it had changed in the time it took to archive it” (p.23). The fluid nature of websites does not only result in changing content. A consequence of this dynamism is erasure of older digital texts as new content supplants them (Eveland, Marton, & Seo, 2004; Rosenzweig, 2003).

News websites present problems of dynamic content and erasure in a unique way. Stuart Allan (2006) suggests that, given their fluid nature, online news sites should be considered a third broadcast medium, after radio and television. While Allan rightly identifies that digital news sites have attributes that differ from those of a traditional , these attributes similarly differ from those of traditional broadcast media. Unlike traditional broadcast media, which “ten[d] to be ephemeral, zipping through the air and then gone” (Shewchuk & Mietkiewicz, 2009, p.44), news websites have the capacity to archive their content in ways that are more immediately accessible to users than a radio or television archive. As such, the creation of permanent online news records offers “incredible potential for depth on the web” (Shewchuk & Mietkiewicz, 2009, p.45).

The Challenge of Preserving Digital Content Yet this potential is not realized nor is it explored uniformly across news platforms or, at times, within the same news organization. Blair Shewchuk and Mark Mietkiewicz (2009) report that the CBC—for whom their book was commissioned—believes in the preservation of online content. However, the CBC website explicitly states that it does not commit to preserving online video and radio content because the website’s purpose is “not to set up to provide archives, unless when explicitly stated” (CBC.ca, n.d., para.2). The CBC limits the scope of its preservation activities to its online text-based news content, provided that the content does not include 4

inaccurate information, pose legal problems, or jeopardize any person's safety (Shewchuk & Mietkiewicz, 2009). This text-based content remains publicly accessible through the CBC News site for years after it is originally published. This policy is problematic for a multimedia news agency like the CBC which produces content in a variety of formats. Further, it does not account for the growing importance of video content to news sites. As a highly shareable, easily consumed news format, news organizations are investing more resources in the production of online news videos (Kalogeropoulos, Cherubini, & Newman, 2016). Given this trend, the CBC’s intention to preserve online content while not archiving digital video shows an incongruence between their policy and practice.

At least one instance in recent history demonstrates that the CBC's policy to preserve content is not infallible. In 2009, a controversial story about the Conservative party was swiftly removed from the CBC Radio One website without a retraction (Wikileaks, 2009). In the absence of a retraction or explanation, one cannot know why this content was erased. What this erasure does highlight is the vulnerability of digital news content to disappearance.

While the CBC has an explicit—albeit incomplete—policy on archiving digital news content, Canada’s two national newspapers are much further behind. Both have experimented with different methods of preserving and restricting access to digital content. At the start of this project, the Globe and Mail utilized paywalls for older articles. In 2012, the news site cautioned its readers that after thirty days access to articles could be limited to users who had purchased Globe Plus subscriptions (The Globe and Mail, 2012). The National Post—Postmedia's flagship newspaper—offered its readers access to articles for four months (National Post, n.d.a). After this period, users were directed to Postmedia's gated archive, FPInfomart.

Over the past five years, both sites have changed their access and archiving policies. The news organizations have transitioned to global subscription systems that include encompass all of their online content, including access to older articles. On the National Post (2017) site, non- subscribers can access up to 10 articles each month without a subscription. Globe and Mail readers can now access three articles in a month, or six if they register for a user account (Globe and Mail, n.d.). Both sites offer subscription packages that enable users to access 5

unlimited content. National Post now houses its archived content within the news website. The Postmedia archive, FPInfomart, has been rebranded as Infomart and is now focused on providing a media monitoring and social listening service to corporate clients (Postmedia Network Inc., 2011a; Stein, 2012). This change of focus for FPInfomart demonstrates that news organizations are attempting to find new mechanisms for profit generation in digital spaces. In doing so, Postmedia has given lower priority to a general archiving structure in favour of creating packages of curated content for paying clients. While Canada's public broadcaster sees the universal benefit of preserving text-based accounts of recent history, private newspapers continue to experiment with the best strategy for extracting profit from aging content.

The drive to monetize online news content is not unique to the National Post and Globe and Mail. News organizations are still struggling to find a way to generate revenue through their online content delivery. Newspapers Canada (2016) reports that, by the end of 2015, of the 103 daily newspapers in Canada, 37 employed paywalls or metered access on their sites. By contrast, the removed their paywall in September of that year, experimenting with a free tablet edition modeled after La Presse’s online system (Newspapers Canada, 2016). However, by the time this thesis was finished, announced that they were abandoning the La Presse model and reintroducing a universal subscription/paywall system (Jackson, 2018).

Attempts to monetize online content demonstrate an intervention by news agencies to reconfigure their business model to one that is financially viable in a digital landscape with increased competition. The spread of Craigslist and Kijiji to cities across Canada in the mid- 2000s had a monumental impact on newspapers’ revenues. Prior to the arrival of online classified services, newspapers derived upwards of 50% of their revenues from classified ad sales (Stross, 2005). The next significant impact on newspapers’ revenues was the loss of advertising revenue, described later in this chapter, as advertisers switched to advertising on Google and Facebook. The scarcity mindset engendered by this two-step revenue crisis is not conducive to creating systems to preserve news content. More than ever, news organizations are less concerned with preserving public access as they attempt to right their financial course. 6

Herein lies the problem with online news sites: despite Allan's excited declaration of their broadcasting potential, news websites are not simply adding a third component to the landscape shared by television and radio news. Certainly, when used to transition traditional broadcast media online, news websites have the potential to remove the temporal constraint of broadcast news, increasing a user’s ability to access content beyond the original time of broadcast. While the capacity of traditional broadcast media may expand with online platforms, the opposite appears to be true for print newspapers that are migrating online. Transitioning a physical artifact to a digital format creates a product that is impermanent and to which access is uncertain.

Jeff Rothenberg's quip that “digital documents last forever—or five years, whichever comes first” encapsulates the vulnerability of content created for and stored on newspaper websites (quoted in Rosenzweig, 2003, p. 740). The digital versions of newspapers cannot be archived in the same ways as their paper counterparts (Rosenzweig, 2003; Maurantonio, 2014). Unlike print news, which is purchased, digital content is licensed. As such, librarians are unable to compile archives with these loaned materials (Rosenzweig, 2003). News aggregation sites such as Factiva, LexisNexis, and ProQuest's Canadian Newsstream act as a conduit between libraries and digital news services, offering subscription-based access to archived digital content. However, this access is not guaranteed and can be eliminated when the aggregate opts to end their subscription to a news provider, or when the publisher decides that it no longer wishes to make its back issues publicly available (Deacon, 2007; Western Libraries, 2011).

A recent example of this is Quebecor Media Company’s 2011 decision to limit access to their publications. As a result, ProQuest and LexisNexis lost access to Quebecor articles, while Factiva's access was limited to articles from the past three months (Western Libraries, 2011). At the time of this decision, the loss of Quebecor publications was profound as they published “43 daily newspapers and over 250 community weekly newspapers” (Quebecor, 2012, para.1). By 2015, Quebecor had sold their holdings to Postmedia (Evans, 2014), reducing their news publications to Le Journal de Montréal, Le Journal de Québec, and Montréal (Quebecor, 2017). While the sale of Sun Media to Postmedia signaled the reintroduction of 7

these papers into the archives, it also points to the troublingly precarious and vulnerable nature of access to digital news content.

News and the Neoliberal Turn To understand the precarious nature of digital content access, one must consider the political economic environment in which the content exists. Thirty years of neoliberal economic policies in Canada have reshaped many things including the country’s news media landscape. While the changes began under the Mulroney government, they came to fruition through a series of policy and funding changes introduced by Chretien’s cabinet (Waddell, 2012).

Stephen McBride and Kathleen McNutt (2007) argue that, unlike the American experience of neoliberal reordering which began quite emphatically in the 1970s, the Canadian reordering has been a longer and subtler process. The history of financial cuts to the Canadian Broadcast Corporation (CBC) provides an apt example of this gradual transition. The CBC was created in 1936 in an attempt to preserve Canadian culture as the country's airwaves became overrun by American programming (Francoli, 2003; Nerberg, 1999). Since then, the CBC has expanded from a radio network capable of reaching half of the Canadian population to a multimedia network. By 1990, the network’s “number of permanent staff exceeded 10,000, expenditures totaled $1.4 billion and 99 percent of the country was within broadcast reach” of the CBC (Nerberg, 1999, para 3). However, a series of cuts would soon cripple the ability of the broadcaster to maintain this scope of production. The first serious blow came in December 1990 when budget cuts resulted in the layoff of 1,000 employees (Nerberg, 1999). In 1995, then- finance minister Paul Martin cut $400 million from the CBC's operating budget, a 40% reduction to their funding (Francoli, 2003; Nerberg, 1999). Despite a temporary reprieve from cuts in the early 2000s (Friends of Canadian Broadcasting, 2014), the broadcaster faced an additional $115 million reduction in its government funding in 2012 (CBC News, 2012).

The funding reductions at the CBC dovetailed with a change in focus by the organization’s leadership. During his tenure as Chief Executive Officer of the CBC, Perrin Beatty (1996) declared “At the CBC, we live and die on our Canadian content—Canadian content for TV and radio, in both languages, available coast to coast” (para 9). Less than a decade later, Beatty's 8

successor, Robert Rabinovitch, recognized that the funding cuts made it increasingly difficult for the CBC to fulfill its mandate to produce such content (Francoli, 2003). To address the cuts incurred during his term as President and CEO, Rabinovitch supplanted notions of the broadcaster as providing a public good with a discourse about the need for the CBC to function like a private broadcaster with an increased focus on creating profitable products (Francoli, 2003). While the neoliberal transition did not result in a total defunding of the CBC, it resulted in a substantial decrease in the broadcaster's capacity to create content and re-ordered it priorities to emphasize cost-savings. This reordering was perhaps best articulated by Rabinovitch's decision to lock out 5,500 unionized CBC workers—or 90% of the broadcaster's workforce—in 2005 following a contract dispute (CBC News, 2005).

While the CBC’s experience under neoliberalism demonstrates changes to funding, the experiences of private media organizations demonstrate the legislative changes which occurred during the same time period. In a word, the experiences of the latter group during this period were a result of financialization. A by-product of neoliberal economics, financialization is the process by which financial institutions have become central to myriad aspects of the economy (Harvey, 2010). “[T]he growing reliance of the economy on the financial sector [is a] response to general economic stagnation and overproduction” inherent in neoliberal economics (Winseck, quoting Foster & Magnoff, 2010, p. 375). In the United States, this has meant that the financial sector—which accounted for 11-12% in the 1980s—came to represent 20-21% of American GNP by 2005 (Winseck, 2010). In Canada, one of the many impacts that financialization has had is a constant shuffling and shrinking of the news media landscape as news organizations are traded and sold to influence stock prices and shareholder payouts.

Financialization and media concentration are trends that developed concurrently in Canada with the former frequently influencing the latter. Financialization has been made possible by the steady liberalization of financial markets and ownership regulations. At the same time, the concentration of ownership was aided by the Chretien government, which “eliminated restrictions that had prevented the same organization from owning newspapers and TV stations in the same market” (Waddell, 2012, p.116). The regulatory body tasked with monitoring media ownership, the Canadian Radio-Television and Telecommunications Commission (CRTC), 9

adopted a laissez-faire approach to regulating these transactions. In a 1986 ruling, the CRTC expressed that “preventing ownership concentration was not its primary focus” (Edge, 2013, p.6). It was not until BCE's 2012 attempt to take control of Astral Media that the CRTC intervened in any of the mergers and acquisitions that occurred in this liberalized environment (Edge, 2013). As a result, Canada’s media ownership is some of the most concentrated in the world (Mosco, 2004; Winseck, 2008).

The shifting ownership of Canadian media outlets over the past twenty-five years clearly demonstrates this trend to financialization. The history of the now defunct Guelph Mercury perfectly encapsulates this trend, as it was bundled with other newspapers and changed hands five times in four years in the late 1990s (Dornan, 2003). Sales like those experienced by the Mercury contributed to the growing financialization of the media sector. As Winseck (2010) notes, “media transactions alone in 2000 ($7.1 billion) were more than eight times greater than five years earlier” (p. 375). These transactions included Global’s purchase of the National Post and the Southam newspaper chain from Conrad Black and Hollinger Inc for $3.2 billion (Waddell, 2012). That same year, Bell Canada Enterprises bought CTV, then merged with the Thomson family's holdings—which owns the Globe and Mail—to create Bell Globemedia (Waddell, 2012). Quebecor concurrently purchased Vidéotron, TVA, and Sun Media for $7.4 billion (Winseck, 2010). By 2003, the top five Canadian news organizations' circulation share reached 79% of the national total (Winseck, 2008). The concentration continued in 2005 when the Toronto Star—who had already acquired both the Kitchener- Waterloo Record, and Hamilton Spectator in 1998-99—purchased 20% interest in CTVGlobeMedia (Waddell, 2012).

The Canadian media landscape in 2010 was markedly different than even a decade earlier and its ownership increasingly concentrated. The Thompson family’s holding company became the sole owner of the Globe and Mail, while remaining invested in CTVGlobeMedia with Torstar, and Bell (Bradshaw, 2015a). The same year, Postmedia (2011) reported a national circulation share of 31%. This market share increased further with their 2014 purchase of 175 Sun Media publications from Quebecor (Evans, 2014). As a result, of the 103 newspapers published in Canada in 2015, Postmedia owned 45 (Newspapers Canada, 2016). Changes in ownership not 10

only leave newsrooms vulnerable to downsizing but also threaten the access to archived content. In the case of the Quebecor sale of Sun media to Postmedia, this acquisition expanded access to archived materials however this expanded access is not guaranteed in perpetuity. With Postmedia’s (2016) debt totaling $653.1 million in 2016, the repackaging and reselling of these media assets is entirely plausible. As long as financialization remains a shaping influence of the Canadian media system, such instability will remain and defining attribute of the landscape.

While the Guelph Mercury’s history, and shuttering by Metroland in 2016 (Houpt, 2017), demonstrates the experience of individual news outlets that are traded in this new landscape, there is another experience occurring at the conglomerate level. These larger organizations are being restructured and refinanced to trade and consolidate media holdings. This process is perhaps best exemplified by Postmedia’s recent history. When CanWest Global filed for creditor protection in 2009, Postmedia grew out of the media company’s repackaged assets (Gutstein, 2014). Along with the new company came a new major shareholder, GoldenTree Asset Management LP (Gutstein, 2014). At the time of the reordering, the debt owed to GoldenTree was converted into a 35% ownership of Postmedia (Gutstein, 2014). Gutstein (2014) notes that the interest in Postmedia by American hedge funds like GoldenTree reveals a grim future for the media company. Hedge funds—aptly referred to as vulture funds—specialize not in building companies but in profiting from their distress through corporate bankruptcies, restructurings and financial liquidations. They profit from lending money to companies ‘at or near the end of their present financial ropes’" (Gutstein, 2014, para 13). Postmedia president and CEO admits that GoldenTree’s interest in Postmedia is one focused on business, not on news production (Bradshaw & Kildaze, 2016).

That interest has only intensified in recent years. Since 2009, GoldenTree has increased their stake in Postmedia. In their 2016 annual report, Postmedia confirmed that “as at August 31, 2016, [GoldenTree] owned 146,694,259, or 53%, of our Variable Voting Shares” (Postmedia Network Canada Corp., 2016, p. 27). Despite an attempt earlier in 2016 to sell part of their stake in the company (Bradshaw & Kildaze, 2016), GoldenTree remains invested in the Postmedia operation. They are not alone. Following a major debt restructuring in 2016, Postmedia's 11

second lien debt holders exchanged their $345 million of debt owed for 98% of the company’s shares (Canadian Press, 2016).

Gutstein (2014) notes that the “profit enhancement” priority of these investors is largely achieved by reducing operating costs through layoffs, salary reductions, and cutting employee benefits. It is unsurprising, then that, in 2013, Postmedia announced a three-year “Transformation Program” aimed at significantly reducing infrastructure costs (Postmedia Network Canada Corp., 2016). The first manifestation of this plan was a series of layoffs in 2014 (Baluja, 2014). In their fiscal year ending August 31, 2015, Postmedia reported a successful Transformation Program which included the shutdown of Postmedia News, our breaking news service, the centralizing of editorial production services through Postmedia Editorial Services in Hamilton, the streamlining of advertiser flyer insert operations, the cancellation of Sunday editions in three markets due to unprofitability, the outsourcing of our classified call centre and the outsourcing of production in certain markets. (Postmedia Network Canada Corp., 2015, p.38). Before the Transformation Program was even finished, Postmedia announced a new three-year Transformation Program with the intention of reducing costs by an additional $50 million. By January 2016, the cost reduction target was increased to $80 million which Postmedia intended to achieve through “a combination of acquisition synergies and further reorganization of our operations.” (Postmedia Network Canada Corp., 2016, p.9). Following the purchase of Sun Media, the company announced a new “integration initiative” designed to once again reduce staff costs. The first act of this plan saw the merging of Postmedia and Sun newsrooms in , , , and , laying off 90 staff (Bradshaw, 2016). This reorganization also includes a company-wide voluntary buyout program introduced in 2017 which aims to reduce Postmedia’s compensation costs by 20% (Craig, 2016; Postmedia Network Canada Corp, 2016).

At Globemedia, staff have faced similar, though not as severe, cuts. In September, 2016, Globe and Mail staff were asked to take voluntary buyouts with the goal of reducing the newspaper’s 12

staff by 40 positions (Craig, 2016). On November 18, 2016, 20 editorial staff and 10 advertising staff members accepted their voluntary buyouts and left the paper. An additional five advertising staff were laid off to reach the cost reduction targets (Watson, 2016). This was the third series of layoffs at the Globe since 2013 (Craig, 2016). The layoffs have included senior editors of the newspaper’s website and Report on Business, as well as members of the photography department and senior correspondents (Baluja, 2014a; Watson, 2016).

The experiences at Globemedia and Postmedia are indicative of an overall downsizing trend in Canadian news agencies, particularly print agencies. The Canadian Media Guild (2013) estimates that, between 2008 and 2013, 10,000 jobs have been cut from Canadian media firms. The CMG estimates that 6,000 of these jobs were positions in print media (Wong, 2013). Layoffs have shown no sign of stopping.

The cost cutting extends beyond individual writers to entire departments. Within a financialization model, priorities for resource allocation shift. Chris Waddell (2012) identifies that one of the most problematic decisions within this new epoch was the removal of parliament reporters by the majority of regional Canadian news outlets. Starting in the 1990s, reporters began to cover issues in Ottawa from their home newsrooms, relying heavily on newswires to provide details about activities on the Hill. This trend to shutter parliamentary bureaus has only continued since then. In 2016, the Parliamentary Press Gallery was the smallest it has been since this trend began in 1994 (Britneff, 2016). At the end of last year, there were 14 fewer print outlets in the Gallery than there were in 1994. Sun News—owned by Postmedia since 2015—is down to one reporter from six. The and the each had one correspondent on Parliament Hill, which they lost after 2004 and 2009, respectively. The had parliamentary reporters in 1994 and cut its presence on the Hill by 2009. (Britneff, 2016, para 10-11) Likely as part of their Transformation Program, in 2014, Postmedia closed their Parliamentary Bureau (Baluja, 2014b).

As a result of the cost-cutting exodus from Ottawa, Canadian political coverage has become 13

largely homogenized. Specific accounts of local MPs’ actions in parliament have been replaced by generic overviews of issues that were expected to be of the most interest to the greatest number of readers across the country (Waddell, 2012). This homogenization has been intensified by cuts to the wire services that newsrooms rely on for parliamentary coverage now. has reduced its staff in the gallery from 26 reporters in 1994 to 18 in 2016 (Britneff, 2016). In addition to these cuts, the number of wire services operating in Canada has declined at the hand of Postmedia who, in 2012, its wire service, CanWest News Service (Bradshaw, 2015a). After acquiring QMI Agency from Quebecor in 2015, Postmedia also closed this news wire service (Bradshaw, 2015a).

These cost saving measures reveal a great deal about Canadian news organizations and their priorities. The focus on financialization reveals the shifting priorities of private media organizations in Canada. Instead of investing in cutting-edge network infrastructure and adapting to new media forms, incumbent media and telecom firms have mostly spent the past decade-and-a-half amalgamating and subsequently retrenching under the weight of fairy-tale levels of capitalization, enormous debt, and dubious business strategies. (Winseck, 2010 p.374)

Does Digitization Compound or Obscure Neoliberalism's Effects? Media companies including Globemedia and Postmedia are quick to point to the rise of digital activity as a cause of lost revenue from declines in both ad sales and readership. There is certainly some truth behind these claims. The widespread adoption of internet-connected handheld devices has rapidly changed the ways that people access information. But are these changes wholly to blame for layoffs and closures at news organizations or do they give opportunity for organizations to achieve neoliberal goals of restructuring, reselling, and downsizing?

Despite the increased competition for audiences that newspapers face in the digital age, they remain important sources of information for Canadians. In 2016, eight out of ten Canadian adults reported reading a newspaper at least once each week (Vividata, 2016). Readers of all 14

ages engage with newspaper brands. Over the average week, daily newspapers reach 74% of millennials (18-34) with 65% of readers in this study using a device to access news content (Vividata, 2016). For Postmedia, these numbers translate into 8.3 million readers each week for their newspapers (Standing Committee on Canadian Heritage, 2016) while their digital content reaches 6.1 million readers in the same period (Craig, 2016b). The Globe and Mail has the highest online readership, reaching 4.5 million Canadians each week (Craig, 2016a). However, the newspaper’s weekday print circulation continues to fall, declining from 290,000 in 2013 to 230,000 in 2015. Despite readers continuing to engage with Canadian newspaper content, this interaction is not translating into revenues in the same way that print subscriptions once did. The introduction of a paywall to the Globe’s site has “meant that readers contributed 40 per cent of its revenue, up from 25 per cent a decade earlier - or, in other words, a shift away from advertisers” (Craig, 2016a, pp. FP2). Despite this increase, Sean Craig (2016a) cautions that digital subscriptions “[have] not been nearly enough to make up for lost revenues in newspaper circulation and advertising sales” (FP2).

Advertising revenue is a critical revenue stream for news outlets. Postmedia reports that the “daily newspaper industry’s revenue was $3.1 billion in 2010, with 74% of that revenue derived from print and online advertising and the balance from circulation” (Postmedia Network Canada Corp., 2011, p.14). For Postmedia, “print advertising revenue was $159.6 million and $356.9 million for the three and six months ended February 28, 2011, representing 66% and 67% of total revenue for such periods, respectively”. (Postmedia Network Canada Corp., 2011, p.2). Despite remaining their primary revenue source, Postmedia reports declining advertising revenues for their newspapers, a loss attributed to a weak economy post-2008, and marketers' increased interest in targeting online audiences (Postmedia Network Canada Corp., 2011). While marketers have transitioned significant portions of their campaigns to digital platforms, they have not brought these digital ad contracts to newspapers’ websites.

Both Globemedia and Postmedia correctly identify that this shift to digital advertising has resulted in declining ad revenue for their outlets. Despite being the primary news source for approximately a third of internet users in North America, news websites receive a much smaller portion of online ad revenue. Between 2005 and 2009, newspapers' share of digital ad revenue 15

fell from 16.2% to 11.4% (Ives, 2010). In 2017, Google and Facebook captured “an estimated 72% of the $5.5 billion internet advertising market in Canada…In fact, Facebook and Google’s internet revenue in Canada are five and ten times those of the entire newspaper industry’s online and mobile advertising revenue (i.e. $258.4 million)” (The Canadian Media Concentration Research Project 2017, p.54).

To address this loss of revenue, the Public Policy Forum (2017) proposes revising Section 19 of the Income Tax Act to extend the distinction between Canadian and non-Canadian media to web-based platforms, incentivizing advertisers to place their advertising with Canadian outlets. The revised Section 19 would “introduc[e] a 10 percent withholding tax on advertising expenditures in non-qualifying media” (Public Policy Forum, 2017, p.83). This tax, which would be used to finance the “Future of Journalism and Democracy Fund,” is anticipated to generate $300 to $400 million annually and would subsidize media organizations in Canada (Public Policy Forum, 2017). In addition to creating this fund, the PPF suggests that “the Government of Canada should advertise only in media that qualify under Section 19 provisions” (Public Policy Forum, 2017, p.83).

The PPF’s recommendations are remarkably similar to requests that Postmedia’s Paul Godfrey made during his presentation to the Standing Committee on Canadian Heritage. During his speech, Godfrey asked for both an incentive program to bring advertisers to Canadian news outlets and specifically for the government to place their advertisements with Canadian news agencies. He also requested two incentive programs that would help fund Postmedia’s various newspapers (Standing Committee on Canadian Heritage, 2016).

Of course, there is important context to how digital ad revenue is distributed and why newspapers’ share of online ad sales are not equal to their audience share. Canadian news websites sell advertising space at three to four times the standard rate. Ad space on websites is sold by one of two values: cost per 1000 impressions (CPM) or cost per click (CPC). An analysis of 100 million dollars of ad revenue in the third quarter of 2016 found that the average CPM was $7.13 (AdEspresso, 2016). By contrast, the National Post sells ad space with a CPM of $20-30, depending on the ad placement (nationalpost.com, n.d.) and the Globe and Mail’s 16

CPM ranges from $18-35 (The Globe and Mail, 2017a).

The $7.13 average CPM rate demonstrates the shifting advertising landscape that news organizations find themselves in. By contrast, a printed banner ad costs between $6,841 and $8,557 in the Globe and Mail (The Globe and Mail, 2017b) and $4,575 in the National Post (National Post, n.d.b). It remains unclear if the CPM prices set by these organizations are willful ignorance of the market rate or a failure to recalibrate to this new marketplace where the organizations’ audience commodity is no longer sold for a premium.

Declines in revenues and subscriptions are cited by Globemedia and Postmedia as the reasons driving their layoffs (Baluja, 2014a; Craig, 2016a; Craig, 2016b). They are not alone. These decisions mirror a growing trend amongst modern news organizations. Aeron Davis (2000b) notes that news media corporations have responded to declining profits with“[d]e-unionization, the use of freelancers and short-term contracts, the 'pooling of journalists', 'multi-skilling' and merging of sister papers” (p.44). Postmedia's 2011 Prospectus confirms that the company has undertaken many of these initiatives, as well as reducing employee benefit and pension plans. As a result, “[c]ompensation expenses, which are comprised of payroll and contractor expenses, decreased $12.9 million or 11%, to $104.5 million for the three months ended February 28, 2011 as compared to $117.4 million for the three months ended February 28, 2010” (Postmedia Network Canada Corp. 2011b, p.42).

Internet enthusiasts might argue that the reduced capacity of newspapers to provide and archive content is increasingly irrelevant as more and more digital news organizations join the online news landscape. Such optimism about the capacity of the internet to usher in an era of citizen journalists providing real time content is not new and echoes a broader optimism about the Information Age. As Robert Neubauer (2011) aptly identified, such narratives have served the neoliberal reordering. It is true that the digital news landscape is more diverse and densely populated than its print counterpart. In their 2013 media review, “the Pew Research Centre estimate[d] that there were 438 small digital news organizations in the US in 2013, most of which are digital-first startups” (Boss & Broussard, 2017, p.2). However, the health of the newspaper system remains critical to the health of the entire news media landscape as 17

[n]ewspapers play a dominant role in the gathering and dissemination of the news. Their newsrooms are far better staffed than those of the other media, and each day, in most of the communities they serve, they cover more events than their competitors do, and often in greater depth. (Sauvageau, 2012, p.30) Florian Sauvageau (2012) notes that, in a sample collected by the Pew Research Centre, 95% of articles and reports reviewed contained information from traditional media, primarily newspapers. Despite transformations to the news landscape, news agencies remain critical sources of information.

This role as a source of information is not naturally congruent with news organizations’ focus on profitability. Moreover, cost-saving measures threaten longer-term access to digital content as the drive to reduce costs exacerbates publishers’ pre-existing aversion to archiving materials. Rosenzweig (2003) explains, “publishers have not traditionally assumed preservation responsibility since there is no obvious profit to be made in ensuring that something will be available or readable in a hundred years when it is in the public domain and can't be sold or licensed” (6). Online newspapers archive only the content that they believe can generate revenue by being accessible in the future.

This disinterest in archiving is captured by the recent history of Postmedia’s online service, Infomart. Postmedia's description of their archive on their website focuses on FPInfomart's capacity to access “more than 60 million articles, blog entries and news clips” (Postmedia Network Inc., 2010, para. 1). Here the emphasis is on Postmedia's ability to resurrect past content. However, by 2011, Postmedia announced FPInfomart's partnership with social network monitoring company, Social Media Group. The redesigned Infomart was described as “a one stop resource for media monitoring, financial and corporate data and offers a complete, online media monitoring product … an indispensable resource for information professionals, business communicators, knowledge workers, researchers” (Postmedia Network Inc., 2011a, para. 7). Over the course of this project, Infomart has fully transitioned from Postmedia’s archive to a social listening service, offering a service that looked more like Meltwater’s widget-driven social media tracking platform than Factiva’s article archiving service. Over five years, Infomart came to reflect the priorities outlined in Postmedia's 2011 annual report, and also reflect the 18

disincentive to archive unprofitable content as identified by Rosenzweig. At the time of writing this paper, Postmedia had finalized the sale of Infomart to Meltwater for $38 million (Hasselback, 2017).

In the aftermath of the Infomart sale, neither Globemedia nor Postmedia have a public-facing online archive for their articles. This means that articles are preserved precariously on the websites with no guarantee that they will remain there. In this precarious system, what are the criteria that determine an article’s fate?

19

Chapter 2 Investigating Digital News Archives: Existing Research and Methods Used in This Study

Understanding the disappearance of the Shannon class action lawsuit articles is hindered by an overall lack of published research on the uneven archiving of online news. At the start of this project, very little had been written about the rates of archiving of digital news. As Kathleen A Hansen and Nora Paul (2015) note, “the landscape of digital news preservation is still mostly uncharted” (p.290-291).

Existing Research Perhaps the most notable research on digital news archives available at the time was David Deacon's (2007) study of LexisNexis' archiving of British news articles. Deacon (2007) assessed the intra- and inter-archive reliability of LexisNexis' records of content from ten British print newspapers. As part of this project, Deacon (2007) “selected three random days distributed five months apart and checked each item published in the hard copies of each of the UK national daily press to see whether it was present in the Lexis-Nexis archive” (p.20). This portion of Deacon’s research revealed two major absences. Both gaps occurred on April 1, 2006. The first gap was a loss of 18% of The Times total stories for that day, while the second gap resulted in 12% of the Daily Mirror's total news space not being archived (Deacon, 2007). “No systematic pattern was evident in the omitted material” but Deacon (2007) is quick to caution that “these tests do not completely rule out the possibility that there may be areas of the archive where exclusions are both patterned and considerable” (p.20). The lack of comparable digital news archive studies results in an incomplete understanding of how and why certain content is not included. 20

In the ten years since Deacon published his research, the digital news landscape has changed considerably due largely to the arrival of digital-first and born digital news services. These changes have attracted the interest of academics and researchers, and not a moment too soon. “Four decades after news organizations began creating and storing the news digitally, the questions surrounding preservation and archiving this content are just starting to be asked outside of a small circle of ‘insiders’” (Hansen & Paul, 2015, p.290).

Over the past five years, the digital-first and born digital news landscape has grown exponentially. Katherine Boss and Meredith Broussard (2017) argue that the arrival of born digital news content—including blogs, videos, and interactive components—has brought with it an era of crisis in news archiving. But perhaps it is more accurate to suggest that the introduction of born digital content has not created a new crisis but, rather, has highlighted the vulnerability and ephemerality of digital news as a whole. A telephone survey of news workers conducted by researchers with the Donald W. Reynolds Journalism Institute speaks to a broader crisis. Many of the survey participants reported a lack of archiving policies at their news agencies (Carner, McCain, & Zarndt, 2014). Of the 476 interviews, the majority (406) were with journalists from traditional print newspapers with websites (called hybrid), while a smaller group of participants (70) represented online-only news organizations (called online-only). Among this group, 64% of hybrid sources representatives and 70% of online-only sources reported not having official archiving policies regarding the preservation of born digital news content (Carner, McCain, & Zarndt, 2014). Further, 27% of the hybrid news sources reported experiencing significant content loss, as did 17% of their online-only counterparts (McCain, 2015). Given these numbers, it is unsurprising that, in 2013, the National Digital Stewardship Alliance deemed born digital news content as at-risk, likely to be lost from publicly accessible channels (Moore & Bonnet, 2015).

“One paradox about the virtual presence of online journalism is that content can often be revised or deleted by the original publisher with little effort and, in most cases, a large amount of success” (Shewchuk & Mietkiewicz, 2009, p.46). While these deletions reduce publishers’ costs, they prove incredibly detrimental for researchers. Lynne Cooke's (2005) analysis of the past 21

forty years of news media design is an example of a study that was impacted by this absence of data. “Unlike newspapers and television news programs, there is no comprehensive archive of news websites” (Cooke, 2005, p.28). For Cooke (2005), this lack of data “resulted in a purposive sample that was largely based on availability” (p.28). Thus, what research manages to be conducted on this topic must have methods configured to the shortage of available data.

Certain types of content are more vulnerable to erasure than others. Traditionally, images tend to not be included in news archive databases as news archiving services like LexisNexis are text-based services. Deacon (2007) notes that the resulting loss of “the visual dimension of the news” that occurs when articles are inputted into text-only databases is a “significant omission” that limits how researchers can engage with and analyze the articles (p.10). In her investigation into archived news stories of the 1985 MOVE bombing in Philadelphia, Nicole Maurantonio (2014) found that the exclusion of photographs from text-based news archives presented an incomplete account of the event’s coverage. “In lacking access to photographs in digitized newspapers, I was unable to see, quite literally, the ways in which photography was used as a mode of suggesting clarity to a narrative that was largely fragmented.” (Maurantonio, 2014, p.97). The loss of news contents’ visual attributes in the archiving process changes the experience of the text.

The limitations of text-based archives to fully capture news content is exacerbated by the growth of video-based, interactive, and live content on news sites. This is particularly troubling as the amount of born digital and digital-first content increases. “Unlike content digitized from analog media, born digital often has no physical surrogate to serve as an effective fallback” (McCain, 2015, p.338). While photographs used in print newspapers can potentially be found on microfilm, dynamic digital content has no failsafe backup.

Recognizing this threat, libraries and archives are struggling to find ways of archiving these emerging types of news content (Moore & Bonnet, 2015). However, they are not met with enthusiasm or co-operation from their news organization counterparts. While the amount of digital news content grows, the capacity and impetus of news organizations to archive it has dwindled. Revenue generation is the top priority for news organizations, not archiving born 22

digital content (Moore & Bonnet, 2015). As a result, news organizations have resulted in the scaling back or shuttering of both private and public news archives (Hansen & Paul, 2015; McCain, 2015). As previously mentioned, although falling profits are pointed to as the reason for these cuts, researchers have identified a separate driving force for the changes. While interviewing news agencies, Hansen “was told that there was no need to archive the online [news] because it was a service, not a publication” (Hansen & Paul, 2015, p.292). Despite the popularly held view of news organizations as keepers of the public record, it is not a view shared by the organizations (Hansen & Paul, 2015). As such, McCain (2015) quips, “The problem [of archiving born digital news content] is made more difficult by misaligned goals of major players” (p.339).

Given the uncertainty of access over time to digital news content, several researchers have raised questions about how these practices will affect future research. While the digital turn was promised to increase accessibility to texts by unbridling them from their spatial-temporal constraints (Maurantonio, 2014), this techno-optimism has run up against neoliberal economic realities. “Increasingly technologies, digital and analog, often govern the realm of possible research. The types of research projects on which scholars embark typically reflect the situational constraints confronting them” (Maurantonio, 2014, p.89). Hansen and Paul (2015) echo the concerns raised by Maurantonio. “Research questions that can now be answered through the use of print newspaper archives will be much harder to answer in the future” (Hansen & Paul, 2015, p.297). The authors give the example of a historical analysis of media coverage related to climate change in the future. If there isn’t a reliable repository of news articles, will researchers be able to track when the public became aware of climate change? Hansen & Paul (2015) suggest that the born digital content that has been created about this topic has likely already been lost and future content will face a similar fate. They conclude that, even in the digital age, the best bet a researcher in 2045 would have at finding articles from 2015 would be microfilm.

My project’s consideration of the effects of online newspapers on archiving processes is not an argument supporting displacement theory which proposes that “once a new medium enters the arena, its users must reallocate their limited amount of time” from the media which they 23

previously accessed (De Waal & Schoenbach, 2010, p.479). Certainly, print newspapers still exist, and continue to be accessed by readers. As De Waal and Schoenbach (2010) note, changes in media use patterns are complex and the re-allotment of time to new platforms is difficult to accurately trace. The Newspaper Audience Databank released data in March, 2012 which identifies that the majority of Canadians still prefer to read print newspapers over their online counterparts (Ladurantaye, 2012). Yet the same study also finds that subscriptions to the National Post and Toronto Star continue to decline (Ladurantaye, 2012). To understand the effects that digital news is having on print newspapers—and therefore on story archiving—it is necessary to explore beyond the consumption patterns of users.

Methods Used in This Thesis This thesis is an attempt to provide such an exploration. I am specifically interested in examining the rates at which online Canadian national newspapers archive and permanently delete articles. I employ Blair Shewchuk and Mark Mietkiewicz's (2009) definition of permanent deletion as “removing all traces of a story so that it is unavailable through links to the [news] website or the [news site's] search engine” (p.244). In addition to monitoring archiving and permanent deletion trends, this project will document the variation in rates at which national news articles are archived by secondary archiving services Canadian Newsstream, Factiva, and LexisNexis.

The project is divided into three sections: collection and coding; observation and recoding; analysis and evaluation. To assess potential trends in deletion rates, I employ quantitative content analysis. Content analysis is useful for identifying article attributes and potential correlations between these attributes with rates of deletion.

When designing a content analysis project, the quality criteria for the sampling method and coding protocol are reliability and validity (Titscher et al., 2000). According to Krippendorf: “A content analysis is valid if the inferences drawn from the available texts withstand the test of independently available evidence, of new observations, or competing theories or interpretations, or of being able to inform successful actions” (Krippendorff, 2004, p.313). Extending beyond the traditional categories of internal and external validity, Krippendorff (2004) identifies three categories of validity: face, social, and empirical. The last category breaks down further into 24

three levels of sub-categories. Face and social validity assess the relationship between a given research project and its sociocultural context while empirical validity refers to the effective design of project components and the relationships of these components to similar research (Krippendorff, 2004). Within empirical validity, Krippendorff (2004) identifies the importance of the following types of validity: sampling, semantic, structural, functional, predictive, and correlative—which divides into the subsections of convergent and discriminant validity. Each of these types of validity highlights the importance of project components' ability to measure, represent, model, assess, and/or make predictions about their subject of inquiry.

Reliability relates to validity insofar as a high degree of the former is considered a necessary condition for the latter (Titscher, Meyer, Wodak, & Vetter, 2000). “A research procedure is reliable when it responds to the same phenomena in the same way regardless of the circumstances of its implementation” (Krippendorff, 2004, p.211). Krippendorff (2004) identifies three levels of reliability with which to verify the usefulness of a content analysis project: stability, reproducibility, and accuracy. Stability is the weakest form of reliability, referring only to the ability of researchers to achieve consistent results when retesting with their coding protocol (Krippendorff, 2004). Reproducibility extends one step further, as this type of reliability requires the successful replication of a test by a researcher independent from the original project— interobserver or intercoder reliability. Accuracy is the extent to which a project measures its intended subject within the specifications it originally identified (Krippendorff, 2004).

Both the reliability and validity considerations outlined above informed the creation of this project’s methodology. Wherever possible, I implemented safeguards to ensure the highest level of both reliability and validity. These safeguards include field-testing the sampling method and coding protocol; having colleagues review the coding protocol to ensure its clarity; and the addition of a re-coding to the final methodology (see appendix: final coding protocol).

In his keynote address to the “Dodging the Memory Hole: Saving born digital News Content” Forum, Clifford Lynch cautioned that “it seems very reasonable to talk about preserving the web, until you realize that today in most cases, everybody sees a different web—every time they visit” (quoted in McCain, 2015, p.345). Recognizing the prevalence of tracking cookies and their role 25

in shaping user experience, both ad-blocking and auto-deleting cookie extensions were added to the browser used to load the news sites for collection. This ensured that any other activity on the computer would not inform, alter, or shape the order in which the news sites presented articles to the researcher.

Field-Test The ability of the project’s sampling method and coding protocol to accurately collect and codify articles was evaluated using a four-day field-test. Originally, the sample for this project was collected over four weeks on Wednesdays. For this initial sample all of the front page stories of CBC News.ca, the Globe and Mail, and the National Post were included. Each of these front pages hosts approximately 100 stories (see Appendix Images 8-10). During the initial capture, the only information recorded about the stories was the collection date; the original date of publication; the source, or news website; the story headline; and the story link. Additional story attributes were coded after the sample was collected.

For the preliminary sample, screenshots were taken of each website's front page prior to starting the documentation of individual articles. Screenshots acted as an important guide during sample collection as they offered a visual record of a page’s content at the moment of recording. This record prevented the sampling process from encountering problems similar to those identified by Brügger (2005) in his attempts to document an ever-changing website. During the field-test, however, all three of the screenshots were taken prior to collecting the sample articles. This meant that screenshots for each site were taken at the start of the day’s collection process. Due to the large sample from each page, the collection process took over an hour per page. As a result, by the time stories were collected from the first site, the content on the other two sites no longer reflected the screenshot taken at the start of collection.

Using the front page in its entirety as a sample proved a daunting task with little methodological merit. As the majority of online news readers only read “above the fold”—the content that fits on their screen when the website loads (Chu, Paul, & Ruel, 2009)—many of the stories lower on the page were repeated content from the first quarter of the page, along with content pulled from YouTube and user-generated content. In short, content published below the virtual fold had a 26

substantial amount of duplicated content and filler.

The initial intention of spreading the collection out over four weeks was to prevent the sample from being overloaded with one story focus—a risk when collecting content for consecutive days. The decision to only capture content on Wednesdays resulted in the inability of the sample to collect different content features published throughout the week, such as weekly columnists.

To field-test the coding protocol, fifteen stories from the sample were selected. Stories with varying attributes and subjects were purposely chosen to evaluate the protocol's ability to code the attributes of a wide array of articles. As Matthew Reason and Beatríz Garcia (2007) found, it is “not possible to field-test a systematically representative sample, merely field-test an intelligently selected sample”. From these tests, the code was revised four times. During this time, the protocol was also assessed by two colleagues. These evaluations were sought in order to assess the protocol's reliability. As I was going to complete the project as a single researcher such safeguards were necessary to ensuring that the protocol is reliable, as its reproducibility would not be verified during its use.

The coding protocol for this project changed considerably from its initial design (see Appendix A for the original protocol, Appendix B for the final protocol). The original protocol followed the design utilized by Shane Gunster (2010) for his study of climate change-related news stories. Gunster's (2010) coding guide provided useful criteria with which to code structural aspects of the stories. These criteria were refined during field-test to better measure the attributes of the stories in this sample.

The original coding system attempted to capture journalistic attributes using the protocol proposed by Reason and Garcia (2007). In their study of media coverage of Glasgow’s Year of Culture, the authors created what seemed to be useful systems for identifying story attitude and position (Reason & Garcia, 2007). However, two field-tests of their system proved that these mechanisms for assessing position based on the percentage of news space devoted to a subject had low stability. While this system may have worked for its authors, for this project it 27

lacked reliability. The criteria for assessing attitude based on an article's descriptive or analytical text—also based on Reason and Garcia's model—seemed logical prior to the field- tests. As with the mechanism for determining position, the assessment of attitude failed during its retest. The assignment of attitude proved arbitrary, and therefore, problematic.

Coding to indicate controversy, of specific importance to this project, was originally coded based on the topic of the controversy: legal, legislative, environmental, etc.. During the initial field-test, this system proved capable only of producing vague and inaccurate results. The second attempt at coding controversy involved the creation of categories for the actors involved in the controversy: who alleged what about whom. This system also proved inadequate as controversy is rarely presented so succinctly and the presence of multiple parties alleging multiple controversies rendered this mechanism powerless to code story content. Coding to indicate controversy was eventually removed from the study’s purview given this difficulty. Future attempts to code controversial stories would likely benefit from an open coding approach that allows the researcher to capture the stories’ attributes more accurately.

The Final Method Stage One: Collection and Coding This project’s sample comprises stories collected from online Canadian national news sources. The news websites included in this project are the National Post (the Post), the Globe and Mail (the Globe), and the CBC News (the CBC) websites.

For the sample, stories were captured using a constructed week sampling method. This method is critical to capturing an accurate sample of online news, accounting for the varying types of content available on different weekdays (Hester & Dougall, 2007). The constructed week sample for this project comprises six captures collected over a six week period. While originally conceived of as a seven day capture, it was reduced to six days to reflect the six-day print week followed by both the Globe and Mail and the National Post and to prevent the collection of duplicate content. At the start of each week during the six-week collection period, a number from one to six was chosen randomly using number generation software to determine which day of the week to capture. Content from all three news sites was then collected for the selected day. 28

For each capture, five stories were collected from each of the following sections for each website each day, providing a 690 story sample: 1. News (or Front Page) 2. Commentary (not applicable for CBC News) 3. Politics 4. Business 5. Health & Lifestyle 6. Sports 7. Arts & Entertainment 8. Technology & Science N = days*sources*sections*stories = (6*8*5*2) + (6*7*5*1) = 480 + 210 = 690

A screenshot of each section was taken before the sample from that section was collected. These screenshots acted as a safeguard. If the page changed or was updated during the collection, the screenshot acted as a guide for which stories to include.

The five stories coded for each section represented one of the five above-the-fold spaces on the web pages (see coding attribute 8 in the attached coding protocol for a breakdown of the page). The ranking of space on the news sites is based on the research of Cooke (2005); Chu, Paul, and Ruel (2009); and Eveland, Marton, and Seo (2004).

Units of Analysis The sample unit for this study is the individual news story. Once collected, these articles were coded for 13 recording units. During the collection, articles were assigned a unique identifying alphanumeric code and coded for eight structural attributes: title, section, placement on page, author (See Appendix B for coding protocol). The remaining four attributes, which were temporal rather than structural, were coded during the observation stage.

29

Stage Two: Observation and Coding During the second stage, the final four recording units for each story were documented. These units detailed the rates of archiving and deletion for each article. The collected stories were coded for remaining on or disappearing from their original web pages as well as on whether or not they were archived externally by Canadian Newsstream, Factiva, and LexisNexis.

To ensure an accurate report of the Globe and Mail and National Post archives, subscriptions to both services were purchased to allow full access to the papers' digital archives. Based on the published policies at the time of collection (The Globe and Mail, 2012; National Post, n.d.a), it was anticipated that the majority of news stories from both national newspaper websites would be removed by the twenty-fourth week of observation. If they remained on the website, observation would continue up until the five year mark.

Stage Three: Assessment The stories’ temporal attributes, these trends were cross tabulated with key structural attributes. Of specific interest were inter-archive comparisons between the news organizations and intra- secondary archiving comparisons to assess differences in rates of archiving each the CBC, Globe and Mail, and National Post by the secondary services. Of the thirteen recording units documented for each story, eight can be cross tabulated to assess potential correlations. The hypotheses to be tested in these cross tabulation are as follows: H0: the two variables are independent (no relation between article source and the likelihood of its erasure); H1 : the two variables are dependent (article source affects the likelihood of its erasure). These hypotheses will be tested with the formula: X 2= ∑(O-E)2/E.

30

Chapter 3 Results and Discussion

The constructed week sample comprised six days of articles captured from the CBC News (CBC), Globe and Mail (Globe), and National Post (Post) websites. While originally anticipated to include 690 units, the final sample included 688 units. This variation resulted from one day of coverage in the Post which devoted the entire front page of the Politics section to one story (see Appendix: Image 11). Of the 688 total, 210 stories were from CBC, 240 were from the Globe, and 238 were from the Post. After the original 24-week observation period, the vast majority of the articles remained available so, as per the methodology, the observation period was extended to the full five year period. At the end of this extended observation period, 584 of the original 688 articles were still available on the CBC, Globe, and Post websites. An additional 24 articles were still available, but had undergone substantial edits to their text (see images 12-13 for an example). 25 of the stories were duplicates of other stories in the sample demonstrating the reliance of online news channels to repeat content to fill space. Finally, 55 of the original 688 stories were permanently deleted from the news sites.

Table 1. Archive rates by date Date Archived Changed Missing Duplicate Row Totals

June 6 100 0 11 4 115

June 17 103 3 7 2 115

June 22 98 3 7 5 113

June 28 93 8 9 5 115

July 3 91 8 10 6 115

July 9 99 2 11 3 115

Column Totals 584 24 55 25 688

Of the three news sites, the Post had the best rate of retention for its articles with only eleven 31

missing and none changed.

Each of the archived stories from the Globe and Post was then searched for in the Canadian Newsstream, Factiva, and LexisNexis news archives. CBC stories are only archived in LexisNexis, so searches for these articles were limited to this archive. While the archiving rates for collected stories on the original websites were fairly consistent, gaps in the news archiving process became evident during the search of these secondary archives.

Searches of the secondary archives were conducted by date, searching the day of news collection plus two days on either side. For example, when searching for stories collected on June 22, the secondary archives were searched for articles published between June 20 to June 24. Articles in the sample that were originally published outside of the study’s range were searched individually by title and author. Any stories that were not included in the original date- based search were also searched for individually by title, author, and keywords.

Table 2: Archiving rates for secondary archives (Original numbers for news sources in parentheses) Archive News Source Articles Archived Articles Changed Articles Missing

Canadian Newsstream

Globe and Mail 86 (185) 18 (21) 128 (26)

National Post 76 (224) 8 (0) 144 (3)

Factiva

Globe and Mail 132 (185) 8 (21) 92 (26)

National Post 81 (224) 17 (0) 130 (3)

LexisNexis

CBC 79(175) 2(3) 123(26)

Globe and Mail 124 (185) 9(21) 99(26)

32

National Post 15 (224) 7 (0) 206 (3)

The totals for rates of archiving were analyzed using chi-square calculations to assess the variation in archiving rates between news sources and between archives. Comparisons of article preservation by news sites (NS) was conducted using a 4 column by 3 row contingency table comparing numbers of Archived (A), Changed (C), Missing (M), and Duplicate (D) articles for each CBC, Globe and Mail (GM), and National Post (NP). Comparisons involving the secondary archives of Canadian Newsstream (CN), Factiva (FC), and LexisNexis (LN) examined differences between Archived (A), Changed (C), and Missing (M) articles with Duplicates (D) excluded from the analysis. Where outliers presented, secondary calculations were conducted with these groups of values removed to assess the significance of variance between the remaining groups. Specific details of these calculations are included in Appendix D.

Figure 1: Archiving totals by site with news sites aggregated for comparison

33

Table 3: Summary of findings

Table(s Comparison P Value Conclusion )

1.1, 1.2 1.1 Difference in ACMD <0.0001 Significant differences between ACMD rates rates for CBC, GM, and NP samples.

2.1, 2.2 Between archives and news <0.0001 Significant differences between ACM sites: ACM comparison rates for News Sites (total), CN, FC, and LN. With News Sites (total) removed, significant differences remained between ACM rates for CN, FC, and LN.

2.3 Between CN and FC: ACM .0025 Significant differences between ACM comparison rates for CN and FC.

3 Intra-archive: CN ACM .0682 No significant difference in ACM rates comparison for GM and NP for GM and NP articles archived by articles CN.

4 Intra-archive: FC ACM .000017 Significant difference in ACM rates for comparison for GM and NP GM and NP articles archived by FC. articles

5.1 Intra-archive: LN ACM <0.00001 Significant difference in ACM rates for comparison for CBC, GM, CBC, GM, and NP articles archived by and NP articles LN.

5.2 Intra-archive: LN ACM 0.0005 Significant difference in ACM rates for comparison for CBC and GM CBC and GM articles archived by LN. articles

34

6 Interarchive: CBC articles <0.00001 Significant difference in ACM rates for ACM comparison between CBC articles in CBC-NS and LN archives (CBC-NS and LN) archives.

7.1 Interarchive: GM articles <0.00001 Significant difference in ACM rates for ACM comparison between GM articles in GM-NS, CN, FC, and archives (GM-NS, CN, FC, LN archives. and LN)

7.2 Interarchive: GM articles <0.00001 Significant difference in ACM rates for ACM comparison between GM articles in CN, FC, and LN archives (CN, FC, and LN) archives.

7.3 Interarchive: GM articles 0.7538 No significant difference in ACM rates ACM comparison between for GM articles in FC and LN archives. archives (FC and LN)

8.1 Interarchive: NP articles <0.00001 Significant difference in ACM rates for ACM comparison between NP articles in NP-NS, CN, FC, and LN archives (NP-NS, CN, FC, archives. and LN)

8.2 Interarchive: NP articles <0.00001 Significant difference in ACM rates for ACM comparison between NP articles in CN, FC, and LN archives (CN, FC, and LN) archives.

8.3 Interarchive: NP articles 0.1278 No significant difference in ACM rates ACM comparison between for NP articles in CN and FC archives. archives (CN and FC)

9 AM comparison by date 0.0571 No significant difference in total AM rates by date.

35

Discussion Despite Jeff Rothenberg’s observation about the impermanence of digital content (‘Forever or Five years’), and the unexplained disappearance of the Shannon articles, the stories collected for this study had a high rate of retention on their original news sites. The CBC, Globe and Mail, and National Post news sites maintained the majority of the stories collected for this study. Overall, the CBC preserved 83% of the articles in this sample, the Globe kept 77%, and the Post had the highest article retention with 94% of the articles available five years after they were recorded for this study. However, my research also revealed several major pitfalls facing digital news archiving. During the coding and analysis process, several factors impacting the ACM rates for articles in the samples became apparent. They include: the National Post versus the ; Globe and Mail Breaking News; the use of video content on news websites; and the inclusion of wire stories.

National Post versus Financial Post While LexisNexis claims to have access to National Post articles, the only articles from this study that were included in their archive were stories from the Post’s business section, the Financial Post. The A and M values for National Post stories in LexisNexis are outliers as a result of this limited access.

Globe and Mail Breaking News Stories generated for, and captured from, the Globe website are not necessarily replicated in the print version of the newspaper. While this variance between online and print news is not unique, it impacted the capacity of secondary archives to record Globe articles. The only articles available under the Globe and Mail in Canadian Newsstream, Factiva, and LexisNexis are those that have a print equivalent. There is a second collection of born digital Globe articles that are archived under the title Globe and Mail Breaking News (GMBN). Factiva and LexisNexis have access to GMBN which provided archived versions of many of the articles in this study. However, Canadian Newsstream does not have access to GMBN and, as such, had a very different Archived (A) to Missing (M) ratio from the other two archives.

Bob Nicholson cautions that “the creation of a digital newspaper does not simply produce what 36

archivists term a ‘surrogate’, or a stand-in, for the original… Though the digital text may look familiar, it is not the same source” (in Maurantonio, 2014, p.89). The absence of digital-only Globe and Mail stories from secondary archives exemplified Nicholson’s point. While technically the same source, the digital-only articles from this source are excluded from the Globe and Mail title in Canadian Newsstream, Factiva, and LexisNexis. While Globe and Mail Breaking News captures many of these digital stories, access to this title is not included with access to the Globe’s title, resulting in a sizable absence of Globe articles from Canadian Newsstream.

It is also important to consider the significantly higher proportion of stories in the Globe and Mail sample that were changed after they were originally posted. The images below capture an article which detailed a police raid on Nicolas Sarkozy’s offices. Following its original publication, this article was completely replaced. The link that originally linked to an Agence France-Presse article now links to a different article, written by two Globe writers in a very different tone from the original piece. The lead paragraph originally read “French police searched Nicolas Sarkozy’s offices and home Tuesday as part of their probe into claims the former president was involved in illegal political campaign financing, his lawyer said” (Agence France-Presse, 2012, para 1).

The replacement took a much lighter tone with Sarkozy and the investigation into his activities. Former French president Nicolas Sarkozy and his wife, Carla Bruni-Sarkozy, enjoyed a taste of Canadian cottage life in the Laurentians last week, a respite from public attention that has been renewed by a French police raid on Mr. Sarkozy's offices and Ms. Bruni-Sarkozy's mansion as part of a probe into alleged secret campaign financing. (Ha & Perreaux, 2012, para. 1) In fact, the entire replacement article softened the allegations against Sarkozy, burying them in a story that was as interested with his vacationing history as it was in the reasons for the raids. (See Images 12-13 in the Appendices for an additional example of where substantial edits occurred that changed the tone of the article published).

37

38

As a result, the stories archived under the Globe and Mail title in Factiva and LexisNexis are markedly different from the stories archived under the Globe and Mail Breaking News title. But more importantly, the articles that once existed on the Globe and Mail’s news site have been replaced by new articles occupying the same urls with no acknowledgement that markedly different content was originally housed at that address. In the absence of Globe and Mail Breaking News, there would be no available record of the original article save for the screenshot taken for this project.

The absence of these digital-first articles from secondary archives signals a growing problem facing digital news archiving. As the drive to create an increasing amount of digital content to engage online users grows, the amount of online news that does not have a print corollary will likely grow. Already, this study has revealed a difference between what the Globe and Mail publishes online and what it prints. The online, born digital content is not archived in the same way that the print news is. This is particularly troubling because “unlike content digitized from analog media, born digital often has no physical surrogate to serve as an effective fallback” (McCain, 2015, p. 338). The Globe and Mail Breaking News issue highlights a problem that hybrid news producers are having as they come online: the ways that these organizations are distributing content are changing but their archiving practices have not adapted to capture this new content. This presents a particularly pressing issue for archiving Globe and Mail content because, in a bid to attract more online subscribers, the news organization has launched an entire section of content housed online behind a paywall, Globe Unlimited (Stackhouse, 2012). Given the pattern of exclusion identified by this project, it is plausible that this extra digital content is vulnerable to being loss because it does not have a print equivalent and is, therefore, excluded from the Globe and Mail archive accessed by secondary archives.

The challenges facing the Globe’s digital content exemplifies the vulnerability of digital content identified by the Reynolds Journalism Institute / Journalism Digital News Archive (JDNA). In 2014, the JDNA surveyed 476 born digital and hybrid (traditional news with online presence) news organizations about their management of born digital news content. The survey revealed that “the vast majority of both types of organizations did not have written policies for archiving and preserving those resources” (Hansen & Paul, 2015, p.292). Unsurprisingly then, of those 39

surveyed, “27 percent of hybrid news producers reported significant content loss” (McCain, 2015, p.338).

Wire Stories An unexpected influence on archiving rates was the reliance on wire stories by all three news sites. Wire stories refer to articles posted on the CBC, Globe, and Post websites that were written by a third party service such as the Associated Press or Reuters. The reliance on third party content was substantial, representing between 15-42.5% of stories collected each day. Within this study’s sample, the Globe had the lowest mean percentage of articles from a third party service, 18.9%, while the Post had the highest, 28%.

Figure 2: Percentage of news stories from wire

The third parties whose content was sourced also varied by news site. The CBC’s sourced articles came almost exclusively from the Canadian Press and the Associated Press. The Globe fairly evenly sourced materials from the three major outlets as well as posting content from third parties including Agence France-Presse, MidnightTrader.com, and The Street. The majority of the Post’s third party articles came from sources other than the Canadian Press, Associated 40

Press, or Reuters. Their third party sources included Agence France-Presse, Postmedia News, Bloomberg News, and Business Insider.

Figure 3: Number of wire stories by source

The reliance on third party articles is considered a threat to media democracy because it results in a singular account being reprinted across myriad news channels. For this study, however, third party articles presented an entirely new challenge. Because content that originates with a third party site is not owned by the news agency who publishes it, it is not archived alongside original news content when said content is captured by archiving services like Factiva and LexisNexis. While this rented content could be microfiched if it appeared in print, the digital equivalent is not preserved in the same way. As such, the presence of third party content impacted archiving rates by Factiva, Canadian Newsstream, and Lexis Nexis for all three news sites. The use of wire stories directly impacts that ability of secondary archives to provide holistic captures of news content.

While the majority of wire stories remain available on the news sites, they were not archived by Canadian Newsstream, Factiva, and LexisNexis. The news sites’ reliance on third party content is responsible for much of the variance between their M values and those of the secondary 41

archives. This lack of access by secondary archives to sourced content results in an incomplete snapshot of the news published on any given day.

Video files During the sample window for this study, both the CBC and Globe were integrating video stories into their websites. Among the 210 CBC stories collected, seven were video-based or included video in the article. Of the 240 Globe stories, 15 were video-based or included video in the article. Stories that were entirely video-based were archived on neither the news site nor the secondary websites. LexisNexis archived the text portion (or a rewritten version of the text) from the three CBC stories that included both video and text.

The subsection of the study’s sample that comprises video-based articles offers the first glimpse into the crisis in digital news archiving identified by Boss and Broussard (2017). The presence of video-based news content in this study’s sample speaks to the growing importance of video as a digital news medium. While video content accounted for 3% of this study’s sample, the use of this medium in digital news content has only grown since 2012 (Kalogeropoulos, Cherubini, & Newman, 2016). When they announced their web and print redesign in 2010, the Globe and Mail named increased amounts of video content as a key component of their new online presence (Stackhouse, 2010).

Although the use of video by news agencies is increasing, it isn’t driven by on-site audience demand. The Reuters Institute for the Study of Journalism examined the growth of video use by news outlets and found that, while 6.5% of news pages contained video, “on average, video accounted for 2.5% of total time spent [by users] on these websites” (Kalogeropoulos, Cherubini, & Newman, 2016, p.13).

The increased interest in producing video-based news content is likely less driven by on-site user behaviour and more related to the developing trend of online audiences utilizing social media—particularly Facebook—as a news aggregator. Recent data from the Pew Research Centre (2017) suggests that 35% of online news consumers get their news from social media, while 36% use news websites to access articles. Social posts that feature video outperform all 42

other types of shared content for the number of impressions and engagements that they receive (Ross, 2015). The significant portion of online news audiences who use social platforms for their news content combined with the traction of video posts on social networks helps explain why news agencies are increasing their investment in this medium (Kalogeropoulos, Cherubini, & Newman, 2016). This shift may be, in part, responsible for the increase of video-based news content as news media companies are incentivized to generate content that translates well to off-site posting and sharing (Kalogeropoulos, Cherubini, & Newman, 2016).

While this storytelling medium is helping news agencies engage off-site audiences, the introduction of video-based articles creates new challenges for archiving. While the sample of video-based stories included in this study was small, the inability of the news sites or secondary archives to archive any of these stories suggests a troubling trend that replicates Hansen and Paul’s (2015) findings: “Archiving of multimedia elements for [news] websites is spotty or nonexistent. Even if the ‘frame’ is there, most of the functionality does not work any more.” (p.296). While links remained live on both the CBC and Globe websites, the video content that was once housed on these pages was no longer accessible and had no secondary archive equivalent.

It is important to note that, while the risks of lost articles is potentially growing for news agencies that are seeking to gain a foothold in off-site posting, the risks are much higher for the born digital news organizations that have proliferated since this study began. Born digital news organizations produce the types of news media that could not have existed in print format including data-driven knowledge translation, smartphone applications, blogs, videos, livestreams, and interactive components (Boss & Broussard, 2017). “The Pew Research Centre estimates that there were 438 small digital news organizations in the US in 2013, most of which are digital-first startups” (Boss & Broussard, 2017, p.2). While these new organizations are at the forefront of innovating new ways to communicate information, they haven’t tended to focus on innovative ways to archive these new types of content (Boss & Broussard, 2017; Moore & Bonnet, 2015). As such, the National Digital Stewardship Alliance has labeled born digital news as “at-risk” of significant content loss (Moore & Bonnet, 2015).

43

Chapter 4 Conclusion

This project began as an attempt to explain what happened to the news stories about the Shannon, Quebec trial. While it remains unclear why that collection of articles disappeared from the Postmedia archives, what this study has detailed are the many gaps that currently exist within digital news media archives. The news sources in this study retained between 77% and 94% of the stories sampled. There was further loss, however, when the articles were captured by the secondary archives Canadian Newsstream, Factiva, and LexisNexis. Part of this loss was related to how these secondary archives access digital content: LexisNexis was unable to capture the majority of National Post articles as their access was seemingly limited to the Financial Post. Globe and Mail articles that only appeared in digital form were not recorded by the archives which did not have access to Globe and Mail Breaking News. Finally, wire stories that were captured in the news samples were not recorded by the secondary archives, likely due to copyright restrictions.

Over the course of this study, the Canadian news media have changed markedly with closures and mergers reshaping—and largely reducing—the landscape. The challenge of increasing profitability in a digital environment has many news organizations looking to minimize operational costs and create new revenue streams. This focus will likely lead to an increased reliance on wire stories and possibly the repackaging of archiving rights, both of which this study demonstrated limit the ability of researchers to reliably access archived digital content. Combined with the emergence of digital first content that has no print equivalent, the digital news space is volatile and currently not designed to prioritize nor guarantee access to archived content.

Further work While this project was able to track and identify problems in the archiving of digital news content, it was unable to track specific controversial articles. The driving forces behind content loss in this project were third-party content and the inclusion of rich media. As such, it does not help explain what happened to the Shannon, Quebec articles. Throughout the constructed week 44

sample, there were no Shannon trial equivalents being covered in the mainstream Canadian news media. The limitations of this study should no discount for the overall importance of answering this question. It is my belief that the methodology of this project could be adapted for the collection and monitoring of issue-specific coverage in the future.

Further work is also required to track how neoliberal sensibilities and a general disinterest in archiving content by news articles is shaping long term access to digital content. Beyond the threats identified by this study, Hansen and Paul (2015) identify additional challenges to archiving digital news content. “The legacy publications pick and choose which types of content are delivered through mobile devices, and not one of them is generating an archive of that set of decisions and content choices” (Hansen & Paul, 2015, p.296). Content shared through both social channels and mobile platforms are both at high risk of being lost as they are not being actively archived by news agencies (Boss & Broussard, 2017; Hansen & Paul, 2015; Moore & Bonnet, 2015).

Questions of access While the question that ignited this project focused on the loss of articles about the Shannon, Quebec trial, at its root, this research and that original question are, at their root, questions about access. As the news media transition further into the digital realm, the ability of individuals to access content becomes less certain. This potentially impacts community memory; the ability of individuals to access their history through media; and reduce the capacity of researchers to conduct news media-based historical analyses.

As the work of Maurantonio, and Hansen and Paul detailed, uncertain or decreasing access to news archives has significant ramifications for researchers and communities. Threats to future access to Canadian digital news media are threats to myriad forms of research that rely on news articles for historical information. What this thesis revealed was that there are many gaps in the systems that archive online news content. These gaps do not impact the access of all audiences equally. An academic researcher, for example, has a greater ability to piece together the fragments of archived news by searching for content across multiple databases. A community organizer, however, does not have the same access to secondary archives’ 45

databases and is, therefore, reliant on the incomplete newspaper archives.

The concept of standing on the shoulder of giants is well established in academia: we look to the work of our colleagues and our mentors to shape our own. News archives perform a similar function for communities beyond academia—they share information between communities that are otherwise separated spatially and/or temporally. Accessing information about the actions taken by communities located downstream from industry, for example, could provide other communities an understanding of how to begin the process of advocating for themselves.

This community benefit, however, is not likely to be reflected in the priorities of Canada’s news organizations, particularly in an era of cost cutting. As news content is adapted to rich formats for news platforms, archiving practices must adapt to collect these often ephemeral forms. With one of Canada’s national news agencies controlled by an American hedge fund and cuts to the national broadcaster’s federal funding, it is clear that there is little will to undertake this work at the news organization level. If we are to safeguard access to digital news archives, academics and librarians who understand its importance will need to invest in this undertaking.

46

Reference List AdEspresso (2016). The complete resource to understanding facebook ad cost—2016 Q3 results. Retrieved from https://adespresso.com/academy/blog/facebook-ads-cost/ Agence France-Presse (2012, July 3). Police raid Sarkozy’s office, home as former French president visits Canada. The Globe and Mail. Akhavan-Majid, R., Rife, A. & Gopinath, S. (1991). Chain ownership and editorial independence: A case study of Gannett Newspapers. Journalism Quarterly, 68(1/2), 59-66. Allan, S. (2006). Online news. New York, NY: Open University Press. Baluja, T. (2014a, January 13). Updated: Layoffs announced at Postmedia and The Globe and Mail. J Source. Retrieved from: http://j-source.ca/article/updated-layoffs-announced- postmedia-and-globe-and-mail Baluja, T. (2014b, February 4). Updated: Postmedia eliminates parliamentary bureau. J Source. Retrieved from: http://www.j-source.ca/article/updated-postmedia-eliminates- parliamentary-bureau Beatty, P. (1996). The new, slimmer all-Canadian CBC (speech). Canadian Speeches, Issues of the Day, 10, 28-32. Retrieved from http://search.proquest.com.proxy.lib.sfu.ca/docview/222266803?accountid=13800 Boss, K. & Broussard, M. (2017). Challenges of archiving and preserving born-digital news applications. International Federation of Library Associations and Institutions, 1-8. doi:10.1177/0340035216686355. Retrieved from http://journals.sagepub.com/doi/abs/10.1177/0340035216686355 Bradshaw, J. (2015a, June 16). Postmedia closes news wire service QMI Agency. The Globe and Mail. Retrieved from https://beta.theglobeandmail.com/report-on- business/postmedia-closes-news-wire-service-qmi- agency/article24987192/?ref=http://www.theglobeandmail.com& Bradshaw, J. (2015b, August 14). Woodbridge acquires full ownership of The Globe and Mail in deal with BCE. The Globe and Mail. Retrieved from http://www.theglobeandmail.com/report-on-business/woodbridge-buys-bces-15-stake- in-the-globe-and-mail/article25973828/ Bradshaw, J. (2016, January 19). Postmedia merges newsrooms, cuts 90 jobs in response to financial woes. The Globe and Mail. Retrieved from 47

http://www.theglobeandmail.com/report-on-business/postmedia-cuts-90-jobs-merges- newsrooms-in-four-cities/article28257456/ Bradshaw, J. & Kildaze, T. (2016, March 14). U.S. hedge fund GoldenTree seeks buyer for stake in Postmedia. The Globe and Mail. Retrieved from http://www.theglobeandmail.com/report-on-business/goldentree-in-talks-to-sell-stake-in- postmedia-report/article29219247/ Britneff, B. (2016, December 8). Parliamentary press gallery now the smallest it's been in 22 years. iPolitics. Retrieved from: http://ipolitics.ca/2016/12/08/parliamentary-press- gallery-now-the-smallest-its-been-in-22-years/ Brügger, N. (2005). Archiving websites. General considerations and strategies. S. Cozart & P. Lunddahl (Trans.). Retrieved from the Centre for Internet Research: http://cfi.au.dk/fileadmin/www.cfi.au.dk/publikationer/cfis_b__ger/nb_archiving.pdf The Canadian Media Concentration Research Project (2017). The growth of the network media economy in Canada, 1984-2016. Retrieved from http://www.cmcrp.org/the-growth-of- the-network-media-economy-in-canada-1984-2016/ The Canadian Media Guild (2013). Job cuts in the broadcast industry in Canada, Nov. 2008 – Aug. 2013. Retrieved from http://www.cmg.ca/en/wp- content/uploads/2013/11/Preliminary-numbers-Broadcast-Job-cuts-between-2008- 2013-CMG.pdf Canadian Press (2016, September 7). Postmedia is now 98 per cent owned by debt holders. Huffington Post. Retrieved from http://www.huffingtonpost.ca/2016/09/07/postmedia- ownership-debt-holders_n_11893844.html Carner, D., McCain, E., & Zarndt, F. (2014). Missing links: The digital news preservation discontinuity. Retrieved from the The International Federation of Library Associations and Institutions (IFLA) website: https://www.ifla.org/files/assets/newspapers/Geneva_2014/s6-carner-en.pdf CBC News (2005, October 27). Lockout 'last resort' to reach deal: CBC president. Retrieved from http://www.cbc.ca/news/canada/lockout-last-resort-to-reach-deal-cbc-president- 1.520365 CBC News (2007, March 28). Quebec town can sue DND over tainted water: court. Retrieved from the CBC News website: 48

http://www.cbc.ca/news/canada/montreal/story/2007/03/28/qc- shannonwater20070328.html CBC News (2012, March 29). CBC budget cut by $115m over 3 years. Retrieved from: http://www.cbc.ca/news/politics/cbc-budget-cut-by-115m-over-3-years-1.1147096 CBC News (2017, April 18). Transcontinental selling 93 newspapers in Ontario and Quebec. Retrieved from http://www.cbc.ca/news/business/transcontinental-newspapers-sale- 1.4073735 CBC.ca (n.d.). Policies. Retrieved from the CBC.ca website: http://www.cbc.ca/aboutcbc/discover/policies.html Charles Veilleux and Associés (n.d.). The Shannon class action lawsuit. Retrieved from the Charles Veilleux and Associés website: http://www.cva-juris.com/en/shannon/ Chu, S., Paul, N., & Ruel, L. (2009). Using eye tracking technology to examine the effectiveness of design elements on news websites. Information Design Journal, 17(1), 31-43. DOI: 10.1075/idj.17.1.04chu Cooke. L. (2005). A visual convergence of print, television, and the internet: Charting 40 years of design change in news presentation. New Media & Society, 7(1), 22-46. Craig, S. (2016a, September 9). Globe aims to cut 40 jobs as revenue sinks; Layoffs might ensue if not enough go freely. National Post, pp. FP2. http://proxy.lib.sfu.ca/login?url=http://search.proquest.com.proxy.lib.sfu.ca/docview/181 8045527?accountid=13800 Craig, S. (2016b, October 21). Postmedia plans to cut staffing costs; Print ad revenue falls, quarterly loss deepens. National Post, pp. FP2. Retrieved from http://www.pressreader.com/canada/national-post-latest- edition/20161021/282286729800414 Davis, A. (2000). Public relations, business news and the reproduction of corporate elite power. Journalism, 1(3), 282–304. Davis, A. (2000). Public relations, news production and changing patterns of source access in the British national media. Media, Culture & Society , 22(1), 39-59. De Waal, E. & Schoenbach, K. (2010). News sites’ position in the mediascape: uses, evaluations and media displacement effects over time. New Media Society, 12(3), 477-496.

49

Deacon, D. (2007). Yesterday’s papers and today’s technology. European Journal of Communication, 22(1), 5-25). Dornan, C. (2003). Printed matter: Canadian newspapers. In D. Taras, F. Pannekoek, & M. Bakardjieva (Eds.), How Canadians communicate (pp. 97-120). Calgary, AB: University of Calgary Press. Edge, M. (2013). Public benefits or private? The case of the Canadian Media Research Consortium. Canadian Journal of Communication, 38(1), 5-34 Evans, P. (2014, October 6). Quebecor sells 175 Sun Media newspapers and websites to Postmedia. CBC News. Retrieved from http://www.cbc.ca/news/business/quebecor- sells-175-sun-media-newspapers-and-websites-to-postmedia-1.2788693 Eveland, W.P., Marton, K., & Seo, M. (2004). Moving beyond “just the facts”: The influence of online news on the content and structure of public affairs knowledge. Communication Research, 31(1), 82-108. DOI: 10.1177/0093650203260203 . Francoli, P. (2003, Dec 08). CBC facing its worst funding crisis: Rabinovitch: 'you are not going to find much fat in the organisation,' says CBC president. The Hill Times, 0-17. Retrieved from http://search.proquest.com.proxy.lib.sfu.ca/docview/208542140?accountid=13800 Friends of Canadian Broadcasting (2014, April 10). Change in parliamentary appropriation to the CBC (in 2014 $). Retrieved from http://www.friends.ca/files/PDF/cbcgrant- 2014update.pdf The Globe and Mail (n.d.). How many articles will I be able to read before reaching the monthly limit? Retrieved from http://feedback.theglobeandmail.com/knowledgebase/articles/122248-how-many- articles-will-i-be-able-to-read-before-re The Globe and Mail (2012). Help: Searching and submitting articles. Retrieved from the Wayback Machine: http://web.archive.org/web/20120323202937/http://www.theglobeandmail.com/help/ The Globe and Mail (2017a). Globe digital. Retrieved from http://globelink.ca/wp- content/uploads/2016/01/GlobeDigital-MediaKit-2017.pdf

50

The Globe and Mail (2017b). The Globe and Mail: National Media Kit 2017. Retrieved from http://globelink.ca/wp-content/uploads/2016/01/Globe-Newspaper-MediaKit-2017-Q2- a.pdf Goho, A. (2004). News that's fit to print—and preserve. Science News, 165, 24-25. Retrieved from http://proxy.lib.sfu.ca/login?url=http://search.proquest.com.proxy.lib.sfu.ca/docview/197 507377?accountid=13800 Gottfried, J. & Shearer, E. (2016). News use across social media platforms 2016. Pew Research Center. Retrieved from http://www.journalism.org/2016/05/26/news-use-across-social- media-platforms-2016/ Gunster, S. (2010). Media and climate justice – Spring 2010 final coding protocol – Version V. Retrieved from: https://webdav.sfu.ca/web/cmns/courses/2012/801/1- Readings/Gunster%20Coding%20Scheme/Climate%20Justice%20Coding%20Protocol %205%20%28Mar%209,1pm%29.pdf Gutstein, D. (2014). Follow the money, part 4—who owns the National Post? Rabble. Retrieved from http://rabble.ca/blogs/bloggers/donald-gutstein/2014/04/follow-money-part-4-who- owns-national-post Ha, T.T. & Perreaux, L. (2012, July 3). Sarkozy holes up in Canadian cottage country as l'affaire Bettencourt rages. The Globe and Mail. Retrieved from https://beta.theglobeandmail.com/news/world/sarkozy-holes-up-in-canadian-cottage- country-as-laffaire-bettencourt- rages/article4386007/?ref=http://www.theglobeandmail.com& Hansen, K.A. & Paul, N. (2015). Newspaper archives reveal major gaps in digital age. Newspaper Research Journal, 36(3), 290-298. Harvey, D. (2010). The enigma of capital. New York, NY: Oxford University Press. Hasselback, D. (2017, June 22). Postmedia agrees to sell its Infomart business to Meltwater for $38 million. Financial Post. Retrieved from http://business.financialpost.com/news/postmedia-agrees-to-sell-its-infomart-business- to-meltwater-for-38-million Hester, B. & Dougall, E. (2007). The efficiency of constructed week sampling for content analysis of online news. J&MC Quarterly, 84 (4), 811-824. 51

Hicks, R.G. & Featherston, J.S. (1978). Duplication of newspaper content in contrasting ownership situations. Journalism & Mass Communication Quarterly, 55 (3), 549-553. Houpt, S. (2017, July 21). Guelph's post-Mercury blues: How an Ontario city is coping without its local newspaper. Globe and Mail. Retrieved from https://beta.theglobeandmail.com/news/national/guelph-mercury- ontario/article35731429/?ref=http://www.theglobeandmail.com&

Infomart (2012). Help – news sources. Retrieved from the Infomart website: http://www.fpinfomart.ca/help/help_dbs.php Infomart (n.d.). Financial Post Mergers & Acquisitions in Canada: The Globe and Mail Inc. Retrieved from http://fpadvisor.infomart.com.proxy.lib.sfu.ca/doc/doc_display.php?key=fp|fpma|29502 Ives, M. (2010). Mounting web woes pummel newspapers. Advertising Age, 81(26), 6. Jackson, E. (2018, May 9). Torstar to move to a subscription model, charge readers for online news. The Financial Post. Retrieved from: https://business.financialpost.com/telecom/media/do-it-on-our-own-torstar-corp-reports- loss-as-it-begins-transformation-strategy Kalogeropoulos, A., Cherubini, F. & Newman, N. (2016). The future of online news video. Reuters Institute for the Study of Journalism. Retrieved from: http://reutersinstitute.politics.ox.ac.uk/sites/default/files/The%20Future%20of%20Online %20News%20Video.pdf?utm_source=digitalnewsreport.org&utm_medium=referral Krippendorff, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Thousand Oaks, CA: Sage Publications, Inc. Ladurantaye, S. (2012, March 30). Most Canadians still prefer to get their news hit from newspapers. The Globe and Mail. Retrieved from: http://www.lexisnexis.com.proxy.lib.sfu.ca/hottopics/lnacademic Lewis, J., Williams, A. & Franklin , B. (2008). A compromised fourth estate? Journalism Studies, 9(1), 1-20. Mahon, A. (2009). What lies beyond the gates: Media gatekeepers and the framing of welfare news in Canada. Retrieved from http://digitool.Library.McGill.CA:80/R/-?func=dbin- jump-full&object_id=86987&silo_library=GEN01 52

Mahon, Q., Lawlor, A. & Soroka, S. (2014) The mass media and welfare policy framing: A study in policy definition. Political Communication in Canada. Eds Marland, A., Giasson, T. & Small, T.A. 160-175. Vancouver, BC: UBC Press Maurantonio, N. (2014). Archiving the visual. Media History, 20(1), 88-102. McCain, E. (2015). Plans to save born-digital news content examined. Newspaper Research Journal, 36(3), 337-347. McBride, S. & McNutt, K. (2007). Devolution and neoliberalism in the Canadian welfare state. Global Social Policy, 7(2), 177-201. Moore, J.E. & Bonnet, J.L. (2015). Survey finds differences on preserving born-digital news. Newspaper Research Journal, 36(3), 348-362. Mosco, V. (2004). The digital sublime: Myth, power, and cyberspace. Cambridge, MA: MIT Press. National Post (n.d.a). Today's paper and archive. Retrieved from: http://www.nationalpost.com/todays-paper/index.html National Post (n.d.b). National rate card. Retrieved from http://mediakit.nationalpost.com/wp- content/rates/np-rates-national.pdf National Post (2017). Subscription FAQ: Digital access. Retrieved from http://www.nationalpost.com/subscribe/faq/index.html#DigitalAccess4 Nationalpost.com (n.d.). Rates—Digital advertising. Retrieved from http://mediakit.nationalpost.com/wp-content/rates/np-rates-digital.pdf Nerberg, S. (1999). Death by a thousand cuts? The Ryerson Review of Journalism. http://rrj.ca/death-by-a-thousand-cuts/ Neubauer, R. (2011). Neoliberalism in the information age, or vice versa? TripleC, 9 (2), 195- 230. Newspapers Canada (2016). Circulation report: Daily newspapers 2015. Retrieved from the Newspapers Canada website: http://newspaperscanada.ca/wp- content/uploads/2016/06/2015-Daily-Newspaper-Circulation-Report- REPORT_FINAL.pdf Pew Research Centre (2017). How Americans encounter, recall and act upon digital news. Retrieved from http://assets.pewresearch.org/wp- content/uploads/sites/13/2017/02/08183209/PJ_2017.02.09_Experiential_FINAL.pdf 53

Postmedia Network Inc. (2010). Media services: Gateway to the world. Retrieved from: http://www.canada.com/postmedianews/index.html Postmedia Network Inc. (2011a, October 25). FPinfomart takes social media research to new heights by teaming up with Social Media Group. Retrieved from the Postmedia website: http://www.postmedia.com/2011/10/25/fpinfomart-takes-social-media-research-to-new- heights-by-teaming-up-with-social-media-group/ Postmedia Network Canada Corp. (2011b, October 27). 2011 annual report. Retrieved from: http://www.postmedia.com/wp-content/uploads/2011/12/Final-PDF-of-Annual-Report- for-posting.pdf Postmedia Network Canada Corp. (2015, October 21). 2015 annual report. Retrieved from: http://www.postmedia.com/wp-content/uploads/2015/12/Postmedia-Annual-Report- F2015_web-2.pdf Postmedia Network Canada Corp. (2016, October 20). 2016 annual report. Retrieved from: www.postmedia.com/wp-content/uploads/2016/11/2016-Annual-Report-FINAL.pdf Public Policy Forum (2017). The shattered mirror. Retrieved from https://shatteredmirror.ca/wp- content/uploads/theShatteredMirror.pdf Quebecor (2012). News media: Canada's top newspaper chain. Retrieved from the Wayback Machine website: https://web.archive.org/web/20120621081331/http://www.quebecor.com/en/content/can adas-top-newspaper-chain Quebecor (2017). Le Journal de Montréal and le Journal de Québec: The most-read dailies in Québec. Retrieved from the Quebecor website: http://www.quebecor.com/en/content/canadas-top-newspaper-chain Reason, M. & Garcia, B. (2007). Approaches to the newspaper archive: Content analysis and press coverage of Glasgow’s Year of Culture. Media, Culture & Society, 29(2), 304-331. Rosenzweig, R. (2003). Scarcity or abundance? Preserving the past in a digital era. The American Historical Review, 108 (3). Retrieved from: https://chnm.gmu.edu/digitalhistory/links/pdf/introduction/0.6b.pdf Ross, P. (2015, February 17). Native Facebook videos get more reach than any other type of post. Retrieved from Socialbakers: https://www.socialbakers.com/blog/2367-native- facebook-videos-get-more-reach-than-any-other-type-of-post 54

Savageau, F. (2012). The uncertain future of the news. In D. Taras & C. Waddell (Eds.), How Canadians communicate IV: Media and politics (pp. 29-44). Edmonton, AB: AU Press. Séguin, R. (2009, January 31). Quebeckers launch class action over cancer cluster near military base. Globe and Mail, pp. A9. Retrieved from the ProQuest Historical Newspapers http://proxy.lib.sfu.ca/login?url=http://search.proquest.com.proxy.lib.sfu.ca/docview/141 2324893?accountid=13800 Shewchuk, B. & Mietkiewicz, M. (2009). Online news fundamentals: An introduction to journalism on CBCNews.ca. Toronto, ON: CBC News. Sotiron, M. (1997). From politics to profit: The commercialization of Canadian daily newspapers, 1890-1920. Montreal, QC: McGill-Queen’s University Press. Spangler, T. (2017, February 2). Snapchat stacks New York Times on media pile. Variety. Retrieved from variety.com/2017/digital/news/snapchat-new-york-times-1201976408/ Stackhouse, J. (2010, September 30). A new Globe - in print and online. The Globe and Mail. Retrieved from http://www.theglobeandmail.com/community/digital-lab/a-new-globe-in- print-and-online/article1735935/ Stackhouse, J. (2012, October 15). New subscription plan combines best of print and digital journalism. Globe and Mail, pp.A2. Retrieved from ProQuest Historical Newspapers http://proxy.lib.sfu.ca/login?url=http://search.proquest.com.proxy.lib.sfu.ca/docview/169 5854790?accountid=13800 Standing Committee on Canadian Heritage. (2016, May 12). CHPC-15 Evidence: Godfrey, Paul 9:15. Retrieved from the House of Commons website: http://www.ourcommons.ca/DocumentViewer/en/42-1/CHPC/meeting-15/evidence Stein, J. (2012, March 8). This blog has moved! The Official FPinfomart Blog. Retrieved from https://fpinfomart.wordpress.com/ Stross, R. (2005, Jun 05). What eBay could learn from craigslist. New York Times. Retrieved from http://proxy.lib.sfu.ca/login?url=https://search-proquest- com.proxy.lib.sfu.ca/docview/92892599?accountid=13800 Titscher, S., Meyer, M., Wodak, R., & Vetter, E. (2000). Methods of text and discourse analysis. Thousand Oaks, CA: Sage Publications. van Leeuwen, T. (2008). Discourse and practice. New York, NY: Oxford University Press.

55

Vividata (2016). Vividata releases study of 40,000 Canadian consumers. Retrieved from https://vividata.ca/news/press-release/ Waddell, C. (2012). Berry’d alive: The media, technology, and the death of political coverage. In D. Taras & C. Waddell (Eds.), How Canadians communicate IV: Media and politics (pp.110-128). Edmonton, AB: AU Press. Watson, H.G. (2016, November 29). Here are the confirmed Globe and Mail staff who took buyouts. J Source. Retrieved from http://www.j-source.ca/article/here-are-confirmed- globe-and-mail-staff-who-took-buyouts Wayback Machine (2017). FPinfomart search results. Retrieved from http://web.archive.org/web/20070627205206/http://www.fpinfomart.ca/home/home.php Western Libraries (2011, October 27). Canadian newspapers – a thing of the past? C.B. “Bud” Johnston Library News. Retrieved from the Western Libraries website: http://www.lib.uwo.ca/news/business/2011/10/27/canadiannewspapersathingofthepast. html Wikileaks (2009, March 24). CBC Radio One report on Canadian Conservative campus front groups, 20 Mar 2009. Retrieved from the Wikileaks website: http://wikileaks.org/wiki/CBC_Radio_One_report_on_Canadian_Conservative_campus_ front_groups,_20_Mar_2009 Winseck, D. (2008). The state of media ownership and media markets: Competition or concentration and why should we care? Sociology Compass, 2(1), 34-47. Winseck, D. (2010). Financialization and the "crisis of the media": The rise and fall of (some) media conglomerates in Canada. Canadian Journal of Communication, 35(3), 365-393. Wong, J, (2013, November 19). Canadian Media Guild data shows 10,000 job losses in past five years. Retrieved from The Canadian Journalism Project website: http://j- source.ca/article/canadian-media-guild-data-shows-10000-job-losses-past-five-years

56

Appendix A: Field-test Coding Protocol

Media Selection: The news stories included in this project will be from one of three sources: The Globe and Mail, the National Post, and CBC News. National newspapers and the CBC were chosen for this project because of their cross-country reach. Also, the increased resources available to national news organizations should allow for a greater capacity to archive news stories than local and independent news organizations might have.

In addition to recording stories from the websites of the three aforementioned sources, hard copies of the National Post and Globe and Mail will be catalogued as part of the project.

Date and Time Selection: Stories will collected for over a six week period, capturing two constructed weeks of news coverage.

Due to the ever-changing content on news websites, each session of story recording will begin with taking screen captures of each section. Stories will be recorded from each website in one session to prevent confusion from content change. Stories will be collected from the sites between 12:00 and 3:00 p.m..

Story Selection: Five stories will be selected from each section of the following sections for each website each day, providing a 500 story sample: 1. News (or Front Page) 2. Commentary (not applicable for CBC News) 3. Politics 4. Business 5. Health & Lifestyle 6. Sports

57

7. Arts & Entertainment 8. Technology & Science

Unit of Analysis: The unit of analysis for this study is the individual news story. Hyperlinks to secondary content within news stories, as is common practice for all three news websites, will be considered as separate items. While the presence of hyperlinks will be noted, the linked story may not be included in the content capture.

Coding Information:

1) Story Date: DD/MM/YYY Online stories include 24 hour timestamp.

2) Story Source: 1. CBC News 2. Globe and Mail 3. National Post

3) Story Accessed: 1. Online 2. Offline

4) Story Title:

5) Story Hyperlink:

6) Story Type: 1. News report: in-house 2. News report: wire service 58

3. Editorial/analysis/commentary 4. Regular columnist 5. Guest columnist/editor 6. Press release 7. Product feature 8. Transcribed interview 9. Letter to the editor

7) Story Word Count:

8) Story Primary Geographic Subject: 1. Canadian city 2. Canadian province 3. Canadian region 4. Canada 5. America 6. Mexico 7. Central America 8. South America 9. Western Europe 10. Eastern Europe 11. Russia 12. China 13. India 14. Asia 15. Middle East 16. Africa 17. Oceania 18. Antarctica

9) Story Actor Focus: 59

0. None 1. Governmental 2. Corporate 3. Nongovernmental/Nonprofit 4. Entertainment 5. Academic 6. Athletic 7. Public: Group 8. Public: Individual 9. Animal

10) Story Controversy: 0. None 1. Legal: Criminal 2. Legal: Human Rights 3. Legal: Corporate 4. Legal: Environmental 5. Legislative 6. Financial 7. Environmental 8. Health 9. Protest: Union/Labour 10. Protest: Political 11. Protest: Environmental 12. Protest: Other

11) Story Position: Recognizing that one story may take a position on several groups/individuals, the position will be coded based on the position which is of central focus in the article. Central focus is that to which 60% of the article is dedicated. 0. None 60

1. Pro-Canadian federal government 2. Pro-Canadian provincial government: specify 3. Pro-Canadian municipal government: specify 4. Pro-Other nation’s government: specify 5. Pro-International body (WHO, UN) 6. Pro-Business 7. Pro-Nongovernmental organization 8. Pro-Other group or individual: specify 9. Anti-Canadian federal government 10. Anti-Canadian provincial government: specify 11. Anti -Canadian municipal government: specify 12. Anti -Other nation’s government: specify 13. Anti -International body (WHO, UN) 14. Anti –Business 15. Anti -Nongovernmental organization 16. Anti-Other group or individual: specify

12) Story Attitude: 1. Neutral: no discernable attitude 2. Negative, descriptive: reports details in a negative tone 3. Negative, analytical: offers an analysis of the subject which leads to a negative conclusion 4. Positive, descriptive: reports details in a positive tone 5. Positive, analytical: offers an analysis of the subject which leads to a positive conclusion

61

Appendix B: Final Coding Protocol

Media selection: The news stories included in this project will be from one of three sources: The Globe and Mail, the National Post, and CBC News. National newspapers and the CBC were chosen for this project because of their cross-country reach. Also, the increased resources available to national news organizations should allow for a greater capacity to archive news stories than local and independent news organizations might have.

Date and time selection: Stories will collected for over a six-week period, capturing one six-day constructed week of news coverage. During the same period, a sample of controversial stories will be collected. The rates of erasure and archiving of the stories in these samples will be compared after a minimum of 24 weeks following their collection.

Due to the ever-changing content on news websites, each session of story recording will begin with taking screen captures of each section. Stories will be recorded from each website in one session to prevent confusion from content change. Stories will be collected from the sites between 12:00 and 3:00 p.m..

Story selection: Five stories will be selected from each of the following sections for each website each day, providing a 690 story sample: 1. News (or Front Page) 2. Commentary (not applicable for CBC News) 3. Politics 4. Business 5. Health & Lifestyle 6. Sports

62

7. Arts & Entertainment 8. Technology & Science

Stories in each section will be selected for their position on the webpage (see coding attribute 8 for a breakdown of the page).

Unit of analysis: The sample unit for this study is the individual news story. Hyperlinks to secondary content within news stories, as is common practice for all three news websites, will be considered as separate items. While the presence of hyperlinks will be noted, the linked story may not be included in the content capture.

Coding protocol: 1) Story date: MM/DD/YYY Online stories include 24 hour timestamp. Include original publication date, if the story was reposted.

2) Story source: 1. CBC News 2. Globe and Mail 3. National Post

3) Story title:

4) Story slug/Url:

6) Author:

7) Wire Story: Identify source, where applicable

63

8) Location of story on website:

9) Unique Identifying Code: Number to denote day of the week; letter to denote site; letter to denote section; number to denote story number in sample

Day Site Section Number

1: Sunday C: CBC N: News/Front Page 1

2: Monday G: Globe and Mail C: Comment 2

64

3: Tuesday N: National Post P: Politics 3

4: Wednesday B: Business 4

5: Thursday H: Health 5

6: Saturday S: Sports 6

A: Arts

T: Tech

Example: 2NB5 would be a story from Monday, collected from the National Post’s Business section, in fifth position.

10) Story is still posted on the news site 24 weeks after publication: 0. No 1. Yes

11) Story is available through LexisNexis: 0. No 1. Yes

12) Story is available through Canadian Newsstream: 2. No 3. Yes

13) Story is available through Factiva: 4. No 5. Yes

65

Appendix C: Images

Images 1-7: Search results for the Shannon, Quebec trial

66

67

68

69

Images 8-10: Methodology & field-test

70

71

Images 12-13: Results

72

Images 12-13: Story “2GN1” original version (left), changed version (right)

73

Appendix D: Additional Figure and Tables

74

75

Chi-Square Analyses

Table legend first number = observed; (second number) = expected; [third number] = χ2

1.1 Between news sites: Archived (A) Changed (C) Missing (M) Duplicate (D) comparison Row News Site Archived Changed Missing Duplicate Totals

175 3 26 6 (178.30) (7.31) (16.76) (7.62) CBC [0.06] [2.55] [5.09] [0.34] 210

185 21 26 8 (203.77) (8.36) (19.16) (8.71) Globe and Mail [1.73] [19.11] [2.44] [0.06] 240

225 0 3 11 (202.92) (8.33) (19.08) (8.67) National Post [2.40] [8.33] [13.55] [0.62] 239

Column Totals 585 24 55 25 689

The chi-square statistic is 56.284. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 56.284, df = 6, χ2/df = 9.38 , P(χ2 > 56.284) = 0.0000.

1.2 Between news sites: AM comparison News Site Archived Missing Row Totals

175 26 (183.73) (17.27) CBC [0.41] [4.41] 201

76

185 26 (192.87) (18.13) Globe and Mail [0.32] [3.41] 211

225 3 (208.41) (19.59) National Post [1.32] [14.05] 228

Column Totals 585 55 640

The chi-square statistic is 23.932. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 23.932, df = 2, χ2/df = 11.97 , P(χ2 > 23.932) = 0.0000.

2.1 Between archives and news sites: ACM comparison

Site Archived Changed Missing Row Totals

162 26 272 Canadian (240.85) (19.05) (200.10) Newsstream [25.82] [2.54] [25.84] 460

213 25 222 (240.85) (19.05) (200.10) Factiva [3.22] [1.86] [2.40] 460

217 18 428 (347.15) (27.45) (288.40) LexisNexis [48.79] [3.25] [67.57] 663

584 24 55 News Sites (347.15) (27.45) (288.40) (Total) [161.60] [0.43] [188.89] 663

77

Column Totals 1176 93 977 2246

The chi-square statistic is 532.218. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 532.218, df = 6, χ2/df = 88.70, P(χ2 > 532.218) = 0.0000.

2.2 Between archives: ACM comparison Site Archived Changed Missing Row Totals

162 (172.03) 26 (20.05) 272 (267.92) Canadian Newsstream [0.58] [1.77] [0.06] 460

213 (172.03) 25 (20.05) 222 (267.92) Factiva [9.76] [1.22] [7.87] 460

217 (247.94) 18 (28.90) 428 (386.16) LexisNexis [3.86] [4.11] [4.53] 663

Column Totals 592 69 922 1583

The chi-square statistic is 33.770. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 33.770, df = 4, χ2/df = 8.44, P(χ2 > 33.770) = 0.0000.

2.3 Between Canadian Newsstream and Factiva: ACM comparison

Site Archived Changed Missing Row Totals

162 26 272 Canadian (187.50) (25.50) (247.00) Newsstream [3.47] [0.01] [2.53] 460

213 25 222 (187.50) (25.50) (247.00) Factiva [3.47] [0.01] [2.53] 460

Column Totals 375 51 494 920

78

The chi-square statistic is 12.016. The p-value is .0025. The result is significant at p < .05. χ2 = 12.016, df = 2, χ2/df = 6.01, P(χ2 > 12.016) = 0.0025.

3. Intra-archive: Canadian Newstream ACM comparison

Article Source Archived Changed Missing Row Totals

86 18 128 (81.70) (13.11) (137.18) Globe and Mail [0.23] [1.82] [0.61] 232

76 8 144 (80.30) (12.89) (134.82) National Post [0.23] [1.85] [0.63] 228

Column Totals 162 26 272 460

The chi-square statistic is 5.370. The p-value is .0682. The result is not significant at p < .05. χ2 = 5.370, df = 2, χ2/df = 2.69, P(χ2 > 5.370) = 0.0682.

4. Intra-archive: Factiva ACM comparison Article Source Archived Changed Missing Row Totals

132 8 92 (107.43) (12.61) (111.97) Globe and Mail [5.62] [1.68] [3.56] 232

81 17 130 (105.57) (12.39) (110.03) National Post [5.72] [1.71] [3.62] 228

Column Totals 213 25 222 460

79

The chi-square statistic is 21.923. The p-value is .000017. The result is significant at p < .05. χ2 = 21.923, df = 2, χ2/df = 10.96, P(χ2 > 21.923) = 0.000017.

5.1 Intra-archive: LexisNexis ACM comparison Article Source Archived Changed Missing Row Totals

2 123 79 (66.77) (5.54) (131.69) CBC [2.24] [2.26] [0.57] 204

123 9 99 (75.61) (6.27) (149.12) Globe and Mail [29.71] [1.19] [16.85] 231

15 7 206 (74.62) (6.19) (147.19) National Post [47.64] [0.11] [23.50] 228

Column Totals 217 18 428 663

The chi-square statistic is 124.065. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 124.065, df = 4, χ2/df = 31.02, P(χ2 > 124.065) = 0.0000

5.2 Intra-archive: LexisNexis ACM comparison, National Post removed Article Source Archived Changed Missing Row Totals

79 2 123 (94.73) (5.16) (104.11) CBC [2.61] [1.93] [3.43] 204

123 9 99 (107.27) (5.84) (117.89) Globe and Mail [2.31] [1.71] [3.03] 231

80

Column Totals 202 11 222 435

The chi-square statistic is 15.015. The p-value is 0.0005. The result is significant at p < .05. χ2 =15.015, df = 2, χ2/df = 7.51, P(χ2 > 15.015) = 0.0005

6. Interarchive: CBC articles ACM comparison between archives Site Archived Changed Missing Row Totals

175 3 26 (127.00) (2.50) (74.50) CBC News Site [18.14] [0.10] [31.57] 204

79 2 123 (127.00) (2.50) (74.50) LexisNexis [18.14] [0.10] [31.57] 204

Column Totals 254 5 149 408

The chi-square statistic is 99.631. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 99.631, df = 2, χ2/df = 49.82, P(χ2 > 99.631) = 0.0000

7.1 Interarchive: Globe and Mail articles ACM comparison between archives Site Archived Changed Missing Row Totals

185 21 26 (131.75) (14.00) (86.25) Globe and Mail News Site [21.52] [3.50] [42.09] 232

86 18 128 (131.75) (14.00) (86.25) Canadian Newsstream [15.89] [1.14] [20.21] 232

Factiva 132 8 92 232

81

(131.75) (14.00) (86.25) [0.00] [2.57] [0.38]

124 9 99 (131.75) (14.00) (86.25) LexisNexis [0.46] [1.79] [1.88] 232

Column Totals 527 56 345 928

The chi-square statistic is 111.430. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 111.430, df = 6, χ2/df = 18.57, P(χ2 > 111.430) = 0.0000

7.2 Interarchive: Globe and Mail articles ACM comparison between archives, News Site removed Site Archived Changed Missing Row Totals

86 18 128 (114.00) (11.67) (106.33) Canadian Newsstream [6.88] [3.44] [4.41] 232

132 8 92 (114.00) (11.67) (106.33) Factiva [2.84] [1.15] [1.93] 232

124 9 99 (114.00) (11.67) (106.33) LexisNexis [0.88] [0.61] [0.51] 232

Column Totals 342 35 319 696

The chi-square statistic is 22.649. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 22.649, df = 4, χ2/df = 5.66, P(χ2 > 22.649) = 0.0001

82

7.3 Interarchive: Globe and Mail articles ACM comparison between Factiva and LexisNexis Site Archived Changed Missing Row Totals

132 8 92 (128.00) (8.50) (95.50) Factiva [0.12] [0.03] [0.13] 232

124 9 99 (128.00) (8.50) (95.50) LexisNexis [0.12] [0.03] [0.13] 232

Column Totals 256 17 191 464

The chi-square statistic is 0.565. The p-value is 0.7538. The result is not significant at p < .05. χ2 = 0.565, df = 2, χ2/df = 0.28, P(χ2 > 0.565) = 0.7538

8.1 Interarchive: National Post articles ACM comparison between archives Site Archived Changed Missing Row Totals

225 0 3 (99.25) (8.00) (120.75) National Post News Site [159.33] [8.00] [114.82] 228

76 8 144 (99.25) (8.00) (120.75) Canadian Newsstream [5.45] [0.00] [4.48] 228

81 17 130 (99.25) (8.00) (120.75) Factiva [3.36] [10.12] [0.71] 228

15 7 206 (99.25) (8.00) (120.75) LexisNexis [71.52] [0.12] [60.19] 228

83

Column Totals 397 32 483 912

The chi-square statistic is 438.092. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 438.092, df = 6, χ2/df = 73.02, P(χ2 > 438.092) = 0.0000

8.2 Interarchive: National Post articles ACM comparison between archives, News Site removed Site Archived Changed Missing Row Totals

76 8 144 (57.33) (10.67) (160.00) Canadian Newsstream [6.08] [0.67] [1.60] 228

81 17 130 (57.33) (10.67) (160.00) Factiva [9.77] [3.76] [5.62] 228

15 7 206 (57.33) (10.67) (160.00) LexisNexis [31.26] [1.26] [13.22] 228

Column Totals 172 32 480 684

The chi-square statistic is 73.242. The p-value is < 0.00001. The result is significant at p < .05. χ2 = 73.242, df = 4, χ2/df = 18.31, P(χ2 > 73.242) = 0.0000 8.3 Interarchive: National Post articles ACM comparison between Canadian Newsstream and Factiva Site Archived Changed Missing Row Totals

76 8 144 (78.50) (12.50) (137.00) Canadian Newsstream [0.08] [1.62] [0.36] 228

81 17 130 Factiva (78.50) (12.50) (137.00) 228

84

[0.08] [1.62] [0.36]

Column Totals 157 25 274 456

The chi-square statistic is 4.115. The p-value is 0.1278. The result is not significant at p < .05. χ2 = 4.115, df = 2, χ2/df = 2.06, P(χ2 > 4.115) = 0.1278

9. AM comparison by date Date Archived Missing Row Totals

200 156 (194.60) (161.40) June 6 [0.15] [0.18] 356

218 140 (195.70) (162.30) June 17 [2.54] [3.07] 358

185 169 (193.51) (160.49) June 22 [0.37] [0.45] 354

182 185 (200.62) (166.38) June 28 [1.73] [2.08] 367

189 162 (191.87) (159.13) July 3 [0.04] [0.05] 351

204 165 (201.71) (167.29) July 9 [0.03] [0.03] 369

Column Totals 1178 977 2155

85

The chi-square statistic is 10.725. The p-value is 0.0571. The result is not significant at p < .05. χ2 = 10.725, df = 5, χ2/df = 2.14 , P(χ2 > 10.725) = 0.0571

86