Web Content Analysis

Web Content Analysis Workshop in Advanced Techniques for Political Communication Research: Web Content Analysis Session Professor Rachel Gibson, University of Manchester Aims of the Session • Defining and understanding the Web as on object of study & challenges to analysing web content • Defining the shift from Web 1.0 to Web 2.0 or ‘social media’ • Approaches to analysing web content, particularly in relation to election campaign research and party homepages. • Beyond websites? Examples of how web 2.0 campaigns are being studied – key research questions and methodologies. The Web as an object of study • Inter-linkage • Non-linearity • Interactivity • Multi-media • Global reach • Ephemerality Mitra, A and Cohen, E. ‘Analyzing the Web: Directions and Challenges’ in Jones, S. (ed) Doing Internet Research 1999 Sage From Web 1.0 to Web 2.0 Tim Berners-Lee “Web 1.0 was all about connecting people. It was an interactive space, and I think Web 2.0 is of course a piece of jargon, nobody even knows what it means. If Web 2.0 for you is blogs and wikis, then that is people to people. But that was what the Web was supposed to be all along. And in fact you know, this ‘Web 2.0’, it means using the standards which have been produced by all these people working on Web 1.0,” (from Anderson, 2007: 5) Web 1.0 The “read only” web Web 2.0 – ‘What is Web 2.0’ Tim O’Reilly (2005) “The “read/write” web Web 2.0 • Technological definition – Web as platform, supplanting desktop pc – Inter-operability , through browser access a wide range of programmes that fulfil a variety of online tasks – i.e. manage a home page, communicate with friends, share/publish pictures, receive news. • Social understanding – Web 2.0 is based around social networking activities – relies on and built through ‘social or participatory’ software. – Users create, distribute, value-add to online content through blogs, facebook profiles, online video, wikis, tagging tools. ‘Produsers’ Bruns (2007) or ‘Prosumers’ – distinction between producers and users/consumers of content disappears. – Hallmark of these applications is the way in which they devolve creative and classificatory power to ‘ordinary’ users. Web 1.0 & Web 2.0 applications ––‘social‘social media’ Web 1.0 • Websites • Email • Discussion forums/chat room/IM Web 2.0 or ‘social media’ • Blogs • Social Network Sites (Facebook) • Online Video sharing/posting (YouTube) • Online Photo sharing/posting (Flickr) • Microblogging sites (Twitter) @, #, RT • Online Link sharing/posting (Del.icio.us, Digg) Methodologies used for analysing online campaigns (supply(supply--side)side) Adapted from offline - `digitized methods` (Rodgers, 2010) Web 1.0 • Online interviews, elite surveys • Online discourse analysis • Online content/feature analysis – ‘Classic ‘ or offline approaches – ‘Web specific – mixed mode Online specific - ‘digital methods’ Web 2.0 • Hyperlinking and webspheres – Voson, Issue Crawler, SocScibotOnline, NodeXL, • Blog analysis toolkits and search engines –Technorati . BAT (Blog Analysis Toolkit) • Twitter and facebook scrapers - Infoscape Lab, 140kit.com, Twapperkeeper, Discovertext • YouTube scrapers -Tubemogul, Tubekit • Wikipedia – Wikiscanner Methodologies for analysing online campaigns (‘demand’ side) User Focused (demand) • Focus groups: Price and Capella (2001); Stromer-Galley and Foot (2002) • Surveys: Bimber and Davis (2003); Gibson & McAllister (2007) • Experiments: Iyengar (2002); Lupia & Philpott (2005) • User statistics: Hitwise, Sitemeter, Alexa, Google Analytics • Computer tracking devices: Phorm • Online SNA software: NodeXL, Discourse analysis to study web content • Focus of the analysis is on web as socio-cultural ‘texts’. • Described and explored from a qualitative perspective to uncover underlying meaning and significance. • Examples of studies: – Markham (1998) Howard (2002) Hakken (1999) Hine (2000) – Benoit and Benoit (2000) ‘The Virtual Campaign: Presidential Primary Websites in Campaign 2000’ American Communication Journal – Warnick (1998) ‘Appearance or Reality? Political Parody on the Web in Campaign 96’ Critical Studies in Mass Communication Web specific content/feature analysis • Focus is on the web page/site as unit of analysis • Structural-Functional & Quantitative - systematic set of criteria developed and applied to sites to yield largely quantitative measures of content, functionality, usability, and design features. – Quantitative-automated - Bauer and Scharl (2000) – Quantitative semi -automated • Web 1.0 • Web 1.5? • Action Centers • Heuristic Evaluation - Nielsen & Molich (1990) ‘Heuristic evaluation of user interfaces’; Collings and Pearce (2002); Yates (2006). Bauer and Scharl (2000) Internet Research Quantitative semisemi--automatedautomated 1. Web 1.0 static/fixed homepages - Parties and campaigns sites Gibson and Ward (2000; 2002) Foot & Schneider (2002) - E-government field see Baker (2009) Pina et al. (2009) Panopoulou et al (2008) West (2007) Henriksson et a. (2006) Holzer and Kim (2005); Garcia et al (2005) http://www.insidepolitics.org/egovt06us.pdf 2. Web 1.5? - Updated web 1.0 schemes to incorporate web 2.0 elements. Party and Campaign home pages - Gulati & Wiliams, 2006; Foot and Schneider 2006 NSM - Stein, 2009 3. ‘Action Centers’ - MyBO, Membersnet, MyConservatives, LibDemACT (Gibson, 2010; Lilleker and Jackson, 2010) Gibson & Ward (2000) SSCORE Gibson & Ward (2002) AJPS Gibson & Ward (2002) AJPS Gibson & Ward (2002) AJPS Stein, L.. 2009 NM&S Stein, L.. 2009 NM&S Stein, L.. 2009 NM&S Stein, L.. 2009 NM&S Gibson, R. 2010 APSA Hyperlinking and webspheres • Focus is on inter-relational/textual element of websites and the wider context. Unit of analysis is the web network rather than individual sites. • Thematic set of websites are identified, ‘captured’/archived over time and the linkage patterns explored as well as content. – Method pioneered by Schneider and Foot (2004) ‘Web Sphere analysis’. – Define a Web Sphere as a ‘hyperlinked set of dynamically defined digital resources spanning multiple Web sites and deemed relevant to related to a central theme or ‘object’.’ The borders of the sphere are defined relationship to central theme/object and a given time period. See Library of Congress, Internet Archive http://www.archive.org/index.php & WebArchivist.org http://webarchivist.org/ Hyperlink analysis More sophisticated/automated methods have developed for academic research since around 2003. – Adaptations of search engine technology – most simple form of online link analysis. – Hindman et al. (2003) ‘Googlearchy’ pioneered the term ‘googlearchy’ referred to a ‘power law .operating in regard to the prominence of sites. – Adamic and Glance (2005) examined linkage between left and right-wing blogs in the U.S. http://www.blogpulse.com/papers/2005/AdamicGlanceBlogWWW.pdf – Ackland and Gibson (2007) comparative analysis of political party systems linkage using VOSON http://anu.voson.edu.au crawler to examine the out and inbound linkages among political parties in 6 different democracies. Hyperlinks = networked communication – inclusivity, identity, opponent dismissal, force multiplication. Test via nos to links to other parties, target of links by tld, inter-linkage within party groups. – For tools see http://www.issuecrawler.net ; http://www.touchgraph.com http://socscibot.wlv.ac.uk/ InterInter--linkagelinkage between Parties by ideological familfamilyy Table 4: Inter-linkage between political parties by party type (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Far Left Left Centre Right Far Right Ecol. Reg. Total no. of parties linking to Total no. % of links other parties of links within party made to family other parties Ecologist 0 0 2 1 0 6 0 6 14 79 Far Left 11 7 0 2 1 3 1 12 71 46 Left 3 4 3 2 0 2 0 2 22 50 Centre 0 4 2 2 0 1 0 5 12 25 Right 1 3 1 8 1 2 1 9 20 55 Far Right 0 0 0 1 2 0 0 3 7 71 Regionalist 0 0 0 0 0 0 1 1 1 100 39 147 Note: The numbers in columns 1-7 are counts of parties (e.g. 11 far left parties linked to other far left parties). Note that the same party can appear in more than once in a given row in columns 1-7, and this is why column 8 (showing the total number of p arties that have linked to another seed site) is not equal to the sum of the preceding columns. Beyond websites…? Blog analysis – key studies Types, Content, Structure Sweetster-Trammel, Kaye D. 2007. ‘Candidate campaign Blogs: Directly Reaching Out to the Youth Vote. ’ The American Behavioral Scientist . 50(9): 1255-1264. Xifra, J. and A. Huerta2008. s ‘Blogging PR: An Exploratory Analysis of Public Relations’ Public Relations Review 34:265-275. Herring, S. et al. ‘Weblogs as a bridging genre’ Information Technology and People 18(2): 2005: 142 -171. Influence Karpf, D. 2008. ‘Measuring Influence in the Political Blogosphere’ Institute Politics and Technology Review , Institute for Politics, Democracy and the Internet. :33-41 http://www.the4dgroup.com/BAI/articles/PoliTechArticle.pdf Etling, B. et al. 2010. ‘Mapping the Arabic blogosphere: politics and dissent online’ 12: 1225-1243. Elmer, G.; Ryan, P.M.; Devereaux, Z.; Langlois, G.; Redden, J. and McKelvey, F. 2007. Election Bloggers: Methods for Determining Political Influence. First Monday 12(4). Drezner and Farrell. 2008. ‘The Power and Politics of Blogs’ Public Choice (134): 15-30 Beyond websites…? Twitter analysis • Boyd, D. et al. 2010 ‘Tweet Tweet Retweet’ Working paper. http://www.danah.org/papers/TweetTweetRetweet.pdf • Tumasjan et al. 2010. ‘Predicting Elections with Twitter’ Association for the Advancement of Artificial Intelligence. Social Science Computer Review http://ssc.sagepub.com/content/early/2010/09/24/089443931038655 7.full.pdf+html • Zuckerman, E. ‘Studying Twitter and the Moldovan Protests’ www.Ethanzuckermann.com/blog/ Beyond websites…? YouTube analysis • Shah, C. 2010. ‘Supporting Research Data Collection from YouTube with Tubekit’ Journal of Information Technology & Politics 7(2&3): 226-240 • • Wallsten, K. 2010. ‘Yes We Can’: How Online Viewership, Blog Discussion, Campaign Statement, and Mainstream Media Coverage Produced by a Viral Video Phenomenon.’ Journal of Information Technology & Politics 7(2&3): 163-181 • Zink, M. Suh, K and J.

Web Content Analysis

Legality and Ethics of Web Scraping

Digital Citizenship Education

Methodologies for Crawler Based Web Surveys

A History of Social Media

Characterizing Search-Engine Traffic to Internet Research Agency Web Properties

Evaluating Internet Research Sources

1 a Practical Guide to Analysing Online Support Forums Richard M

International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-8, Issue-4, November 2019

A Review of Web Searching Studies and a Framework for Future Research

Cancer Internet Search Activity on a Major Search Engine, United States 2001-2003

Internet Communication As a Tool for Qualitative Research

What Is Internet? Ans: Internet Is an Interconnection of Several Thousands of Computers of Different Types Belonging to the Various Networks All Over the World