Using the Internet for Social Science Research
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Web Content Analysis
Web Content Analysis Workshop in Advanced Techniques for Political Communication Research: Web Content Analysis Session Professor Rachel Gibson, University of Manchester Aims of the Session • Defining and understanding the Web as on object of study & challenges to analysing web content • Defining the shift from Web 1.0 to Web 2.0 or ‘social media’ • Approaches to analysing web content, particularly in relation to election campaign research and party homepages. • Beyond websites? Examples of how web 2.0 campaigns are being studied – key research questions and methodologies. The Web as an object of study • Inter-linkage • Non-linearity • Interactivity • Multi-media • Global reach • Ephemerality Mitra, A and Cohen, E. ‘Analyzing the Web: Directions and Challenges’ in Jones, S. (ed) Doing Internet Research 1999 Sage From Web 1.0 to Web 2.0 Tim Berners-Lee “Web 1.0 was all about connecting people. It was an interactive space, and I think Web 2.0 is of course a piece of jargon, nobody even knows what it means. If Web 2.0 for you is blogs and wikis, then that is people to people. But that was what the Web was supposed to be all along. And in fact you know, this ‘Web 2.0’, it means using the standards which have been produced by all these people working on Web 1.0,” (from Anderson, 2007: 5) Web 1.0 The “read only” web Web 2.0 – ‘What is Web 2.0’ Tim O’Reilly (2005) “The “read/write” web Web 2.0 • Technological definition – Web as platform, supplanting desktop pc – Inter-operability , through browser access a wide range of programmes that fulfil a variety of online tasks – i.e. -
Legality and Ethics of Web Scraping
Murray State's Digital Commons Faculty & Staff Research and Creative Activity 12-15-2020 Tutorial: Legality and Ethics of Web Scraping Vlad Krotov Leigh Johnson Murray State University Leiser Silva University of Houston Follow this and additional works at: https://digitalcommons.murraystate.edu/faculty Recommended Citation Krotov, V., Johnson, L., & Silva, L. (2020). Tutorial: Legality and Ethics of Web Scraping. Communications of the Association for Information Systems, 47, pp-pp. https://doi.org/10.17705/1CAIS.04724 This Peer Reviewed/Refereed Publication is brought to you for free and open access by Murray State's Digital Commons. It has been accepted for inclusion in Faculty & Staff Research and Creative Activity by an authorized administrator of Murray State's Digital Commons. For more information, please contact [email protected]. See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/343555462 Legality and Ethics of Web Scraping, Communications of the Association for Information Systems (forthcoming) Article in Communications of the Association for Information Systems · August 2020 CITATIONS READS 0 388 3 authors, including: Vlad Krotov Murray State University 42 PUBLICATIONS 374 CITATIONS SEE PROFILE Some of the authors of this publication are also working on these related projects: Addressing barriers to big data View project Web Scraping Framework: An Integrated Approach to Retrieving Big Qualitative Data from the Web View project All content following this -
Digital Citizenship Education
Internet Research Ethics: Digital Citizenship Education Internet Research Ethics: Digital Citizenship Education by Tohid Moradi Sheykhjan Ph.D. Scholar in Education University of Kerala Paper Presented at Seminar on New Perspectives in Research Organized by Department of Education, University of Kerala Venue Seminar Hall, Department of Education, University of Kerala, Thycaud, Thiruvananthapuram, Kerala, India 17th - 18th November, 2017 0 Internet Research Ethics: Digital Citizenship Education Internet Research Ethics: Digital Citizenship Education Tohid Moradi Sheykhjan PhD. Scholar in Education, University of Kerala Email: [email protected] Abstract Our goal for this paper discusses the main research ethical concerns that arise in internet research and reviews existing research ethical guidance in relation to its application for educating digital citizens. In recent years we have witnessed a revolution in Information and Communication Technologies (ICTs) that has transformed every field of knowledge. Education has not remained detached from this revolution. Ethical questions in relation to technology encompass a wide range of topics, including privacy, neutrality, the digital divide, cybercrime, and transparency. In a growing digital society, digital citizens recognize and value the rights, responsibilities and opportunities of living, learning and working in an interconnected digital world, and they engage in safe, legal and ethical behaviors. Too often we see technology users misuse and abuse technology because they are unaware of -
D10.3: Description of Internet Science Curriculum
D10.3: Description of Internet Science Curriculum Sorana Cimpan, Kavé Salamatian, Thomas Plagemann, Meryem Marzouki, Žiga Turk, Jonathan Cave, Tamas David-Barrett, Marco Prandini To cite this version: Sorana Cimpan, Kavé Salamatian, Thomas Plagemann, Meryem Marzouki, Žiga Turk, et al.. D10.3: Description of Internet Science Curriculum. [Research Report] European Commission. 2014. hal- 01222260 HAL Id: hal-01222260 https://hal.archives-ouvertes.fr/hal-01222260 Submitted on 12 Apr 2016 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives| 4.0 International License Description of Internet Science Curriculum ICT - Information and Communication Technologies FP7-288021 Network of Excellence in Internet Science D10.3: DESCRIPTION OF INTERNET SCIENCE CURRICULUM Due Date of Deliverable: 31/05/2014 Actual Submission Date: 12/07/2014 Revision: FINAL Start date of project: December 1st 2011 Duration: 42 months Organisation name of lead contractor for this deliverable: UoS Authors: Sorana -
Methodologies for Crawler Based Web Surveys
Mike Thelwall Page 1 of 19 Methodologies for Crawler Based Web Surveys Mike Thelwall1 School of Computing, University of Wolverhampton, Wulfruna Street, Wolverhampton WV1 1SB, UK [email protected] Abstract There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and analysed, each justifiable in its own right, but a simple experiment is presented that demonstrates concrete differences between them. The concept of crawling the web also bears further inspection, including the scope of the pages to crawl, the method used to access and index each page, and the algorithm for the identification of duplicate pages. The issues involved here will be well-known to many computer scientists but, with the increasing use of crawlers and search engines in other disciplines, they now require a public discussion in the wider research community. This paper concludes that any scientific attempt to crawl the web must make available the parameters under which it is operating so that researchers can, in principle, replicate experiments or be aware of and take into account differences between methodologies. A new hybrid random page selection methodology is also introduced. Keywords Web surveys, search engine, indexing, random walks 1 Introduction The current importance of the web has spawned many attempts to analyse sets of web pages in order to derive general conclusions about its properties and patterns of use (Crowston and Williams, 1997; Hooi-Im et al, 1998; Ingwerson, 1998; Lawrence and Giles, 1998a; Miller and Mather, 1998; Henzinger et al., 1999; Koehler, 1999; Lawrence and Giles, 1999a; Smith, 1999; Snyder and Rosenbaum, 1999; Henzinger et al., 2000; Thelwall, 2000a; Thelwall, 2000b; Thelwall, 2000d). -
A History of Social Media
2 A HISTORY OF SOCIAL MEDIA 02_KOZINETS_3E_CH_02.indd 33 25/09/2019 4:32:13 PM CHAPTER OVERVIEW This chapter and the next will explore the history of social media and cultural approaches to its study. For our purposes, the history of social media can be split into three rough-hewn temporal divisions or ages: the Age of Electronic Communications (late 1960s to early 1990s), the Age of Virtual Community (early 1990s to early 2000s), and the Age of Social Media (early 2000s to date). This chapter examines the first two ages. Beginning with the Arpanet and the early years of (mostly corpo- rate) networked communications, our history takes us into the world of private American online services such as CompuServe, Prodigy, and GEnie that rose to prominence in the 1980s. We also explore the internationalization of online services that happened with the publicly- owned European organizations such as Minitel in France. From there, we can see how the growth and richness of participation on the Usenet system helped inspire the work of early ethnographers of the Internet. As well, the Bulletin Board System was another popular and similar form of connection that proved amenable to an ethnographic approach. The research intensified and developed through the 1990s as the next age, the Age of Virtual Community, advanced. Corporate and news sites like Amazon, Netflix, TripAdvisor, and Salon.com all became recogniz- able hosts for peer-to-peer contact and conversation. In business and academia, a growing emphasis on ‘community’ began to hold sway, with the conception of ‘virtual community’ crystallizing the tendency. -
The Internet
•• TEN THINGS SHOULD KNOW LAWYERS ABOUT THE INTERNET .. The COMMONS Initiative: Cooperative Measurement and Modeling of Open Networked Systems KIMBERLY CLAFFY .. CAIDA: Cooperative Association for Internet Data Analysis BASED AT THE SAN DIEGO SUPERCOMPUTER, CENTER AT UCSD .. table of contents .. .... about the author 01 07 Table of contents, Point #6 -08 Author biography How data is KC Claffy being used Kimberly Claffy Point #1 02 Point #7 08 received her Ph. D. Updating legal -10 in Computer Science frameworks Normal regulatory responses doomed from UCSD. She is Director and a prin- Point #2 02 -03 10 cipal investigator Obstacles to progress Point #8 -16 Problematic responses for CAIDA, and an Point #3 03 Adjunct Professor Available data: -05 Point #9 16 of Computer Science a dire picture The news is -19 and Engineering at UCSD. Kimberly’s not all bad research interests include Internet Point #4 05 measurements, data analysis and vi- -06 The problem is not so 20 sualization, particularly with respect new Point #10 Solutions will cross -23 to cooperation and sharing of Internet data. She advocates the use of quanti- Point #5 06 boundaries An absurd situation tative analysis to objectively inform Sponsors, Credits 24 public Internet policy discussions. Her email is [email protected]. Adapted from: http://www.caida.org/publications/papers/2008/10_things_Lawyers_Should_Know_About_The_Internet 1 2 Last year Kevin Werbach invited me to his Supernova 2007 conference to give a 15- minute vignette on3 the challenge of getting empirical data to inform telecom4 policy. They posted5 the video of my talk last year, and my favorite tech podcast ITConversations , posted the mp3 as an episode last week. -
Characterizing Search-Engine Traffic to Internet Research Agency Web Properties
Characterizing Search-Engine Traffic to Internet Research Agency Web Properties Alexander Spangher Gireeja Ranade Besmira Nushi Information Sciences Institute, University of California, Berkeley and Microsoft Research University of Southern California Microsoft Research Adam Fourney Eric Horvitz Microsoft Research Microsoft Research ABSTRACT The Russia-based Internet Research Agency (IRA) carried out a broad information campaign in the U.S. before and after the 2016 presidential election. The organization created an expansive set of internet properties: web domains, Facebook pages, and Twitter bots, which received traffic via purchased Facebook ads, tweets, and search engines indexing their domains. In this paper, we focus on IRA activities that received exposure through search engines, by joining data from Facebook and Twitter with logs from the Internet Explorer 11 and Edge browsers and the Bing.com search engine. We find that a substantial volume of Russian content was apolit- ical and emotionally-neutral in nature. Our observations demon- strate that such content gave IRA web-properties considerable ex- posure through search-engines and brought readers to websites hosting inflammatory content and engagement hooks. Our findings show that, like social media, web search also directed traffic to IRA Figure 1: IRA campaign structure. Illustration of the struc- generated web content, and the resultant traffic patterns are distinct ture of IRA sponsored content on the web. Content spread from those of social media. via a combination of paid promotions (Facebook ads), un- ACM Reference Format: paid promotions (tweets and Facebook posts), and search re- Alexander Spangher, Gireeja Ranade, Besmira Nushi, Adam Fourney, ferrals (organic search and recommendations). These path- and Eric Horvitz. -
Network Science, Web Science and Internet Science: Comparing Interdisciplinary Areas
Network Science, Web Science and Internet Science: comparing interdisciplinary areas Thanassis Tiropanis1, Wendy Hall1, Jon Crowcroft2, Noshir Contractor3, Leandros Tassiulas4 1 University of Southampton 2 University of Cambridge 3 Northwestern University 4 University of Thessaly ABSTRACT In this paper, we examine the fields of Network Science, Web Science and Internet Science. All three areas are interdisciplinary, and since the Web is based on the Internet, and both the Web and the Internet are networks, there is perhaps confusion about the relationship between them. We study the extent of overlap and ask whether one includes the others, or whether they are all part of the same larger domain. This paper provides an account of the emergence of each of these areas and outlines a framework for comparison. Based on this framework, we discuss these overlaps and propose directions for harmonization of research activities. INTRODUCTION The observation of patterns that characterise networks, from biological to technological and social, and the impact of the Web and the Internet on society and business have motivated interdisciplinary research to advance our understanding of these systems. Their study has been the subject of Network Science research for a number of years. However, more recently we have witnessed the emergence of two new interdisciplinary areas: Web Science and Internet Science. Network Science Network science can be traced to its mathematical origins dating back to Leonard Euler’s seminal work on graph theory (Euler, 1741) in the 18th century and to its social scientific origins two centuries later by the psychiatrist Jacob Moreno’s (1953) efforts to develop “sociometry.” Soon thereafter, the mathematical framework offered by graph theory was also picked up by psychologists (Bavelas, 1950), anthropologists (Mitchell, 1956), and other social scientists to create an interdiscipline called Social Networks. -
Downloading a Webpage
UCLA UCLA Electronic Theses and Dissertations Title Time Constructs: The Origins of a Future Internet Permalink https://escholarship.org/uc/item/133926p6 Author Paris, Britt Publication Date 2018 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA Los Angeles Time Constructs: The Origins of a Future Internet A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Information Studies by Brittany Paris 2018 © Copyright by Brittany Paris 2018 ABSTRACT OF THE DISSERTATION Time Constructs: The Origins of a Future Internet by Brittany Paris Doctor of Philosophy in Information Studies University of California, Los Angeles, 2018 Professor Leah A. Lievrouw, Chair Technological time has been a topic of much theorization and dread, as both intellectuals and laypeople fear that human life is increasingly becoming secondary to the technological world. Feelings of despair and nihilism, perhaps attributable to social, political and economic upheavals brought by the synchronization of human life with technology, have been theorized by numerous scholars in a plethora of overlapping disciplines. What is left undertheorized is how technology develops in ways that might or might not actually foster these sensations of synchronicity, or speed. Technological development includes patterns of social coordination and consumption, as well as individual use and goals, that all relate to a sense of lived time. But what of the ways that technical design fosters these relations? What is the discourse of time in technological projects? This dissertation investigates the aforementioned questions in the context of NSF-funded Future Internet Architecture (FIA) projects—Named Data Networking (NDN), eXpressive Internet Architecture (XIA), and Mobility First (MF)—which are currently underway. -
Evaluating Internet Research Sources
Evaluating Internet Research Sources By Robert Harris Version Date: June 15, 2007 Introduction: The Diversity of Information Information is a Think about the magazine section in your local grocery store. If you reach out Commodity with your eyes closed and grab the first magazine you touch, you are about as Available in likely to get a supermarket tabloid as you are a respected journal (actually more Many Flavors likely, since many respected journals don't fare well in grocery stores). Now imagine that your grocer is so accommodating that he lets anyone in town print up a magazine and put it in the magazine section. Now if you reach out blindly, you might get the Elvis Lives with Aliens Gazette just as easily as Atlantic Monthly or Time. Welcome to the Internet. As I hope my analogy makes clear, there is an extremely wide variety of material on the Internet ranging in its accuracy, reliability, and value. Unlike most traditional information media (books, magazines, organizational documents), no one has to approve the content before it is made public. It's your job as a searcher, then, to evaluate what you locate in order to determine whether it suits your needs. Information Information is everywhere on the Internet, existing in large quantities and Exists on a continuously being created and revised. This information exists in a large Continuum of variety of kinds (facts, opinions, stories, interpretations, statistics) and is Reliability and created for many purposes (to inform, to persuade, to sell, to present a Quality viewpoint, and to create or change an attitude or belief). -
2016 Conference Report
2016 Conference Report www.igf-usa.org #igfusa2016 Table of Contents About the 2016 IGF-USA ........................................................................................................................................ 2 Key Metrics ................................................................................................................................................................ 3 Sessions ....................................................................................................................................................................... 4 Opening Plenary: Beyond Mere Access To Enhanced Connectivity For the Next Billion Online ............... 4 Session Report ....................................................................................................................................................................... 5 Morning Breakout Sessions .................................................................................................................................. 9 Expanding Access, Adoption, And Digital Literacy Through Technology And Local Solutions .................. 9 Session Report .................................................................................................................................................................... 10 Content and Conduct: Countering Violent Extremism and Promoting Human Rights Online ................. 12 Session Report ...................................................................................................................................................................