Publication Bias: Identification of the Internet Community UNCOVER Project Deliverable D3.1 Part A
Total Page:16
File Type:pdf, Size:1020Kb
UNCOVER is an FP7-funded project under Contract N o 282574 02/2013 AIT DUK UNC Joachim Klerx Publication Bias: Identification of the Internet Community UNCOVER project deliverable D3.1 Part A This report should be cited as follows: Klerx, J. Deliverable D3.1 (Part A) of the UNCOVER FP7-funded project under contract number 282574: Publication Bias: Identification of the Internet Community The main goal of task 3.1 (Part A) was to identify the ‘publication bias’ community on the internet by means of social network analysis. Measures were established to identify organisations with key positions in this community network to finally identify specific roles of member organisations of the community. The analysis revealed that new and sometimes unconventional types of organisations are currently gaining key position on the internet. Besides blogs, we identified numerous e-journals, social networks, and video platforms, like YouTube with videos about conference presentations, discussion forums and other new services indicating a structural change on the science and publication infrastructure. UNCOVER is an FP7-funded project under Contract N o 282574 This deliverable was prepared for the UNCOVER project consortium: AIT Austrian Institute of Technology Contact Department for Foresight & Policy Development Dr. Manuela KIENEGGER Coordinating partner Technology Management Department for Foresight & Policy Vienna, Austria Development AIT Austrian Institute of Technology GmbH Donau-City-Straße 1 DUK Danube University Krems A-1220 Vienna Department for Evidence-based Medicine and Clinical Austria Epidemiology T +43(0) 50550-4530 Krems, Austria F +43(0) 50550-4599 [email protected] UNC University of North Carolina at Chapel Hill Gillings School of Global Public Health www.ait.ac.at Chapel Hill, North Carolina, United States Publication Bias : Identification of the Internet Community . This deliverable was prepared by: Joachim Klerx Other contributions to the deliverable by: Eva Buchinger, Dirk Holste, Manuela Kienegger, Petra Wagner-Luptacik 2 / 91 UNCOVER is an FP7-funded project under Contract N o 282574 Content 1 Executive Summary ................................................................................................ 6 2 Introduction ........................................................................................................... 7 2.1 Background ............................................................................................................... 7 2.2 Objectives ................................................................................................................. 7 2.3 Organisation of This Report ...................................................................................... 8 3 Methods ................................................................................................................. 8 3.1 Development of Search Strategy ............................................................................ 10 4 Results from internet search statistics ................................................................... 13 5 Results from web crawling and social network analysis ......................................... 16 6 Conclusion and implications .................................................................................. 23 7 Appendix .............................................................................................................. 25 Appendix A: List of authorities in the publication bias community .................................. 25 Appendix B: List of hubs in the publication bias community ............................................ 32 Appendix C: List of identified domains with organisation names ..................................... 35 Appendix D: List of domains with organisation type ........................................................ 44 Appendix E: List of identified literature ............................................................................ 52 3 / 91 UNCOVER is an FP7-funded project under Contract N o 282574 Table Index Table 1: Google search statistics, related in some kind to the topic ‘publication bias’ ........ 10 Table 2: List of database optimized search strategies from WP 2, applied to internet search ...................................................................................................................... 12 Table 3: Top 10 list of authorities in the publication bias community on the internet ........ 19 Table 4: Top 10 list of hubs in the publication bias community on the internet .................. 19 Table 5: Top 10 sites from Google, Bing and Yahoo ............................................................. 20 Table 6: Classification of organisation according to their web service ................................. 22 Table 7: List of authorities in the publication bias community on the internet ................... 25 Table 8: List of hubs in the publication bias community on the internet ............................. 32 Table 9: List of identified domains with organisation names ............................................... 35 Table 10: List of identified domains with organisation type ................................................. 44 4 / 91 UNCOVER is an FP7-funded project under Contract N o 282574 Figure Index Figure 1: System architecture of the CIA agent ...................................................................... 9 Figure 2: Search statistics for ‘publication bias’ over the time, Index. ................................. 13 Figure 3: Geographic distribution of search statistics for ‘publication bias’, Index. ............ 14 Figure 4: Search statistics for ‘clinical trial’, over time, Index. ............................................. 14 Figure 5: Geographic distribution of search statistics for ‘clinical trial’, Index. ................... 14 Figure 6: Search statistics for ‘evidence-based medicine’, over time, Index. ...................... 15 Figure 7: Geographic distribution of search statistics for ‘evidence-based medicine’, Index. ..................................................................................................................................... 15 Figure 8: Network of publication bias community sites. ...................................................... 16 Figure 9: Network of domains ............................................................................................... 17 Figure 10: Web usage statistics for cochrane.org. ............................................................... 20 Figure 11: Web usage statistics for: www.ncbi.nlm.nih.gov. ................................................ 21 Figure 12: Timeline of publication bias literature identified in the WWW .......................... 21 5 / 91 UNCOVER is an FP7-funded project under Contract N o 282574 1 Executive Summary The main goal of task 3.1 (Part A) was to define the ‘publication bias’ community on the internet by means of social network analysis. Measures were established to identify key positions and roles of organisations within the community network to finally identify prominent member organisations. In this context, the term ‘organisation’ refers to the responsible entity behind a website. A community identification agent (CIA) was used to systematically locate and archive content and activities relating to ‘publication bias’ such as conferences, pressure group sites, standardization organisations, public forums and blogs with publicly accessible sites. The overall aim was to gain a deep insight into the community structure of the publication bias community on the internet through social network analysis, web crawling and site statistics. About 220,000 internet sites were scanned for ‘publication bias’ and related content. About 17,000 sites were identified as being part of the ‘publication bias’ community network on the internet. These sites are operated by 483 website providers. Based on these findings, a network analysis was carried out to explore the community structure. Finally, organisation types were classified according to the services they offer on their internet sites. This revealed that new and sometimes unconventional types of organisations are currently gaining importance in the ‘publication bias’ community. Blogs, e-journals, social networks, and video platforms, like YouTube, with videos about conference presentations, discussion forums and other new services can be considered potential sources for information about the current discussion on publication bias. To address the discussion on publication bias in new media, new forms of information management are necessary. The automatic identification of epistemic communities is only the first step in a direction of automatic knowledge management. It will be a competitive advantage in the publication bias community to use automatic issue management systems with issue identification, issue tracking, weak signal detection for emerging issues and other. Although automatic issue management cannot substitute the manual research, it can support researchers with respect to their information management. One of the main results of task 3.1 (Part A) is a list with organisations (identified by domain name) and their relating network position. The individual position of an organisation can be seen as an indicator for whether the organisation tends to be either an information hub for the community (a site with numerous outbound links) or whether the organisation is accepted as an authority by the community (a site with numerous