Seventh Framework Programme Fp7-Ict-2009-6

SEVENTH FRAMEWORK PROGRAMME FP7-ICT-2009-6 BlogForever Grant agreement no.: 269963 BlogForever: D2.1 Survey Implementation Report Editor: Silvia Arango-Docio Revision: Draft V1.0 Dissemination Level: WP2 Author(s): S. Arango-Docio, P. Sleeman, H. Kalb Due date of deliverable: 31/08/2011 Actual submission date: Start date of project: 01 March 2011 Duration: 30 months Lead Beneficiary name: University of London Abstract: This report outlines a principal study aiming to inform the development of preservation and dissemination solutions for blogs. To achieve this, the study encompassed: [i] a user survey exploring the aspects of blog preservation and blogging practices in general; [ii] an investigation into the use of tools and technologies within the Blogosphere; and finally [iii] an inquiry into the recent theoretical and technological advances for analysing blogs and their networks. The results of the study, as summarised in this report, enable addressing the objectives pursued as part of the D2.1 deliverable of the BlogForever project. More specifically, this report comments on: [a] common weblog authoring practices; [b] important aspects and types of blog data that should be preserved; [c] the patterns in weblogs structure and data; [d] the technology adopted by current blogs; and finally [e] the developments and prospects for analysing blog networks and weblog dynamics. As an account for the conducted work this report includes implementation details and adopted ethical guidelines. The results of the inquiry are discussed within the context of BlogForever – offering directions for further development of the project. Project co-funded by the European Commission within the Seventh Framework Programme (2007-2013) BlogForever_D2_120110808 31 August 2011 The BlogForever Consortium consists of: Aristotle University of Thessaloniki (AUTH) Greece European Organisation for Nuclear Research (CERN) Switzerland University of Glasgow (UG) UK The University of Warwick (UW) UK University of London (UL) UK Technische Universitat Berlin (TUB) Germany Cyberwatcher Norway SRDC Yazilim Arastrirma ve Gelistrirme ve Danismanlik Ticaret Limited Sirketi (SRDC) Turkey Tero Ltd (Tero) Greece Mokono GMBH Germany Phaistos SA (Phaistos) Greece Altec Software Development S.A. (Altec) Greece BlogForever Consortium Page 2 of 142 BlogForever_D2_120110808 31 August 2011 History Version Date Modification reason Modified by 0.1 18 June 2011 First Draft S.Arango-Docio 0.2 24 July 2011 Updated First Draft S. Arango-Docio, P. Sleeman 0.3 27 July 2011 Table of Contents S. Arango-Docio 0.4 27 July 2011 Comments received S. Arango-Docio 0.5 30 July 2011 Second Draft S. Arango-Docio, H. Kalb 0.5.1 03 August 2011 Adaptations to the format, Addition of literature H. Kalb references, Extension of the SmartPLS chapter 0.5.2 03 August 2011 Comments on second draft P. Sleeman 0.5.3 08 August 2011 Merged versions 0.5.1 & 0.5.2 S. Arango-Docio 0.5.4 18 August 2011 Updates to 0.5.3 S. Arango-Docio 0.5.5 24 August 2011 Inclusion of the chapter 4.1.1 and 4.3.3 H. Kalb 0.5.6 24 August 2011 Integration of the Technical Survey K. Stepanyan 0.5.7 25 August 2011 Expansion, review, copy-editing and commenting M. Joy, K Stepanyan 1.0 27 August 2011 Pre-final formatting H. Kalb, K. Stepanyan 1.1 30 August 2011 Review and editing M. Joy BlogForever Consortium Page 3 of 142 BlogForever_D2_120110808 31 August 2011 Table of Contents 1 EXECUTIVE SUMMARY ......................................................................................................................... 8 2 INTRODUCTION ................................................................................................................................. 10 3 BLOG PRESERVATION AND STUDIES REVIEW ..................................................................................... 11 4 BLOGFOREVER SURVEY ...................................................................................................................... 13 4.1 QUESTIONNAIRE DEVELOPMENT ............................................................................................................. 13 4.1.1 Influence Factors for Author and Reader Behaviour .................................................................. 14 4.1.1.1 Blog author intention to contribute to a blog archive ............................................................................. 14 4.1.1.2 Influence factors for the search strategy in the archive .......................................................................... 17 4.1.2 Survey Pilot Testing .................................................................................................................... 18 4.1.3 Survey Translations .................................................................................................................... 20 4.2 SURVEY IMPLEMENTATION ..................................................................................................................... 21 4.2.1 Survey Population and Sampling................................................................................................ 21 4.2.1.1 Sampling strategy ..................................................................................................................................... 21 4.2.1.2 Internet sampling ...................................................................................................................................... 22 4.2.1.3 Sample size ............................................................................................................................................... 25 4.2.2 Survey software.......................................................................................................................... 26 4.2.3 Survey Implementation and Promotion ..................................................................................... 31 4.3 DATA ANALYSES AND RESULTS ................................................................................................................ 33 4.3.1 From IProbe to SPSS and SmartPLS ............................................................................................ 33 4.3.2 SPSS and Excel Analyses ............................................................................................................. 34 4.3.2.1 Authors survey respondents summary ...................................................................................................... 34 4.3.2.2 Common blog authoring practices ............................................................................................................ 37 4.3.2.3 Patterns in blog structure and data ............................................................................................................ 39 4.3.2.4 Network-based metrics ............................................................................................................................. 41 4.3.2.5 Blog lifecycle examination ....................................................................................................................... 43 4.3.2.4 Aspects of blogs for preservation ............................................................................................................. 45 4.3.3 Blog Author Intentions for Contributing to Blog Archives .......................................................... 47 4.3.3.1 Measurement model ................................................................................................................................. 48 4.3.3.2 Structural model ....................................................................................................................................... 49 4.3.4 Influence Factors for Search Strategies within Archives ............................................................ 50 5 TECHNOLOGY USED BY CURRENT BLOGS ........................................................................................... 53 5.1 OBJECTIVES, DATA COLLECTION METHODS AND DATASETS .......................................................................... 53 5.1.1 Data Collection Methods and Datasets ...................................................................................... 54 5.1.2 Evaluation Method ..................................................................................................................... 55 5.2 EVALUATION RESULTS ........................................................................................................................... 56 5.2.1 Platforms and Software Used .................................................................................................... 56 5.2.2 Document Character Sets ........................................................................................................... 58 5.2.3 Use of CSS, Images, HTML5 and Flash ....................................................................................... 60 5.2.4 Semantic Markup: Microformats, Microdata and Metadata .................................................... 62 5.2.5 RSS and Atom Feeds ................................................................................................................... 65 5.2.6 APIs and Libraries ....................................................................................................................... 65 5.2.7 Social Media ............................................................................................................................... 67 5.2.8 Media Types and Common File Formats ...................................................................................

Seventh Framework Programme Fp7-Ict-2009-6

CHAPTER 12 Making Your Web Site Mashable

N8ek Kf N?8Kêj L

Life Sciences and the Web: a New Era for Collaboration

Quality Spine Care

Integrating with Blogs

Trackback Spam: Abuse and Prevention

Elie Bursztein, Baptiste Gourdin, John Mitchell Stanford University & LSV-ENS Cachan

Information Retrieval and Mining in Distributed Environments Studies in Computational Intelligence,Volume 324 Editor-In-Chief Prof

In the Spirit of Georgeʼs Request for Feedback from The

Set-Up E-Jurnal Menggunakan Open Journal Systems (Ojs)

Web Crawling, Analysis and Archiving

Critical Point of View: a Wikipedia Reader