D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

SEVENTH FRAMEWORK PROGRAMME THEME ICT 2009.7.3 ICT for Governance and Policy Modelling

Project acronym: WeGov

Project full title: Where eGovernment meets the eSociety

Grant agreement no.: 248512

D5.3 Evaluation of the final WeGov Toolbox

Deliverable Id: D5.3 Deliverable Name: Evaluation of the final WeGov Toolbox Status: Final Dissemination Level: Public Due date of deliverable: 30 September 2012 (Project month 33) Actual submission date: 19 October 2012 (Project month 34) Work Package: WP5 Scenarios, Testbeds and Evaluation Organisation name of lead contractor for this GESIS – Leibniz Institute for the Social Sciences deliverable: Author(s): Timo Wandhöfer, Catherine Van Eeckhaute, Beccy Allen, Steve Taylor, Paul Walland Partner(s) contributing: Gov2u, Hansard Society, IT Innovation Centre

Abstract: This document is the evaluation report of the final WeGov toolbox. This document gives an overview of the WeGov prototype versions, describes the evaluation methodology with a process model and an evaluation model, and considers end users’ and expert advisers’ engagement. Complementary, in depth experiments were carried out to validate the accuracy and reliability of the toolbox 3.0. The WeGov methodology also considers the exemplary usage, a user guidance including a manual and Q&A, and privacy considerations.

 WeGov Consortium

Page 1 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 History

Version Date Modification reason Modified by

0.1 30.08.2012 Initial draft Timo Wandhöfer

0.2 30.09.2012 Improved version Timo Wandhöfer

Catherine Van Eeckhaute

0.3 05.10.2012 WeGov Privacy Considerations Steve Taylor

Paul Walland

0.4 06.10.2012 HeadsUp experiment Beccy Allen

0.5 08.10.2012 1st version for internal review Timo Wandhöfer

Catherine Van Eeckhaute

0.6 09.10.2012 Modified structure Timo Wandhöfer

Mark Thamm

0.7 11.10.2012 Review Beccy Allen

0.8 15.10.2012 2nd version for internal review Timo Wandhöfer

Catherine Van Eeckhaute

0.9 17.10.2012 Review Steve Taylor

Paul Walland

Peter Mutschke

1.0 19.10.2012 Finale Version Timo Wandhöfer

Catherine Van Eeckhaute

 WeGov Consortium

Page 2 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 Table of contents

HISTORY ...... 2

TABLE OF CONTENTS ...... 3 LIST OF FIGURES ...... 7

LIST OF TABLES ...... 9 LIST OF ABBREVIATIONS ...... 10 GLOSSARY ...... 12

EXECUTIVE SUMMARY ...... 14 1 INTRODUCTION ...... 16

1.1 WEGOV WORK PACKAGE 5 DELIVERABLES ...... 18

1.2 HOW TO READ THE DOCUMENT ...... 19 2 BACKGROUND ...... 20

2.1 METHODOLOGY ...... 20

2.1.1 PROCESS MODEL ...... 20

2.1.2 APPLIED PROCESS MODEL ...... 22

2.1.3 EVALUATION MODEL ...... 23

2.1.4 QUALITATIVE RESEARCH ...... 24

2.1.5 APPLIED EVALUATION MODEL ...... 25

2.2 END USERS ...... 27

2.2.1 EU-PARLIAMENT ...... 29

2.2.2 GERMAN PARLIAMENT ...... 31

2.2.3 STATE PARLIAMENT ...... 31

2.2.4 LOCAL GOVERNMENT ...... 31

2.2.5 CITY ...... 32

2.2.6 PARLIAMENTARY PARTY ...... 32 2.2.7 NGO ...... 32

2.3 EXPERT ADVICE ...... 33

2.3.1 ADVISORY BOARD ...... 33

2.3.2 WORKSHOPS ...... 34

 WeGov Consortium

Page 3 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

2.3.3 EVENTS ...... 34

2.3.4 EU COMMISSION ...... 34

2.4 CHRONOLOGICAL OVERVIEW OF STAKEHOLDER ENGAGEMENT CONTACTS ...... 35 3 EVALUATION OF THE PENULTIMATE TOOLBOX ...... 38

3.1 INTRODUCTION ...... 38

3.2 TOOLBOX 2.5 ...... 38

3.2.1 CHARACTERISTICS ...... 38

3.2.2 END USER FEEDBACK ...... 42

3.2.3 RESULTS ...... 42

3.2.4 ADVISORY BOARD FEEDBACK ...... 48

3.3 TOOLBOX 2.6 ...... 50

3.3.1 CHARACTERISTICS ...... 50

3.4 CONCLUSION OF THE EVALUATION OF THE PENULTIMATE TOOLBOX ...... 52 4 EVALUATION OF THE FINAL TOOLBOX ...... 54

4.1 TOOLBOX 3.0 ...... 54

4.2 CHARACTERISTICS PRE VERSION ...... 54

4.3 VALIDATION OF ANALYSIS RESULTS WITH END USERS ...... 56

4.3.1 INTRODUCTION ...... 56

4.3.2 AIMS OF THE EVALUATION ...... 56

4.3.3 METHODOLOGY AND SPECIFICATIONS ...... 57

4.3.4 RESULTS ...... 65

4.3.5 DISCUSSION ...... 80

4.3.6 CONCLUSION FOR VALIDATION OF ANALYSIS RESULTS WITH END USERS ...... 82

4.4 EVALUATION WITH THE EU-PARLIAMENT ...... 86

4.5 WORKSHOPS WITH SCIENTISTS, END USERS AND PRACTITIONERS ...... 89

4.5.1 INTRODUCTION ...... 89

4.5.2 2ND WEGOV WORKSHOP DURING EGOV12 CONFERENCE ...... 89

4.5.3 WORKSHOP WITH THE ...... 92

4.5.4 WORKSHOP WITH PRACTITIONERS FROM THE POLITCAMP12 ...... 93

4.5.5 ADVISORY BOARD INVOLVEMENT ...... 95

 WeGov Consortium

Page 4 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.6 HEADSUP: TOPIC OPINION EVALUATION ...... 97

4.6.1 INTRODUCTION ...... 97

4.6.2 TOPIC ANALYSIS ...... 104

4.6.3 IDENTIFYING KEY USERS AND POSTS ...... 109

4.6.4 SENTIMENT ANALYSIS ...... 111

4.6.5 CONCLUSION OF HEADSUP: TOPIC OPINION EVALUATION ...... 118

4.7 CONCLUSION OF THE EVALUATION OF THE FINAL TOOLBOX ...... 121 5 END USER GUIDANCE AND CONDITIONS ...... 123

5.1 INTRODUCTION ...... 123

5.2 EXEMPLARY USAGE ...... 123

5.2.1 WHAT TALKS ABOUT THE CITY ON FACEBOOK? ...... 124

5.2.2 WHAT DISCUSSION IS TAKING PLACE ABOUT DAILY NEWS TOPICS ON THE SOCIAL WEB? 125

5.2.3 HOW CAN E-PARTICIPATION PORTALS BE SUPPORTED? ...... 125

5.2.4 WHICH TOPICS ARISE ON A FACEBOOK PAGE? ...... 126

5.2.5 WHICH TOPICS ARISE WITHIN A FACEBOOK POST? ...... 127

5.2.6 QUICK CATCH UP ON A TOPIC VIA TWITTER ...... 127

5.2.7 TIME-BASED COLLECTION: LAUNCHING AND PUSHING TOPICS ...... 128

5.2.8 MONITORING OF POLITICAL SNS CAMPAIGNS ...... 129

5.2.9 INCREASED ALERTNESS DURING PRE-ELECTORAL PERIODS ...... 129

5.3 USER MANUAL ...... 129

5.4 QUESTIONS & ANSWERS ...... 129

5.5 WEGOV PRIVACY CONSIDERATIONS ...... 130

5.5.1 BACKGROUND ...... 130

5.5.2 KEY ISSUES ...... 131

5.5.3 CONCLUSIONS FROM KEY ISSUES ...... 137

5.5.4 IMPLEMENTATION RECOMMENDATIONS ...... 138

5.5.5 POLICIES ...... 145

5.5.6 CONCLUSION OF WEGOV PRIVACY CONSIDERATIONS ...... 150 6 GENERAL CONCLUSION ...... 151 REFERENCES ...... 154

 WeGov Consortium

Page 5 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

A. PRE-TESTING – TOOLBOX 2.5 ...... 158

B. MATERIAL – TOOLBOX 2.5 ...... 161 C. PRE-TESTING – TOOLBOX 2.6 ...... 171 D. PRE-TESTING – TOOLBOX 3.0 ...... 173 E. MATERIAL – VALIDATION OF ANALYSIS RESULTS WITH END USERS ...... 175 F. MATERIAL – HEADSUP: TOPIC ANALYSIS EVALUATION ...... 204 G. MATERIAL – WEGOV PRIVACY CONSIDERATIONS ...... 212

 WeGov Consortium

Page 6 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 List of figures

Figure 1: WeGov process model ...... 21

Figure 2: Socio-technical plan ...... 23 Figure 3: Evaluation and data evaluation model ...... 26

Figure 4: Toolbox 2.5 - evaluation cycle external evaluation ...... 42 Figure 5: Toolbox 3.0 - evaluation cycle validation of toolbox results ...... 56 Figure 6: Sample for an individual data profile ...... 59

Figure 7: Toolbox 3.0 - UI for Facebook monitoring ...... 59 Figure 8: Toolbox 3.0 - UI for Twitter monitoring ...... 60 Figure 9: Analysis process for the analysis report ...... 61

Figure 10: Querying Twitter for #PC12 ...... 82 Figure 11: Toolbox 3.0 - evaluation cycle topic opinion validation ...... 97 Figure 12: HeadsUp Analyser – forum selector ...... 101 Figure 13: HeadsUp Analyser - analysis results ...... 102 Figure 14: HeadsUp analyser - topic group results ...... 103 Figure 15: examples of short positive posts ...... 113 Figure 16:post showing incorrect analysis of sentiment ...... 114 Figure 17: posts showing correct analysis of sentiment ...... 115 Figure 18: Facebook pages of all 23 districts in Vienna ...... 124 Figure 19: e-participation portal for the budget of Cologne ...... 125 Figure 20: Topic analysis for BBC website ...... 126 Figure 21: Topic analysis on BBC Facebook page ...... 127 Figure 22: Twitter dialogue with ...... 128 Figure 23: Roles associated with WeGov ...... 130 Figure 24: Toolbox 2.5 - evaluation cycle pre-testing ...... 158 Figure 25: Toolbox 2.6 - evaluation cycle pre-testing ...... 171 Figure 26: Toolbox 3.0 - evaluation cycle pre-testing ...... 173 Figure 27: posts showing incorrect analysis of sentiment ...... 206 Figure 28: posts showing correct analysis of sentiment ...... 206

 WeGov Consortium

Page 7 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 29: examples of short positive posts ...... 210

 WeGov Consortium

Page 8 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 List of tables

Table 1: Toolbox prototypes in a nutshell ...... 18

Table 2: End user engagement and methodology ...... 29 Table 3: Expert advice and engagement ...... 33

Table 4: Chronological Overview of Stakeholder Engagement Contacts ...... 37 Table 5: Toolbox 2.5 – characteristic ...... 41 Table 6: Toolbox 2.5 - evaluation results ...... 48

Table 7: Toolbox 2.6 - characteristic ...... 52 Table 8: Toolbox 3.0 – characteristic ...... 55 Table 9: Questionnaire page 3/6 – topics ...... 67

Table 10: Questionnaire page 4/6 - key users and posts ...... 69 Table 11: Questionnaire page 5/6 - posts to watch ...... 73 Table 12: Questionnaire page 5/6 - users to watch ...... 75 Table 13: Questionnaire page 6/6 - discussion activity ...... 77 Table 14: comparison of debate themes - report versus toolkit ...... 105 Table 15: comparison of keyword search versus toolkit ...... 106 Table 16: comparison of different analyses on the same data ...... 107 Table 17: comparing the number of posts excluded from analysis results ...... 108 Table 18: comparing quotes from the reports and key users from the analysis results ...... 110 Table 19: sentiment analysis of topic groups ...... 112 Table 20: percentage accuracy of sentiment analysis ...... 114 Table 21: Toolbox 2.5 - pre evaluation results ...... 160 Table 22: Toolbox 2.6 – results ...... 172 Table 23: Toolbox 2.6 – results ...... 172 Table 24: Locations (constituencies and areas) ...... 177

 WeGov Consortium

Page 9 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 List of abbreviations

Abbreviation Explanation

CDU Christlich Demokratische Union (German)

CeDEM Conference for e-democracy

D5.1 WeGov deliverable D5.1 - Scenario definition, advisory board and legal/ethical review

D5.2 WeGov deliverable D5.2 – Initial Evaluation of the WeGov Toolbox

EC European Commission e-citizens Citizens on the web e-democracy Electronic Democracy e-forum Electronic Forum e-government Electronic Government e-participation Electronic Participation

EPP European People’s Party

EU European Union

FP7 Seventh Framework Programme

Gov2u Government to you

ICT Information and Communication Technology

IMCO Internal Market and Consumer Protection Committee

IT Internet Technology

MEP Member of European Parliament

MP Member of Parliament

 WeGov Consortium

Page 10 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

MS Member States

PM Policy Maker

NGO Non Governmental Organization

NRW Nordrhein-Westfalen (German)

SNS Social Networking Sites

UK United Kingdom

URL Uniform Resource Locator

WeGov Where eGovernment meets the eSociety

 WeGov Consortium

Page 11 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 Glossary

Term Explanation

Comments/posts Messages written by HeadsUp users in the forums and subsequently analysed by the WeGov toolkit.

Controversy How polarised the comments are in the debate. This ranges from 0 (no controversy – total agreement) to 10 (complete controversy – no agreement).

End Users External organisations and people who represent the potential users of WeGov – examples are members of the Bundestag, EC, and local governments we interacted with.

End User Partner WeGov project partners that are responsible for user scenarios and end user engagement. These are GESIS, The Hansard Society and Gov2u.

Key posts The posts that the toolkit deems to be most important to the debate.

Key users The users that the toolkit deems to be most important to the debate.

Policy Maker The governmental representative that is the target user of WeGov. They may be an MP, a local government employee – anyone who represents government in any form, and wishes to interact with citizens on the social web.

Relevance How closely related the individual posts are to the key words in a group. This ranges from 0 (no relevance) to 1 (the best match possible). Only posts greater than 0.5 are shown.

Report The manual analysis of the HeadsUp forums that we are using to judge the toolkit against. This was created by a human analyst at the time the forum was run.

Sentiment The degree of positivity or negativity displayed in comments or posts. Sentiment ranges from +10 (most positive) to -10 (most negative).

 WeGov Consortium

Page 12 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Stakeholder A stakeholder is a person or body of people who may influence decision-making. In the WeGov project we focussed on stakeholders in the field of government and NGOs. (Cp. 2.2 End Users)

Themes The key areas of discussion as documented in the HeadsUp reports.

Toolkit/WeGov/HeadsUp Names for the algorithms and user interface built to analyse the Analyser data from the HeadsUp forums.

Topic group The group of posts selected by the analyser as being most related to one another. These are characterised by keywords from the posts that represent the theme of the group.

 WeGov Consortium

Page 13 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 Executive summary

This document is the evaluation report of the final WeGov toolbox. The WeGov project developed five prototypes in total instead of the initial plan of just two prototypes; a first and a final prototype. A three month extension to the project duration allowed the consortium to update its strategy by creating more software iterations for stakeholder engagement, with increased opportunities for user feedback to be incorporated. The prototypes developed in the project are:

• prototype 1.0 implemented the use case of injecting posts into Facebook group and analysing users’ feedback afterwards; • version 2.0 included the concept of creating workflows, using the functionality for quicker search and analysis; • toolbox 2.5 enabled multiple long-term searches on geographically restricted information from social networking sites; • the 2.6 prototype implemented the functionality for analysing multiple long-term searches, • and the final toolbox, version 3.0, combined the functionalities that were highlighted by stakeholders. Prior to making the prototypes available to stakeholders, the functional capability was checked within the consortium and partly improved. The external stakeholders engaged range from the European Parliament, the German Bundestag, the State Parliament NRW, German states, municipalities and political parties in Germany and Belgium, to a number of specific experts. The evaluation approach included face-to-face engagement, like interviews, workshops and hands on demos, as well as experiments to validate the accuracy and reliability of the WeGov analysis results against external data sets. The diverse end users and the experiments delivered valuable feedback on the toolbox in general, its usability, its functionality and the analysis output and they suggested a number of use cases for integration of the WeGov functionality in the workflow of policy makers or their supporting staff. The socio-technical process of stakeholder engagement allowed the consortium to better understand and shape how the WeGov toolbox will be used - this included best practice advice or guide and a strategy relating to legal and ethical issues. Overall, the end users saw the WeGov tools as a valuable complement to existing social media tracking tools, responding to specific concerns related to exploitation of social media for policy purposes. Although the WeGov tools have known a considerable evolution during this evaluation phase, some gaps remain with regard to usability in general and transparency, visualisation and accuracy of the analysis results, mainly the topic-opinion analysis. A considerable part of the end expressed their interest to remain further engaged after the end of the WeGov project. Therefore, the consortium has taken a number of measures, like for

 WeGov Consortium

Page 14 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 instance allow them to further use the tools for a certain period. They will also be considered as priority stakeholders for any new evolution of WeGov.

 WeGov Consortium

Page 15 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 1 Introduction

WeGov has been funded with support from the European Commission under the SEVENTH FRAMEWORK PROGRAMME THEME ICT 2009.7.3 ICT for Governance and Policy Modelling. The project aims to provide the tools and techniques for closing the loop between policy makers and citizens, taking advantage of the plethora of well-established social networking sites. These tools will enable policy makers to move away from limitations involved in the current practice of government hosted websites and instead to make use of the high levels of participation and rich discussions that already take place in existing social networking communities. The aim of work package WP5 - Stakeholder Engagement and Evaluation of the WeGov Toolbox, was to provide a final analysis of the tools developed by the technical partners against the requirements from the WeGov end users, supporting their engagement with citizens on the social web. Therefore, stakeholders and a methodology were needed to turn a toolbox, topic opinion analysis and user behaviour analysis, into an application that can be used in the everyday life of a policy maker. The methodology we used aimed to engage stakeholders with real life problems and behavioural patterns on social networking sites, providing suggestions on how to fit the toolkit into the daily workflow of a policy maker. This is the third and last deliverable in relation to WeGov user engagement. Where the first deliverable addressed scenarios, the second covered the evaluation of the first WeGov prototype, this deliverable includes feedback on functionality that was previously evaluated by stakeholders. The different methods used to evaluate the toolbox and the parties involved in the evaluation are detailed within this deliverable. When the WeGov project began in January 2010 there was no prototype providing analysis tools and no stakeholders who would provide suggestions on how to use the analysis components in a real life environment. The following table documents the prototypes that have been developed throughout the project.

Functionality/ Level of Launch for Groups Feedback for the next use case engagement external participating development iteration 1 Report Version number evaluation

“0” Show possible Paper mock- --- End user Policy makers inject posts on D5.1 scenarios ups and partner; SNS to receive feedback on screenshots their statements Advisory Board;

End users

1 The internal testing with the pre version of the toolbox started approximately two weeks beforehand

 WeGov Consortium

Page 16 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Functionality/ Level of Launch for Groups Feedback for the next

use case engagement external participating development iteration 1 Report Version number evaluation

1 Injection on 1st demo 31/03/2011 End user A PMs post do not receive D5.2 Facebook and version partner; enough comments; D4.2 analysing the Review PMs would rather use their users’ Meeting; mobiles for injection; comments End users PMs do not use multiple injections;

The constituency is important

2 Tools created 2nd demo 25/10/2011 End user Workflows need to be D5.3 once can be run version partners; improved; many times Scientific Geographically restricted community searches / analysis; (eChallenges2 More data for analysis 011); process; End users

2.5 Multiple long 1st “hands- 14/03/2012 End user Multiple searches into one D5.3 term searches on” version partners; analysis; and Advisory German analysis; geographical Board; restriction on Bigger or smaller radius than SNS End users 10 km for local Twitter searches;

Improvement of widgets , more widgets that are easier to use

2.6 Multiple long 2nd “hands- 30/07/2012 End user More improvement needed D5.3 term analysis on” version partners to functionality development and than usability -> richer results geographical for the validation of analysis restriction on results SNS

3 Long term Final “hands- 27/08/2012 End user Recommendations on D5.3 analysis of on” version partners; usability, use cases, multiple inputs, dissemination, accuracy and Scientific easy to use apps quality of data (cp. this community on the landing deliverable)

 WeGov Consortium

Page 17 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Functionality/ Level of Launch for Groups Feedback for the next

use case engagement external participating development iteration 1 Report Version number evaluation

page, and (EGOV12); recommend- End users actions of users and posts to reply to

Table 1: Toolbox prototypes in a nutshell

The first column shows the version numbers of the prototypes. The second column describes the main functionality where feedback from end users was needed. The third column shows the level of engagement - starting with mock-ups describing scenarios and ending with clickable functionalities. The fourth column shows the date when the prototype was launched and the fifth column shows the groups that participated into the evaluation phase. The main important feedback that was mentioned by the participated groups is shown in column six. The last column shows the names of three WeGov deliverables that cover the evaluation of the different prototype versions.

1.1 WeGov Work Package 5 deliverables

• D5.1 – Scenario definition, advisory board and legal/ethical review describes the first six months’ where scenarios used as starting points to discuss the WeGov approach with stakeholders. Therefore we used mock-ups and screenshots why this prototype is called version zero. [6] Results are also published in Addis et al. (2010) and Wandhöfer et al. (2011a).

• D5.2 - Initial Evaluation of the WeGov Toolbox describes month six until month 18. This deliverable covers the first WeGov demo version that was kicked off with a workshop at the German Bundestag and its use cases were evaluated with the method of semi- structured interviews. Hence the deliverable is an internal one we published its results in Joshi et al. (2011), Wandhöfer et al. (2012b), Geana et al. (2012), and Fernandez et al. (2012)

• D5.3 – Final Evaluation of the WeGov Toolbox is the underlying report and last deliverable that describes stakeholder engagement and evaluation of toolbox versions that could be tested by stakeholders. Here we focus on the penultimate toolbox 2.5 and the final toolbox 3.0. First results and evaluation strategies are published in Wandhöfer et al. (2012a,c) and Joshi et al. (2012).

 WeGov Consortium

Page 18 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 1.2 How to read the document

In the next section, we discuss the background to the evaluation process we used – why we chose it, who we engaged with, etc. We then discuss evaluation of the penultimate and final tooboxes (evaluation of earlier toolboxes is covered in previous deliverables). Each evaluation section contains the outcome of the evaluation, and we show how the feedback from evaluation has fed into the development of the next iteration. The next two sections concentrate on user guidance and best practice. First, we present a “user manual” for the WeGov toolbox, aimed at helping a prospective end user to get the most out of the toolbox. After this, we present our analysis of privacy protection of social network users from the perspective of the end user policy maker. Finally, we draw general conclusions.

 WeGov Consortium

Page 19 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 2 Background

2.1 Methodology

On the one hand the WeGov toolbox is a research project that has developed a web application to support policy makers in engaging with citizens on SNS. On the other hand, the WeGov toolbox is also a feasibility study for the use of automatic analysis components to engage with data from SNS. The challenge was to reconcile the politicians’ needs with the technical feasibility of analysis components that were developed in the project. It was therefore necessary to engage policy makers from the beginning of the process, specifically the design of the analysis tools. The development process needed to be continuous, with new iterations combining policy makers’ requirements with the technical feasibility of analysis tool development, as well as presenting and discussing software prototypes throughout. This process was accompanied by internal and external evaluations and validations that were conducted during events and several types of experiments.

2.1.1 Process Model A long-term research and development project does of course run the risk of losing its stakeholders' interest if the engagement process is not managed properly. A further risk for WeGov was the fact that our end users are policy makers and members of parliaments who are extremely busy. Internal shifts in the political climate of Europe and regional/local elections can make it all the more challenging to sustain engagement with the same group of people throughout the project's lifetime. We therefore built into our methodology a process for stakeholder engagement that would facilitate a viable model in response to the constraints mentioned above. This model of engagement sustains interest from the stakeholders because it stresses the need for frequent reporting to them on project evolution, hands-on demonstrations as well as the arrangement of face-to-face and virtual conferences or symposia, where project findings could be debated with the immediate and wider stakeholder group. The rationale behind this was to encourage involvement from the stakeholders whose participation we sought within the project. This would also enable us to feed back to them how their suggestions, comments and views were integrated in the evolving prototype of the toolkit. As a result of this approach, we succeeded to keep a loyal core user group engaged during the full project duration. Unfortunately, a number of initial candidates dropped during the process for various reasons. This was offset by a significant number of new stakeholders joining the trials during the project. Overall, WeGov ended the project with a serious net increase of its trial user population.

 WeGov Consortium

Page 20 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Finally a key concern that our stakeholders shared with us was the question: “what happens after the end of the project?” To address this fear of a 'pilot-effect', it was essential within our engagement framework to brief our stakeholders in a clear and transparent manner about issues ranging from the IP (intellectual property) status of tools and resources developed within the project, to the project’s built-in sustainability and continuity measures that would allow them to exploit project outcomes long after the end of the project. In particular the knowledge transfer and engagement with the WeGov toolkit, we believe would enable the end users to integrate social networks and citizen-centric policy making into their everyday work.

Figure 1: WeGov process model

The WeGov stakeholder engagement model considered the good stakeholder engagement principles of transparency, meaningful dialogue, expectation-management, feedback and

 WeGov Consortium

Page 21 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 analysis within its practical execution.2 This iterative engagement with stakeholders on the project’s evolution, progress and outcomes, allowed the final results to be firmly grounded and externally verified by the policy makers, meeting their needs and expectations.

2.1.2 Applied Process Model The development plan of the WeGov toolbox provided five iterations each producing a software version of the WeGov toolbox: one initial prototype, three improved prototypes and the final version of the toolbox. For each iteration, the prototype was presented to the end users and feedback was sought to improve the next version. The original project plan included two main stakeholder iterations for the development process, but with the launch of the initial prototype the consortium decided to present further prototype versions to end-users before launching the final toolbox. The addition of three iterations allowed more effective engagement and increased the chance of developing functionality that fits into the policy maker’s everyday life, ensuring the toolbox is useful and effective.

Our set of end users was in effect self-selecting: of those approached, the ones that wished to engage with us were those with an interest in social networking. During the first period of the project, members of parliament were mostly considered for validating the toolbox, because they are directly or indirectly elected by the citizens to represent their interests within the decision- making process. Therefore dialogue between citizens and members of parliament seemed to be of common interest. However our end user group included members from different kinds of parliaments as well as their office members that are responsible for public relations and presswork. These stakeholders came from the EU Parliament, the German Bundestag and from the State Parliament of Nordrhein-Westfalen in Germany. In addition the group included stakeholders from parliamentary parties, cities and other public organizations. Most of the stakeholders were recruited in public meetings and conferences where the WeGov project was presented and discussed with participants. For example, the PolitCamp3 and the Open Government Camp4, which took place in Germany, were two excellent events for this purpose because their overall aim was to connect stakeholders, in the economy and sciences. Stakeholder engagement therefore was an on-going process in which requirements were identified and the progress of software versions needed to be examined. Hence an open semi- structured expert interview was designed that allowed both, the underlying assumptions about the stakeholder’s daily work to be verified as well as considered how the current software

2 Good Stakeholder Engagement - Key Components of Stakeholder Engagement. URL: http://goo.gl/hoq2L (Retrieved May 2012)

3 URL: http://politcamp.org/ (Retrieved May 2012)

4 Open Government Camp. URL: http://goo.gl/BVYYT (Retrieved May 2012)

 WeGov Consortium

Page 22 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 satisfied the previously defined requirements. For each interview, there were about 30 minutes scheduled with the interviewees not needing to do any preparation. One part of the interview was designed to assess what efforts are already being made to use social network sites. In addition, the requirements for using SNS and the drawbacks when using these sites had to be determined. After getting a better idea of the current online engagement of politicians and the problems they faced, the WeGov solution could be more consistent with real needs. Another part of the interview was demonstration of the current prototype’s basic functionality. This part was mainly to demonstrate what kind of functionality was possible – we found that once people had an idea of what was possible they would ask for improvements and imagine new use cases – all useful feedback for our development.

2.1.3 Evaluation Model The aim of the WeGov project is a toolbox that enables policy-makers to better engage with the citizens and civil society and to close the gap between them. Therefore this work package has developed a three-step model showing what the socio-technical process looked like.

Figure 2: Socio-technical plan

Within this process it was necessary to identify and understand the behavioural patterns, how the policy makers interact with SNS, and what their expectations and difficulties with this technology are. These patterns will be generalized to start the socio-technical process of designing helpful tools to interact with SNS users. The next step is to show policy makers the use cases including the system’s functionality and its usability, and allow them to evaluate it. This process is a kind of a socio-technical harmonization to better understand behaviour and requirements and to improve the real life functionality. The third step focuses on the validation of the analysis results that are provided by the software. This step is important in order to get feedback on the usefulness of the tool’s results for the everyday use of a policy maker and to identify the parameters that influence this usefulness allowing them to be fed into further improvements.

 WeGov Consortium

Page 23 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 2.1.4 Qualitative Research While qualitative methods are based on the evaluation of non-standardized data and its analysis, qualitative methods are rather applied for evaluating and analysing a data sample. For Wilson (1989) there is no answer from the literature what the best approach is, it is more a question of the right method and how the method fits the needs of the task. Within the WeGov project we applied both – qualitative and quantitative methods. For the validation of analysis results we mixed both approaches. (Cp. Chapter 4.3) The reason is for the method mix is due to the fact that politicians are difficult to catch and we wanted to increase the amount of stakeholder engagement, and we let the policy makers choose how to participate, thereby making it easy for them to participate. We found this to be more effective than attempting to dictate the method of engagement ourselves. 2.1.4.1 Expert interviews The semi-structured interview is a method within the field of qualitative research. By using closed questions assessments nearly every potential feature for the system can be polled. Whatever the answer, positive or negative, the interviewer can add an open question to understand the reasons for the interviewee’s decision. Therefore within the WeGov project semi-structured interviews were used within all steps to conduct: (1) the requirements gathering process, (2) the feedback process and (3) the validation process. Langer (1985) argues that especially the verbal driven methods in the qualitative research are very strong, because the interviewees can describe the own viewpoint. The interviewees engaged within the WeGov project are all ‘experts’ – this means they are familiar with the field of decision-making within the field of politics. That’s why the methods of interviews are often called expert interviews. Another common label is elite interviews. Here Litting (2008) argues that these are nearly the same methods, because there are more similarities than dissimilarities between them. Mayring (1990) describes three classifications of interview formats:

• Open vs. closed interviews: If the interview is open the interviewee has the freedom to answer without selecting an answer from a list. In the case of a closed interview form the interviewee is dependent to select an answer from a given list.

• Structured vs. unstructured interviews: The characteristic of an unstructured interview is that the interviewer can select the questions himself. In contrast the unstructured interview instructs the interviewer, which questions to ask.

• Qualitative vs. quantitative interviews: This classification depends on the approach how the interview will be analysed.

 WeGov Consortium

Page 24 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Within the WeGov project we choose open instead of closed interviews, unstructured rather than structured interviews and qualitative instead quantitative interviews. However, we had a set of well-defined objectives for the interviews – for example to elicit feedback on the current prototype and to find possible new use cases. Therefore we labelled our interview method the “semi-structured interview”. During the interviews we documented the interviews by recording them with a voice recorder or by writing bullet points with a follow-up summary. After the interviews we used the abstracting method that is described by Mayring (1990). Here the level of text quantity will be reduced by summarizing the major points. After the step of text reduction we used the method of modelling types (Cp. Kluge, 2000) that include the most important end user requirements. 2.1.4.2 Workshops The workshop is not a method in the field of qualitative research, but we used it within a similar way to the group discussion that is described by Mayring (1990) in the following way: The group discussion is an additional method for engaging more than one interviewee into the evaluation process. The strength of this method is that individuals are often bound within their social environment and the discussion within a group help to break these borderlines and feeding higher level discussions. Following are the steps that are described in Mayring (1990) for a group discussion and which we are using for the WeGov workshops:

• Questions: One part of preparation is the design of questions that feed the open discussion and that create the desired feedback.

• Discussion starting points: As background for a fruitful discussion the workshop starts with a short presentation of the project, some thesis on the workshop topic and a live demonstration. The live demonstration is used to lead over into the open discussion.

• Open discussion: Within the open discussion the attendees react by commenting on the live demonstration or on other comments from the audience. If there are no more comments from the audience the chair asks one of the prepared question or shows a new provoking use case.

• Meta discussion: After the open discussion the chair concludes the discussion and starts a closing discussion on a higher level. After the workshop the results are summarized and shared with the attendees.

2.1.5 Applied Evaluation Model While figure 2 shows the process that the WeGov project used to develop sustainable software whilst engaging end users, figure 3 shows the more tangible version of evaluating and validating this approach. Again this figure considers three steps that are important for the WeGov

 WeGov Consortium

Page 25 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 evaluation process: (1) understanding the SNS behaviour and the interaction of stakeholders with users, (2) simulating use cases that might help stakeholders and getting feedback, and (3) validating the WeGov analysis results with the HeadsUp experiment and the control group.

Figure 3: Evaluation and data evaluation model

The idea behind this approach is to identify potential validation strategies that can be used as a “gold standard” for validating the WeGov software and its analysis results. The problem that faced WeGov is that the data analysed was in flux, driven by current topics and its users on SNS. In addition the effectiveness of the WeGov toolkit is strongly dependent on the quality of the SNS data. Nearly all stakeholders that were engaged during the project were very familiar with SNS interaction and especially with the data that is available throughout the social web. It is their daily engagement with the social web that meant they highly prioritized an understanding of the characteristics of users’ behaviour within SNS and the subsequent influence on topics and data. For instance, the “controversy rating” in WeGov’s topic analysis regularly categorises posts as neutral. While a non SNS user might argue that there is a technical problem with the algorithm – five of the WeGov end user experts argued that this is what they would expect. The reason for less controversial discussions is related to user behaviour – many high-profile topics are

 WeGov Consortium

Page 26 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 unemotional and uncontroversial because opinion leaders (e.g., the press) are very influential and lead discussions in only one direction. In addition to the validation of the toolbox by stakeholders, further experiments have been conducted to validate the accuracy of the toolbox’s results. While the stakeholders have validated the tool with live SNS data, the HeadsUp experiment makes use of a pre-analysed data set to compare results and assess accuracy. This deliverable focuses on the steps two and three with the software prototype versions 2.5 and 3.0. The first step has already been completed via the first WeGov prototype (version 1.0). The methodology and results are published in (Joshi et al., 2010), (Wandhöfer et al., 2011).

2.2 End Users

The stakeholders that have been engaged during the WeGov project and shaped the development of the Toolkit are summarized in the table below. Here we are using the N-factor to show the total number of stakeholders that participated in the interview or in the questionnaire for one of the three toolbox evaluations. In combination with the workshop the N- factor shows the number of workshops instead of the number of participants.

Level Participant Toolbox 1.0 Toolbox 2.5 Toolbox 3.0

EU Parliament MP Interview (N=1) Interview (N=1) Interview (N=1)

MP’s staff Interview (N=4) Interview (N=3) Interview (N=3)

One initially engaged MEP was elected in his own country. Instead the head of the web communications department of the DG Communications within the EU Parliament was engaged

German MP ------Interview (N=2) Parliament Questionnaire (N=2)

MP’s staff Workshop (N=1) Workshop (N=1) Workshop (N=1) Interview (N=11) Interview (N=7) Interview (N=8) Questionnaire (N=7)

 WeGov Consortium

Page 27 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Level Participant Toolbox 1.0 Toolbox 2.5 Toolbox 3.0

One office skipped participation, because of illness

State MP Interview (N=1) --- Interview (N=2) Parliament Questionnaire (N=1)

MP’s staff ------Interview (N=1)

Local State Chancellery ------Interview (N=1) Government Questionnaire (N=1)

One further state chancellery has confirmed participation, but didn’t engaged in time

Big city5 Department for e------Interview (N=2) government Questionnaire (N=2)

One city has confirmed participation, but didn’t engage in the end

Mid-size city Department for e- --- Interview (N=1) One city has confirmed government participation, but didn’t engage in the end

Small city Department for e------Interview (N=1) government Questionnaire (N=1)

Parliamentary Department for Interview (N=1) Interview (N=1) New accounts have Party public affairs been delivered

5 WeGov considers three sizes for cities: A big city has more than one million citizens, the mid-size city has less than one million citizens and the small city has less than one hundred thousand citizens.

 WeGov Consortium

Page 28 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Level Participant Toolbox 1.0 Toolbox 2.5 Toolbox 3.0

2.10.20126

NGO Organizer / Interview (N=1) --- New accounts have Department for been delivered public affairs 2.10.20127

Table 2: End user engagement and methodology

The first column shows seven different governmental levels, including: the EU Parliament, a federal parliament (German Bundestag), a state parliament (State Parliament NRW), local government (Germany), three different sizes for cities, political parties, and an NGO.

The people who WeGov has engaged with are represented in the second column. For instance the level of different parliaments includes the Member of Parliament and the Member of Parliament’s staff. While the MP is the figure head, the office employees typically interact with the social web by using labels #team or #office and do not use their own identity (Cp. Wandhoefer et al., 2010). With respect to the day to day interaction on the social web for presswork and public relation issues, the office employees also engage with citizens, and the monitoring and exploitation of social network is generally their responsibility. This is why they are important to engage as stakeholders within the WeGov project. The columns three, four and five show the methodology (e.g. interview) and number of participants who influenced the three prototypes (version 1.0, 2.5 and 3.0) that were shown to end users. While the end user partner Hansard Society acted mainly as an NGO and provided important feedback, the end user partners Gov2u and GESIS arranged the stakeholder participation. The different colours in the table show which stakeholders were contacted by Gov2u or by GESIS.

2.2.1 EU-Parliament The WeGov consortium has focused on the Internal Market and Consumer Protection Committee (IMCO) within the European Parliament. Within the EU Parliament IMCO is the “Committee responsible for:

6 A party central office of one of the biggest people’s party in Germany; one parliamentary party of the State Parliament NRW

7 We have some very recent new external end users: A church organization in the city of Cologne; a blogger in the city of Dortmund; a music school located close to the city of Hamburg

 WeGov Consortium

Page 29 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

1. coordination at Community level of national legislation in the sphere of the internal market and for the customs union, in particular: a. the free movement of goods including the harmonisation of technical standards, b. the right of establishment, c. the freedom to provide services except in the financial and postal sectors;

2. measures aiming at the identification and removal of potential obstacles to the functioning of the internal market; 3. the promotion and protection of the economic interests of consumers, except for public health and food safety issues, in the context of the establishment of the internal market.” 8

Three MEPs from the EPP (European People’s Party) Group showed an interest in being involved in WeGov. One MEP was personally involved in the interviews, whereas the others had their staff members engaged. “The EPP Group has been the largest political group in the European Parliament since July 1999. 271 Members of the European Parliament (MEPs) elected from the lists of EPP member-parties, comprise the EPP Group which represents some 36% of the total seats. The Group strives to resist the political priorities of in Europe and to advance the goal of a more competitive and democratic Europe, closer to its citizens.” 9 As additional stakeholders in this evaluation phase WeGov also engaged:

• The recently appointed social media coordinator of the EPP Group in the EU-Parliament. He was a useful and interesting candidate as the EPP members are not necessarily the ones that communicate most on the social media, but their party seems to be the one that takes the social media potential most seriously and that has decided to invest in their monitoring and in finding the ways to best exploit them, among other purposes in the preparation of the 2014 election.

• The Web Communications department within the Communications Directorate General of the EP. In contrast to MEPs who use social media for personal outreach under their own responsibility, this team exploits the social media on behalf of the institution of the European Parliament, whose social media accounts are among the most supported within Europe, considering all types of users.

8 URL: http://www.europarl.europa.eu/committees/en/imco/home.html (Retrieved on 1.10.2012)

9 URL: http://www.epp.eu/parliament.asp?z=5C5D (Retrieved on 1.10.2012)

 WeGov Consortium

Page 30 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 2.2.2 German Parliament The WeGov Consortium engaged two MPs and 14 MP’s office employees from the German Bundestag as end users for the level of federal parliamentary decision-making.

“The German Bundestag is elected by the German people and is the forum where differing opinions about the policies the country should be pursuing are formulated and discussed. The most important tasks performed by the Bundestag are the legislative process and the parliamentary scrutiny of the government and its work. The Members of the German Bundestag also decide on the federal budget and deployments of the Bundeswehr (Federal Armed Forces) outside Germany. Another important function performed by the Bundestag is the election of the German Federal Chancellor.”10 “The 17th German Bundestag has 620 Members, eleven more than in the last electoral term. The largest parliamentary group is the CDU/CSU with 237 seats, of which 22 are so-called ‘overhang mandates’, ahead of the SPD with 146 seats, the FDP with 93 seats, the Left Party with 76 seats and Alliance 90/The Greens with 68 seats.”11

2.2.3 State Parliament The choice of stakeholder for the level of a state parliament is the State Parliament NRW, because Nordrhein-Westfalen is the biggest federal state in Germany. Here the engagement is with two MPs and a MP’s office employee. “The Landtag of North Rhine-Westphalia is the state diet of the German federal state of North Rhine-Westphalia. It convenes in Düsseldorf and currently consists of 237 members from five parties. The current parties of government are a coalition of the Social Democratic Party (SPD) and the Alliance ‘90/The Greens (Die Grünen), supporting the cabinet of minister-president Hannelore Kraft.”12

2.2.4 Local Government With respect to the requirement (German Bundestag and EU-Parliament) and improvement of locally restricted search and analysis functionality, GESIS extended the stakeholder reach to the level of local government in Germany. Here the engagement is with a State Chancellery of the Saarland13.

10 URL: http://www.bundestag.de/htdocs_e/bundestag/members17/index.html (Received 17/09/2012)

11 URL: http://www.bundestag.de/htdocs_e/bundestag/function/index.html (Received 17/09/2012)

12 URL: http://en.wikipedia.org/wiki/Landtag_of_North_Rhine-Westphalia (Retrieved 17/09/2012)

13 The official web page of the Saarland: http://www.saarland.de/ (Retrieved 2012/10/18)

 WeGov Consortium

Page 31 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

“Germany is made up of sixteen Länder (singular Land, colloquially called Bundesland, for "federated state") which are partly sovereign constituent states of the Federal Republic of Germany. Land literally translates as "country", and constitutionally speaking, they are constituent countries. Often referred to in English by German speakers as "states". Berlin, Hamburg and Bremen are frequently called Stadtstaaten (city-states). The remaining 13 states are called Flächenländer (literally: area states).”14

2.2.5 City Similar to the justification of engaging state chancelleries, to validate geographically restricted search and analysis results, GESIS and Gov2u involved cities as end user partner for the WeGov project. Here the engagement is with cities of three different sizes: A small city with less than 100,000 citizens (Kempten15 in Germany), a mid-size city with more than 100,000 citizens (Gent16 in Brussels), and a big city with more than 1,000,000 citizens (Cologne17 in Germany). Here the engagement is with departments of e-government in Germany and Belgium.

2.2.6 Parliamentary Party In parallel to members from parliaments, GESIS and Gov2u engaged one parliamentary party in Germany and one in Belgium. Here the engagement is with the head of the department that is responsible for online communication and public relations. “A parliamentary group, parliamentary party, or parliamentary caucus is a group consisting of members of the same political party or electoral fusion of parties in a legislative assembly such as a parliament or a city council. Parliamentary groups correspond to party caucuses and conferences in the United States Congress. A parliamentary group is sometimes called the parliamentary wing of a party, as distinct from its organisational wing.”18

2.2.7 NGO In an early stage of the project the WeGov consortium had an interview with a European association for the protection of consumer interests to check their engagement with social networks and their interest in the WeGov toolbox. The organisation pursued their engagement

14 URL: http://en.wikipedia.org/wiki/States_of_Germany (Retrieved 17/09/2012)

15 URL for the official web page of the city of Kempten: http://www.kempten.de/ (Retrieved 2012/10/18)

16 URL for the official web page of the city of Gent: http://www.gent.be/? (retrieved 2012/10/18)

17 URL for the official web page of the city of Cologne: http://www.cologne.de/ (Retrieved 2012/10/18)

18 URL: http://en.wikipedia.org/wiki/Parliamentary_party (Retrieved 18/09/2012)

 WeGov Consortium

Page 32 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 in WeGov by participating in the Advisory Board, to represent the perspective of the consumer, or citizen in general. Through their involvement in the internal validation steps of the toolbox, the project partners Hansard and Gov2u, both NGOs, gave feedback on how the toolbox can be exploited in the everyday work of civil society actors. The Headsup experiment (see section 4.6) on its own is a perfect illustration of the integration of the WeGov toolbox in their core business function.

2.3 Expert Advice

In addition to the stakeholders mentioned above, the WeGov project engaged the following groups that provided advisory feedback. The different colours show whether Gov2u, GESIS, Hansard Society or the whole WeGov Consortium engaged this end user. Here we are using the N-factor to show the total number of participants that engaged in the questionnaire for one of the five toolbox evaluations. In combination with the workshop and seminar the N-factor shows the number of workshops and seminars instead of the number of participants.

Level Status Toolbox Toolbox Toolbox Toolbox Toolbox

1.0 2.0 2.5 2.6 3.0

Advisory Members Workshop --- Workshop --- Questionnaire Board (N=1) (N=1) (N=1)

Conference Scientists --- Workshop ------Workshop (N=1) (N=1)

Events Practitioner ------Workshop (N=1)

Seminar (N=1)

EU Project Full day Full day ------Scheduled for Commission Officer / presentation presentation November 2012 Reviewers Full day presentation

Table 3: Expert advice and engagement

2.3.1 Advisory Board The WeGov consortium set up an Advisory Board, including the following people with complementary backgrounds and areas of expertise: ● Peter Winstanley - Information Management Unit, the Scottish Government

 WeGov Consortium

Page 33 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

● Michael Dauderstädt - Economic and Social Policy, Friedrich-Ebert-Stiftung ● Thanassis Kountzeris - Observatory for the Greek Information Society ● Kostas Rossoglou – BEUC, The European Consumer Organisation ● Jan Linhart – Echo-Logic Engagement was sought from the Advisory Board when toolbox 1.0, toolbox 2.5 and toolbox 3.0 were released. The AB was most involved in the middle of the project – before this, we did not have much to show them, and in the latter stages of the project it has been less again. The reduction of AB engagement towards the end of the project has not had a significant impact on the project’s developments, as it is too late to make changes as a result of their recommendations, so their involvement in the middle of the project was well-timed. In addition, we have had large amounts of external engagement and feedback from end users, and this has offset any reduction in AB engagement.

2.3.2 Workshops The WeGov toolbox was presented in two special sessions at conferences. The sessions were opportunities to get feedback, comments and questions about the project’s philosophy and progress.

• The first WeGov workshop with the toolbox 2.0 was organized by Gov2u during the eChallenges conference 2011, in Florence (Italy).

• The second WeGov workshop with the toolbox 3.0 took place during the EGOV conference 2012, in Kristiansand (Norway).

2.3.3 Events During September 2012 the final WeGov toolbox 3.0 was presented on two different events.

• While the PolitCamp12 (Berlin) event is more a workshop with practitioners, stakeholders and citizens;

• The House of Parliament event was a discussion with MPs.

2.3.4 EU Commission As part of the project setup, WeGov was asked to attend two intermediary project reviews by a Project Officer from the European Commission and three independent external experts: Dr. Keri Facer, Professor of Educational and Social Futures at the University of Bristol and visiting Professor at the Graduate School of Education, Exeter University, Mr Robert Link, Research Programme Officer at the University of Graz, and Mr. Robert Woitsch, Head of the Innovation Group at BOC Asset Management. Their advice was taken into consideration when fine tuning the user involvement method and extending the user engagement groups to a more diverse

 WeGov Consortium

Page 34 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 range of policy makers, not only elected officials from parliaments, but also decision makers from regional and local authorities and political parties. To maintain their independence, they preferred to refrain from evaluating the toolbox prototype outside the specified review meetings.

2.4 Chronological Overview of Stakeholder Engagement Contacts

Table 2 presents an inclusive and historic recapitulation of all end user contacts that were set up to collect feedback from the beginning to the end of the WeGov project.

Year Month Event Method End User Prototype Partner

2010 March Presentation of Dr. Marcus Menzel (Abstractor Discussion GESIS of a MP of the Bundestag) at GESIS, in Mannheim (Germany)

Event PolitCamp10, in Berlin (Germany) Presentation & GESIS discussion

Aug Meeting with two offices of MPs and Presentation & GESIS Fritz Rudolf Körper (German Bundestag), in discussion Berlin (Germany)

Sep Event Government 2.0 Camp, in Berlin (Germany) Presentation & GESIS discussion

Meeting with the Deputy Head of Unit of Unit Discussion Gov2u A.1: "Institutional Relations & Communication" DG Health and Consumers in the European Commission

Oct Research organisation Q| Agentur für Forschung, Presentation & GESIS in Mannheim (Germany) discussion

Presentation of Young Germans at GESIS, in Bonn Discussion GESIS (Germany)

2011 March German Bundestag, in Berlin (Germany) Workshop GESIS Toolbox 1.0

Meeting with the Spokesman of the European Discussion Gov2u Parliament, and the Head of Web Communications of the DG Communications within the EU Parliament

June State Parliament NRW – MP Stefan Engstfeld Interview GESIS

German Bundestag – six MPs offices Interview GESIS

 WeGov Consortium

Page 35 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Parliamentary Party Interview GESIS

July Think tank Co:llaboratory, in Berlin (Germany) Presentation & GESIS discussion

Meeting with the Chairman and the Head of Unit Discussion Gov2u of the Secretariat of the European Parliament IMCO Committee to identify the potential end users MEP

Meeting with 3 MEP and/or their staff Presentation & Gov2u discussion

Sep Open Government Camp Presentation & GESIS discussion

Oct Conference eChallenges Conference, in Florence 1st WeGov Gov2u Toolbox 2.0 (Italy) workshop

Dec German Bundestag Evaluation GESIS planning

2012 March German Bundestag Workshop Toolbox 2.5

April Meeting with MEP and/or their staff Discussion and Gov2u demo

Meeting with the Social Media Coordinator of Discussion and Gov2u the EPP Group in the European Parliament demo

May Meeting with the Citizen Participation Discussion and Gov2u Coordinator and the Communications demo department of the Belgian city of Gent

Meeting with the Communications Director and Discussion and Gov2u the Social Media Coordinator of the Green Party demo in Wallonia - Belgium

June German Bundestag – 7 MP offices Presentation GESIS and interviews

July Toolbox 2.6

Aug City of Cologne Presentation, GESIS Toolbox 3.0 questionnaire, interview

State Parliament NRW – MP Stefan Engstfeld Questionnaire, GESIS Interview

Sep City of Kempten Questionnaire, GESIS Interview

 WeGov Consortium

Page 36 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

State Parliament NRW – MP Matthi Bolte Questionnaire, GESIS Interview

Conference EGOV, in Kristiansand (Norway) 2nd WeGov GESIS workshop

Event Westminster, in London (UK) Discussion Hansard Society

Event PolitCamp12, in Berlin (Germany) Workshop GESIS

Meeting with MEP and their staff Discussion and Gov2u demo

Meeting with the Social Media Coordinator of Discussion and Gov2u the EPP Group in the European Parliament demo

Meeting with the web communications Discussion and Gov2u department of the DG Communications within demo the EU Parliament

Oct Local Government - Saarland Questionnaire, GESIS Interview

Table 4: Chronological Overview of Stakeholder Engagement Contacts

 WeGov Consortium

Page 37 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 3 Evaluation of the penultimate Toolbox

3.1 Introduction

The aim of having three successive software prototypes (2.5, 2.6 and 3.0) was to iteratively improve the WeGov toolbox so that it well addressed the needs of the target end users – the (policy makers) and fit into their daily workflow. Therefore, the WeGov partners (mainly GESIS, Gov2u and Hansard Society) tested the system as a first step. The benefit of this “pre- evaluation” is that the end-user project partners became familiar with the prototype, found and reported bugs, and identified important gaps in order to avoid end users focusing on known technical limitations in their feedback – we wanted them to concentrate on functionality and exploring the uses of the system. Hence, the pre evaluation was a very important process that increased the quality of the system before it was shown to external users, and therefore the quality and usefulness of end user feedback was higher than without this step. The pre- evaluation started approximately two weeks before the main evaluations with end users were conducted. During the pre-evaluation GESIS and Gov2u started with the preparation for the end user evaluation, while IT Innovation fixed the technical bugs. Therefore the structure for the following evaluation report includes:

1. a pre evaluation for identifying the technical bugs and usability problems; 2. the preparation of the evaluation with external end users; 3. the main evaluation itself with results from the end users. 3.2 Toolbox 2.5

Prototype 2.5 of the toolbox was delivered for the second evaluation round with external users, and was used for demonstrations and hands-on assessment.

3.2.1 Characteristics

Version 2.5 (for internal use only)

Launch 24.2.2012

pre 24.2.2012 - 14.03.2012 evaluation

Features Technical restrictions:

 The toolbox is optimized for the following browsers: Firefox, Chrome, Safari  A maximum of 100 tweets and 1700 Facebook posts can be retrieved in a single

 WeGov Consortium

Page 38 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

search  To query Twitter data, no registration is required. For Facebook queries it is necessary to be logged in with a Facebook account. WeGov will automatically detect the connection to Facebook and use this to query data from Facebook. Login Page: The site allows switching between English and German interface.

Home page (initially called “dashboard”)

The user settings dialog allows name, organisation and password to be amended.

The home page is composed of a number of widgets that allow the user to personalise the page and configure a number of criteria for frequent searches. It permanently displays updates of the related analysis results.

These widgets can be configured (e.g. change the search word), hidden, deleted or duplicated in order to compare results. The list of widgets is:

● Current Location: shows the user’s current location which is determined automatically based on his IP address and Google maps ● My saved Locations: allows the user to add other locations to restrict search to specific locations ● Main Topics: topic analysis on Twitter, based on search terms ● Main local Topics: geographically restricted topic analysis on Twitter, based on search terms ● Recent local Posts: displays the latest tweets about a specified key term released near your current location. Recent local search results can be viewed on a single page by clicking at the name of the widget. That page contains all collected posts, some extracted information and history below which will display previous searches from that widget. ● Facebook Posts for User or Group: shows the 25 most recent posts on a specified Facebook page or open group ● Comments on Facebook Post: displays all comments on a specified Facebook post ID ● Facebook Posts Topics: analysis of topics discussed on a specified Facebook post ID with indication of key users for each topic ● Facebook Topics for Latest Post: analysis of topics discussed on the latest post of a specified Facebook page or group ID ● User Roles: pie chart showing allocation of different roles of users after a search on Twitter, according to their posting behaviour ● Users for a certain role: allows users to be found that fulfill a particular role Some widgets allow integrated analysis from other websites:

● Trending now: Twitter-generated topics, that Twitter users are currently discussing in a limited set of locations are displayed ● Peer Index for a given person. These features are illustrated in Table 4 below.

 WeGov Consortium

Page 39 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Search page

Ability to search and make an analysis based on a search term.

● everywhere or restricted to own or one of saved locations ● on Twitter only ● topic analysis only uses English language based models ● Only on the fly searches are possible The results of the search run are displayed in 3 tabs:

● Search results: displays list of retrieved posts ● Topic analysis - sorting of comments and users in the different topic groups - for each topic group 3 key users and about 3 key posts are displayed

● Behaviour analysis - Discussion activity: graphical presentation of the analysed tweets in a timeframe - Display of Top 5 Users and Top 5 Posts to watch - Classification of users according to their behaviour and interactions within social networks Display of search activity history

 WeGov Consortium

Page 40 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

screenshot

Table 5: Toolbox 2.5 – characteristic

During 3 weeks prior to the external release, the user engagement partners performed in-depth tests to eliminate a number of technical shortcomings and to ensure that the potential of the technical features was fully understandable to policy makers. The user engagement partners could anticipate many of the external users’ potential remarks, mainly on usability aspects and missing functionalities that were still under development. Although all concerns expressed by the end user partners could not be solved immediately by the development team, a number of fixes where introduced, for instance, by creating some additional widgets or by making the titles

 WeGov Consortium

Page 41 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 of widgets clickable to drill down to the underlying retrieved data. Also, it was decided to address the general usability issue by writing an off-line detailed user guide document. More detailed information about these tests can be found in the appendix A of this document.

3.2.2 End User Feedback 3.2.2.1 Introduction

Figure 4: Toolbox 2.5 - evaluation cycle external evaluation

Activities The end users were involved either through collective workshops or through semi-structured interviews and hands-on training. If the end users were sufficiently interested, they were also given access rights to further test the toolbox on their own. Workshops Presentation at the German Bundestag with a software demo followed by an open discussion. Semi-structured interviews with: ● 1 Member of the European Parliament (MEP) ● Staff members of 2 MEPs ● The SNS coordinator of the EPP Group in the EU-Parliament ● 7 staff members of elected officials in the German Bundestag ● 2 representatives of the Belgian city of Gent (approximately 250,000 inhabitants) ● 2 representatives of the French speaking Belgian green party (Ecolo)

The interviews were based on a common standard questionnaire, and mainly addressed: ● the usability of the toolbox in general ● ease of understanding the offered functions and analyses ● the benefits of the offered features and analyses for their everyday workflow ● their perception of the usefulness and the reliability of analysis results ● features that were not available that they would like to see in a toolbox like WeGov.

3.2.3 Results Overall, the top 5 perceived unique advantages of WeGov were : ● The toolbox concept (search page vs. widgets) ● Facebook monitoring, as not many existing tools allow this

 WeGov Consortium

Page 42 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

● Ability to perform a local search, allowing the user to concentrate on their own constituency ● Quick visualisation of the user types with the role information explanation ● Activity diagram, provided it can be applied to a long term analysis

The main concerns related to the overall usability of the toolbox, the understanding of how the analyses function and the interpretation and integration of the analysis results.

Below is the consolidated overview of end user feedback

Toolbox End user’s concern Integration in the WeGov functionality development process

Overall concern User interface needs to be presented in The WeGov interface is made available several languages in 2 languages, to show that it is possible to deploy more languages in a further stage

Overall concern Interactivity of WeGov. Is it possibly to Version 3.0 provides the option to react to influential posts or users answer tweets or retweet directly directly from WeGov? from the WeGov interface, based on recommendations from the output of analysis

Overall concern Would like to monitor more SNS, like WeGov is limited to 2 SNS to Youtube, Wer-Kennt-Wen or blogs demonstrate the concept but there is the potential to add more in the future

Overall concern Retrieval of 100 tweets in 1 run is Prototype 2.6 allows the user to insufficient. User should have the determine the number of tweets with option to determine number of a maximum of 1500 per run. In retrieved tweets addition, schedules can set up that collect many thousands of posts over time.

Overall concern Concern about the need to log into a The FAQ must clearly explain that Facebook account to activate Facebook WeGov does not affect the Facebook search account that is used to trigger the search

 WeGov Consortium

Page 43 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Overall concern How to find “hot” discussions or The development team hasn’t found a groups on Twitter and Facebook way to do general local Twitter searches without a search term, but this is not deemed a priority as other solutions already exist, such as http://trendsmap.com/. The external “Trending now” widget will also offer a partial solution, as it is not available for all European cities.

Not being based on key words, Facebook topic analysis is supposed to identify the hot discussions.

Overall concern · Need for benchmarking and monitoring Can be addressed by using the widget the MP’s activities and impact within “Recent local Posts” with the policy the social web – what are people saying maker’s name as search term and with about the policy maker, with more than the peer index widget just using their official twitter tag

Overall concern Geographically restricted search is not Prototype 2.6 allows user to determine precise enough. Need to have a better a radius for search around a certain correspondence between search area location and policy maker’s constituency territory. Local authorities request monitoring of SNS by city district

Overall concern Mobile and tablet interface for WeGov As it soon became clear WeGov in its current conception is rather a back office tool, we did not consider mobility as a priority for this project. It might however be an option that is worthwhile to analyse for further product evolution.

Overall concern The search results page should better As of prototype 2.6 the result pages indicate how many posts have been show a number of parameters like the retrieved and during what period number of retrieved posts, the collection period, hash tag lists for tweets…

Overall concern The ability for a policy maker to inform This is not considered as a priority for his friends and followers that he is

 WeGov Consortium

Page 44 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

using WeGov WeGov

Overall concern Can the trial user continue to use the The consortium will host the tool for at tools at end of WeGov project? least 1 year after the end of the project, so the users can continue to use it.

Overall concern Can WeGov foresee some kind of alert This is not addressed as a priority for function on certain themes (like Google the next versions of the prototype Alerts)

Overall concern The users are concerned about data The FAQ document will explain how security and about privacy laws the user’s data is stored in a personal account without WeGov being able to access it

The FAQ document summarises the main WeGov recommendations for privacy protection and refers to the specific Privacy Protection document created by the WeGov consortium

Home page Instability in finding the current This works correctly as from prototype location 2.6

Home page Ability to do handle all location In later prototype versions both parameters in a single widgets existing location widgets will be merged into a single widget

Home page Widgets are not intuitive, their number Prototope 2.6 offers different answers: is confusing and their labels are not - Make a search widget the input to clear enough an analysis widget

- Presentation of widgets that have different functions in different colours

- Widgets easier to configure, create and duplicate to compare results

- Allows widgets of different sizes

Home page More combination possibilities from FB This widget has been released within

 WeGov Consortium

Page 45 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

analysis, like running topic opinion prototype 2.5 analysis on the comments of one post

Advanced search The users want to have the same This is made possible from version 2.6 analysis possibilities for Facebook as for onwards Twitter

Advanced search The users want to know where and how Version 2.6 will store all search history long the search and analysis data can in the user’s account, and allow them be stored to reload any historical search or results analysis

Advanced search Need to do long term analysis on large Version 2.6 uses a background process data sets (scheduled activity, automatic collection) that collects posts specified period of time

The new versions of the prototype allow previous runs to be displayed and reloaded to combine analyses on different runs Scheduling SNS makes it necessary to cope with possible collection of duplicate posts

Advanced search The different panes in the result page They will be increased in the next are too small versions to show more results on one page

Advanced search Results and scores need better This will be addressed in the next description and labelling. General versions through better labelling of the usability of Topic analysis is poor results, a more visual or graphical presentation of some scores, and additional explanations in an FAQ section

Also in future versions WeGov will display all posts in a topic and not just the top 3, provide the possibility to sort by different criteria, e.g. relevance, positivity.

Scores from topic analysis will be

 WeGov Consortium

Page 46 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

presented in a graphical format

Advanced search To trust the analyses users want to A user guide and a FAQ section understand the algorithms that lead to explains the underlying criteria and Behaviour the results. mechanisms analysis

Topic analysis

Advanced search Requests for more flexibility to present Stems will be replaced by the most topics through choice of word types like frequent words including that stem Topic analysis avoiding stems, meaningless words, search term to be repeated in the topic, nouns or nouns and adjectives only, etc…

Advanced search Requests for more flexibility to present Prototype 2.6 will allow the number of topics through user determined choice topics to be configured by the user; Topic analysis of topics this will be replaced by the display of as many topics as analysis decides

Advanced search Advanced analysis in the user’s As it takes a long time to retrain language behaviour models in another language Behaviour Prototype 2.6 will only enable the analysis topic analysis in German which Topic analysis demonstrates that WeGov can be multi lingual. In the exploitation plan there is an analysis of the time & effort required to train for another language.

Advanced search Is it possible to click through to This is not a priority for WeGov underlying post from dots on the Behaviour activity diagram analysis

Activity diagram

Advanced search What are the exact criteria to A user guide and a FAQ section explain determine influential posts and users the underlying criteria and Behaviour mechanisms analysis

Top 5 Users and Top 5 Posts

 WeGov Consortium

Page 47 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Advanced search What is the meaning of the scores A user guide and a FAQ section explain generated for Top 5 Users and Top 5 the underlying criteria and Behaviour Posts to watch mechanisms analysis

Top 5 Users and Top 5 Posts

Overall concern Combination of several Facebook pages This will be made possible as from into a single analysis prototype 3.0 by checking multiple runs from the search history list

Advanced search The user role pie chart must allow the Prototype 3.0 will supply a detailed list user to retrieve underlying tweets of the users in each role. Behaviour analysis

User roles

Table 6: Toolbox 2.5 - evaluation results

Additionally, the trial users expressed some concerns about the origin of posts and the content value of SNS conversations: ● Friends on Facebook are often members from the same party, politicians, organizations and journalists or citizens from the MP’s constituency ● The Twitter space is mainly occupied by professional journalists and policy makers, rather than by citizens ● What is the representativeness of the SNS savvy citizen? ● citizen’s sometimes react to policy maker’s posts via other channels like telephone, email etc. ● Finding “relevant” content (groups, comments, pictures) on Facebook is difficult ● How should users cope with abusive and irrelevant posts? ● A typical reaction from a local authority: should the same service level be given to citizen interaction on SNS as to interactions through more traditional media?

3.2.4 Advisory Board Feedback Prototype 2.5 was presented during a 2 hour online Advisory Board meeting in April, with the following agenda: ● Demonstration of the UI and functionality of the current version of the WeGov toolbox - prototype 2.5 ● Overview of the future plans for the development of the future version of the WeGov toolbox - prototype 2.6, due at the end of June 2012

 WeGov Consortium

Page 48 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

● Q&As discussion, feedback on the WeGov prototype 2.5 and proposals for the development of prototype 2.6 The following questions were submitted to the members of the AB ● Feedback and comments on the usability of the toolbox ● Feedback on our general direction, the approach of development, release, evaluate, and iterate ● Specific comments on the possible applications you could see the toolkit being used for ● Any ideas for new widgets, ways of displaying data Due to time zone problems and unexpected travel for some members, only one AB member managed to attend the call. His major feedback was in line with the results of the internal and external stakeholder evaluation, and it mainly concerned documentation of the analyses performed and the ways to exploit the too cryptic analysis results.

 WeGov Consortium

Page 49 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

3.3 Toolbox 2.6

3.3.1 Characteristics

Version 2.6 (project-internal only)

Launch 4.7.2012

Pre 4.7.2012 - 30.7.2012 evaluation

Features This is the list of WeGov 2.6 improvements over version 2.5: Home page

Wi Widgets can be more easily created and configured from the widget page

o Previously a widget had to be “duplicated” o Also, in 2.5, if a widget was deleted, there was no way to create a new one of the same type o the size of the widget is variable Th

The widgets are now colour-coded

o Blue border indicates a “search” widget o Orange border indicates an “analysis” widget o White border indicates a configuration widget o The colour of the centre panel of a search widget is configurable by the user, and any analysis widgets that take their input from a search widget will inherit the search widget’s centre panel colour. So the user can easily see how search and analyses are co-related.

All widget data is now stored in the WeGov database, and data is refreshed (i.e. collected from social networks) manually by the user.

o Previously, widgets collected new data from Facebook or Twitter every time the widget page was refreshed. This was wasteful in terms of bandwidth and queries on the social networks. Twitter places limits on the number of queries per hour, and this limit can be fairly quickly reached if a number of widgets are on the main page.

Analysis widgets’ input can be the results of a search widget

o The user can create analysis widgets of different types and configure them so they all take input from the same search.

 WeGov Consortium

Page 50 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Search has become “Advanced Search”·

Advanced search tool and UI have been further developed to support collection of more posts:

o Collects multiple pages of results within a single search (configurable by user) o Results now stored in raw JSON format (less restrictive than structured schema, and matches storage by widgets) o Repeated searches can avoid duplicate results o Improvements of reliability and performance o Re-integrated with Coordinator, Quartz scheduling, etc ·

Advanced search now collects seed posts in a Facebook group and their comments:

o Repeating searches can be scheduled so the group’s activity can be collected automatically. o The user can instruct the tool to avoid collecting duplicate comments, so each iteration of a repeated search will only collect new comments. o The user has the option to configure a “collection window” for seed posts & comments. This means that the user can specify N most recent seed posts to collect comments from (currently this defaults to 10). The window also helps to avoid unnecessary requests, as older posts generally have few (or no) new comments as time progresses. o By adjusting the collection schedule parameters (in particular the period between collections) and the N most recent seed posts to collect comments from, the user has the opportunity to balance collection efficiency and coverage of comments.

Advanced search page has been further improved:

o Client-based searches replaced by execution via search tool o Results displayed using data from database o Automatic execution of analysis (topics, behaviour) using results from database o Scheduling options implemented, e.g. § repeated execution over time interval in minutes, hours or days § avoid duplicates option o Pop-up to display currently scheduled activities and their status (includes delete schedule option) o Search history improved, to show details of activities and runs o Automatic (and manual) refresh of table is available o Facebook collections now supported (previously only client Twitter search was available) § Dynamic update of available parameters to reflect SNS selected

 WeGov Consortium

Page 51 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

§ Look-up of entered Facebook group/page id, displaying user- friendly details § Use of user’s Facebook login to authorise search o Selection of English or German language (used by Twitter only) ·

Topic analysis has been updated to support new features:

o The number of posts in a topic are displayed o The distance each topic is from other topics (this is so related topics can be grouped ) o Sentiment is now included – the average positivity / negativity of posts is indicated o A “controversy” metric is included. This is the spread of sentiment for the posts in a group – if the posts are all positive or all negative, controversy is low, but if there is a significant spread of the sentiment, controversy is said to be high. o The topic analysis now has a “Further Analysis” page, which shows the topics, and the user can sort by number of posts, sentiment, and controversy. This is currently work in progress and the final version is aimed for version 3.0 ·

Table 7: Toolbox 2.6 - characteristic

As the final toolbox would be issued 4 weeks after the delivery of this prototype 2.6, it was decided to only submit it to internal evaluation. Lots of experimenting happened on the ideal presentation of the results of the topic-opinion analysis, like the ideal number of topics to be displayed, how to show the distance between topics, how to display controversy and sentiment. Some search parameters were further refined, like the possibility to specify a radius around the current location. The toolbox was also submitted to stress test to assess performance when collecting and analysing many posts. More detailed information about these tests can be found in the appendix B of this document.

3.4 Conclusion of the Evaluation of the penultimate Toolbox

The availability of the toolbox as a working web application, ready for demo and hands on experience, represented a major breakthrough in user engagement. The toolbox evolved tremendously in the period from end February to end July 2012, thanks to an efficient iterative process during which the feedback of many different parties, consortium partners, end users, experts and AB were constantly integrated. This evaluation round was mainly focused on the usability and usefulness of the toolbox’s features for the policy maker. The reliability of the analysis results and the robustness of the

 WeGov Consortium

Page 52 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 application were focused on to a lesser extent, as these elements would be tested more thoroughly in the next evaluation round. The evaluators were quickly able to see the potential of the tool. The tool was highly appreciated for its simple structure, based on a customizable dashboard on the one hand and an ad hoc or advanced analysis on the other hand. The most welcome features were the geographically restricted search, the Facebook monitoring possibilities, the fast identification of user roles, key users and posts to watch, as well as the visualization of post activity over the collecting period.

Version 2.5 gave rise to concerns about user friendliness. The limited number of posts that it was possible to retrieve did not provide meaningful analyses. The users found it difficult to understand the analysis results, mainly with respect to the topic analysis. Both the topic groups and scores as a result of the topic-opinion analyses were considered cryptic and trial users raised the question about how to understand and exploit the information. Many major fixes took place during the months that followed the introduction of prototype 2.5. The transparency of the tools was highly improved, by providing visualisation of search results in a more structured way and identifying all underlying posts individually. Widgets were made easier to manipulate, their structure was simplified by creating search widgets that can be used as input for analysis widgets, and colours showed the different types of widget at a glance as well as links between widgets. The number of posts collected in one round was increased and the possibility of scheduling search and analysis on combined runs was introduced. Lots of experimenting happened regarding the optimization of the topic-opinion analysis. The terms resulting from topic analysis were rationalized and made easier to understand, the number of topics was made more dynamic and the distances between topics were shown in a matrix. Relevance and sentiment analysis scores were presented on sliding scales with easy to understand colour indications. Topic analysis in German language demonstrated the potential of the tool to cope with multilingual environments. Finally the system’s reliability and performance was highly improved through more efficient storage techniques. Prototype 2.6 was only evaluated internally, as a number of remaining concerns would be addressed in version 3.0 that was available a couple of weeks after the delivery of prototype 2.6. These related to increased user flexibility by introducing more configurable parameters, the easier management of scheduled search and combined analysis, and a more user friendly search function on Facebook pages and groups.

 WeGov Consortium

Page 53 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 4 Evaluation of the final Toolbox

4.1 Toolbox 3.0

Toolbox 3.0 was the final version that was developed within the WeGov project. On the basis of the WeGov “three-steps” evaluation model this phase focuses on the validation of analysis results and its usefulness for the policy maker’s everyday use. In addition, this phase considered the evaluation of the system as a whole and how the different end user groups may use the tool. In comparison to the previous evaluations the strategy included the preparation of customized analysis reports for each end user based on their specific thematic and geographic interests. The purpose was to show end users more concrete results related to how the tool may support them. Previous evaluations had shown that stakeholders were not willing to spend the time on the tool that was necessary to get in-depth analysis results.

4.2 Characteristics pre Version

Pre version 3.0 (for internal use only)

Launch 18.8.2012

pre 18.8.2012 - 27.8.2012 Evaluation

Features Since the last update of version 2.6, WeGov focused on the following developments: ● Integration of topic and behaviour analysis code, including initial support for German language ● Analysis now runs as a tool, storing data in database which can be viewed in UI (previously all analysis was run on an ad hoc basis) ● We can now feed multiple search results sets into analysis ● Topic summary table now integrated into advanced search page, including expansion of topic to display contained posts

Improvements to the advanced search page, including:

● New tabbed layout (mainly because topics summary needs full width of page) ● Improved search history and new analysis history. These are context sensitive, e.g. ○ click Twitter radio button to see only Twitter results ○ click Facebook radio button to see only Facebook results ○ click Topics Analysis tab to view corresponding results

 WeGov Consortium

Page 54 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

○ click Behaviour Analysis tab to view corresponding results ● Selection of multiple search activities (all runs) to feed into topics or behaviour analysis ● Setting of analysis language (currently via the Language option in search parameters) ● Users decide on the number of topics returned (-1 is the default, which lets analysis decide) ● Improved selection of runs (e.g. used to be able to select runs in different searches, which was confusing) ● Search results now contain location info (where available) (N.B. this feature seems to only work for a few posts, so there may be a display problem?) ● Summary of search also presented on right-hand side ● Improved automatic location detection (this is now consistent with location determined by widgets page) ● Added radius to search activity name

Table 8: Toolbox 3.0 – characteristic

 WeGov Consortium

Page 55 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.3 Validation of Analysis Results with End Users

4.3.1 Introduction This evaluation was conducted with

• two members of the German Bundestag,

• eight employees that work directly for a member of the German Bundestag,

• with two members of the State Parliament NRW,

• one employees that work directly to a member of the State Parliament NRW,

• with one small German city,

• with one big German city,

• and with one local government. In total this evaluation consisted of 16 semi-structured interviews and 11 questionnaires. The questionnaires and the semi-structured interviews were based on an individual analysis report that was created from four weeks of data collected from Facebook and Twitter using the WeGov search tools and the scheduler. Figure 5 shows an overview of the applied steps. This evaluation assesses the final WeGov toolbox and its analysis results applied to three different use cases. The evaluation outcome will provide suggestions for further work in this area.

Figure 5: Toolbox 3.0 - evaluation cycle validation of toolbox results

4.3.2 Aims of the evaluation Until now the WeGov evaluation considered the usefulness of use cases and the usability of the toolkit for the everyday use of policy makers. With the launch of the toolbox 3.0 the toolkit provides real data that analyses live SNS data including up-to-date topics. Hence the evaluation is conducted in order to validate the usefulness of analysis components of the toolbox 3.0. Within this context we are addressing the following questions: 1. Can we bunch local Facebook pages to show the topics and arguments that people are discussing locally?

 WeGov Consortium

Page 56 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

2. Can we monitor topics locally on Twitter to provide potential starting points for dialogues with the e-society? 3. What are the differences between monitoring Twitter locally or globally? 4. Can the analysis results help to reduce the gap between the e-government and e- society? 5. How beneficial are the WeGov results for policy makers and what are important parameters?

4.3.3 Methodology and Specifications To address the aims above, we applied three experimental use cases that considered the end user requirements for long-term analysis and local monitoring of Twitter and Facebook. We configured the WeGov toolkit to collect data relevant to our proposed interviewees – we created user accounts for them, and set up automatic scheduled searches that were relevant to them. This enabled us to demonstrate and evaluate the analysis components with the external users that would contain subject matter they were interested in. Our reasoning behind this was that if they were interested, they would be better engaged, and therefore the quality of feedback would be better than if we had used generic searches. From the previous feedback, local or constituency-based searches were of high importance to MPs, so these were strongly featured in the searches we set up. 4.3.3.1 Facebook Analysis of local Pages The intention was to monitor a sample of at least ten local Facebook pages to represent a geographical area like the MP’s constituency. Here the topic opinion analysis (Cp. section Topic Analysis) was applied to analyse the sample of Facebook pages to extract the topics that people discussed on the pages. Each topic is a combination of words that represents a focused part of the discussion. Every topic comes with key users, and key comments. 4.3.3.2 Monitoring Topics on Twitter The second use case within this part of the evaluation starts with search queries (e.g. climate change) on Twitter. Here we collected data three times a day from Twitter for approximately five topics. The maximum number of tweets for a single search is 1500 tweets. All the tweets were used as input for the topic analysis, discussion activity analysis and the behaviour analysis. 4.3.3.3 Monitoring Topics on Twitter locally The third use case within the experiment used similar search queries to the second. The difference is that the search queries in this case are geographically restricted to a local area representing the MPs constituency. 4.3.3.4 Unique Data Profile

 WeGov Consortium

Page 57 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

As the toolbox 3.0 is not self-explanatory and the end users haven’t got time for extensive testing in the case of multiple, long-term, search and analysis, our strategy with a selection of the German end users was the preparation of an individual analysis report. Each analysis report was created from the same structure, but with personalised data. Figure 6 shows a sample for the MP Matthi Bolte19. Each data profile included

(1) approx. ten Facebook pages, related to the local area, (2) approx. five topics to monitor on Twitter, and (3) approx. five topics to monitor on Twitter locally. The five topics that were monitored on Twitter locally were geographically restricted to the constituency under consideration. To deliver these results the WeGov toolbox requires a location (the pin on the Google maps) and a maximum radius where the tweets will be collected from.

The map in Figure 6 shows the constituency on a Google Map including the radius selected for the data collection. This data profile was initially created by GESIS and was updated in several iterations by the feedback end users provided concerning their profile. For the collection from Facebook pages, we used the Facebook search tool, where we queried the constituency and the names of cities and towns within the constituency. Therefore we started with the biggest cities and towns. Pages with more likes, posts and comments were selected before those that displayed less public engagement. If the MP had “liked” one of the selected pages this information was noted. The pages represent a selection of the available pages related to or managed by cities, public institutions, associations, local associations, arts and culture, politics, tourism and the local press.

19 Official website – URL: http://www.matthi-bolte.de/ (Retrieved on 4.10.2012)

 WeGov Consortium

Page 58 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 6: Sample for an individual data profile

4.3.3.5 Monitoring unique Data

The following figure shows the parameters that were used for scheduling a daily search for a four-week duration.

Figure 7: Toolbox 3.0 - UI for Facebook monitoring

Twitter searches were scheduled every eight hours for the duration of four weeks. Figure 8 shows, on the left hand side of the screenshot, the settings for both the local Twitter search and the non-geographically restricted Twitter search. The options on the right hand side allow users

 WeGov Consortium

Page 59 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 to run a search for “everywhere” and a geographically restricted area via a radius selection button.

Figure 8: Toolbox 3.0 - UI for Twitter monitoring

4.3.3.6 Analysis Report After four weeks of data collection, the data was analysed within the three use cases mentioned above. Figure 9 shows the analysis process. There are three columns, one for each use case, where the input data from the monitoring process was analysed. The table shows the WeGov analysis components that were used for the analysis report. The results for the Facebook pages show a topic analysis with key users and key posts. The Twitter analysis results are the same but additionally show the analysis plus user roles, the post frequency, top posts to watch and top users to watch. The reason for the different analyses is that the user role analysis, post frequency, top posts to watch and top users to watch are not available for Facebook. Instead of providing these analyses for both Facebook and Twitter, KMI prioritised making their components work with German text, as German end users felt this was a necessary feature to make the toolkit most useful to them.

 WeGov Consortium

Page 60 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 9: Analysis process for the analysis report

The analysis outcome of Figure 9 is summarized in a PDF report. The sample report for Matthi Bolte, one of 15 analysis reports, can be found in the appendix at the end of this document. (Cp. appendix E.2) The analysis reports include a description of the evaluation strategy and the results at a glance, on one page where possible. The analysis reports were sent to end users approx. two weeks before the interviews to allow time for them to prepare their comments and feedback. 4.3.3.7 Questionnaire In addition to the analysis report, the participants got a questionnaire that covered concrete examples from the analysis report. The questionnaire examples included one Facebook and one Twitter example of geographically restricted data. The sample questionnaire for Matthi Bolte, one of 15 questionnaires, can be found in the appendix. (Cp. appendix E.3) All questionnaires contained the following information and included the same questions. The only difference was the sample of analysis results that were selected manually. Questionnaire outline

• Page 1

 WeGov Consortium

Page 61 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

o Cover sheet with information stating the aims of the questionnaire

• Page 2 o Background information with respect to the project and the analysis components

• Page 3

o A list with ten topics that are composed of five words. For each topic the stakeholders were asked to answer the following four questions: . Question 1: Is the topic clear? (The aggregated number of answers is summarized in Table 9, in the following section on results, column 3) . Question 2: What is the label for this topic? (The aggregated number of answers is summarized in Table 9, column 5) . Question 3: Do you know the topic from press work? (The aggregated number of answers is summarized in Table 9, column 8) . Question 4: Is this an interesting topic? (The aggregated number of answers is summarized in Table 9, column 9)

• Page 4

o This page included three sample posts and the names of the users who have written the posts. This sample is the most relevant posts for the first topic of page 3. Here the stakeholders got the following question: . Question 1: What is the label you may choose for these posts? (The aggregated number of answers is summarized in Table 10, column 2)

• Page 5

o On this page the stakeholders were shown five influential posts (users to watch) and five influential users (users to watch). Here the stakeholders were given information about what the query term was and when the monitoring took place. For the five posts to watch the stakeholders were asked the following question: . Question 1: Would you react to this tweet? (The aggregated number of answers is summarized in Table 11, column 2)

o Concerning the users to watch the stakeholders were asked the following questions: . Question 1: Are you following this user? (The aggregated number of answers is summarized in Table 12, column 2)

 WeGov Consortium

Page 62 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

. Question 2: Would you follow this user? (The aggregated number of answers is summarized in Table 12, column 3)

• Page 6

o The last page of the questionnaire showed the graph of discussion activity concerning the analysis parameters on the previous page. Two questions were asked of stakeholders:

. Question 1: Would you like to get more information on this topic regarding the graph’s characteristic that shows the discussion activity for the topic because of the figure? (The aggregated number of answers is summarized in Table 13, column 2) . Question 2: Why? (The aggregated number of answers is summarized in Table 13, column 2) 4.3.3.8 Follow-up Interview The reason for follow-up interviews was to receive more in-depth assessments about the analysis results, which were provided within the analysis report and the questionnaire. Here the interview focused on the reasons that stakeholders answered the questionnaires in the way that they did. The complete interview is included within the appendix in German language. Below is the interview guidelines that provided the sample questions for the 20 minute interview with stakeholders: Questions concerning the questionnaire

• Introducing/ describing questionnaire: „Two weeks ago we sent you a questionnaire. With this questionnaire we would like to estimate the quality and the application range of the results of the analysis …“ (2-3 sentences).

• Did you have the possibility to fill in the questionnaire?

Yes • Did any problems occur/arise with respect to the questions or the contents?

o Questions: Incomprehensible formulation?

o Contents: Incomprehensible results?

No • Explain the questionnaire briefly and take up questions …

General questions

• Did you have the possibility to look at the results of the analysis?

 WeGov Consortium

Page 63 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Yes • Are the results clear? • Does the use case „local Facebook monitoring” interest you? o Do the subject fields have a realistic relation to…? o Does the analysis reflect real discussion in your region? o Are the comments and users assigned correctly? o Would you use opinions / information for your work? Is e- participation possible? o Is public dialogue possible at this level? o Is this use case suitable for use in your everyday life? Scale 1-10. • Is the use case „(local) subject analysis on Twitter“ interesting for you? o Which analysis is of most use to you? o Would you use opinions / information for your work (e- participation)? For which? o Is public dialogue possible at this level? o Is this use case suitable for your everyday use? Scale 1-10. • Are the results helpful? • Do the results reflect your expectations? • Are there any surprises? No • Show the results and give explanations… • Interviews with above stated questions in the following

Focused questions

• Which of these concrete results is of value to you?

Yes • What represents this added value? / Which is this added value? • Graphics, topics, single comments, user profiles? No • Are the results not understandable? • Do the results not meet your expectations? • Which are your expectations? • Do you use the Social Web? • Is it of any value to you? • How could WeGov support you?

 WeGov Consortium

Page 64 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 4.3.4 Results This chapter is separated in the different parts of the WeGov analysis components. Each part summarizes the questionnaire results and the follow-up interviews that consider the individual analysis report of end users. Each section starts with the questionnaire and proceeds with the follow-up interview results. There is one general feedback section at the end that covers the general feedback and relates to all analysis components. 4.3.4.1 How to read the following Tables? This section shows five tables that summarize the answers of the participants that assessed one sample of individual analysis data in the questionnaire. Each table has a description in the top left corner that is highlighted in yellow and shows the name of the analysis component that was assessed (e.g. Assessment for topics). All tables have the same number of rows, because the rows show the aggregated answers for each of the 12 participants. The number of columns is different, because they are related to the questions of the questionnaire. 4.3.4.2 Topic Analysis with key users and posts Summary questionnaire page3 The following two tables cover the aggregated numbers of answers how the participants assessed the topic analysis with key users and posts. While Table 9 summarizes the assessment of ten topics that were created with the analysis from posts and comments from at least ten local Facebook pages, Table 10 covers the assessment for the three most relevant comments and users for the first topic. Table 9 shows in addition to the first column with the participants ten more columns how the local Facebook sample was assessed:

• Column 1 shows the number of Facebook pages that were selected as input data to create a list with ten topics (each topic is conducted of five words). The range of Facebook pages is between ten and 52 pages for one analysis.

• Column 2 shows if the geographical area is locally or part of a city area.

• Column 3 shows the number topics that were clear to the participant – in other words the participant could imagine for what event the words stand for. Therefore, in this column a larger number is better.

• In contrast to column 3, column 4 shows the number of topics that were not understandable to the participants. The participants could not imagine what the words stand for. In this column, a small number is better.

• Column 5 shows the number of how many topics the user was able to give labels to. For instance the topic was “weekend, people, U2, late, expensive” the user was able to give the label “concert”. Therefore the criterion for a “good” topic is if the user was able to

 WeGov Consortium

Page 65 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

give a topic to. Here the higher the number the clearer the topics. The maximum number is ten. This means all topics are clear to the user.

• In the case the user was able to give a label to one of the ten topics we compared the similarity of labels and topics. For instance if the user chose the label “U2” for the topic “weekend, people, U2, late, expensive” we increased the number in column 6. In the case the user chose the label “concert” for the topic “weekend, people, U2, late, expensive” we increased the number within column 7.

• The column 8 shows the total number of topics that were already known (e.g. by the presswork). This is a test to

• Column 9 shows the aggregated number of topics that are of general interest for the participant’s work.

• The last column 10 shows if the participant had assessed an additional sample of the analysis report.

Assessment for topics (page 3 from 1 2 3 4 5 6 7 8 9 10 the questionnair e)

c

ok Pages

Participant Unique Facebo Monitored area Number of topics that are understandable Number of topics that are NOT understandable How many topics the user was able to give labels to Number of titles that are SIMILAR to the topic Number of titles that are DIFFERENT to the topi Number of topics that were known from presswork Number of topics that were interesting Have the participants validated an additional data set? Bundestag 1 20 Local 6/10 4/10 1/10 0/1 1/1 6/10 6/10 Yes (Office)

Bundestag 2 10 Local 7/10 3/10 7/10 0/7 7/7 3/10 2/10 Yes (Office)

Bundestag 3 20 Local 0/10 10/10 4/10 0/4 4/4 0/10 0/10 Yes (MP)

Bundestag 4 10 Local 0/10 10/10 0/10 ------0/10 0/10 No (Office)

 WeGov Consortium

Page 66 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Bundestag 5 10 City 1/10 9/10 1/10 1/1 0/1 0/10 0/10 Yes (Office)

Bundestag 6 10 Local 4/10 6/10 4/10 0/4 4/4 2/10 0/10 No (MP)

State PARL. 1 10 City 7/10 3/10 7/10 0/7 7/7 6/10 6/10 Yes (MP)

Local 10 Local 3/10 7/10 5/10 2/5 3/5 0/10 1/10 Yes Government + City

City of 52 City 7/10 3/10 7/10 1/7 6/7 6/10 4/10 Yes Cologne 1

City of 52 City 6/10 4/10 6/10 0/6 6/6 5/10 4/10 Yes Cologne 2

City of 26 City 6/10 4/10 4/10 0/4 4/4 4/10 8/10 Yes Kempten

Total ------47/110 63/110 46/110 4/46 42/46 32/110 31/110 --- number

Average ------43% 57% 41% 9% 91% 29% 28% ---

100% 100%

Table 9: Questionnaire page 3/6 – topics

Results questionnaire page 3

• In total 110 topics were shown to the participants; 47 topics were assessed as understandable and 63 topics were assessed as not understandable. Therefore the average number of topics that were understandable to the participants is 43%. This means nearly every second topic was understandable to a stakeholder. Even if many topics were clear to stakeholders the range of understandable topics is between 0 and 7 out of ten. There were three stakeholders, including the local and city level, who identified zero or only one topic. There were three participants, including the local and city level, who identified seven topics. (Cp. columns 2+3+4)

• For the total number of 110 topics the participants suggested 46 titles for a topic. The average for titles that were given a topic is 41%. Therefore nearly every second topic got a title from the participants. The interpretation is that from the keywords the policy maker finds it easy to make a label summarising the topic. The range of suggested titles is between zero and seven; three participants gave zero or one title to a topic, and three participants suggested seven titles for topics. While comparing suggested titles and topics the words are 9% similar and 91% different. This can be interpreted as how focused a topic is – if there is one clear label for a topic based on its keywords, this is

 WeGov Consortium

Page 67 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

good. If there is none, this is bad because the user cannot understand the topic. If there is more than one label, this means the topic is less focused and may contain more than one theme. (Cp. columns 5+6+7)

• From the list of 110 topics the participants knew 32% of the topics from the daily presswork and were generally interested in 28% of the topics. (Cp. columns 8+9) Summary questionnaire page 4

On the second page of the questionnaire (aggregated in Table 10) the stakeholders were shown the first three key comments and key users that belong to the first topic from the first page of the questionnaire.

• Column 1 shows when a topic was specified in the first round.

• The column 2 shows when a topic was specified in the second round.

• Column 3 considers when a title was given in the first and second round. Therefore the column shows when the specified topics match or not.

• We did not ask in the questionnaire if users may react to a comment or if the comments are relevant, but some users provided this information. Therefore column 3 shows some numbers in the case the participants rated the comments.

Assessment for comments (page 4 1 2 3 4 from the questionnaire)

Was a title given in Second round Potential to start Was a title the second round? topic equates a dialogue on given in first to the first comment / user round? Participant round topic level?

Bundestag 1 (Office) Yes Yes No 0

Bundestag 2 (Office) ------

Bundestag 3 (MP) Yes No ------

Bundestag 4 (Office) No Yes --- 0

Bundestag 5 (Office) ------

Bundestag 6 (MP) No Yes ------

State PARL. 1 (MP) No Yes --- 0

Local Government No Yes --- --

City of Cologne 1 Yes Yes Yes 1

City of Cologne 2 Yes Yes Yes ---

 WeGov Consortium

Page 68 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

City of Kempten No Yes --- 0

Total number 4/9 8/9 2/3 ---

Average 44% 88% 66% ---

Table 10: Questionnaire page 4/6 - key users and posts

Results questionnaire page 4

• 44% of the participants suggested a title for the first topic in the first round. In the second round the participants read the three most relevant comments and gave a title for the topic again: One user chose the same title; one user chose a more specific title; and one participant chose a different title than before. (Cp. columns 1+2+3)

• Four of the six users (66%) that participated in the assessment of relevant comments would not reply to one of the comments; two of the six participants (33%) would reply to one of the comments. (Cp. column 4) Results follow-up interviews Expected topics WeGov is producing results that the stakeholders expect. All topics that were assessed as understandable were known beforehand. The reasons being: stakeholders are well informed about topics that arise or are discussed within their constituency. Stakeholders follow local SNS channels and are part of SNS discussions – so they are ‘aware of the public area’. Regarding the further twitter analysis results (local and global) the assessment was often the same: stakeholders monitor topics locally and globally – therefore they are ‘aware of the public area’ and which subtopics being discussed. Within the samples that were shown to the interviewees the subtopics were identified and the topic of discussion was clear to them. Therefore the analysis is able to provide the topics that are relevant for the queried search on Twitter; if there are enough tweets. Concerning Facebook the topics were better understandable and helpful for the interviewees when they were extracted from Facebook pages with high discussion activity (e.g. or the press). Quality of topics When comparing the three use cases ‘Facebook topics’, ‘local Twitter topics’ and ‘global Twitter topics’, the best results were from Twitter. Comparing local Twitter and global Twitter analysis, the topics were better for the global analysis. None of the ‘Facebook topics’ interested the policy makers and there were no topics that the policy maker didn’t know about before – therefore no ‘unknown’ topics could be provided by this use case. Political content within Facebook comments

 WeGov Consortium

Page 69 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

The comments and users that were provided for the Facebook use case were not useful at all. The interviewees mentioned the following reasons:

• “The political extract within discussions is low”. Predominantly, public Facebook discussions were driven by non-political comments.

• “There is no local discussion culture”. Facebook users did not regularly discuss local topics on Facebook.

• “The context is missing”. Even if an interesting comment was identified it is important to be able to follow the whole discussion to understand the comment in the right way.

Different meanings for topics All interviewees mentioned that the combination of five words for one topic can have multiple meanings. It is often the case that two or three words fit together and another word has a completely different meaning for the group of words as a whole. Another problem is that single words can also have different meanings. For instance the German word liebe. One interviewee mentioned that it’s not clear to him if this word means the noun or the title. Depending on the single meaning of the word the combination with other words can have different meanings. Less clear topics All interviewees observed that the topics are often unclear and do not provide benefits form them. The reason why 42% of the 110 topics were assessed as understandable topics is due to the fact that policy makers know what’s happening in the area of their electorate. The interviewees confirmed that the number of 42% in the questionnaire is a very optimistically number, because the interviewees often made a guess what the meaning of the topic could be. Most of the topics were clear to them, because they know the ‘real world’ case and can therefore suggest the topic. All interviewees confirmed further that this background information is necessary for most of the provided topics. Redundant words There are words like http, https or dear that appear within the list of words in a topic that are redundant for understanding the topic. These words should be removed from the list. Strangely occurring words There are some combinations of words that sound very strange to the interviewees. For instance the combination of “bacterium”, “Facebook” and “Google” sounded unrealistic for one interviewee. Many tweets with relevance = 1

 WeGov Consortium

Page 70 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Whilst looking at relevant posts with interviewees, it was often the case that many ‘key posts’ had a relevance of 1. The interviewees tried to understand this circumstance of many posts with high relevance, but no one could explain it with the provided information on the dashboard. Key words not matching the key posts Interviewees did not understand why key posts are relevant if their words do not match the words within the topic groups’ key words.

Number of words per topic Each topic comes with five words that compose exactly one topic. All interviewees confirmed that it’s unusual that all five words are necessary to understand one topic. Sometimes one word is enough to understand a topic. But often two ore maximum three words are needed to understand the topic. The combination of four words is also very unusual for understanding one topic. 4.3.4.3 Sentiment and Controversy Analysis Results follow-up interviews The validation of sentiment analysis was not part of the questionnaire. But the analysis report covered at least one example for each of the three uses that have been discussed during the expert interviews. Below is the interviewee’s assessment of the sentiment analysis:

• Most of the interviewees can guess at the meaning of ‘sentiment’ and ‘controversy’ within the WeGov toolbox. Therefore end users look at these indicators to choose a topic, and to read the posts contained within the group. But:

o It’s not ‘clear’ to them why a discussion is either positive or negative as the visualization provides only one scale. For instance it may help to show the total number of both - the number of positive and also the number of negative comments.

o When combined with ‘controversy’, the ‘sentiment’ is less clear. End users have difficulties understanding what the discussion looks when only seeing both scales.

o For the ‘controversy’ scale, when viewed separately, it is easier to understand. In general the UI needs more improvement to provide the end users with a better understanding of its parameters.

• One abnormality within nearly all of the sentiment/controversy analysis was the low peaks for its scales. Generally the scales were very neutral. But there were differences between Facebook and Twitter posts. The peaks most often occurred in the Twitter sample. Four interviewees mentioned that:

 WeGov Consortium

Page 71 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

o The reason for the neutral scales could be due to the ‘lower quality’ of the algorithm, but the interviewees didn’t examine all of the comments to confirm their statement.

o Another issue is the strength of opinion leaders on Twitter. In the case of the MP Patrick Schnieder, who tried to start a discussion opposing the point of view of a chief editor of a famous German newspaper, his controversial tweet was answered with a tirade of comments from users that drove the discussion

o This feedback suggests room for improvement on the clear presentation of information, greater transparency regarding the criteria that the tools use to analyse information, and a better understanding of SNS behaviour in general.

o Interviewees argued that WeGov is a tool that is between them and the large amount of SNS data. Therefore the project needs to consider that the behaviour of the SNS may change frequently – for instance through new privacy settings on Facebook or the way that political parties in Germany have revolutionised discussions on SNS using open and transparent methods.

• When comparing the results of sentiment analysis between the different use cases, there were more peaks on the Twitter sample than the Facebook sample. 4.3.4.4 Posts to watch Summary questionnaire page 5 The following table aggregates the participants’ answers concerning the validation of posts to watch. The total number of posts was five, which were presented to each participant:

• Column 1 shows the queried term selected for twitter, if the search was geographically restricted and the number of tweets that were collected as input for the analysis.

• Column 2 shows the total number of tweets the participants would react to.

• Column 3 shows the number of single authors - if the number is five all tweets are from different authors; if the number is one all tweets are written by only one author.

• The column 4 shows the number of duplicates in the result list.

Assessment for posts to watch 1 2 3 4 (page 5 from the questionnaire)

Participant Search term / number of Number of Number of Number of tweets tweets the unique authors duplicates participant would react to

 WeGov Consortium

Page 72 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Bundestag 1 (Office) 401 local tweets for Euro 0/5 5/5 0/5 (Eng. Euro)

Bundestag 2 (Office) Approx. 1000 local tweets 3/5 1/5 0/5 for Bundeswehr (Eng. Federal Armed Forces)

Bundestag 3 (MP) Approx. 1000 global tweets 1/5 5/5 0/5 for Bürgerbeteiligung (Eng. civic participation)

Bundestag 4 (Office) Approx. 300 global tweets 0/5 5/5 0/5 for Fracking

Bundestag 5 (Office) 49 local tweets for ------Netzpolitik (Eng. internet policy)

Bundestag 6 (MP) 49 local tweets for 4/5 5/5 0/5 Datenschutz (Eng. data privacy)

State PARL. 1 (MP) Approx. 500 local tweets for 2/5 1/5 0/5 F95 (Name of a German soccer club) / LOCAL

Local Government Approx. 10,000 local tweets 1/5 1/5 0/5 for Saarland (geographical region)

City of Cologne 1 Approx. 500 local tweets for 3/5 4/5 0/5 Schule (Eng. school)

City of Cologne 2 Approx. 500 local tweets for 1/5 4/5 0/5 Schule (Eng. school)

City of Kempten 37 local tweets for 0/5 2/5 3/5 Festwoche (Eng. festival)

Total number --- 15/50 33/50 3/50

Average --- 30% 66% 6%

Table 11: Questionnaire page 5/6 - posts to watch

Results questionnaire page 5

• The total number of tweets where participants might react to is 15. Therefore the average of tweets is 30%. Nearly every third tweet is that interest for the participants that they would react to it. (Cp. column 2)

 WeGov Consortium

Page 73 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The number of authors that have written the posts to watch was between one and five. In 30% of the cases only one author has written all the posts to watch and in 40 % of the cases all authors were unique. For two cases where the tweets have been written by only one author the author was a journalist. (Cp. column 3)

• All the posts to watch that have shown to end users were unique – only in the case of the city of Kempten the list included three duplicates. (Cp. column 4) Results follow-up interviews In total there were three cases (30%) where participants wouldn’t react to any of the posts. What was the reason for this?

• In the case of the city of Kempten there is less Twitter activity in the area of the city and only one provider for local news that is also active on Twitter. Therefore the interviewee expected the results beforehand.

• For the two other cases the interviewees argued that the provided posts to watch do not include any interesting information. 4.3.4.5 Users to watch Summary questionnaire page 5 The following table aggregates the participants’ answers concerning the validation of users to watch. The total number of users was five, which were presented to each participant:

• Column 1 shows the queried term selected for twitter, if the search was geographical restricted and the number of tweets that were collected as input for the analysis.

• The column 2 shows the total number of Twitter users, which the participants already following.

• Column 3 shows the number of Twitter users, which the participant would follow on Twitter.

Assessment for users to watch (page 5 1 2 3 from the questionnaire)

Participant Search term / number of Number of users Number of users the tweets the participant is participant would following follow

Bundestag 1 (Office) 401 local tweets for Euro (Eng. 0/5 4/5 Euro)

Bundestag 2 (Office) Approx. 1000 local tweets for 3/5 2/5 Bundeswehr (Eng. Federal

 WeGov Consortium

Page 74 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Armed Forces)

Bundestag 3 (MP) Approx. 1000 global tweets for 0/5 0/5 Bürgerbeteiligung (Eng. civic participation)

Bundestag 4 (Office) Approx. 300 global tweets for 0/5 0/5 Fracking

Bundestag 5 (Office) 49 local tweets for Netzpolitik ------(Eng. internet policy)

Bundestag 6 (MP) 49 local tweets for Datenschutz 1/5 0/5 (Eng. data privacy)

State PARL. 1 (MP) Approx. 500 local tweets for 3/5 0/5 F95 (Name of a German soccer club) / LOCAL

State PARL. 2 (MP) Approx. 200 local tweets for ------OWL (acronym for a local region)

Local Government Approx. 10,000 local tweets for 4/5 1/5 Saarland (geographical region)

City of Cologne 1 Approx. 500 local tweets for 0/5 1/5 Schule (Eng. school)

City of Cologne 2 Approx. 500 local tweets for 0/5 0/5 Schule (Eng. school)

City of Kempten 37 local tweets for Festwoche 2/5 2/5 (Eng. festival)

Total number --- 13/50 10/50

Average --- 26% 20%

Table 12: Questionnaire page 5/6 - users to watch

Results questionnaire page 5 From the viewpoint of the questionnaire this analysis provided, in most cases, interesting users for stakeholders to watch:

• Column 2 shows that five of the participants were following at least one user but five users were not following any of the proposed users. The maximum number of ‘users to watch’ that were being followed was four. Therefore the average for proposed users that are already followed by the participant is 26%.

 WeGov Consortium

Page 75 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The column 3 shows that four stakeholders might follow at least one of the proposed users. Two of them are already following at least one user and two of them were not interested in following any of the proposed users.

Results follow-up interviews Concerning the interviews the participants mentioned good examples that this kind of analysis is very helpful to them. For instance Matthi Bolte identified a user, who is very active in the field of technology. The user is an opinion leader and acts in the role of a NGO. This example is very shows that the analysis identifies not only opinion leader in the field of politics, the press and the economy, there is also potential to identify NGOs and similar opinion leader. 4.3.4.6 Discussion Activity Summary questionnaire page 6 Table 13 summarizes the participants’ answers regarding the assessment of discussion activity. Here the participant got a figure that visualizes a graph showing the number of posts in ration to the time:

• Column 1 shows if the graph in the figure has a significant characteristic that might be of interest for the participant. For instance the graph has a very high peak for one day.

• The column 2 shows the answers of participants if the visualization is interesting to them and column 3 shows the reason for this interest if the participants provided an explanation.

Assessment for discussion activity 1 2 3 (page 6 from the questionnaire)

Participant Characteristic of Is the chart interesting for Why? the chart engaging?

Bundestag 1 (Office) --- No ---

Bundestag 2 (Office) ------

Bundestag 3 (MP) One peak Yes Topic getting more relevant

Bundestag 4 (Office) Mountain: chart Yes For comparison of two areas; initially increases and decreases until the end

Bundestag 5 (Office) ------

Bundestag 6 (MP) --- No Not enough posts

State PARL. 1 (MP) Base line with two Yes Peaks

 WeGov Consortium

Page 76 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

peaks

State PARL. 2 (MP) ------

Local Government Graph is increasing Yes High activity and the graph is increasing

City of Cologne 1 Base line with two Yes What happens between the peaks very close peaks?

City of Cologne 2 Base line with two Yes Peaks peaks very close

City of Kempten --- No ---

Table 13: Questionnaire page 6/6 - discussion activity

Results questionnaire page 6

• Column 2 shows that the discussion activity diagram is interesting for six participants. Each of these five diagrams had a special characteristic. For instance the chart had at least one peak or was increasing or decreasing. (Cp. column 1)

• Participants suggested they might look at the chart to see the peaks and potentially detect trends. In the cases where the analysis didn’t show much movement, this kind of visualization was not that interesting to stakeholders. Results follow-up interviews

• Within the interviews the stakeholders mentioned the graph as an interesting tool for long-term analysis. The interesting part is the right hand side of the graph. If the graph increases the visualization provides a trend to show how the topic may evolve. In combination with an alert function this tool is beneficial.

• However, the interviewees were disappointed that it was not possible to view the tweets directly from a peak on the graph or the tweets that were between two peaks. The stakeholders were interested in the users behaviour and arguments before the peak, during the peak and after the peak. If the number of tweets decreased dramatically the reason for this is would be particularly beneficial to understand. 4.3.4.7 User Roles (Pie Chart) Results follow-up interviews The user roles weren’t part of the questionnaire. But the user roles were part of the analysis report and each twitter search term. Therefore the following feedback is drawn directly from the interview.

• The five user roles are meaningful and it is interesting to compare the roles over time and across different topics. However, the way that the user roles are currently

 WeGov Consortium

Page 77 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

translated in dashboard with the pie chart, are virtually useless. The reason for this is that currently the users are not available in combination with the pie chart. We saw this as an important point to address, so this is fixed in a patch available on our development server. The pie chart now has lists of users for each role.

• The labels for the different user roles are not clear. It takes lot of time to explain the different roles to interviewees. Therefore the roles need labels, where the end users can have an explanation. For instance, a ‘broadcaster’ is understood as the press by the interviewees.

• Examples may help to understand the roles in a right way.

• The role ‘Information Source’ was an interesting role for policy makers.

• The stakeholders that were categorised within user roles were within the role of daily users. Some stakeholders asked for the criteria that can influence themselves. The reason for this is the assumption that it might be beneficial to become an ‘Information Source’ instead of being a ‘daily user.’ A number is helpful that shows how far they are from the next role.

• For a better understanding of the roles, the stakeholders need more information on the process of the classification into the roles. It’s not only important for them to know the criteria; it’s also interesting to them to know what the most important criteria for increasing their own score. 4.3.4.8 General Feedback Results follow-up interviews Support for E-Participation portals E-participation portals like Cologne’s budgeting portal provide useful topics, which are important for citizens. Nevertheless the assumption is that more in depth discussions are taking place on the social web, hence why the city of Cologne operates social media channels to engage with their citizens. For a better awareness of the topics that citizens are discussing online, analysis software like WeGov is necessary. Social media activity In the area of Cologne, with approx. one million citizens, it is assumed that rich discussions on political topics are taking place on the social web and it is beneficial for the city to be aware of them. In contrast, the city of Kempten has approx. sixty thousand citizens, and analysis tools like WeGov are less important. The reason is the very low social media activity on the public areas of SNS. This feedback is very similar to the interviewees from parliaments. The social media activity within small communities is very low, simply because of the small number of people in those communities. Social media activities are more prevalent in places with a large population density.

 WeGov Consortium

Page 78 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Social media quality

The most important factor beside social media activity is the ‘quality’ of SNS. Interviewees confirmed that rather low-quality political discussions were are taking place within the public areas of SNS. From their personal point of view most of the political discussions taking place do not happen in public. For instance on Facebook there is more reaction to ‘social’ posts (e.g. a personal photograph) than to a politics driven post. On platforms like Twitter, every tweet is public and therefore the users’ behaviour is different, so the WeGov toolkit needs to take this into account. ‘Pseudo’ dialogue Another issue is the strength of opinion leaders on Twitter, as was described in the example of the MP Patrick Schnieder (4.3.4.2). Data security and ethical guidelines Data security and ethical guidelines are very important. For public authorities, a social media policy is necessary. WeGov has investigated this issue and our conclusions are given in section 5.5. Importance of ethical guidelines for public authorities In contrast to commercial tools (such as the ones surveyed in [7]), public authorities need to consider ethical guidelines while using analysis tools as WeGov. While companies using market analysis tools to explore the status of their brand within the social web, an up to date example with the State Chancellery of the state Sachsen in Germany indicates a real existing gap of open questions and the idea of applying tools like WeGov. The State of Sachsen published a call for bids for a project that is very similar to WeGov. This project should identify the opportunity of analysis software to monitor SNS in Sachsen to better engage with their citizens by identifying their needs. The politician Johannes Lichdi20 criticized this within an open letter the call with statements likes: “Opinion research is not a main issue for a State…” the state chancellery of Sachsen removed the call. [36] Within the WeGov project we engaged different levels of end users that deal with this issue in different ways. For instance the city of Cologne argued: “There is no report from the EU or the federal government for the usage of social web analysis tools for public authorities. But public authorities are responsible for providing a report when using social web analysis tools. The general data security guidelines, provided by the EU, are not enough. There is the lack of a clear guideline from the highest agencies.”

20 Official website of the politician Johannes Lichdi. URL: http://www.johannes- lichdi.de/pm+M50b55fa2d5e.html (Retrieved 2012-10-09)

 WeGov Consortium

Page 79 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

E-participation

WeGov is not a typical e-participation tool because its aim is improved dialogue between the eGovernment and the eSociety. Therefore the analysis components identify users and opinions - the KMI tools provide opinion leaders, while the topic analysis provides the users with information regarding the content of posts. Therefore the topic analysis component seems to support e-participation in a more effective way than the user behaviour component does.

• Interviewees mentioned that the comments provided for the ‘local Facebook’ use case are useless – therefore e-participation isn’t realistic. As reasons were mentioned the low quality of comments and the incoherency with respect to the discussion thread.

• With respect to the ‘local Twitter’ and ‘global Twitter’ use cases the results for the topic analysis provided much more realistic comments for e-participation for all analysis components. Even if the level of comments provided by the topic analysis haven’t validated interviewees mentioned a better universal set in general.

• Concerning the behaviour analysis, the components provide useful comments and users, but they generally include opinion leader instead of ‘citizen opinions’. The interviewees argued that lots of identified comments and users are NGOs and this group represent at least the ‘society’s opinion’. On this level, e-participation seems more realistic, because these opinions can influence decision-makers.

4.3.5 Discussion General analysis approach This evaluation approach was very effective with respect to the quality and usefulness of the end users’ feedback. However, it was very time consuming because the analysis reports and the extracted sample for the questionnaire needed current and personalized data - Facebook pages, topics for Twitter on the global and local level of interest to the stakeholder. Therefore this approach needed research time on the social web and continuous coordination with the end users to design an individual data report. Including all steps that were necessary to run this study, about one week was needed for each end user. In the case of the Bundestag there was one end user, who cancelled the participation because of illness. Technical boarder line We found a technical limitation of the hosting system – when there were more than about 25000 posts, the analysis components ran out of memory. This is not a limitation of the software, but rather the system it was executed on. However, it did mean that we did not use some search results because they were too large. The solution is simple – to increase the memory, but time and cost prevented us from doing this. Number of topics

 WeGov Consortium

Page 80 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Within the analysis reports we presented ten topics for the topic analysis. There are two reasons for this. It is easier if the structure is always the same and ten topics can be viewed in a glance. The downside is that the topics might be of better quality if the algorithm ‘decides’ itself how many topics are identified and will be shown on the dashboard. In the case of one of the searches with more than 23,000 tweets, the analysis was run with the console instead of the dashboard and it took 115 minutes to analyse and listed over 600 topics. When we specified ten topics, the analysis on the same data took 4 minutes! Even if the topics could provide better results this way, it is unrealistic for a stakeholder to wait 115 minutes for an analysis to finish and that they would then read more than 600 topics (each composed of five words). This is why we decided to limit the result list to ten topics. We decided to give the user control over the number of topics in the user interface, and have advised them that it may be a good idea to begin with a limited number of topics, and to adjust up and down as they see fit, for example increasing the number of topics by 10 and re-running the analysis until the number of topics becomes intractable, the analysis takes too long or repetition is seen. The run time of the analysis is related to the number of topics, so selecting a small number at the beginning should give a quick runtime and an overview of the themes, and the user can then dig deeper if they so wish. Global vs. local analysis While the questionnaire considered a local sample for analysis, the analysis report also covered samples on the global level. The difference between local and global analysis results is that the number of activities is usually lower. This experiment didn’t consider if the behaviour was also different on a local level and if the analysis components need to take this into account. Right number of input posts Particularly on the local level we often applied the analysis to a very small number of posts. In some cases we provided an analysis of the local and global levels for the same search term. We didn’t compare both results within each report, but the results are different. In general the tools seem to work with fewer posts and with large numbers of posts. Further tests should be carried out to compare the differences between the results. Possible amount of local tweets For Twitter we used the local search functionality that restricts tweets by using geo-location information provided by the SNS user. There is no concrete number on how many posts can be detected this way. However, it seems that a high number of tweets cannot be collected that way, because the users do not provide their location data. During the PolitCamp12 event, taking place in Berlin in September 2012, we started a test to get an impression of the total number of tweets for querying local areas with WeGov. Therefore we collected tweets with the event’s hashtag (#PC12) to see the difference between the numbers for local (Cp. Figure 10, left hand side) and not local querying (Cp. Figure 10, right hand side). The

 WeGov Consortium

Page 81 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 first setting restricted Twitter to a 5 kilometres radius around the event’s location and the second setting were without any geographical restriction.

Setting: Local search on Twitter with hash tag Setting: Global search on Twitter with hash #PC12 and a radius of 5 kilometres for the tag #PC12 event’s location21 Highest peak: 2630 tweets on Sep 22th 2012 Highest peak: 120 tweets on Sep 22th 2012

Figure 10: Querying Twitter for #PC12

People that may have attended the PolitCamp12 may confirm that most of the tweets have been written directly at the event location. Even with a conservative rating of 50%, which are written directly from the event location, the number of tweets is 1315. The number of tweets identified with the local search is 120 tweets. This is less than ten per cent. This example is not representative, but it gives an impression on local available tweets.

4.3.6 Conclusion for Validation of Analysis Results with End Users In the beginning of this section we mentioned five questions as leading questions for this underlying validation with end users. Following are the conclusions regarding the questions: 1. Can we bunch local Facebook pages to show the topics and arguments that people are discussing locally?

21 URL: https://maps.google.de/maps?q=Holzmarktstra%C3%9Fe+33,+Friedrichshain- Kreuzberg+10243+Berlin&hl=en&ie=UTF8&sll=51.238455,6.81435&sspn=0.272555,0.696945&geocode=F Wo_IQMd-ufMAA&hnear=Holzmarktstra%C3%9Fe+33,+10243+Berlin&t=m&z=16&iwloc=A (Retrieved 2012-10-15)

 WeGov Consortium

Page 82 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The local Facebook monitoring use case, that bunches Facebook pages for the topic analysis is interesting for the interviewees, because it provides topics in combination of relevant comments and users. The additional information of the sentiment and controversy for each topic provides added value for the interviewees.

• Even if 42% of the topics within the sample of the questionnaire were clear to the participants the quality of the analysis component was mentioned by the interviewees. The interviewees recommended more toolbox improvement (especially for the analysis component) and empirical evaluations (e.g. change the number of topics). 2. Can we monitor topics locally on Twitter to provide potential starting points for dialogues with the e-society?

• The use case strongly depends on the number of tweets that can be collected. The number of tweets was predominantly under 50 tweets. The mentioned reasons for that are less Twitter communication (e.g. city of Kempten) and technical issues with the geo information that needs to be activated in the user account. Therefore the analysis results were not really relevant on the local level.

• But in the case of many tweets the analysis results are generally getting more interesting and more useful for the interviewees.

• In general the use case is of significant interest for the interviewees but needs more improvement on the analysis component and technical opportunity to collect more tweets. The key challenge is to acquire more tweets about the same subject from a local area, and this may be achieved by using additional local Twitter searches for terms similar to the one entered by the user. GESIS have a search term recommender that provides related search terms to the one the user inputs, and it would be a good experiment to incorporate this into WeGov. In addition, for the future interviewees expect that the social web will increase and the use case will become more important for them. Nevertheless the interviewees recommended the use of further SNS. 3. What are the differences between monitoring Twitter locally or globally?

• The study with stakeholders has shown that the analysis components are more effective on global searches, because amount of input data is bigger.

• The topics provided by the topic analysis were clearer to the interviewees using Twitter data than on Facebook data. 4. Can the analysis results help to reduce the gap between the e-government and e- society?

 WeGov Consortium

Page 83 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The analysis results provide starting points for policy makers to engage with citizens. Users that represent the e-society have been detected and interesting content was identified. Therefore the gap can be reduced.

• One MPs statement was: “The social web doesn’t really provide political discussions, but it reduces the gap between policy makers and citizens concerning social aspects and is therefore necessary for politics.”

• Therefore WeGov provides already important results to reduce the gap between e- government and e-society, but needs more improvement to identify the e-society instead of opinion leaders.

5. How beneficial are the WeGov results for policy makers and what are important parameters?

• The overall conclusion here is that WeGov local and global monitoring use cases are relevant to policy makers and provide useful results to policy makers. Currently there is no concrete number from 1 to 10 that indicates how relevant the results are and if they affect policy-making – the relevance and usefulness is strongly dependent on the individual cases (e.g. searches and how much data they can return).

• There are important parameters that affect the analysis results. Factors that affect the toolbox results include:

o the quality of political discussions, o quantity of input data, o combination of input data o the behaviour of opinion leaders in discussions, o different SNS for different discussions, o different user roles, o frequently changing data security and privacy settings, o identifying citizens and controversial opinions.

• From the viewpoint of the user important features of the WeGov toolbox include: o the usability of tools, o the processing of analysis results for further analysis, o transparency of analysis, o what the analysis criteria are, o examples and descriptions of components,

 WeGov Consortium

Page 84 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

o the accuracy and credibility of results.

• A critical component determining quality of results is quantity and quality of input data. The more data returned from searches, the better, and because WeGov can combine search results into a single analysis enables the user to conduct multiple related searches to achieve a reasonably large data set for analysis.

 WeGov Consortium

Page 85 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.4 Evaluation with the EU-Parliament

The evaluation with members and staff of the European Parliament was fully focused on the usability and the usefulness of the toolbox and its functions in a policy making environment. The user engagement was carried out through a semi-structured interview a demo and accompanied hands on trial. The dashboard had been prepared in advance to deal with their major areas of interest, and the interviewees could try out a number of features on the spot. Due to high workload and heavy travelling schedules of this stakeholders group it was only possible to have one evaluation session for each of the prototype versions. This allowed us to capture their on the fly feedback and perception of the evolution of the toolbox from one version to the next. We met MEPs and their front line staff, as well as representatives of supporting departments to the MEPs. Although the latter had more resources to allocate to social media monitoring, they all mentioned that finding the human and financial resources to keep control of the impact of the policy makers on social network and exploiting the social networks as a barometer of the citizens’ opinions and concerns remains a major challenge. The evaluators immediately understood the progress made since the previous prototype, but it was clear that a new or an infrequent user needs more guidance on the general structure and the logic behind the application. When accessing the application, it is not immediately clear that the application represents a workflow, starting with a search, followed by different types of analysis. Users also had trouble initially understanding how a search is initiated differently on Twitter and Facebook. Once these concepts were understood, the users appreciated the new way the search results were displayed, with lots of familiar references to the social network features, which seemed to highly increase their trust in the toolbox. As it was the case with all trial users so far, many questions were raised about the algorithms used for topic and behaviour analysis. The topic analysis results performed from a number of Twitter and Facebook searches tested on the fly looked relevant to them. To avoid confusion an effort was made to mainly use key terms in English language and to concentrate on analysis results with English language posts, which in the European context is a simplification of reality. As expected these trial users were very concerned about how a tool like WeGov could handle posts in different languages in an analysis. The analysis of combined search results was highly appreciated, but the users recommended making the tools more flexible in terms of management of stored search and analysis data, like the possibility of renaming and to exporting them. Providing that the concerns mentioned above were addressed, the trial users saw a good potential in the tools, especially to obtain a very fast view of influential users and posts and of user behaviour. The major use cases they identified included follow-up of discussions on main newspapers Facebook pages, and, during a pre-electoral period, keeping abreast of discussions taking place on the Facebook pages of all competing political parties. They also considered

 WeGov Consortium

Page 86 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

WeGov as a potential tool to quickly follow and analyse the impact of a “Twitter fight” and specific social media campaigns. While the toolbox functionality not always easy to understand and to assess by an individual policy maker with a reduced staff with limited availability, its full potential became clearer when meeting supporting entities specialized in social media. This was the case for the last two stakeholders met. This was an opportunity to meet dedicated social media professionals specialised in policy making or institutional communication at a European level, with a good knowledge of the European public sphere. It was important to hear the perspective of staff highly familiar with social networks, and who need to cope with important data volumes and have some experience with commercial or open source social media monitoring tools. As a complement to the elected members of the European Parliament, who use the social media on their own initiative to reach out to their constituency, we also met the Directorate General Communication of the European Parliament. This DG has a Web Communication Unit, actively using the social media to communicate on behalf of the EU Parliament institution. With almost 500.000 likes the EP Facebook page posts also generate many comments. Currently the comments are read one by one by the staff, but their content is not really exploited nor analysed. They make the statement that today they measure (rather than analyse) the activity on social media about the European Parliament and the themes related to its competence areas. Their ultimate objective is to set up a live monitoring of what is said on their institution on the social media and identify with a high level of accuracy where intervening in conversations can make a difference. From their highly specialized background on social media communications and related concerns the seven participants in the demo quickly understood the structure and the functionality of the WeGov tools. The widgets had been preconfigured with search terms related to the most important EP topic of the week and the recent comments on the EP’s Facebook page. At first glance, with their existing knowledge of what is being said on their own page and on the used search terms reflecting their hot themes, but without a real in depth analysis of the topic and behaviour analysis results, the participants to the demo perceived WeGov’s functionalities and results as certainly relevant. On Twitter behaviour analysis results they even recognized themselves as an “Information Source” user on one of the topics. They make the preliminary conclusion that tools like WeGov could help them on their way to their objectives in terms of social media exploitation, especially to close the gaps in terms of Facebook monitoring and analysis. An area in which they are somewhat more sceptical is the automated sentiment analysis. Some recent research done with a PhD student came up with very positive sentiment on a number of posts made by one of the most ironic anti-Europe journalists. Irony and sarcasm present a challenge for many lexical analyses (see for example [44]). As a conclusion of the engagement with EP members, staff and supporting departments, those who remained engaged during the whole project duration were sufficiently intrigued by the

 WeGov Consortium

Page 87 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 tools to insist on the opportunity to further test them and to be informed about future evolutions of the WeGov concept and features. Understandably, the major interest and the highest potential for the tool were expressed by the Web Communication Unit of DG Communications. WeGov could clearly bridge a number of gaps they encounter with their existing support tools, though they need some further validation of the reliability and accuracy of the WeGov analyses. They showed readiness to be engaged in possible next steps of WeGov. They also introduced WeGov to the personal communications advisor to the President of the European Parliament and thought it would have a high value to meet with the social media coordinator of the European Commission Directorate General Communication.

 WeGov Consortium

Page 88 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.5 Workshops with Scientists, End Users and Practitioners

4.5.1 Introduction The aim of the WeGov workshops was to elicit the engagement of scientists, users and practitioners regarding the design process of the toolbox. The strength of the workshops was that the group commented on and discussed the toolbox from an expert viewpoint and yet, with different focuses. To feed the discussion and receive feedback on use cases, functionality, usability, dissemination and exploitation we started each presentation with an individual presentation or a live demo.

4.5.2 2nd WeGov Workshop during EGOV12 Conference Conference: IFIP EGOV 2012, Kristiansand (Norway) Session: Workshop D – 5th September 2012, 16:00-18:00 Presenter: Timo Wandhöfer, Paul Walland Chairs: Prof. Maria Wimmer, Paul Walland Workshop Agenda

• Project and background (Paul Walland)

• Stakeholder engagement and results (Timo Wandhöfer)

• Live demonstration and hands-on session (Timo Wandhöfer)

• Login page – how to logon the toolbox

• Landing page – overview of widgets to see the analysis results in a glance

• Advanced search – sample use cases to see its strengths

• Use case 1: The end user is located in Kristiansand, in Norway. He is interested in the activities on the topic “eGov” on Twitter within a radius of 100 kilometers.

• Use case 2: The end user that is located in Kristiansand wants to know the sub topics that twitter users are discussing on “Kristiansand”.

• Use case 3: The end user wants to know what the sub topics are that twitter users are currently discussing in London concerning the Paralympics.

• Use case 4: The end user is a member of the German Parliament. He has identified 10 Facebook pages that are important within his constituency (he likes the pages). He started the monitoring four weeks ago and wants to see if there are emotional topics that he is not aware of.

 WeGov Consortium

Page 89 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• Discussion with audience (Maria Wimmer, Paul Walland, Timo Wandhöfer) Discussion Topics Participation or monitoring

Audience: As WeGov looks “so far” it provides more monitoring functionality than participation functionality. The workshop presentation is focused on search and analysis. WeGov: The policy maker uses the search and analysis functionality within a first step to identify topics and opinions. The latest version of the toolbox provides reply functionality where policy makers may start or join a dialogue. Support campaigning Audience: Campaigning is important for politicians. Campaigning does not mean engaging with citizens. The assumption is that policy makers will use the tool as campaigning tool rather engaging with citizens.

Supporting parliamentary day life Audience: How does WeGov support parliamentary life on a day-to-day basis? WeGov: Preliminary interviews with parliamentarians and their employees have shown that policy makers engage with their constituents on social networking sites. WeGov has implemented the functionality enabling the analysis of local areas on Twitter. Incomprehensible topics (topic analysis) Audience: The topics that are identified by the topic analysis are not understandable. The policy maker might not understand the meaning of words that represent one topic. Sentiment and Controversy (topic analysis) Audience: The sentiments and the controversy do not match the real values. The expectation of values is higher than shown on WeGov. Here a validation is needed to test the accuracy of the tools or to test if the problem is with the poor quality of the data on social networking sites. Accuracy of analysis results Audience: Supporting good functionality and use cases are important. But in the end the results need to be accurate to make WeGov an “everyday use” tool. What efforts have been made to validate the results? WeGov: Experiments like the HeadsUp case study validate the accuracy of topic analysis components. HeadsUp is a website with approx. 10,000 comments on discussions of politicians and students. These discussion threads are manually summarized and will be compared with the outcome of the WeGov topic analysis component.

 WeGov Consortium

Page 90 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

In parallel, results of analyses based on four weeks’ intensive monitoring on policy area and Facebook pages selected by German policy makers were assessed by taking into account their existing experience with content discussed and user behaviour in their respective public spheres. All opinions within a discussion Audience: WeGov shows “samples” of the social web concerning a search topic or a Facebook page. Do the samples really represent the real discussion? WeGov: The WeGov analysis tools do not show all the diverse opinions. The approach is more to highlight those comments and users that influence the social web and that are relevant within a discussion. The topic analysis provides the “sub discussions” that are related to a discussion topic. Automatic language selection Audience: The social web is not restricted to one language. Users may use different languages within one discussion or on a single Facebook page. The analysis tools should automatically identify the language of the author and use this language for its analysis. During the hands-on session there were examples of Arabic and Japanese texts. Authentication of tweeter Audience: Not all social network users use their real name and an MP’s staff often writes in the name of the politician. How does WeGov deal with this issue of authentication? WeGov: The toolbox doesn’t prove the real identity of the user. This is the decision of users themselves, and how they interact with the social web. Combining Facebook and Twitter Audience: The end user should be able to start with one search term and get all analysis results that are available on one screen. State of the art Audience: There are many analysis tools out in the market. Why should policy makers use WeGov? What makes it more beneficial? WeGov: WeGov was developed in response to the requirements of policy makers from different levels and especially their review of the prototype version. That means the WeGov toolbox concept and its functionality addresses real needs and respects data privacy issues and ethical issues as well. Therefore WeGov placed value on transparency as it is what our stakeholders have requested. Is it still too complicated? Audience: The toolbox provides rich functionality, but there should be more development on usability to make the toolbox easier to use.

 WeGov Consortium

Page 91 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Tracking than analysing

Audience: The WeGov approach is to analyse social networking sites and reply to posts/tweets. Another interesting concept would be tracking the response to Facebook posts by a politician. Labelling Audience: There is no description for the labels (e.g. sentiment and controversy). This is important because there are several ways to interpret these labels. Easy to use functionality

Audience: The toolbox provides lots of functionality. The advanced search has several settings to choose from but it should be more self-explanatory and the functionality should be included within widgets. Data protection Audience: Data protection is a very sensitive issue for politicians. How does WeGov deal with privacy issues? WeGov: The programming interfaces of Twitter and Facebook allow a huge amount of data to be collected. Concerning Facebook, the interface may provide posts and comments where its authors do not know that these messages are publicly accessible. Therefore the toolbox limits itself to collect data from public pages and groups. Conclusion The discussion with the audience of the 2nd WeGov workshop shows that the toolbox provides potential functionality in several ways, but with respect to taking it to a production or commercial quality level, more improvement is necessary. The toolbox needs improvements concerning its usability, implementation of more general use cases as easy to use workflows, improvement of analysis components (e.g. the words that build a topic) and further validation of the results. Furthermore the concept of the toolbox allows other analysis components to run in combination with the search and schedule component.

4.5.3 Workshop with the Bundestag Discussion Topics

Audience: How does the opinion analysis work? Audience: How can ‘sentiment’ are quantified? How is the calculation of ‘sentiment’ affected? Audience: How can irony are considered? Audience: What is the criteria for the roles in the pie chart? When can a ‘daily user’ be used as ‘Information Source’? Which factors are important here and which should a politician consider? Audience: Roles of FB are also interesting.

 WeGov Consortium

Page 92 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Audience: Prediction of issues is very helpful.

Audience: WeGov is comparable to a press cutting service – it will reduce the field of information to several subjects. Audience: WeGov is conceivable as a service offered to the German Bundestag. In this case, a firmly implemented service should be established. Audience: Other platforms are also interesting. WeGov: It is self-evident that the project has invested many resources in data protection, technical interfaces, and languages. This means usability is less of a priority for the project. Audience: What will happen next with WeGov? Audience: Twitter in one region. An electoral district does not correspond exactly to the form of a circle. Therefore, not all Tweets are used or too many. Twitter allows the search alongside a border. Conclusion The state of the WeGov toolkit made a good impression on those at the event. The concept of widgets that can be created by the user in toolbox is considered to be very innovative. Further development must concentrate on usability and analysis components. A business case for operation of the toolkit as a service is recommended. The participants felt that the results cannot be rated or estimated yet, as it is not clear that the input data is of high enough quality. The question remains how good the quality of social media is in general. Therefore, the ideal way to validate the toolkit would be to provide tech savvy users with issues and areas that are familiar to them, to enable testing with the best possible data.

4.5.4 Workshop with practitioners from the PolitCamp12 Discussion Topics Audience: Isn’t the user role ‘Rare Poster’ of most interest with respect to the dialogue with citizens? Audience: What is the criteria how the role ‘Rare Posters’ is categorized? Can this role being understand as a citizen? Audience: What is the incentive for the future use of WeGov in politics? Audience: Does any abuse exist? Audience: What about data protection? Audience: Is the quality of data sufficient for data protection policies? Audience: Do settings indicate to friends that I am using WeGov?

 WeGov Consortium

Page 93 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Audience: Where are the limits for users’ roles? Which factors count the most?

Audience: How well does sentiment function? In percentage terms? Audience: How do the analyses behave regarding different issues and different data inputted? Audience: Who reads the posts written by politicians? Can/Will the availability be assessed? Audience: Transfer of the functionality to the UI must be improved. As yet, the benefits are not yet intuitively recognizable. However, they do exist. Audience: How are different languages to be handled?

Audience: Can trends be detected? Can discussions be detected? Audience: In the analyses of issues, the context of discussion is lacking on the commentary level. Audience: Will socio-demographic data also be evaluated? Women, men, age restrictions? Audience: The benefit of analyses cannot be determined globally – the analyses must rather cover a specified application – for this purpose, the politician must know exactly what he would like to analyse in the Social web and which data are important for him. This requires a sensitization in dealing with the Social web – non Social Web affine users are therefore not the target group of WeGov. Audience: Who should use WeGov? What does WeGov cost? Audience: Which analysis works best of all? Audience: Does Twitter or Facebook deliver better results? Audience: Can other networks be integrated? Audience: What is the recommendation for how often WeGov should be used? Daily? Audience: How are different network practices of users to be dealt with and how are amendments to be responded to? Audience: A shopping basket function should allow, for example, a choice of identified opinion leaders in order to monitor them. E.g., a topic analysis would be interesting, to determine the topics and subtopics that opinion leaders post. Audience: If a user is assigned to a role: How can the user check on how the analysis has determined this? This also applies to all the other analyses! Audience: The relation of controversy and sentiment is not always clear. Here graphic examples must illustrate all conceivable possibilities! Why is a scale of 1 – 10 applied? This is not meaningful at all. Either a discussion is positive or not. Here percentage figures should be used, making it clearer. The relation of different figures and analyses helps to assess the results. Audience: Top user and top comments should be referred to in the pie chart.

 WeGov Consortium

Page 94 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Audience: How is the number of listed elements determined in WeGov? E.g., 5 roles, 5 terms for a subject field. 10 subject fields. Audience: FB and Twitter rate the number of likes and Retweets. These are Social ratings. Where does one find these at WeGov?

4.5.5 Advisory Board involvement In this final stage the Advisory Board members’ input has been solicited through non- accompanied distance hands-on testing. They were asked to reply to 3 questions: WeGov: What is your feeling about the general usability of the tools? Advisory Board:

• The tool was found to be basic, but adequate

• There were request to support recent Internet Explorer versions, as these are the most widespread

• Problems were reported in getting ‘my location’ …. The location found on the basis of IP address was wrong

• A quick validation of a Twitter search was carried out through a widget and the results of the same search carried out immediately on the Twitter site, found differences

• Problems with making a widget reappear after hiding it

• It wasn’t clear initially that the tabs in the advanced search actually represented a workflow. Also, it wasn’t immediately apparent that the topic and behaviour analyses required the topic search to have been performed first. So, one suggested that either the controls are shaded out at points where they are out of scope and some tool tips be provided, or else a wizard like interface be used for the analysis workflow for a new user

• The time scheduling tools will facilitate the monitoring of trends, but the difficulty is that the person has to initiate the search with search terms – and perhaps there should be part of the tool kit (or a preliminary step in the work flow) that focuses on optimizing the initial search terms and discovering any mutation of these during the lifetime of the topic WeGov: How relevant are the functionalities offered by the toolbox, and how useful do you consider they will be to the policy maker? Advisory Board:

• The general direction of the functionality (search followed by topic and behaviour analyses) is a good way to go, but they think that there will be a learning curve that might be challenging to someone with a background in the more closed world questioning of the traditional consultation. They also think that there will be challenges

 WeGov Consortium

Page 95 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

for reducing the results of this type of ‘survey’ to a quantitative level - something that those who work in policy, are more familiar with WeGov: What are the specific use cases for WeGov that you could imagine for a policy maker?

• The WeGov approach might be very helpful in doing some of the preparatory work in advance of a more formal ‘consultation’ exercise.

• It will definitely be helpful to promote the adoption of more novel methods of surveying attitudes in that the toolkit condenses what would otherwise be a difficult and unmanageable bit of work.

• General monitoring of trends WeGov: What are your major concerns?

• Validation of the methods. The end user might need to be convinced that it is appropriate for their system and so might need to be able to check this out. One option would be to provide a simple as a word cloud as an alternative to the widget showing selected topics

• Are there stop word removal, lemmatisation, or other pre-analysis reduction? Is there any tf/idf or similar weighting?

• Ensuring that the tools actually pick up the real results and don’t obscure the detection of other things that are going on that might be discovered serendipitously by people looking at raw feeds. When scanning a list of words the user can perhaps pick up on something that is a weak signal and so might not be picked up by the software but might have meaning to the user and it might then alter their search strategy. This might be particularly relevant as the language used in social media develops (neologisms, abbreviations etc), but could also just be the chance sighting of something of potential interest that piques the curiosity of the user.

 WeGov Consortium

Page 96 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.6 HeadsUp: Topic Opinion Evaluation

4.6.1 Introduction

Figure 11: Toolbox 3.0 - evaluation cycle topic opinion validation

HeadsUp is a forum hosted by the Hansard Society and was chosen by the project as a validation and evaluation case study. It represents a valuable opportunity both because of its relevance to the project and because it is a real-world case involving real data and analysis. The objective of this evaluation is twofold: 1. to assess the usefulness of the toolkit in analysing online engagement within a civil society context, 2. to assess how accurate and reliable the analysis of data is when compared to human analysis. This evaluation is self-contained and independent of the other trials managed by Gov2u and GESIS, but its outcomes will also be fed into the specification and development of the WeGov toolkit prototypes. 4.6.1.1 Background

HeadsUp (www.headsup.org.uk) was launched in June 2003 to promote political awareness and participation amongst young people. It is an online debating space for 11-18 year olds that gives them the opportunity to debate political issues with their peers, elected representatives and other decision-makers. Five, three-week debates happen each year and fit around both the school and parliamentary calendar. The forum discussions are based around political topics of interest to young people, as well as those related to key political events, issues of debate in Parliament and the media, and current government policy. Each forum is supported by background materials and teaching resources to ensure that the discussions are of a high-quality. The discussions are analysed by the Hansard Society and are summarised in a report, which is disseminated widely. The report contains the key themes of the debate with direct quotes from participants, other information about the forum and the political context at the time the debate happened.

 WeGov Consortium

Page 97 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

The core reason for analysing the forums and distributing the report is to allow young people to have their voices heard by those that make decisions on their behalf, and to highlight that their perspectives are often different to those of adults. This is a vital aspect of HeadsUp: the report provides a channel to feed back information from the forums to policy-makers, politicians and journalists; thereby allowing young people’s perspectives to inform a wide audience of those with the power to effect change. 4.6.1.2 Evaluation

The fact that each debate has been analysed and recorded in the forum reports provides a good basis for evaluating the WeGov toolkit. Each report is written just after the forum has taken place and the findings are based on a purely human analysis of the discussions. As all comments are pre-moderated, the human analysts have a good understanding of the content of the forums, but those discussions that have hundreds of comments can prove a challenge to analyse manually.

It is important to note that the purpose of the forum and reports is not to provide a ‘pure’ research tool. The forums are primarily used for education purposes and to allow decision- makers a way to understand young people’s opinions on a whole range of issues. The methodology used to compile the reports may therefore not be as research oriented as other parts of the WeGov evaluations. However, having the historical forum data and a pre-existing set of manually analysed reports, most of them written before the WeGov project was formed, means that there is an independent set of data that provides a useful comparison point to test the accuracy of the algorithms. This evaluation deals only with the University of Koblenz’s analysis components. It was decided that an evaluation of the KMI behaviour analysis components would not be that effective because users’ behaviours are not a feature of the HeadsUp reports, and are compiled primarily to understand the themes of debate. Furthermore, the behaviour analysis components from KMI were not compatible with the HeadsUp data, as the relationships between posts are not recorded on the forums, meaning that a significant outcome from analysing this data was unlikely. 4.6.1.3 Aims of the Evaluation

• To compare the similarity of the analysis results between the WeGov toolkit and the fo- rums that have been analysed manually; • To confirm the accuracy of the toolkit and explore how well it interprets post data, allo- cating comments to topic groups and understanding positive or negative sentiment; • To identify improvements that could be made to the usability of the toolkit and to ex- plore how understandable the current results are to an ordinary user.

4.6.1.4 Methodology

 WeGov Consortium

Page 98 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Each HeadsUp forum is accompanied by a report created shortly after the forum finished which highlights the key themes of the debate in order to show policy-makers and politicians which issues were of most interest to the young people taking part. These reports were created by a human analyst and formed the bench-mark that we used to assess the accuracy of the analysis carried out by the WeGov toolkit. This evaluation focused on three different sized forums:

• one small (fewer than 100 posts) Sex Education – Do you get enough? (36 posts)

• one medium (fewer than 400 posts) Youth Citizenship Commission: are young people allergic to politics? (317 posts)

• and one large (800+ posts) How equal is Britain? (1186 posts) This allowed us to test the accuracy of the topic analysis with small numbers of comments when human analysis was capable of understanding the entirety of the debate. Progressively larger forums were chosen to explore how the toolkit dealt with larger amounts of data that presents more of a challenge for human interpretation. 4.6.1.4.1 Topic analysis The forums chosen were run on the HeadsUp analyser and the topics that were returned were compared to those that were highlighted in the report (representing the human analysis of that forum). These questions were addressed: a) How many topics/sub-topics appear in both the analyser and the report? b) How many topics/sub-topics appear in the report that do not appear in the analyser? c) Is further analysis or drilling down into the data set required to understand the key topics presented in the toolkit? d) How well does the analyser pick out ‘hot topics’ or the most popular key words from the data? e) How useful are the results that come from the analyser that do not match the report? Do they tell a different story? 4.6.1.4.2 Identifying key users and posts Quotes are also used in the report to support the conclusions made about topics of importance in the forum. To investigate whether the same users and posts were highlighted in the toolkit and the report, the questions below were addressed: a) How many quotes that feature in the toolkit as key posts are used in the report? b) How many quotes are used in the report that do not appear in the analyser? 4.6.1.4.3 Sentiment analysis

 WeGov Consortium

Page 99 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Understanding sentiment within large amounts of data is incredibly challenging – the WeGov toolkit could be useful to understand how posters felt about an issue. To test this we addressed the following points: a) Look through the top positive and top negative comments identified by the analyser - from a human perspective are they truly positive or negative? b) Is it possible to identify what it is that comments are positive or negative towards? c) Is it clear which specific issues are agreed or disagreed with from what is presented in the sentiment analysis? d) If not, how could agreement or disagreement be conveyed through the data available, if at all? 4.6.1.4.4 Usability and usefulness Did the HeadsUp analyser make the analysis of key topics easier? Were there issues with: a) The amount of data returned – is there too much or too little for the outcome to be understandable? b) Usability – how easy is the toolkit to use? c) Interface – does the interface allow decisions to be made about what are the key topics quickly and accurately? d) What options would be needed to make the toolkit as flexible and suitable for numerous data subjects? e) How easy is it to identify topics within topics (sub-topics)? 4.6.1.5 Preparing the Data Only moderated posts that have been analysed in the original reports were used. Non-student users eg. decision-makers and moderators were removed to ensure that only the posts included in the report were also in the HeadsUp analyser, and therefore we were comparing like with like. Any identifiers such as school or email address were removed to ensure the posts were anonymous. 4.6.1.6 Introduction to the HeadsUp Analyser The HeadsUp analyser uses the same algorithms as the WeGov toolkit but is built to display information from a forum rather than from social networks.

 WeGov Consortium

Page 100 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 12: HeadsUp Analyser – forum selector

The number of comments within each forum and each individual thread is shown. All the data available can be selected for analysis, as can individual forums, threads or any combination of forums and/or threads. The threads and further information about the number of posts can be hidden if required. All the available data can be selected or cleared using the buttons at the top of the forum list. Once the selected data has been analysed an overview of the topics is presented, with five key words that indicate what that topic group is about.

 WeGov Consortium

Page 101 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 13: HeadsUp Analyser - analysis results

Other information displayed is:

• the number of posts contained in that topic group22,

• the average sentiment in the group,

• the level of controversy between the posts within the topic group.

22 All posts are related to an extent to all topic groups but some are more strongly related to one group than another. Posts are therefore relevant to multiple topic groups but are usually more strongly related to one – hence how the relevance score is calculated. The Headsup analyser is set to allocate posts to a group if they have a score of more than 0.50 relevance to that topic group.

 WeGov Consortium

Page 102 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 14: HeadsUp analyser - topic group results

Topic groups can be sorted by the number of posts within the topic group or by the level of sentiment or controversy found within the group.

 WeGov Consortium

Page 103 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

By expanding a topic group, the top three key users (the posters whose comments are most relevant to the topic group) and all posts within the group can be viewed. Posts can be viewed in the toolkit but not all posts appear in the analysis. This is because the analyser chooses the posts it believes are most relevant to analyse and excludes those that are less relevant from the analysis. Those posts that are not sufficiently related to other posts, or do not fit well in one of the topic groups are not contained in the analysis results, but are part of the initial analysis.

4.6.2 Topic Analysis 4.6.2.1 Introduction Each forum consists of between three and five threads, or sub-topics, designed to allow a range of sub-issues to be discussed whilst making the debate manageable. These threads are selected by the Hansard Society as being key areas for debate. They are chosen specifically to reflect current debate in the media and Parliament as well as focusing on issues that are relevant to young people. As these threads are not chosen by the young people themselves, the forum reports aim to highlight the actual topics that young people focused most on during the debate. This ensures that the reports place the emphasis on what the young people thought rather than the artificially pre-determined debate topics. The reports generally have a similar number of themes (around 5-10), in order to make the report understandable to the reader, regardless of how many comments the forum had. Users can select how many topic groups they would like the analyser to return; for this evaluation the number of topics returned will be the same as in the corresponding report so we are able to compare like with like. As topics can be sorted by how many comments are in each group, it is possible to get an idea of the level of support each topic has within a forum. To ensure that the experiment was consistent in testing the different aspects of the toolkit, the analysis was run once for each test forum and the same results were used to test all elements of the toolkit. More detailed explanations of the experiments conducted and their findings for each forum can be found in the Appendix. Where relevant, these findings are summarised in the main report. 4.6.2.2 Experiment Results The toolkit returns on average a topic group for every 30 posts when the number of topic groups is not set manually by a user. However, the comments will not be distributed equally across the topic groups. Without being able to manually set the number of topic groups returned, the results were very hard to understand. With a medium sized forum of around 300 posts the outcome may be understandable but with smaller or larger forums the topic groups are either not refined enough or there are so many topic groups that patterns are hard to see or too many similar topic groups are returned.

 WeGov Consortium

Page 104 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Allowing users to define how many topic groups are returned was an early improvement made to the HeadsUp analyser. As can be seen from the table below, a forum of 1,000+ comments returns 39 topic groups without any user input; this number would be difficult for a user to analyse effectively. 4.6.2.2.1 Debate themes The table below shows how closely the results from the toolkit and the human analysis match. The number of topics returned in the analysis was set to match the corresponding report so we were able to compare like with like. The numbers in brackets indicate how many topic groups the toolkit returned automatically without any user input. In all instances at least half of the topic groups matched the themes in the report.

Sex Citizenship Equality Education Commission

How many themes appear in the report?23 7 (2) 6 (4) 10 (39)

How many similar topics appear in both the 6 3 7 analyser & report?

How many topics appear in the report but 0 3 3 not the analyser?

Table 14: comparison of debate themes - report versus toolkit

The Youth Citizenship Commission did not fare as well as the others due mainly to the lower quality of the data. Many of the posts in this forum were short, repeated very similar sentiments, and there were some misspelled words that could have affected the analysis eg. ’polititions‘ instead of ’politicians‘ and ’polotics‘ instead of ’politics‘. In this experiment the biggest data set, the Equality forum, came out best in terms of the output being most understandable to a user, with the key themes coming through clearly. Although it did miss some of the smaller themes detailed in the report, it highlighted different parts of larger topics, such as the popular discussion on women’s sport. The toolkit separated a discussion about women’s sport into two distinct groups. One group of comments focused on the conversation around equality in sport, both genders being able to compete on a level playing field and whether mixed sport is desirable. Although there was some overlap, the second group of comments focused much more on the types of sports men and women play and whether this unequal division between sports meant that these sports are sexist. These related but nuanced parts of the debate were mentioned in the report but not

23 The number in brackets shows the number of topics automatically returned by the analyser if a number is not manually specified.

 WeGov Consortium

Page 105 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 highlighted explicitly. This certainly gave an alternative but still truthful perspective on the debate. The analysis of this part of the forum was surprisingly nuanced. 4.6.2.2.2 Key words Across the forums it seems that the WeGov analyser was fairly accurate in picking out key words, although how understandable these key words are in the context of the topic groupings is variable and depends to some extent on the type and quality of the input data. Medium length comments, good spelling, a wide variety of words used and larger data sets seem to produce the best results when analysing topics. This suggests that the use of the topic analysis component might be more accurate and effective in analysing Facebook posts, forum or blog comments rather than Tweets. In order to see how WeGov compared to simply picking out the most used key words in a forum, the source data was analysed at http://textalyser.net. This showed the most frequently appearing key words from the data and allowed these to be compared with the key words appearing in the topic groups.

Useful key words Sex Education Youth Citizenship Equality Commission

Key word search 17 9 13

HeadsUp Analyser 23 19 28

Table 15: comparison of keyword search versus toolkit

The text analyser is a basic tool and includes many ’stop words’24 that WeGov would exclude (‘that’, ‘what’, ‘when’ etc.). Also, the text analyser does not use stemming25, meaning different variations of words are included. When looking at a key word list it is very hard to see how these words relate to one another. This relationship between words is a key feature of the WeGov toolkit as the words are given meaning and context by their association with one another. The toolkit’s grouping of key words gave a much better overview of the issues being discussed than the isolated key words did. This was apparent in the Sex Education forum; the words ‘age’ and ‘consent’ don’t mean much individually but when grouped together we understand that the discussion was about the ‘age of consent’ - the legal age when individuals are allowed to give their consent to sex. The topic groups meant that in most instances it was much easier, at a glance, to understand what the debate was about. 4.6.2.2.3 Issues with topic analysis

24 http://en.wikipedia.org/wiki/Stop_words

25 http://en.wikipedia.org/wiki/Stemming

 WeGov Consortium

Page 106 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

1. Words that skew the outcome of the topic groups.

There were consistent issues with the allocation of posts to some topic groups. In the examples tested there were a number of words that skewed the outcome of the topic groups and occasionally meant that posts were allocated to a topic group that was unrelated except for a few, not particularly useful key words eg. ‘think’, ‘agree’, ‘believe’. These words do not tell the user very much about the issues being discussed and were in some instances the only words common to all posts in a topic group.

2. Differences in the analysis results when conducted on the same data. One of the difficulties for users with the topic analysis, and in conducting this evaluation, is that the results vary slightly when repeating the analysis on the same information. Topic groups contain different key words, in different orders and therefore the information they convey is not entirely consistent. Although these variations are quite minor, from a user’s perspective this is disconcerting. Most people would expect the same analysis of the same set of data to have the same outcome every time. The table below uses the example of the Equality forum to show the differences between three analyses run on the same data with the same number of topics specified to be returned each time. All the input options and the data were exactly the same.

Analysis run ID number 19 26 27

Numbers of posts in largest topic group 166 225 167

Numbers of posts in smallest topic 67 47 51 group

grammer, speech, hoodies, tutoring, don’t, right, different, education, munchkin, work, Words appearing only in one of the compete reading, totally change, analysis runs

Number of topics that match themes in 7 6 7 the report (out of 10 topics)

Table 16: comparison of different analyses on the same data

The words appear in different orders and in different frequencies within the topic groups. As each topic group only has five key words that help users to understand what that debate is about, changing 2 or 3 of these key words could make a big difference to a user’s understanding of the issues contained within the topic group. Users are unlikely to have the time or inclination to check these differences to see how the outcome varies, and may be concerned that the same data yields different results. 3. Posts in the topic group contain none of the key words.

 WeGov Consortium

Page 107 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Although this didn’t happen very often during the experiments there were instances where posts within a topic group contained none of the key words in the topic overview. This happened in the Sex Education forum. In the topic analysis the word ‘abortion’ appears in the key words within one topic group but the posts contained in that group do not include that word. The word only appeared once in the analyser, in a different group discussing teen pregnancy. It may be the case that the analyser is using very sophisticated algorithms to understand and present the data or that some of the posts that are part of the initial analysis are excluded from the analysis results (see point 4 below). However, as a user, posts that do not contain any of the key words would be concerning and need some explanation. 4. Posts are excluded from the analysis Some posts do not appear in the analysis. This is a problem for HeadsUp as the reports aim to present the views of the broadest range of users to decision-makers. Potentially relevant posts that do not fit particularly well with the topic groups that WeGov has selected may be excluded and this could limit the voice of our users. Of course manual analysis can’t include all users’ views in the report, but all are part of the initial analysis.

Sex education Youth citizenship equality commission

Number of posts excluded from 14 31 124 results

Percentage of posts excluded 39% 10% 10% from results

Table 17: comparing the number of posts excluded from analysis results

The table above shows how many posts were excluded from our test debates. Excluding 39% of posts from the analysis (as in the case of the Sex Education forum) seems a very high percentage and may mean that interesting and relevant comments were hidden from the user. The results for the larger forums seem to suggest that the algorithm works better with more data. 4.6.2.3 Future improvements

Explaining why there are different analysis results on the same data. This issue would make many users wary about the consistency and accuracy of the data. If there is a good reason for these different analysis results it needs to be explained clearly. If not this is very likely to stop users relying on the toolkit and trusting the results. Excluding commonly used words from the topic groups. Words like ‘think’, ‘agree’, ‘believe’ do not add anything to the analysis. These words may be different for each user so keeping a list of words to exclude manually from the topic group key

 WeGov Consortium

Page 108 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 words would be helpful. They may however be useful for the sentiment analysis, as they are likely to provide an indication of how users felt about the issues dealt with in the topic groups. Including the relationships between posts. One of the problems with the topic analyser is that it looks at each post individually and doesn’t see the relationship between posts, for example which are replies to other posts. Users may not repeat the words that frame the debate, as to them the issue under discussion is clear eg. posts that are about the riots do not always refer to them explicitly. If the toolkit understood the hierarchy of comments it would have a better understanding of the context and the thread of discussion, making it better able to analyse these discussions. Including more key words in the topic group overview. Including more key words would make it clearer what the differences between the topic groups were. This would also be helpful when keywords are repeated. Explaining the significance of the ordering of key words. Including an explanation of how the toolkit decides on the order of key words is important to better understand the differences between the topic groups. Some groups may contain similar or repeated key words but the order may clarify which words were most important for that group. Teaching the algorithm to more accurately understand and sort the data. If a post appears within a topic group but according to the user does not belong there, it would be desirable if the analyser could learn from this post being reallocated to a different group. This would make future analysis better and more specific to the data being used. This may not be useful in all situations but when writing reports from forum data like HeadsUp this would help to allocate posts to the most relevant topic group. Allowing users to select posts that are to be excluded from the analysis. When attempting to analyse the themes and major topics of a discussion, short posts such as ‘I agree with x’ are not often helpful in this context. As has been seen, these posts can skew the results by focusing analysis on words that hold little meaning such as ‘think’, ‘agree’, ‘believe’. Allowing the user to exclude some posts may allow the analysis to be more refined and precisely related to that data set.

4.6.3 Identifying key users and posts 4.6.3.1 Introduction ‘Key users’ are the authors of the ‘key posts’ that have been suggested by the toolkit as being particularly central to the debate. The analyser shows ‘key users’ and ‘key posts’ within each topic group so some users are duplicated across the topic groups. However, a key user being

 WeGov Consortium

Page 109 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 identified in more than one topic group could highlight these users as being very important to the debate. Currently the HeadsUp reports do not focus explicitly on who were the key users in a debate; the important element is to understand the broad themes of the forum rather than to highlight certain users. However, as a community building tool, knowing more about who the key users are in each debate would be helpful. It would allow them to be rewarded with prizes or badges for their central role in the debate.

Users are not mentioned by name in the reports but their quotes are used widely and this is how we compared the key users from the reports with those in the analyser. By comparing the number of times ‘key posts’ appear both in the report and the analyser we can understand how well the toolkit highlights the most important posts and users in the debate. It’s important to note that the quotes highlighted in the report focus on showing the range of comments rather than the users who were the most engaged in, or central to, the debate. It would be surprising if the key users highlighted by the analyser matched exactly those in the report. The analysis should, however, give an indication of the users who were particularly engaged with the debate, and should highlight similar comments to those used in the report. 4.6.3.2 Experiment results In the larger forums very few of the key users’ posts that were highlighted in the analyser actually appeared in the report. However, this is not surprising when relatively few posts from the forums appeared in the original reports. In the smallest forum, a much larger number of key posts were common to both the analyser and the report. The table below shows the comparison between the number of times key users’ posts appear in the reports and in the WeGov analyser. Sex Citizenship Equality Education Commission

How many quotes appear in the report? 40 38 48

How many key users26 does the analyser 19 18 30 identify?

How many times do key users’ posts appear 23 2 4 in the report?27

Table 18: comparing quotes from the reports and key users from the analysis results

26 These may be repeated as they are specific to the topic groups that their posts appear in. 27 Comments may be used in the report more than once due to snippets of longer posts being used in different sections. Each time a post is used this will be counted.

 WeGov Consortium

Page 110 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

In the larger forums the key posts that were highlighted in the analyser were very similar to the quotes that appeared in the report, even if the authors weren’t the same. For example, a quote from the equality report:

It's wrong to tell people how to dress, this country should be run on the basis of equality, so if you're muslim, christian or any other religion you should be allowed to wear what you want. A key user post from WeGov:

I may not be a muslim but I dont think it is fair that muslim women are not allowed to cover their faces and heads in a head scarf. It is their choice to wear what they want and I dont think anyone should be able to stop that no matter how powerful they are. This element of the analyser performed very well. WeGov consistently highlighted the most in- depth and thought-provoking posts, as well as the actual content of the key posts closely matching the quotes in the report. 4.6.3.3 Future Improvements Including the interactions between key users when analysing the key posts and key users. Users which were central to the debate, which answered others’ comments and responded with further questions, would be useful to be highlighted; particularly when trying to reward users for community building behaviour28. Explanation of how key users and posts were selected. Although the key post and key users analysis performed well when compared to the report it is unclear why the key posts selected by the algorithm were favoured over others. Increasing the number of key posts and key users in each topic group. For larger forums it would be useful to be able to see more key users in each topic group. Having only 3 key posts per topic group across a forum with 1,000+ comments is not enough and may exclude other interesting and useful posts. The number of key users to be identified could be set across a forum or across topic groups.

4.6.4 Sentiment analysis 4.6.4.1 Introduction To assess how accurate the sentiment analysis was, the output of the analyser was checked at report level to see whether the sentiment of the topic groups agreed with the report overall. A selection of individual posts was also checked for accuracy to see whether a human agreed with the positive, neutral or negative analysis of the statements.

28 This kind of user behaviour analysis formed part of the WeGov toolkit created by KMI, but evaluating this analysis component was outside the scope of this experiment.

 WeGov Consortium

Page 111 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

It is worth mentioning that measuring sentiment within a post or across a number of posts is a difficult challenge for both humans and computer programs! This part of the evaluation is much more a subjective judgement than other elements of this report. One of problems with judging the sentiment of forum data, compared to a 140 character Tweet, is that the greater detail and depth of data from forum posts means that users may have made both positive and negative statements within their posts. How the toolkit deals with this challenge or how this issue can be mitigated is considered. 4.6.4.2 Experiment results 4.6.4.2.1 Report level The analyser was compared with the reports for the overall sentiment of each of our test debates. The sentiment varied (as expected) between the different forums, with the analyser suggesting that the Youth Citizenship Commission debate and the Sex Education forum were the most negative. The Equality forum appeared very positive compared to the others.

To test the sentiment, the positivity or negativity was recorded for each of the topic groups and compared to the sentiment in the report. Below is a table showing how WeGov categorised each topic group in terms of sentiment.

Forum Positive Neutral Negative

Equality 9 0 1

Youth Citizenship Commission 3 0 3

Sex Education 3 0 4

Table 19: sentiment analysis of topic groups

Although the toolkit did accurately categorise the forums in terms of whether they were positive or negative, the Youth Citizenship Commission seemed more negative than the Sex Education forum from a human standpoint. It does appear, however, that there was a problem across the WeGov analysis components with the Youth Citizenship Commission forum as the data was of lower quality (shorter, repeated sentiments, with poor spelling) than the other forums.

The Youth Citizenship Commission forum was very negative in terms of young people’s reactions to politics and politicians, as this quote from the report shows:

HeadsUp users overwhelming said that they found politics boring and too complicated…Politicians drew many criticisms from young people for being out of touch, not listening, not speaking simply enough, setting a bad example, not visiting young people and blaming them for all society’s ills!

As the analyser only rates half the topics as negative (three out of six) and the average sentiment for the topics are only negative by the values of -0.03, -0.10 and -0.31, this does not compare

 WeGov Consortium

Page 112 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 that well with the report. The most negative sentiment score across the forums was -3.11 in the Sex Education forum, where sentiment scores were the most negative across the forums tested.

Figure 15: examples of short positive posts

One of the problems that the toolkit had to deal with in the equality forum was that there were a large amount of short comments simply supporting others’ points of view. See above for examples. Although the interaction between participants was certainly positive it is not clear that they were so positive about the topic itself. These comments appeared to skew the results by swamping the toolkit with these short positive statements, which were accurately allocated by the toolkit, but did not form a substantial or useful part of the debate around the issue of equality. 4.6.4.2.2 Individual post level

To test the accuracy of the sentiment scores in individual posts, the most positive and negative topic groups within each forum were examined. The first 50 posts in each of these topic groups (or the whole topic group if it contains fewer than 50 posts) were checked for accuracy by recording whether a human interpreter agreed with the sentiment analysis for each post. The degree of negativity or positivity was also taken into account, for example if the toolkit says a post is -10 negative there must be no ambiguity about its level of sentiment to be seen as accurate.

 WeGov Consortium

Page 113 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

A percentage accuracy rating for each forum was calculated, showing how accurate the sentiment analysis was and highlighting any problems or inconsistencies between the forums. Forum Percentage accuracy

Equality 81%

Youth Citizenship Commission 72%

Sex Education 70%

Table 20: percentage accuracy of sentiment analysis

This shows that like much of the other analysis the toolkit is more accurate when it is given more data to work with. The accuracy level seems quite high particularly, and most importantly, in the larger forums that are less likely to be analysed effectively by a human. It would be useful for users to know what information the toolkit uses to decide the degree of sentiment – is it the number of times positive or negative words appear, where they appear in the comments or how they relate to other words?

Figure 16:post showing incorrect analysis of sentiment

 WeGov Consortium

Page 114 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Above is an example of the toolkit getting the sentiment wrong. It is not clear why it has allocated this post as negative and more explanation about what it uses to decide would make it clearer for users to see when there has been an error and why the toolkit made a different judgement to a human. The human analysis of the words highlights 24 positive words and 19 negative words so the comment should be significantly more positive than the toolkit is suggesting. Even if the toolkit does not understand the context of the individual’s words, the number of positive words versus negative words suggests that the comment should have been categorised as positive. Most comments are not at the extreme edges of debate so the degree to which the toolkit sees comments as positive or negative is also important to judge accuracy. The toolkit was good at analysing the degree of sentiment when posts were of a medium length, were focused around one issue and when the spelling was good.

Below is an example of where the toolkit got this complexity right:

Figure 17: posts showing correct analysis of sentiment

However the toolkit does not always have enough knowledge of context to get the sentiment analysis right. An example is the debate on women’s sport in the Equality forum, where the

 WeGov Consortium

Page 115 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 toolkit either got the sentiment of the word ’sexist‘ wrong or did not understand the negative context in the comments containing this word. An example of a discussion that should have appeared as negative is the debate relating to women’s sport, which was probably the most discussed issue on this forum. The word ’sexist‘ was used 78 times in the discussions about sport yet this did not appear to register as a negative word in the toolkit. In some instances posters were saying that they didn’t think sports were sexist but in the majority of cases the word sexist was used in a negative context along with words such as, ‘rude’ and ‘unfair’. Only one of the posts containing this word was analysed as negative, whilst 23 were categorised as neutral and an overwhelming 54 were positive. It is possible that the toolkit did not understand the word ’sexist‘ or misunderstood it to be a derivative of ‘sex’ or ‘sexy’ which might be categorised as positive by the algorithm.

There was almost unanimous agreement that the world of sports favours men over women. Posters felt that men’s sports got more attention, were broadcast more often, and that sportsmen were higher paid than sportswomen… Stereotypes about gender were agreed to play a role in discrimination… A number of posters said they found these stereotypes frustrating, and said that people should be able to play any sport regardless of sex.

Although this word may be an anomaly that is skewing the results in this particular forum, it shows perhaps how difficult it can be for the algorithm to understand what the sentiment, whether positive or negative, is referring to explicitly. 4.6.4.2.3 Issues with sentiment analysis 1. Short positive statements in agreement with others skew the sentiment data making analysis appear more positive. There were issues with the analysis of sentiment of short positive posts that simply agreed with the posts of others. Although the sentiment was allocated correctly in most cases, the numbers of these types of posts in the largest forum made the overall sentiment appear more positive than it should have done. 2. The toolkit is particularly sensitive to different data lengths and quality when it comes to analysing sentiment. Longer posts that deal with contradictory ideas or are inconsistent in sentiment are not easy for the toolkit to categorise correctly. Likewise, poor quality posts that are short, repetitive or have poor spelling pose a similar challenge. 3. Dealing with negative words when the context is positive. The toolkit does not deal well with posters making suggestions that include negative words such as ‘not’, ‘worse’, ‘afraid’ etc. even if the context for using them is positive. Posters often used a negative pre-word rather than using the self-contained negative version of the word eg. ’not fair‘ rather than ’unfair‘. This seems to be a problem for the toolkit in some cases.

 WeGov Consortium

Page 116 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.6.4.3 Future improvements Sentiment should to be shown on a sliding scale.

To make the spectrum of sentiment understandable to the user, seeing the parameters of sentiment is important. Knowing that the sentiment is -3 out of -10 is very helpful to allow a better understanding of the results. Isolating words such as ’agree‘ or ’disagree‘ from the sentiment analysis. Rather than analysing these short agree/disagree posts with the other comments, these could be isolated and a separate counter could record the number of times these words appeared in each topic group. This would give a better indication of the level of support in each topic group and because these would not be part of the main analysis it would not skew the results. This, however, would need the relationships between the posts to be included in the data otherwise it could be really difficult to know what the agreement or disagreement relates to. Allowing users to select posts that are not to be included in the analysis. As with the topic analysis, there are always some posts that are not sufficiently on-topic or are too short to be relevant to the debate. Allowing users to exclude posts would allow the sentiment of a debate to be more accurate. This is another option to address the short agree/disagree posts as mentioned above. Splitting longer posts into sections that are then analysed separately. This could enable a more accurate understanding of posts when they contain conflicting sentiment or discuss more than one issue. In longer posts sentiment can conflict as the poster sometimes writes about different issues in one comment. This is a problem for the toolkit in understanding what the positive and negative sentiment relates to. For example when the words analysed change from 4 in a row being positive to 4 in a row being negative this could split the comment into posts a) and b), logging the sentiment separately to make this analysis more accurate. Explanation of how sentiment was calculated. More information is needed about how sentiment is decided by the toolkit to ensure the user can accurately interpret the results. Highlighting the words included in analysis. To ensure that users understand why the comments are being categorised as they are, colours could be used to show words that are deemed to be positive or negative by WeGov which would give the option for a quick double-check by the user. It would also be clearer which words the positivity and negativity were relating to.

 WeGov Consortium

Page 117 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 4.6.5 Conclusion of HeadsUp: Topic Opinion Evaluation 4.6.5.1 Introduction Although WeGov was primarily conceived of as a project focusing on the analysis of political conversations on social media, it also has applications for forums and blogs. Most websites now support comments and sites such as the BBC or Daily Mail regularly have hundreds of comments on each article. Civil society groups also run forums and blogs to connect with their members and supporters. Analysing the themes of these discussions is often beyond the resources these organisations have. WeGov could play an important role in helping small not-for-profit organisations, larger media organisations, as well as politicians and policy-makers to understand feedback across a range of communication channels. 4.6.5.2 Usefulness In the case of HeadsUp, the WeGov toolkit could be helpful in analysing forum data, particularly the larger forums with hundreds or thousands of comments. The WeGov toolkit takes seconds to analyse hundreds of comments, whereas human analysis takes days to see similar results. The interface is also beneficial independent of the analysis because it means that posts can be viewed and sorted in a number of different ways. Without this interface, a spreadsheet is used to sort, record and analyse comments, which is very time consuming. However, it is important to note that the data being tested on the toolkit had already been analysed manually so there was already an understanding of what the debates were about; discussions that were previously unseen may be more challenging for a user to understand. As the toolkit has been shown throughout this evaluation to work best when dealing with larger quantities of data this provides a useful tool for situations when the human brain cannot understand the entirety of a debate. As the toolkit performs well on relatively in-depth data this lends itself well to digital channels, such as blogs and forums that encourage more considered and less immediate responses. The toolkit also performed well in showing the nuances between different elements of a wider debate, such as the women’s sport debate, which is encouraging. This was useful as it provided a counter-perspective regarding the major issues of importance within a debate. The HeadsUp analyser is certainly an application the Hansard Society would use again, although we would be wary of relying on it entirely due to some of the issues that have been flagged up in this evaluation. However, the issues noted appear to be very dependent on the situation, the data being used and the context of the discussion. Following are some suggestions about how some of these difficulties could be overcome and how the toolkit could be applied to a broader range of situations. 4.6.5.3 Usability

 WeGov Consortium

Page 118 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.6.5.3.1 Explaining how the algorithm understands and processes data

It is not clear to the user how the algorithms work and this is very important to ensure that users understand what the results are showing. There needs to be an easy-to-understand explanation of the mechanics behind the toolkit for each of the analysis components; topic analysis, sentiment analysis and highlighting the key users and posts. There should also be an explanation of the irregularities in the toolkit and clarifying the results it provides that users would perhaps not expect, such as:

• why the same data yields different results

• why keywords appear in the order and frequency that they do

• why some posts are excluded from the analysis

• what the underlying principles are for excluding certain posts

• why posts that do not contain key words are allocated to a topic group

• how the toolkit decides which words are the most important to include in the analysis

• what the toolkit uses to judge degrees of sentiment in posts Although the way the algorithm processes and selects information to be included in the results is probably quite complicated, users will be far less likely to trust the results unless there is sufficient explanation of how it uses the information. A plain English explanation of the behind- the-scenes process is therefore essential. 4.6.5.3.2 Presentation of the results Once a user has a better explanation of how and why the toolkit works as it does, revealing the workings of the toolkit would be helpful. There are many areas where highlighting the hidden information could help a user to better understand the results they are presented with and would also mean results could more easily be manually checked for accuracy. These suggestions are:

• Showing more key words in each topic group and their relative importance in the analysis - perhaps through using a word cloud.

• Highlighting, in individual posts, the words that were part of the sentiment analysis. Perhaps using red for negative and green for positive.

• Showing all the excluded posts in a separate group so they are not removed entirely from the users’ view.

• Showing sentiment on a sliding scale to make it clear where a post lies in relation to the extremes of sentiment. 4.6.5.3.3 Greater options for the user to refine the data

 WeGov Consortium

Page 119 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

To make the toolkit applicable to a greater number of users and situations, having more options that allow the user to select the data to be inputted or the results shown would be very useful. User defined options could include:

• Selecting the number of key users and posts required within different topic groups.

• Allowing users to exclude posts from the analysis.

• Allowing users to exclude certain common words from the analysis.

• Splitting longer posts into sections with different sentiment or that deal with different issues.

4.6.5.3.4 Optimum conditions for the WeGov toolkit It has become clear throughout this evaluation that there are optimum conditions where the WeGov analysis works best.

The results are the most accurate and understandable when:

• Comments are medium length (3 or 4 sentences)

• Comments focus on one issue

• The spelling is good and few abbreviations are used

• Varied words and phrases are used to express similar points of view The WeGov analysis is less useful or accurate when:

• Negative words are used to modify a positive sentiment eg. ’not fair‘ rather than ’unfair’

• Comments are very short (one sentence) or very long, touching on multiple issues in the one post

• There are many short posts that repeat the same words or phrases

• When smaller less useful words are used in numerous posts eg. ‘think’ Clearly the data available, either from forums, blogs, Twitter or Facebook, is not going to change so the toolkit will need to evolve to deal with the variety of data it may encounter. The suggestions above may go some way to mitigating these issues but there may be other ways to ensure the toolkit can be used in as wide a range of situations as possible.

 WeGov Consortium

Page 120 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

4.7 Conclusion of the evaluation of the final toolbox

All stakeholders involved in this last evaluation round recognized the important progress made since the beginning of the project and as a result of the integration of their feedback on previous versions.

One of the major challenges of this evaluation phase was time management. The delivery of the last prototype in the middle of the summer break made it difficult to have a seamless planning. Therefore we were obliged to meet stakeholders right up to the end of the project.

The major improvements integrated in the last prototype of the WeGov project (version 3.0) had been made in terms of creating a logical workflow all the way from the initial search, the presentation of a clear overview of storable search results, to the initiation of analyses and the presentation of their results, and to the combination of analyses on different search run results. Significant effort had been made in the presentation aspects of the tools, by using colours to distinguish the functions and relationship of widgets, and to show sentiment analysis results. The user could configure lots of parameters to guide the searches and to make a trade-off between level of detail of topic analysis and system performance. The geographically restricted search had been better adapted to fit with the policy maker’s constituency. The introduction of German language in the analysis components has demonstrated the ability to cope with multilingualism, which was an especially important issue for the policy makers at a European level. The last evaluation phase was by far the richest one through its combined application of different methods. Workshops and direct interviews with stakeholders as diversified as policy makers from elected assemblies and institutions, supporting staff and experts allowed a continuous focus on usability and functionality. Complementary, in depth experiments were carried out to validate the accuracy and reliability. With the HeadsUp experiment, this happened by comparing WeGov topic opinion analysis results to a control group of manually pre-analysed data sets. In parallel, results of analyses based on four weeks’ intensive monitoring on policy area and Facebook pages selected by German policy makers were assessed by taking into account their existing experience with content discussed and user behaviour in their respective public spheres. Additionally, the Headsup experiment concluded that the WeGov tools are also applicable to forums. Both experiments gave a relative match between compared data sets but concluded that additional investigation is needed, especially to make the topic opinion analysis results more relevant, reliable and credible. The key result from these experiments is that the WeGov analysis tools’ results are only as good as the quality of the input data. Much concern was indeed expressed by the stakeholders about the relative weakness of data or lack of sufficiently nuanced opinions available on the social media to exploit for governance purposes. As WeGov will never be able to control this reality, it needs to find how to offer the best response to cope with it. Strategies including additional related searching to generate more input data and

 WeGov Consortium

Page 121 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 cleaning of data to remove obvious or useless words (e.g. “Facebook”, “Twitter”, “http”) and duplicate postings should be investigated. The combined user feedback showed the toolkit is not yet a market ready product because it needs further development to address the usability issues, to create more transparency regarding the algorithms used in the different analyses and to give guidance to the optimal integration of the offered functionality in the policy maker’s everyday environment. Nevertheless, this evaluation round has brought many interesting use cases as suggested by the stakeholders.

 WeGov Consortium

Page 122 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 5 End User Guidance and Conditions

5.1 Introduction

The 33 months’ end user engagement process that is described within the description of work for the WeGov work package 5 (end user engagement and evaluation) is now completed. This means the project has developed an iteration of the toolbox that is a final version with respect to the runtime of the project. The end users have influenced this version regarding its function and technical scope and the previous sections in this report provide further details of the process. To provide further opportunities for the toolbox to be improved and applied to a greater number of situations, this section gathers together the lessons learned from the end user engagement and provides information on best practice for future users of the toolkit. This section details: exemplary usage, user guidance with a manual and questions & answers, and the ethical approach of the project. In future, practitioners could use the best practice guideline to suggest a business case for the toolbox and perhaps a greater range of uses than has been identified so far. The research community could use the toolbox to support further research in the fields of e-governance, e-government and e-participation. In addition, IT Innovation has a decided to keep their server available for one year after the WeGov project end to allow the users, who asked for the necessary access credentials, to further experiment with the toolbox. This hosting commitment enables further stakeholder engagement and research opportunities that can be used as input for next steps in the WeGov exploitation track. At the time of writing, six further WeGov accounts were set up for German end users from NGOs, a church organization, a federal agency, and parliamentary parties. This interest in testing the toolbox arose during the PolitCamp12 workshop and indicates the need for public agencies, communities, policy makers and NGOs to find a way of understanding the social web using analysis software developed specifically for their needs.

5.2 Exemplary Usage

The WeGov toolbox is the primary outcome of the EU research project WeGov. However the toolbox was more than this - it enabled the socio-technical exchange between the project partners and the end users, using WeGov as a platform for this learning. The more the functionality improved within the WeGov toolbox, the more useful the feedback was from the end users regarding real life uses for the toolbox. Therefore the final toolbox version 3.0 is the most advanced. While the scenarios and use cases were driven by the end user partners at the beginning of the project, as the project progressed the end users were able to see increasingly tangible benefits of the toolkit and how it would fit with their daily work lives. Following are

 WeGov Consortium

Page 123 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 some example use cases that resulted from the process of effective end user engagement and that have potential to showcase best practice use of WeGov.

5.2.1 What talks about the city on Facebook? The city of Vienna has 23 city districts. Each district had at the time of writing between 707 and 4,500 likes. The WeGov toolbox provides the opportunity to monitor a number of Facebook pages. The toolkit can, at a glance, analyse all the topics that are discussed on these Facebook pages.

Figure 18: Facebook pages of all 23 districts in Vienna

The city of Gent also expressed an interest in following the specific discussions going on in the different city districts all have their own concerns due to different population characteristics and different socio-economic situations.

 WeGov Consortium

Page 124 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 5.2.2 What Discussion is taking place about daily News Topics on the Social Web? The scientific service of the German Bundestag provides five topics every day that are extracted from German newspapers. The first step is showing the topics on the Bundestag’s intranet. After this the service provides the abstracts of the original newspaper articles and the newspaper articles itself. But no comments are provided from outside the media. The ‘social’ discussion of these topics is available on Twitter and other SNS. But WeGov could provide additional analysis of the topics highlighted by the scientific service using the social web. During the last Bundestag’s workshop the audience discussed ‘WeGov as a service’. The idea of this service is that WeGov could automatically provide daily analysis related to the five daily topics. The functionality is currently available manually.

5.2.3 How can e-participation Portals be supported? E-participation portals like the Cologne’s budgeting portal provide useful topics, which are important for citizens. Nevertheless the assumption is that more in depth discussions are taking place on the social web, and this is why the city of Cologne operates social media channels to engage with their citizens. To improve their awareness of the topics that citizens are discussing online, analysis software like WeGov is necessary.

Figure 19: e-participation portal for the budget of Cologne

 WeGov Consortium

Page 125 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 5.2.4 Which Topics arise on a Facebook Page? Operating a Facebook page is one example for sharing information and discussing its content with other users. One example is the BBC web page. Here the BBC employees posting daily information that will be commented by hundreds of users. This use case provides the functionality to get a summary of the topics that have been discussed. Figure 20 shows the outcome of the WeGov topic analysis for the latest twenty posts and its 480 comments. The due date for the analysis was the 8th of October 2012 (09:36CET). The figure shows ten columns including the keywords, number of posts, the sentiment and the controversy:

• Keywords: The five keywords show one out of ten identified discussions.

• Number of posts: This number shows how many posts/comments are classified to the ten different topics.

• Sentiment: The sentiment shows whether the topic were discussed positive (green dot) or negative (red dot).

• Controversy: This scale shows the ratio of positive and negative posts/comments. Therefore a discussion is more agile regarding the number of controversial posts/comments if the scale increases.

Figure 20: Topic analysis for BBC website

 WeGov Consortium

Page 126 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

This use case is possible with Facebook pages and Facebook groups as input data. The toolbox supports additional functionality for analysing a bunch of Facebook pages and groups. The analysis process is also possible for a longer distance (e.g. four weeks). Compare section 4.3 for evaluation results.

5.2.5 Which Topics arise within a Facebook Post? There are posts on Facebook that create many comments. WeGov provide its users with functionality that analysis a discussion regarding the key words. Figure 21 shows on the left hand side a BBC Facebook page from 8th of October 2012 that has created 152 comments. On the right hand side Figure 21 shows two different topics that have been analysed by the topic analysis.

Figure 21: Topic analysis on BBC Facebook page

This functionality is available in the toolbox as a widget and can be applied for a Facebook post.

5.2.6 Quick catch up on a Topic via Twitter Twitter is an important information source to catch up on a topic in a brief and easy way. In the German Bundestag the scientific service informs MPs about special topics. The information is very rich, but it takes some time to get it. A faster opportunity to get a first impression of a particular topic is Twitter – here the user gets the up-to-date discussion with subtopics, links, figures and users. In the Bundestag it is often the case that MPs need to talk about topics, which are relatively new to them. Therefore Twitter is one of the commonly used tools. In addition to the WeGov interviewees, the German politician Volker Beck, mentioned exactly this use case during the PolitCamp12. This event links practitioners, policy-makers, citizens, NGOs and scientists to improve the engagement of politics and society. Figure 22 shows the dialogue on Twitter, where the author of this report asked Volker Beck what sample topics he was referring to: Timo Wandhöfer: “Mr. @Volker_Beck during the #pc12 you mentioned twitter as a “quicker” information source than the scientific service. What are your examples for topics? THX”

 WeGov Consortium

Page 127 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Volker Beck: “@timoWandhoefer Query categories are technology/rights/politics and web. Concerning #zensursula29 I have learned a lot on technical (im-) possibility”

Figure 22: Twitter dialogue with Volker Beck

The WeGov toolbox allows searches on a query and provides its end users with additional analysis.

5.2.7 Time-based Collection: Launching and pushing Topics In the case of the German Bundestag their representatives are responsible for special policy areas. These topics can be important on the local level, state level, national level, EU level or global level. The topics may change over time, new topics may arise, or become interesting for one geographically restricted area. For instance wind energy plants or biomass energy is currently a hot topic in nearly all parts of Germany. Before a topic peaks, parliamentarians are interested in the evolution of the topic and how their activities affect the topic. Therefore there is a real need for monitoring and analysing topics over time. It is helpful to see if there are peaks within the discussion activity, particularly if the parliamentarian has started the activity. Another interest is the user roles and how they change over time. This kind of analysis may indicate the level of interest around the topic, if for instance the number of broadcasters increases. This kind of analysis may provide conclusions about successful citizen integration in the discussion or if a topic is losing the citizens’ interest.

29 Zensursula is the nickname of the German politician . It’s a combination of the German word for censorship and her name. Cp. URL: http://en.wikipedia.org/wiki/Zensursula (retrieved 2012-10-08)

 WeGov Consortium

Page 128 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 5.2.8 Monitoring of political SNS Campaigns On the 20th anniversary of the single market the EPP group within the European Parliament launched a campaign, exclusively using social networks. During this period of general gloom and negative perception of the EU by citizens, they felt it was important to raise awareness about the major achievements of the European Union and its benefits for the daily lives of citizens. The initiators of this campaign needed tools to monitor the effects of these kinds of campaigns and wanted to test whether WeGov could give them an adequate response.

5.2.9 Increased alertness during pre-electoral Periods During the pre-election period, policy makers are likely to be especially interested in the close monitoring of discussions taking place on the Facebook pages of the different parties involved in the election.

5.3 User Manual

The user manual is a document that guides end users through the toolbox and helps them to explore its many functions. The document includes an overview, background information on the project and explains the toolbox functionality step by step.

Please find the following versions of the user manual online: ● User manual for the toolbox 3.0 in German: http://tiny.cc/l2xjlw (Retrieved 2.10.2012)

● User manual for the toolbox 2.5 in English: http://tiny.cc/rpyjlw (Retrieved 2.10.2012)

5.4 Questions & Answers

The Questions & Answers are accessible online within a Google docs document. The aim of the document is to help end users while using the toolbox and provide them with the first level of support. It is often the case that end users ask similar questions that need to be answered to explore the benefits of the software but the answers cannot always be integrated within the user interface. Therefore the document covers some of these questions that were initially asked by end users. Another reason for providing tooltips externally is because the Questions and Answers document is a “living” document that can be modified by the WeGov consortium at anytime.

Please find the online document here: http://tiny.cc/7u2tkw (Retrieved 2.10.2012)

 WeGov Consortium

Page 129 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

5.5 WeGov Privacy Considerations

5.5.1 Background WeGov collects user-generated content from social networking sites, along with a small amount of information about the social network-using citizens in order to provide input for analyses that can answer questions such as “what are people talking about in my local area?”. In developing tools for soliciting, harvesting, processing and storing citizens’ political opinions from the social networking sites (SNS), WeGov must address significant legal and ethical considerations. During the project, we have evolved a position regarding the legal and ethical considerations (described in this document), and this is primarily aimed at respecting the privacy of the social network users. The basis for our position has been the comprehensive analysis [1] conducted by the University of Southampton’s Internet law group, ILAWS, during the first year of the project, and we have used this work to determine our current practical legal and ethical position. The different types of people or organisations directly or indirectly involved in WeGov are shown in Figure 23.

WeGov User

Social Networks Citizens WeGov Toolkit

WeGov Operator

Figure 23: Roles associated with WeGov

There are three main roles:

• The citizen using social networking sites (the citizen or the social network user). The citizen is not directly involved in WeGov, but their posts are collected and analysed by WeGov.

 WeGov Consortium

Page 130 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The hoster or operator of a WeGov toolkit (the WeGov operator). This is the person or organisation that installs and runs the WeGov toolkit. They may do this for themselves, or as a service for other people.

• The user of the WeGov toolkit (the WeGov user). This is typically a governmental policy maker or researcher, who is using the toolkit. A key principle we are working to is that the citizens’ privacy must be respected. Privacy is a fundamental human right (see e.g. Article 12 of the Universal Declaration of Human Rights, described in [2]), and respect for this is the overall guiding principle for this document. The target audience for this document are those that are directly involved in the WeGov toolkit, and these are the operator and user. Our overall approach to data privacy is that it is the responsibility of a future operator or user of the WeGov toolkit to ensure they are collecting, storing and processing information in a fair and responsible manner. This document is intended to provide guidelines to inform them of the issues and to assist them to use the toolkit whilst respecting privacy. Our position has evolved during the project based on what is practical and found to be possible, but also the changing landscape of what is deemed acceptable in the ever-evolving field of social networking and data protection considerations within it. Therefore this document is to be considered a snapshot of the project’s position as it comes to an end, rather than a definitive position that can be simply used without question.

5.5.2 Key Issues This section discusses the issues we have determined are important from the perspective of processing data from social networking sites for the purpose of political engagement. Each section describes an issue, its impact or effect on the project’s purpose and our conclusion regarding it. 5.5.2.1 Personal Data Social networking is geared towards people interacting with each other, so there is a significant probability that personal information is included in posting made on social networking sites, so therefore processing personal data must be considered.

The UK Information Commissioner’s Office (UK ICO) defines personal data as follows: “Personal data means data which relate to a living individual who can be identified – (a) from those data, or (b) from those data and other information which is in the possession of, or is likely to come into the possession of, the data controller, and includes any expression of opinion about the individual and any indication of the intentions of the data controller or any other person in respect of the individual.”

 WeGov Consortium

Page 131 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

From [3]

A working simplification that errs towards safety can be: any data that can identify a person is regarded as personal data, and is subject to data protection regulation. Personal data must be stored and processed in accordance to the rules specified in the data protection legislation of the country where the data is acquired and processed. At the beginning of the WeGov project, pseudo-anonymisation was investigated as a means of removing the “personal” status of data (i.e. processing it so that it is impossible to identify a person from it), but when used for our application of analysing social networking posts, this technique carries with it several problems:

• It is very difficult to guarantee complete anonymisation – some data may contain personal information that automated anonymisation techniques can miss. To assume the result is anonymous is potentially disastrous.

• The process of anonymisation strips out a lot of potentially useful information, sometimes to the extent of rendering the resultant data useless.

• Actually doing anonymisation means that the data being processed is personal data, so the organisation actually doing the anonymisation must be compliant with data protection. Given these problems, we concluded that pseudo-anonymisation is not realistic. As a consequence of this we also concluded that we must accept we are processing personal data, and therefore must be compliant with data protection legislation. The conditions for processing personal data are defined by [3] as: “[…] at least one of the following conditions must be met whenever you process personal data:

• The individual who the personal data is about has consented to the processing.

• The processing is necessary:

o in relation to a contract which the individual has entered into; or o because the individual has asked for something to be done so they can enter into a contract.

• The processing is necessary because of a legal obligation that applies to you (except an obligation imposed by a contract).

• The processing is necessary to protect the individual’s “vital interests”. This condition only applies in cases of life or death, such as where an individual’s medical history is disclosed to a hospital’s A&E department treating them after a serious road accident.

 WeGov Consortium

Page 132 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• The processing is necessary for administering justice, or for exercising statutory, governmental, or other public functions.

• The processing is in accordance with the “legitimate interests” condition.” From [3] (emphasis has been added here)

The issue of whether to acquire consent from the data subjects (the users of social networks) was initially considered by the project, but given that the project aims to extract many thousands of posts from social networks, we concluded that the process of acquiring explicit consent from each post’s author would prove intractable. Therefore we investigated other conditions for processing, and it is the last clause, “legitimate interests”, that is most likely to be applicable to many processors for WeGov’s purposes, because it provides the most flexibility. Legitimate interests are subject to requirements, as described by the UK ICO: “The Data Protection Act recognises that you may have legitimate reasons for processing personal data that the other conditions for processing do not specifically deal with. The “legitimate interests” condition is intended to permit such processing, provided you meet certain requirements. The first requirement is that you must need to process the information for the purposes of your legitimate interests or for those of a third party to whom you disclose it. The second requirement, once the first has been established, is that these interests must be balanced against the interests of the individual(s) concerned. The “legitimate interests” condition will not be met if the processing is unwarranted because of its prejudicial effect on the rights and freedoms, or legitimate interests, of the individual. Your legitimate interests do not need to be in harmony with those of the individual for the condition to be met. However, where there is a serious mismatch between competing interests, the individual’s legitimate interests will come first. Finally, the processing of information under the legitimate interests condition must be fair and lawful and must comply with all the data protection principles.” From [4] (emphasis has been added here)

In the UK ICO specialist guidelines for data protection [5], the phrase “legitimate interests” is further discussed: “The Commissioner takes a wide view of the legitimate interests condition and recommends that two tests be applied to establish whether this condition may be appropriate in any particular case. The first is the establishment of the legitimacy of the interests pursued by the data controller or the third party to whom the data are to be disclosed and the second is whether the processing is unwarranted in any particular case by reason of prejudice to the rights and freedoms or legitimate interests of the data

 WeGov Consortium

Page 133 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

subject whose interests override those of the data controller. The fact that the processing of the personal data may prejudice a particular data subject does not necessarily render the whole processing operation prejudicial to all the data subjects.” From [5] (emphasis has been added here) The conclusion from this is that as long as the processing qualifies as “legitimate interests”, i.e. is lawful, legitimate and does not unduly prejudice the data subject (in our case the user on social networks), then processing personal data is permitted.

How we arrived at these conclusions is discussed in greater detail in [13] and this in turn is derived from [1]. The conclusions are based on careful analysis of what data we are processing, together with the potential for severe legal consequences and reputation damage should anyone not take adequate data protection measures. 5.5.2.2 Sensitive Personal Data Sensitive personal data may be relevant to users and operators of WeGov. The UK ICO defines sensitive personal data as follows: “Sensitive personal data means personal data consisting of information as to - (a) the racial or ethnic origin of the data subject, (b) his political opinions, (c) his religious beliefs or other beliefs of a similar nature, (d) whether he is a member of a trade union (within the meaning of the Trade Union and Labour Relations (Consolidation) Act 1992), (e) his physical or mental health or condition, (f) his sexual life, (g) the commission or alleged commission by him of any offence, or (h) any proceedings for any offence committed or alleged to have been committed by him, the disposal of such proceedings or the sentence of any court in such proceedings.” From [3] (emphasis has been added here) It is clear from the above that sensitive data may be relevant to WeGov-type processing due to item (b) – the data may contain political opinions. As a consequence of this, we concluded that the safest position is to assume we are processing sensitive personal data and abide by the conditions for processing for sensitive personal data. Processing sensitive personal data imposes conditions for processing over and above those for personal data. The ICO defines the conditions for processing sensitive personal data as follows [4]:

 WeGov Consortium

Page 134 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

“[…] if the information is sensitive personal data, at least one of several other conditions must also be met before the processing can comply with the first data protection principle. These other conditions are as follows:

• The individual who the sensitive personal data is about has given explicit consent to the processing.

• The processing is necessary so that you can comply with employment law.

• The processing is necessary to protect the vital interests of:

o the individual (in a case where the individual’s consent cannot be given or reasonably obtained), or

o another person (in a case where the individual’s consent has been unreasonably withheld).

• The processing is carried out by a not-for-profit organisation and does not involve disclosing personal data to a third party, unless the individual consents. Extra limitations apply to this condition.

• The individual has deliberately made the information public.

• The processing is necessary in relation to legal proceedings; for obtaining legal advice; or otherwise for establishing, exercising or defending legal rights.

• The processing is necessary for administering justice, or for exercising statutory or governmental functions.

• The processing is necessary for medical purposes, and is undertaken by a health professional or by someone who is subject to an equivalent duty of confidentiality.

• The processing is necessary for monitoring equality of opportunity, and is carried out with appropriate safeguards for the rights of individuals.” From [4] (emphasis has been added here) Most of these conditions are not applicable for our purposes, but the key condition for our purposes is where the individual has made their information public. This condition is related to the user’s “expectation of privacy”, discussed next. 5.5.2.3 Expectation of Privacy

In addition to the question as to whether data is regarded as personal, it is important to distinguish between those situations where citizens have a reasonable expectation of privacy, and those situations where data is clearly made publicly available and there is no expectation of privacy. There is a spectrum of privacy in many human situations that is governed by citizens’ expectation of privacy: for example, people moderate their speech depending on the situation –

 WeGov Consortium

Page 135 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 a person may not say the same things when interviewed on national television that they might at home with their family. We assert this spectrum of privacy is equally applied to social networking: public Tweets are at one end of the spectrum and can be freely accessible by anyone, whereas access to posts on a private Facebook wall are subject to private, vetted membership and password authentication. In this second case there is clearly an expectation of privacy on the part of the Facebook user making the posts. The conclusion we draw from this is that we should only collect data from sources where there is no expectation of privacy on the part of the social network user – i.e. we should only collect data from sources that are obviously public. 5.5.2.4 Precedent of Related Services

The pattern WeGov uses (collection and analysis of postings on social networking sites) is already widely used by marketing departments and advertising agencies to support commercial brand development (e.g. “what is the social network buzz on the new training shoe we just launched?”). Numerous SNS management analytics services are targeted at this sector (see for example [7], [8]). Mainstream brands are happy to use and be publicly associated with these analytics services: as an example, one of the leading SNS analytics providers, Crimson Hexagon, advertises that Johnson & Johnson, Microsoft and Ogilvy are in its client list. 5.5.2.5 Sensitivity, Perception & Reputation A novel feature of WeGov is in the application of social networking management and analytics functions to government policy-making. Even given that there is a precedent of the use of social networking data for marketing purposes, there is greater sensitivity inherent in political discussion even though many of the underlying privacy and data protection issues are shared with commercial analytics. Governments have a particular need to be open and honest about their handling of public data and to avoid any impression that they are placing citizens under surveillance. It is important that users of the toolkit recognise the danger of negative publicity, negative perception or reputation damage. The major principle to observe in order to minimise this danger is to collect and process data “by the book” (this is mainly observe correct data protection procedures) and be as open as possible about the collection and processing of data. Privacy policies promote openness in the processing of personal data. Data protection procedures require privacy policies to inform the data subjects about how their data will be processed and stored. A future operator of a WeGov service must create privacy policies and determine means by which they can be distributed or made available to the data subjects (the social network users). 5.5.2.6 Intellectual Property Rights The intellectual property rights of most relevance to the WeGov project are likely to be copyright and the database right. There are numerous EC directives dealing with certain aspects of

 WeGov Consortium

Page 136 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 copyright. In England and Wales the right is protected by the Copyright Designs and Patents Act 1988 (CDPA). 5.5.2.6.1 Copyright SNS postings may be regarded as copyrighted and owned by their author (this is currently undecided in the case of tweets and comments due to their brevity). However, it is safest to assume that copyright for posts vests with the author. The terms of use of SNS generally provide that the user grants the SNS with a non-exclusive royalty free license to use, copy, reproduce, process, adapt, modify, publish etc. the user’s content. Whether third parties are granted a right to use this content depends on the SNS concerned, but both Twitter and Facebook provide APIs for third parties to access the publicly-accessible postings of their users, subject to terms and conditions. The key recommendation is to check carefully the terms of the social network in question, and confirm that collection of user-generated content is permitted and does not violate the copyright of the user that created it. 5.5.2.6.2 Database Right In addition to the protection afforded to original databases under the UK CDPA 1988, the EC Directive, Directive 96/9/EC (the ‘Database Directive’) distinguished between original databases and non-original databases. Original databases if satisfying the higher test of ‘author’s intellectual creation’ are protected as copyright works, but non-original databases may also be protected by the standalone (sui generis) database right, and databases containing user- generated content such as social networks, fall into the non-original database category. The major implication of this is that any operator of WeGov must only access SNS through their official APIs and in accordance with the terms and conditions for API access. Any other automated systematic data collection pattern, such as the use of bots, spiders or screen scraping is prohibited. The use of SNS is bound by terms and conditions of use, typically embodied in an End User License Agreement (EULA) and often extended where access is automatic and through a software API. These terms and conditions may impose further restrictions on the collection and processing of data from SNS sites and in all cases need careful review in the context of what the policy-maker is seeking to achieve.

5.5.3 Conclusions from Key Issues The main conclusions from the discussion above are summarised as follows.

• Collect data only from sources that are obviously public. This means that the creators of the data (e.g. social network postings) have no expectation of privacy.

• Consider that some data collected, even though it is public, may contain sensitive personal information, and therefore comply with data protection regulations for processing sensitive personal data.

 WeGov Consortium

Page 137 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• Provide privacy policies that describe to the data subject what data about them is collected, how it will be processed and how long it will be kept for.

• For each social network that is a potential data source, carefully check its API terms and confirm that collection of user-generated content is permitted and does not violate the copyright of the user that created it.

• Access the social networks through their official APIs, and comply with their terms and conditions for access.

5.5.4 Implementation Recommendations This section provides some recommendations on how to address the conclusions of the previous section. The recommendations broadly follow the order of the conclusions above, with the exception of compliance to social networks’ terms and conditions, which have been omitted because they are specific to actual deployment situations. This section contains recommendations only, and reflects the situation at the time of writing, so they cannot be regarded as definitive or complete – the onus is on the future operator or user of a WeGov service to determine that they are behaving legitimately, legally and fairly. 5.5.4.1 Publicly Available Data We believe that there is not a requirement to get explicit consent from the creator of SNS content, provided that the content is posted on an obviously public place, and that the creator of the content has access to information telling them that the place they are posting to is indeed public. Therefore, each social network that is a potential data source must be examined and a judgment made about which elements of it are “obviously public” to its users. Some details and specific examples addressing the question of whether a source is “obviously public” follow.

• Comments made on public areas of social networks that are clearly intended for a mass public audience are not considered private and we believe collecting these messages for analysis is acceptable, provided that collection is via the official API and in full compliance with the API’s terms and conditions.

• The Facebook Data Use Policy [11] and the Twitter Privacy Policy [12] both clearly state the conditions under which the users’ content is public. For Facebook, this is mainly when the user chooses to make some content public, or posts into a public area. For Twitter, the general situation is that content (i.e. Tweets etc) are public.

• In the absence of a clarifying legal ruling, Twitter messages are considered to be broadcast in the public domain and we believe collecting these for analysis is acceptable. (This is consistent with the position taken by commercial SNS analytics vendors).

 WeGov Consortium

Page 138 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• Private Facebook posts or private Facebook Groups, Pages or Walls all have an expectation of privacy on the users’ part and therefore data can only be collected with the prior explicit consent of all users of the Group. This raises considerable practical implications. For example, all users of an existing group must consent to the collection and processing of the group’s data, and if one user refuses, the group cannot be monitored. A newly-created group may require consent as a condition of joining, but this may limit the potential numbers of group members. In general, for practical reasons concerning acquiring explicit consent, it is thus recommended that any non-public areas of Facebook are avoided, but if there is a strong case for monitoring a non-public area, explicit consent of all existing and prospective group users must be assured before proceeding.

• We believe collecting posts from public Facebook Groups, Pages or Walls is acceptable, provided that it is clear to the Facebook users that the Group, Page or Wall is indeed public. A typical case is collecting comments on a politician’s Wall, and most politicians’ Walls are obviously visible to all. For example, a Google Search on “David Cameron Facebook” returns his public wall, and the posts and comments can be viewed without logging into Facebook. 5.5.4.2 Data Protection 5.5.4.2.1 Responsible Party Data protection requires that a legal entity take responsibility for the processing of personal data, and this entity is named the “data controller”. The ILAWS report [1] describes the data controller as follows. “A data controller is defined within the Data Protection Directive as ‘the natural or legal person, public authority, agency or any other body which alone or jointly with others determines the purposes and means of the processing of personal data’. When considering what is meant by the term ‘determines’, factual influences should take precedence over formal requirements (i.e. contractual allocation of responsibility), as it may be the case that the contract does not reflect reality. The following questions should be asked:

• Why is the processing taking place?

• Who initiated it? The opinion considers that the ‘purposes’ and ‘means’ can be equated to the ‘why’ and ‘how’ of the processing.” From [1] The data controller is thus the legal person who decides that data should be processed, why it should be processed and how it should be processed. The actual processing may be done by

 WeGov Consortium

Page 139 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 another party (named the “data processor”) under instruction from the data controller, but the data controller always bears responsibility for the processing. The data controller is also responsible for ensuring compliance with the data protection legislation. In the WeGov situation the data controller may vary, depending on the deployment situation. There are two major situations corresponding to likely exploitation plans for WeGov:

• that where WeGov is deployed, hosted and operated by a service provider and users login and use the service (here the WeGov operator is likely to be the data controller), and

• that where the user deploys and runs the WeGov toolkit themselves (here the user is the data controller).

5.5.4.2.2 Data Protection Principles & Compliance This section expands on material taken from [1], the ILAWS report on legal analysis. This is a publicly available version of an appendix to the WeGov deliverable D5.1 [6]. We begin by quoting the principles and after this each is discussed in turn. The Data Protection Directive’s obligations are contained within Articles 6, 8, 10, 11, 12- 15, 17 and Article 25. These have been implemented within the Data Protection Act 1998, forming the eight data protection principles. These are: 1. Personal data shall be processed fairly and lawfully and, in particular shall not be processed unless- a. at least one of the conditions in Schedule 2 is met, and b. in the case of sensitive data, at least one of the conditions in Schedule 3 is also met 2. Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes. 3. Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed. 4. Personal data shall be accurate and, where necessary, kept up to date. 5. Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or purposes. 6. Personal Data shall be processed in accordance with the rights of data subjects under this Act.

 WeGov Consortium

Page 140 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

7. Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data. 8. Personal data shall not be transferred to a country or territory outside the EEA unless that country or territory ensures an adequate level of protection for the rights and freedoms of data subjects in relation to the processing of personal data. From [1]

The relevance of each of these principles to our application is now discussed in turn. 5.5.4.2.3 Data shall be processed lawfully, fairly, and only under permissible conditions. This principle concerns the legality, legitimacy and the conditions under which it is permissible to process personal data. This has already been discussed, and the main factors are determining that the processing is legitimate and lawful, together with the question of whether explicit consent is required, and our conclusion here is that acquiring explicit consent from all data subjects (i.e. all social networking users who make posts we want to collect) is intractable, so we must assume there is no explicit consent. In this case, we have determined that we must only collect from obviously public sources. 5.5.4.2.4 The personal data shall be collected for a stated purpose, and not processed for additional purposes. The implications for our application are: a) The purpose for which the personal data is processed needs to be decided before the data is collected. b) The data subjects should be informed of the data collections and the purposes of the data collection. The operator of WeGov should create and publish a privacy policy dictating what information is collected, what it will be used for, what security precautions are taken, how information is maintained and removed, and a means for users to request the information stored about them. WeGov-specific samples of privacy policies are given later, guidance to writing a privacy policy is given in [10] and more general examples are given in [14]). c) Once the purpose for data processing has been determined and advertised in a privacy policy, all processing on the data collected for that purpose cannot be processed for another purpose. d) Even when there is no expectation of privacy, informing social network users about data collections is encouraged. This can be in the form of engagement – for example if the comments to a politician’s post are analysed, the politician can post a response. This should encourage engagement and debate.

 WeGov Consortium

Page 141 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

e) WeGov provides tools to enable the following processing types, and those required should be stated by the data controller in the privacy policy. i. The themes of conversations on social networks ii. The ebb and flow of debates on social networks (e.g. what is trending) iii. The sentiments of citizens iv. The key (important, influential) citizens v. What people are saying in a local area (e.g. an MP’s constituency)

vi. Trends in themes and sentiment over short, medium and long term. 5.5.4.2.5 The personal data shall be adequate and not excessive for the purpose of processing. This means the data controller must only collect data required for our purpose, and nothing more. a) For the WeGov analyses (given above) the data required is: a. Social network posts – e.g. a comment on Facebook or a Tweet; b. Basic user information, e.g. name, home area, number of followers etc. b) As described above, the data controller should collect only where there is no expectation of privacy, so therefore all the data collected should be publicly available. 5.5.4.2.6 Personal data shall be accurate and where necessary kept up to date. Accuracy of post information is important. Any operator or user of WeGov must not alter the content of the information collected by WeGov. Apart from being contrary to the terms of data protection, it can misrepresent what people are saying. Having said this, there should be no reason why a user of WeGov would want to alter the content of the information collected for them by WeGov – to do this would result in misleading conclusions. Keeping the information up to date is not relevant for our application, because we are collecting public domain information that represents a particular moment in time. For example, we may be collecting post information that represents social network users’ opinions about a subject. The social network users have made the comments in a public forum, and we are collecting the information that represents an historical snapshot. We may repeatedly search for the same subjects, so if a user modifies their opinion, this will be reflected in subsequent searches. 5.5.4.2.7 Personal data shall not be kept for longer than for the needs of the stated purpose. Our major reason for data collection is to provide input to analyses. Many of these analyses are short-term, so the source data should be deleted once the analysis is run. Other analyses may be longer term (e.g. collecting data for months or years to determine trends over long timescales), and the analyses may be run multiple times on the data, so the data will need to be kept after an analysis has been run. At the current time it is the decision of the WeGov user when to delete

 WeGov Consortium

Page 142 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 data stored in the toolkit, but this decision must be consistent with the purpose for which the data has been collected. The purposes for which the data has been collected need to be stated in the privacy policy, and for each purpose, a retention time or trigger event should be specified. For example, a short term analysis of an MP’s popularity may have two weeks’ retention, while a longer term analysis of public opinion to a policy decision may require longer term storage. In all cases, the stated purpose must describe when the data should be deleted, and this must be adhered to.

Our processing is mainly concerned with making recommendations – selecting key users or posts to watch for the WeGov user. As a result of this, it should be well understood by the WeGov operator and user that the results of WeGov analysis are likely to contain personal data (even though it is public domain). Therefore the retention policies will need to apply not only to data collected from social networks, but also to the output of analyses, unless the analyses’ output does not contain any personal information.

• A technical solution to assist users in data retention could be as follows. Each data collection is tagged with a purpose and a retention policy (e.g. a time limit from collection or after a certain event, such as an analysis run, has occurred). We can use rules or database triggers to delete the data based on these policies. 5.5.4.2.8 Personal data shall be processed in accordance with the data subjects’ rights. The UK ICO describes the rights of data subjects as follows [9]:

• “a right of access to a copy of the information comprised in their personal data;

• a right to object to processing that is likely to cause or is causing damage or distress;

• a right to prevent processing for direct marketing;

• a right to object to decisions being taken by automated means;

• a right in certain circumstances to have inaccurate personal data rectified, blocked, erased or destroyed; and

• a right to claim compensation for damages caused by a breach of the Act.” From [9]

The requirements on the data controller are therefore to:

• Provide a clear means of contacting them to enable the processing of information access requests, objections etc.

• Provide information about a data subject when the data subject asks for it. Note: many companies charge a nominal fee for this (currently in the UK is limited to a maximum of £10), and this is to prevent companies being deluged by information access requests, which can create a lot of work and hinder their normal working practices.

• Stop processing when the data subject submits an objection.

 WeGov Consortium

Page 143 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• Correct or update data when the data subject makes a request to do so. 5.5.4.2.9 Appropriate measures shall be taken to prevent unauthorised access, altering and processing of the personal data.

It is the responsibility of the data controller to determine the adequate security measures, both technical and operational. WeGov is implemented and deployed using a number of standard security techniques to provide protection against data compromise or leakage, and to protect its users. These include: a) The channel between the user and WeGov is secured using secure socket layer encryption (commonly known as HTTPS). This means that anything sent over this link is not visible to anyone but the user and WeGov. b) Users are registered by a human who vets their application and makes a judgement as to the suitability and level of trust of a prospective user.

c) Users are authenticated with a username and a password that is not stored on the server. (A hash is generated and used to confirm the password when the user logs in.) d) Each WeGov user has its own secure space. Any data collected for one user is not visible to any other WeGov user. e) Each user’s activity is recorded at the server, to protect other users and provide an audit trail in the event of misbehaviour. f) The server is deployed in a de-militarised zone (DMZ) behind a firewall. g) The server is kept patched and up to date with frequent virus scanning. h) The server is segregated from other servers. Any future deployment of WeGov must include these aspects as a minimum security level. Any organisation hosting a future WeGov system will have to do its own risk and threat analysis and determine its own security requirements and determine countermeasures in addition to the standard ones above. 5.5.4.2.10 Personal data must not be transferred outside Europe unless the recipient country has equivalent protection and policies for privacy protection. It is the responsibility of the data controller to ensure this is adhered to. The first consideration is where the servers storing and hosting the WeGov system are located. Once this is determined, the question is whether to transmit the data to any third parties for processing. In many cases, the simplest solution is to process everything at the WeGov server, so there is no external transmission.

 WeGov Consortium

Page 144 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 5.5.5 Policies This section provides some example privacy policies. There may in fact be two privacy policies required – we are possibly collecting personal information from two classes of data subject:

• the social network user (the citizen), by searching on social networks; and

• the WeGov user, by providing them with user accounts and recording their searches etc. It is recommended that the operator of WeGov create an acceptable use policy (AUP), describing what the WeGov users can and cannot do using the toolkit, and each prospective user must sign up to the AUP before they are given access to the toolkit. An example is provided in this section. These are sample policies only. The onus is on the WeGov data controller to ensure that the operational policies are complete and correct. 5.5.5.1 Social Network Users’ Privacy Policy This privacy policy sets out how the WeGov data controller uses and protects any information collected from social networks that is created by their users. WeGov is committed to ensuring that the social network users’ privacy is protected. Only information that is publicly available is collected from social networks. Social network users whose data is collected by WeGov are assured that it will only be used in accordance with this privacy statement. WeGov may change this policy from time to time by updating this page. You should check this page from time to time to ensure that you are happy with any changes. This policy is effective from 19 March 2012. 5.5.5.1.1 What we collect We may collect the following information, but it is only collected if it is public domain:

• Social network “post” data. Examples of a post are Tweets, Facebook posts or comments. They typically include:

o The username who created the post; o The content of the post (e.g. text with a comment or tweet); o The date of the post o Geographical information

• Social network “user” data. This comprises the publicly-available user information as supplied by the social network. It may contain:

o The user’s ID from the social network; o The user’s real name (this is deemed public by Facebook’s data use policy [11]). o Gender, date of birth and home geographical location

 WeGov Consortium

Page 145 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

o Social graph information, e.g. number of followers, number following. 5.5.5.1.2 What we do with the information we gather We require this information to understand what citizens are saying on social networks about varying issues, and to determine who is being listened to. In particular we collect data to support the WeGov user. The purposes for which we collect may include:

• To provide the WeGov user with a keyword search facility that collects relevant comments from social network users.

• To provide the WeGov user with a tool for monitoring public Facebook groups – to collect seed posts and comments on those posts in the group.

• To provide analyses of the social network data to summarise the key themes and sentiments of the comments.

• To show key posts and key users in a debate.

• To provide analyses of user profiles to determine who are the influential users in a debate.

• To recommend places on social networks where the WeGov user may wish to make postings.

• To recommend users from social networks the WeGov user may wish to contact, retweet, reply to or follow. 5.5.5.1.3 Security WeGov is committed to ensuring that social network users’ information is secure. In order to prevent unauthorised access or disclosure, we have put in place suitable physical, electronic and managerial procedures to safeguard and secure the information we collect online. 5.5.5.1.4 How we use cookies Cookies are not used to record any information created by or about social network users (social network users) in WeGov. 5.5.5.1.5 Data Retention We will not keep post data for longer than it is required for processing and providing results to the Policy Makers. We have determined three retention classes for data that are related to the intended purpose for which they are collected. The WeGov user will specify the intended purpose before they collect data, and this in turn determines the retention period. After the retention period is expired, the data will be deleted from our server. The retention classes are:

• Short-term analysis. The main purpose of analyses in this class is to provide quick answers to the Policy Maker about hot issues of the moment. Therefore there is no

 WeGov Consortium

Page 146 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

requirement to keep the data for long periods. The data will be deleted after a period of 30 days from the date of collection.

• Medium-term analysis. The purpose of analyses in this class is to collect and analyse data over a medium term to determine trends, or to track a long-running story for example. The data in this class will be kept for a period of one year from the date of collection, and after this will be deleted. Any data derived from this that does not identify individuals may be kept indefinitely.

• Long-term analysis. This is for when the data is collected and analysed over a long period. We will retain this data for a period of five years after the collection date, after which it will be deleted. Any data derived from this that does not identify individuals may be kept indefinitely.

• Any data derived from analysis results that does not identify individuals (e.g. either by name or user name) may be kept indefinitely. This may be for example statistical summary data. 5.5.5.1.6 Controlling your personal information We will not sell, distribute or lease your personal information to third parties unless we have your permission or are required by law to do so. You may request details of personal information which we hold about you under the Data Protection Act 1998. A small fee will be payable. If you would like a copy of the information held on you please write to [address] or email [email address]. If you believe that any information we are holding on you is incorrect or incomplete, please write to or email us as soon as possible, at the above address. We will promptly correct any information found to be incorrect. 5.5.5.2 WeGov Users’ Privacy Policy

This privacy policy sets out how WeGov uses and protects any information that you give us when you use this website. WeGov is committed to ensuring that your privacy is protected. Should we ask you to provide certain information by which you can be identified when using this website, you can be assured that it will only be used in accordance with this privacy statement. WeGov may change this policy from time to time by updating this page. You should check this page from time to time to ensure that you are happy with any changes. This policy is effective from 19 March 2012. 5.5.5.2.1 What we collect We may collect the following information:

• name and job title

 WeGov Consortium

Page 147 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• contact information in the form of email address

• information about your use of the WeGov toolkit, e.g. the searches you perform 5.5.5.2.2 What we do with the information we gather We require this information to understand your needs and provide you with a better service, and in particular for the following reasons:

• Internal record keeping.

• We will provide you with login details that are linked to your user account and this will enable you to use the system.

• All your configuration of the system (e.g. searches you make, configuration of tools and widgets, etc.) will be recorded so that when you next login, you can pick up where you left off and / or view new data collected by the tools since your last login.

• Your search history is recorded so you can view previous searches and results (search results are subject to the data retention policy whereby old or out of date data is deleted).

• We may use the information to improve our products and services.

• Audit tracking: your use of WeGov (e.g. your searches, postings and analysis configurations) will be recorded for audit purposes in order to protect the users of social networking sites whose postings are collected as a result of your searches and analyses. 5.5.5.2.3 Security We are committed to ensuring that your information is secure. In order to prevent unauthorised access or disclosure, we have put in place suitable physical, electronic and managerial procedures to safeguard and secure the information we collect online. 5.5.5.2.4 How we use cookies A cookie is a small file which asks permission to be placed on your computer's hard drive. Once you agree, the file is added and the cookie helps analyse web traffic or lets you know when you visit a particular site. Cookies allow web applications to respond to you as an individual. The web application can tailor its operations to your needs, likes and dislikes by gathering and remembering information about your preferences. We use traffic log cookies to identify which pages are being used. This helps us analyse data about web page traffic and improve our website in order to tailor it to customer needs. We only use this information for statistical analysis purposes and then the data is removed from the system. Overall, cookies help us provide you with a better website, by enabling us to monitor which pages you find useful and which you do not. A cookie in no way gives us access to your computer or any information about you, other than the data you choose to share with us.

 WeGov Consortium

Page 148 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

You can choose to accept or decline cookies. Most web browsers automatically accept cookies, but you can usually modify your browser setting to decline cookies if you prefer. This may prevent you from taking full advantage of the website. 5.5.5.2.5 Links to other websites Our website may contain links to other websites of interest. However, once you have used these links to leave our site, you should note that we do not have any control over that other website. Therefore, we cannot be responsible for the protection and privacy of any information which you provide whilst visiting such sites and such sites are not governed by this privacy statement. You should exercise caution and look at the privacy statement applicable to the website in question. 5.5.5.2.6 Controlling your personal information We will not sell, distribute or lease your personal, account or audit information to third parties unless we have your permission or are required by law to do so.

You may request details of personal information which we hold about you under the Data Protection Act 1998. A small fee will be payable. If you would like a copy of the information held on you please write to [address] or email [email address]. If you believe that any information we are holding on you is incorrect or incomplete, please write to or email us as soon as possible, at the above address. We will promptly correct any information found to be incorrect. If you wish to close your WeGov account, you may request that it be closed in writing to the administrator at [address]. All audit information regarding your account will be retained for a period of three years commencing from receipt of the request to close the account. 5.5.5.3 WeGov Users’ Acceptable Use Policy The purpose of this policy is that it regulates what the users of the WeGov system (the WeGov users) can do when they use WeGov.

• WeGov’s purpose is to enable the engagement between citizens who use social networks and governmental policy makers for their mutual benefit. All use of WeGov must be consistent with the spirit of WeGov’s purpose.

• The user shall not subvert the WeGov methods of data collection. The WeGov data collections use officially published APIs of the social networks, and the WeGov operator has ensured compliance with their terms and conditions. WeGov users shall not use any other means of data collection.

• The WeGov user understands that their activity using WeGov is recorded for audit purposes. Therefore WeGov users are encouraged to behave responsibly using WeGov.

 WeGov Consortium

Page 149 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

• Any postings into social networks by the WeGov user using WeGov must be made using the WeGov user’s own account.

• Collection of social network data is only permitted from data sources that are publicly- accessible. For example, this means that collection from private Facebook groups is not permitted.

• Altering the content information collected by WeGov from social networks is prohibited.

• Any unlawful use of WeGov is prohibited.

• The WeGov user shall not use the WeGov tools for the purposes of any marketing.

• The WeGov user shall not use the WeGov tools for any purposes that detriment the social network users, e.g. harassment, victimisation, defamation etc.

• The WeGov user shall not attempt to attack the WeGov servers, e.g. by using hacking practices.

• The WeGov user shall not disclose their WeGov password to any third party. If the WeGov user believes their password to be compromised, they will notify the WeGov operator immediately.

5.5.6 Conclusion of WeGov Privacy Considerations This document has described the key privacy and data protection issues facing a future operator and user of the WeGov toolkit. It has determined some major principles arising from these issues, and provided some recommendations for the implementation of an action plan to satisfy these principles.

 WeGov Consortium

Page 150 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 6 General Conclusion

Continuous engagement with stakeholders involved in, or familiar with, policy making was an inherent and essential part of the WeGov project methodology. Initially, the end user partners decided to concentrate on elected policy makers at European, national and regional competence level, based on their motivation to respect their constituency’s interests within the decision- making process. Despite the busy agenda and potential volatility of this target group, WeGov managed the end to end involvement of a loyal core group. As the project progressed, the trial user group involved in evaluating WeGov was extended to include; policy makers and the administrations of local governments, political parties and NGO’s. From this we can conclude that the consortium succeeded in mobilizing a diverse group of stakeholders that were able to evaluate and enrich the WeGov concepts and tools from different perspectives. The availability of a working, online, prototype from the end of February 2012, as well as a 3- month extension of the project, allowed stakeholder involvement to intensify in the latter part of the project. An iterative development approach delivered three successive versions of the prototype - one of which was for internal evaluation only - that progressively integrated the stakeholders’ feedback. The process worked as a virtuous circle: the more functionality was added to the WeGov toolbox, the more visible its potential became and the more precise and innovative the feedback was. The end user engagement process combined different approaches. An open semi-structured interview was designed to verify the underlying assumptions made about the stakeholder’s daily work as well as to assess how the available version of the software satisfied the previously defined requirements. Handson demos with relevant live data allowed more effective assessments of the usability and usefulness of the tools to take place. As a complement to face to face interviews and a number of dedicated workshops, the evaluation process included several experiments to validate the reliability and accuracy of the analysis results. Firstly, the HeadsUp forum for political discussion, hosted by the Hansard Society, was chosen as an interesting case study as it made use of a pre-existing, manually analysed data set to provide a comparison with the WeGov results. Secondly, analyses based on four weeks of intensive searches on relevant Facebook pages (e.g., leading news media) and Twitter conversations on specific themes, were run on behalf of German policy makers. These analyses were then assessed against the politicians’ existing knowledge of the discussions taking place on social media, and their expectations of how people behave on social media. During the course of the project the role of social media in politics increased considerably, with the Twitter explosion during the first Obama – Romney debate a case in point. However, the question of how effective social media is in gauging opinions from citizens in the public sphere, remains. Some WeGov stakeholders (e.g. MP Patrick Schnieder from the Bundestag) stated that Twitter is mainly the sphere of professional journalists and politicians when it comes to the discussion of political topics. During the trials, the stakeholders were indeed surprised by the low

 WeGov Consortium

Page 151 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 level of controversy on social media, as opinion leaders seemed to be very strong and led discussions in only one direction. Also, the supporters on policy makers’ social media are often members of the same party. Therefore, questions were raised about the representativeness of social media in general and of the social media savvy citizen in particular. Activity on social media is, for instance, found to be more intense in urban areas. The exponential growth of social media usage since the start of the WeGov project gave rise to an increased offering of social media monitoring tools, which made some of the trial users’ more aware of the unique value of WeGov and how it could address a number of gaps. Finally, some trial users, mainly public authorities, expressed the need for better data security and ethical guidelines to support them in their exploitation of social media. As mentioned previously, the iterative development process applied between February and July 2012 was a steep learning curve for the WeGov Consortium, during which the prototype evolved considerably. The general structure of the tools were improved, specifically; the navigation structure, the categorization and configuration possibilities of widgets, and the context sensitive presentation of search histories, to name just a few. The search volume was increased from a few hundred Tweets to 1,500 Twitter posts in a single run, with the option to dramatically increase the volume through scheduled searches. The presentation of the search results increased the users’ trust, as they now see more contextual data and because it is closer to the interface that users are familiar with from popular social networking sites. The opportunity of presenting search results maked the tool much more dynamic and allowed social media interactions and specific analyses to be initiated from a single identified post. The long term searches and combined search result analysis options were refined throughout the project. The local search became more precise to enable a good match with the territory of the policy maker’s constituency. The opportunities for multilingualism have been shown; the implementation of a second language on the user interface, the automatic selection of posts from SNS, and the provision for topic and behaviour analysis in German as well as English. Additionally, the robustness of the toolkit and the storage of search and analysis data within WeGov have been continuously enhanced. In parallel, the WeGov consortium investigated the legal, and particularly, the ethical dimension of exploiting social media data for political engagement. Personal data collected from SNS must be stored and processed in accordance with the applicable data protection legislation. Therefore, this document described the key privacy and data protection issues facing a future operator and user of the WeGov toolkit. It has determined some key principles arising from the investigation of data protection and privacy issues, and provided some recommendations for the implementation of an action plan to satisfy these principles. We are conscious that a number of gaps remain. The toolbox needs serious improvement regarding usability and user guidance to ensure users understand the logic behind the toolbox workflow. For instance, the description of the labels (e.g. sentiment and controversy) will be very important to avoid misinterpretation. A major effort needs to be made with respect to improved guidance and explanation of the tools to ensure users trust them; this should be

 WeGov Consortium

Page 152 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 addressed primarily through greater transparency regarding the workings of the algorithms used for analyses. Users want to better understand why a post or a user is ‘key’ and why it or he/she is worth paying more attention to than others. Some further in-depth empirical investigation should be carried out to validate the accuracy of analysis components, such as; the words that build a topic, the posts matching topic key words, and the consistency of repeated analyses. Further research is needed to assess whether the sentiment and the controversy do match real values. The toolbox is still weak in identifying trends and ‘hot discussions’, and it is not clear how these can be highlighted to users, for instance through alert functions. The accuracy of the WeGov analysis will always remain dependent on the type and quality of the input data. Medium length comments, good spelling, a wide variety of words used and larger data sets seem to produce the best results when analysing topics. This suggests that the use of the topic analysis components might be more accurate and effective in analysing Facebook posts, forum or blog comments rather than Tweets. But as we have no control over the quality of the content within social media, WeGov must find ways to cope with the specific realities and cultures on a variety of social networks. The main recommendations that came out of the stakeholder involvement were; better documentation of the features, the metrics, and how the algorithm understands and processes the data. A number of analysis results could be shown in more attractive visualizations, such as, word clouds or sliding scales. The engaged stakeholders also requested, more freedom to configure parameters or greater options to refine the data, such as, including or excluding elements from certain analyses. Even if WeGov is not yet a market ready product, many potential use cases have evolved which are described in this document, and these indicate that the WeGov concept and technology has a significant part to play into the future. The HeadsUp evaluation shows WeGov also has applications for forums and blogs. Even if there are many tools currently available on the market to measure and analyse social media activity, policy makers still see a role for the WeGov tools to address existing gaps. Therefore the most important benefits are the WeGov data policy, the long-term monitoring functionality, the geographical restriction of SNS and the opportunity of creating widgets. Trial users mentioned also the benefit it takes the WeGov toolkit a few seconds to analyse hundreds of comments, whereas human analysis can take days to see similar results. The tools will most likely be used by staff or supporting departments, as many political parties and assemblies appoint dedicated social media teams. As analysing social media discussions is often beyond the resources of individual politicians, small administrations and non-profit organisations, WeGov could play an important role in helping them to understand feedback across a range of communication channels. The best proof of the potential of WeGov, is evidenced by the fact that users expressed a strong desire to have further access to the tools after the end of the project, and that they wished to be kept informed about or even to be actively engaged in the further evolution of WeGov.

 WeGov Consortium

Page 153 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 References

[1] ILAWS – WEGOV REPORT - Appendix A to WeGov Deliverable D5.1. Available from http://www.wegov-project.eu (retrieved 2012-10-08). [2] http://en.wikipedia.org/wiki/Universal_Declaration_of_Human_Rights (retrieved 2012-08- 29). [3] UK ICO Key Definitions: http://www.ico.gov.uk/for_organisations/data_protection/the_guide/key_definitions.asp x (retrieved 2012-08-29). [4] UK ICO Conditions for Processing: http://www.ico.gov.uk/for_organisations/data_protection/the_guide/conditions_for_pro cessing.aspx (retrieved 2012-08-30). [5] UK ICO Legal Guidance of UK Data Protection Act: http://www.ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_gu ides/data_protection_act_legal_guidance.pdf (retrieved 2012-08-30). [6] Joshi, Somya; Karamagioli, Evika; Wandhöfer, Timo; Mutschke, Peter; Fallon, Freddy; Fletscher, Rachel; Wilson, Caroline; Nasser, Bassem I. (2010): D5.1 Scenario definition, advisory board and legal/ethical review. Also includes an asset and risk / threat and countermeasures analysis. [7] WeGov Technote T13 - SNS Management and Analytics Review. [Currently project confidential.] [8] Directory of Companies in Social Media Analysis: http://socialmediaanalysis.com/directory/ (retrieved 2012-10-08). [9] UK ICO Citizens’ Rights: http://www.ico.gov.uk/for_organisations/data_protection/the_guide/principle_6.aspx (retrieved 2012-08-30). [10] http://www.businesslink.gov.uk/Growth_and_Innovation_files/Sample3_privacy_policy.d oc (retrieved 2012-10-08). [11] Facebook Data Use Policy: http://www.facebook.com/full_data_use_policy (retrieved 2012-08-30). [12] Twitter Terms of Service: https://twitter.com/tos (retrieved 2012-08-30). [13] Beales, Richard; Taylor, Steve; Walland, Paul (2011): SNS-Based e-Participation and Cloud Computing – a Consideration of the Issues Raised, ePart 2011. [14] ICO Privacy Notices Code of Practice: http://www.ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_gu

 WeGov Consortium

Page 154 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

ides/privacy_notices_cop_final.pdf (retrieved 2012-08-30).

[15] Claes, Alex; Sizov, Sergej; Angeletou, Sofia; Reynolds, John; Taylor, Steven; Wandhöfer, Timo (2010): D4.2 Initial WeGov toolbox. [16] Wandhöfer, Timo; Thamm, Mark; Mutschke, Peter (2011): Extracting a basic use case to let policy-makers interact with citizens on Social Networking Sites: A report on initial results. In: Parycek, Peter; Kripp, Manuel J.; Edelmann, Noella (Eds.): CeDEM11: proceedings of the international conference on e-democracy and open government; 5-6 May 2011, Danube University Krems, Austria, Krems: Ed. Donau-Univ. Krems, S. 355-358. URL: http://works.bepress.com/cgi/viewcontent.cgi?article=1006&context=timo_wandhoefer (Retrieved 2012-10-08). [17] Wandhöfer, Timo; Thamm, Mark; Joshi, Somya (2011): “Politician2.0 on Facebook: Information Behaviour and Dissemination on Social Networking Sites – Gaps and Best- Practice. Evaluation Results of a novel eParticipation toolbox to let politicians engage with citizens online.” In: JeDEM - eJournal of eDemocracy and Open Government, 2011 / Vol 3, No 2, S. 207-215. URL: http://www.jedem.org/article/view/78 (Retrieved 2012-10-08) [18] Williamson, A. (2009) MPs Online: Connecting with Constituents (London: Hansard Society). [19] Williamson, Andy; Miller, Laura; Fallon, Freddy (2010) Behind the Digital Campaign: An exploration of the use, impact and regulation of digital campaigning (London: Hansard Society). [20] Atteslander, Peter (2003): Methoden der empirischen Sozialforschung. 10. Aufl. Walter de Gruyter, Berlin 2003, S. 215-249. [21] Hegner, Marcus (2003): Methoden zur Evaluation von Software, IZ-Arbeitsbericht Nr. 29. (Hrsg.) Informationszentrum Sozialwissenschaften der Arbeitsgemeinschaft Sozialwissenschaftlicher Institute e.V. (ASI) URL: http://www.gesis.org/fileadmin/upload/forschung/publikationen/gesis_reihen/iz_arbeits berichte/ab_29.pdf (retrieved 2012-10-08) [22] Joshi, Somya; Wandhöfer, Timo; Thamm, Mark; Mathiak, Brigitte; Van Eeckhaute, Catherine (2011): “Rethinking governance via social networking: the case of direct vs. indirect stakeholder injection.” In: Estevez, Elsa; Janssen, Marijn; United Nations University; Delft University of Technology (Eds.): ICEGOV 2011: Proceedings of 5th International conference on theory and practice of electronic governance. ACM international conference proceedings series, New York : ACM Press, p. 429. [23] Mochmann, Ekkehard (1994): “Inhaltsanalyse” In: Kriz, Jürgen/Nohlen, D./Schultze, R.O. (Hrsg.): Lexikon der Politik. Band 2: Politikwissenschaftliche Methoden. München 1994, Verlag C.H. Beck, ISBN 3406369049; Seite 184-187

 WeGov Consortium

Page 155 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

[24] Nielsen, Jakob (2003): Usability-Engineering, Morgan Kaufmann Publishers Inc. San Francisco, CA, USA ©1993. ISBN:0125184050 [25] Nielsen; Mack (1994): Usability Inspection methods. In: Proceeding CHI '94 Conference companion on Human factors in computing systems. ACM New York, NY, USA ©1994. ISBN:0-89791-651-4. doi: 10.1145/259963.260531 [26] Rowe, M., Angeletou, S. and Alani, H. (2011) Anticipating Discussion Activity on Community Forums, 3rd IEEE International Conference on Social Computing, Boston, USA.

[27] Rowe, M., Angeletou, S. and Alani, H. (2011) Predicting Discussions on the Social Semantic Web, Extended Semantic Web Conference 2011, Heraklion, Crete. [28] Angeletou, S., Rowe, M. and Alani, H. (2011) Modelling and Analysis of User Behaviour in Online Communities, 10th International Semantic Web Conference (ISWC 2011), Bonn, Germany. [29] Wilson (1982): "Teil 3: Aus dem Leben der Forschung. Qualitative "oder" quantitative Methoden in der Sozialforschung". In: Kölner Zeitschrift für Soziologie und Sozialpsychologie, Jg. 34, 1982, S. 487-508. [30] Langer (1985): Das persönliche Gespräch als Weg in der psychologischen Forschung. In: Zeitschrift für personenzentrierte Psychologie und Psychotherapie, 4, S. 447-457 [31] Littig (2008): Interviews mit Eliten - Interviews mit ExpertInnen: Gibt es Unterschiede? [37 Absätze]. Forum Qualitative Sozialforschung / Forum: Qualitative Sozialforschung Research, 9(3), Art. 16, http://nbn-resolving.de/urn:nbn:de:0114-fqs0803161 (Abgerufen im Juli 2012) [32] Mayring, Philipp (1990): Einführung in die qualitative Sozialforschung. ISBN 3-621-27095- 7. Psychologie Verlagsunion [33] Mayring, Lamnek (1988): Qualitative Sozialforschung. Band 1 Methodologie. Psychologie Verlags Union, München und Weinheim, 1988. ISBN 3-621-27055-8 [34] Kluge (2000): Empirisch begründete Typenbildung in der qualitativen Sozialforschung. In: Forum Qualitative Sozialforschung. Vol. 1, Nr. 1, Art. 14. PID: http://nbn- resolving.de/urn:nbn:de:0114-fqs0001145 Abgerufen im Juli 2012) [35] Wandhöfer, Timo (2012a): Approaches for validating automatic Analytic Tool results on social networking data for its Exploitation within Politicians' everyday Workflow. General Online Research 2012 - GOR 2012. Mannheim, 05.-07.03. 2012. URL: http://tiny.cc/i2uygw (Abgerufen im Juli 2012) [36] Website heise news. Article: State Sachsen wants to monitor social neworks. URL: http://www.heise.de/newsticker/meldung/Sachsen-will-Software-zur-Beobachtung- sozialer-Netze-einsetzen-1663567.html (Retrieved 2012-10-09) [37] Fernandez, Miriam; Wandhöfer, Timo (2012): WeGov: where eGovernment meets the

 WeGov Consortium

Page 156 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

eSociety. In: Electronic Journal of E-Government – EJEG, Vol. 9, Issue 2.

[38] Joshi, Somya; Wandhöfer, Timo; Koulolias, Vasilis; Van Eeckhaute, Catherine; Allen, Beccy; Taylor, Steve (2012): Paradox of Proximity – Trust & Provenance within the context of Social Networks & Policy. In: Proceedings of The 4th International Conference on Social Informatics, 5–7 December 2012, p. 14. [39] Geana, Ruxandra; Taylor, Steve; Wandhöfer, Timo (2012 - erscheint): Bringing citizens' opinions to Members of Parliament: the Newspaper Story. In: CeDEM12 Conference Proceedings "Die Zukunft der digitalen Gesellschaft " [40] Wandhöfer, Timo; Taylor, Steve; Alani, Harith; Joshi, Somya; Sizov, Sergej; Walland, Paul; Thamm, Mark; Bleier, Arnim; Mutschke, Peter (2012c): Engaging politicians with citizens on social networking sites: the WeGov Toolbox. In: International Journal of Electronic Government Research (IJEGR), Vol. 8/ No. 3, S. 22-43. [41] Wandhöfer, Timo; Thamm, Mark; Joshi, Somya (2011b): Politician2.0 on Facebook: information behaviour and dissemination on social networking sites – gaps and Best- practice; evaluation results of a novel eParticipation toolbox to let politicians engage with citizens online. In: JeDEM - eJournal of eDemocracy and Open Government, Vol 3, No 2, S. 207-215. [42] Wandhöfer, Timo; Van Eeckhaute, Catherine; Taylor, Steve; Fernandez, Miriam (2012b): WeGov analysis tools to connect policy makers with citizens online. In: Proceedings of the tGov Conference, May 8th – 9th 2012, Brunel University, University Kingdom, p. 7. [43] Wandhöfer, Timo; Thamm, Mark; Mutschke, Peter (2011a): Extracting a basic use case to let policy makers interact with citizens on Social Networking Sites: a report on initial results. In: Parycek, Peter; Kripp, Manuel J.; Edelmann, Noella (Hrsg.): CeDEM11 : proceedings of the international conference on e-democracy and open government ; 5-6 May 2011, Danube University Krems, Austria, Krems: Ed. Donau-Univ. Krems, S. 355-358. [44] González-Ibáñez, Roberto; Muresan , Smaranda; Wacholder Nina (2011): Identifying Sarcasm in Twitter: A Closer Look, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:shortpapers, pages 581–586, Portland, Oregon, June 19-24, 2011.

 WeGov Consortium

Page 157 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 A. Pre-Testing – Toolbox 2.5

During 3 weeks prior to the external release the user engagement partners performed in depth tests to eliminate a number of technical shortcomings and to ensure that the potential of the technical features was fully understandable to policy makers.

Figure 24: Toolbox 2.5 - evaluation cycle pre-testing

Results The general wish list that resulted from the previous end user engagement phase was to have:

• A focus on something really local (Locations, Recent Local Posts, Main Local Topics, Trending Now)

• Possibilities of personal tracking (Peerindex, any search or analysis for a name of the policymaker, event or place relevant to the policymaker)

• Something that helps make a decision (Main Topics, User Roles, User with the Role, Trending Now, Main Local Topics, Peerindex)

• Finding people to contact or use, issues to address, find which discussion to join etc.

The major improvements that resulted from internal testing were the following:

Toolbox part User partner’s concern Improvements made to the toolbox

Overall Having the possibility to drill down from the Hash tags and links everywhere are concern general analysis results to the originating highlighted and made clickable. posts

A number of widgets receive a clickable title (recent local search on twitter, all Facebook widgets) allowing accessing an underlying results page, showing all retrieved

 WeGov Consortium

Page 158 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

posts.

A new widget has been added, helping to find all individual users with a particular role (based on behaviour analysis user roles) after a search on Twitter. For example, Broadcasters for "politics" or Information Sources on "government"

Home page The Facebook widgets rely on the user Facebook post IDs that are required knowing in advance the correct ID for the for some Facebook widgets to work user that they are interested in, but is pretty will be displayed on the results page difficult (or impossible) for users to be able to for Facebook posts find the ID of the user account that they want to monitor. Even if you search on Facebook, you can see the individuals, but the IDs are not displayed.

Home page Possibility to search recent local posts The development team hasn’t found without a specific search term, to understand a way to do general local Twitter main local concerns at a certain point in time searches without a search term.

But a new widget has been added that shows local trends on Twitter for a location. Locations that do return trending topics are unfortunately very limited because of the way Twitter implemented it.

Home page Give the policy maker additional tools that The dashboard has the possibility to help to track his online reputation connect and use services which are not part of the project. Peerindex is one of them.

Home page User might have troubles making a distinction A differentiation in the display of between the roles and functions of the widgets according to their role will

different types of widgets offered. be implemented in prototype 2.6.

 WeGov Consortium

Page 159 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Search page - The activity diagram as part of the behaviour The maximum number of retrieved Behaviour analysis is very useful, but the results are not posts will be substantially increased analysis tab interesting given the current restriction to in prototype 2.6. retrieve 99 tweets in a single search.

Search page There is a need to collect data on a topic over Scheduled runs will be more flexible time and running the activity diagram within in prototype 2.6. the widget to show if one topic is getting Process (scheduled activity, hotter or colder. automatic collection) that does a particular posts collection for a specified period of time (like a week).

Search page - Results need to be explained and presented a The proposed fixes that will be Topic analysis lot better, word stems, are difficult to integrated in next versions: understand ● replace stems with the most common keyword or topic having that stem. ● Number of terms per topic should be will be variable either by user configuration or by tool determining automatically the most relevant numbers

Search page - “Key posts” resulting from the topic analysis A user guide and a Frequently Asked Topic Analysis and top users and “Top 5 users to watch” and Questions document provides the and Behaviour “Top 5 posts to watch” resulting from user with easy to understand analysis tabs behaviour analysis are characterised by explanation of the algorithms used to scores, but the user has no explanation what compute the scores.

is the method behind to compute why some users or posts should deserve more attention than others.

Table 21: Toolbox 2.5 - pre evaluation results

 WeGov Consortium

Page 160 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 B. Material – Toolbox 2.5

B.1 Semi-structured Interview

 WeGov Consortium

Page 161 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 162 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 163 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 164 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 B.2 Hand-out for the German Bundestag

Evaluation der WeGov Toolbox Version 2.5

Ziel des WeGov Projekts Ziel des WeGov Projekts1 ist die Vernetzung der Politik mit Bürgern auf Sozialen Netzwerken. Die WeGov Toolbox ist eine Internetseite mit Analysemöglichkeiten für Soziale Netzwerke zur Unterstützung der Politik und bietet neue Möglichkeiten:

 die Stärken Sozialer Netzwerke für den politischen Entscheidungsprozess zu nutzen;  sich mit Bürgern durch das Social Web zu vernetzen und einen Einblick in verschiedene Meinungen zu erhalten;  Meinungen von Bürgern zu identifizieren die den Entscheidungsprozess beeinflussen können;

Ziel der Evaluation Ziel dieser Evaluation ist es die aktuellen Analysekomponenten (Seite 3 ff.) sowie die Integration innerhalb der WeGov Toolbox durch Benutzer zu bewerten. Das gesamte Feedback fließt sowohl in den Prozess der Softwareentwicklung ein als auch in die Abschlussbeurteilung der Machbarkeitsstudie im Rahmen des Forschungsprojektes WeGov. WeGov befindet sich in der Endphase und wird mit Projektende im September die finale Software vorstellen.

Nächster Schritt: Bewertung der Analyseergebnisse Neben der WeGov Toolbox als Ganzes möchten wir die Qualität und die Verwendungsmöglichkeiten der Analyseergebnisse im Detail bewerten. Die Idee ist: WeGov beobachtet für Sie Facebook Gruppen und Themen auf Twitter über einen Zeitraum und Sie beurteilen anschließend die Relevanz der Ergebnisse. Vergleichsmöglichkeiten sehen Sie auf den folgenden Seiten, wie z. B.

 Themen aus Facebook Gruppe mit Relevanz zu Ihrem Wahlkreis;  Twitter Profile passend zu Ihrem Fachbereich (z.B. Netzpolitik)

Kontaktperson Dipl.-Inf. Timo Wandhöfer GESIS – Leibniz-Institut für Sozialwissenschaften Unter Sachsenhausen 6-8 50667 Köln Telefon: + 49 (0) 221 - 47694 - 544 E-Mail: [email protected]

1 http://www.wegov-project.eu (abgerufen im Juni 2012)

1

 WeGov Consortium

Page 165 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

WeGov Toolbox

Die WeGov Toolbox ist unterteilt in eine Startseite mit personalisierten Analysefenstern zur „schnellen“ Übersicht und in eine Rechercheseite mit detaillierteren Such- und Analysemöglichkeiten. Die Startseite (vgl. Abbildung 1) enthält unterschiedliche Funktionsfentster, wie Google Maps zur georgrafischen Einschränkung der Suchergebnisse, WeGov Analyse Komponenten und die Möglichkeit Funktionen von Drittanbietern zu integrieren. Die WeGov Suchseite (vgl. Abbildung 1) ermöglicht aktuell eine Recherche auf Twitter sowie die Analyse der Suchergebnisse.

 Welche Funktionsfenster sind für Sie hilfreich? Warum?  Welche Funktionsfenster fehlen?

Abbildung 1: WeGov Toolbox mit Startseite (l.) und Suchseite (r.)

Als Analyseinhalte verwendet die WeGov Toolbox aktuell Facebook und Twitter. Abbildung 2 zeigt die Fan Page von Angela Merkel (links) und eine Suchanfrage zu Energiewende auf Twitter (rechts). Diese Inhalte werden auf den nachfolgenden Seiten zur Analyse verwendet.

Abbildung 2: Facebook Fan Page Angela Merkel (l.), Twitter Suche zu Energiewende (r.)

 Welche Inhalte verwenden Sie auf Sozialen Netzwerken?  Wie und zu welchem Zweck nutzen sie Soziale Netzwerke?  Welche Kriterien sind Ihnen wichtig?

2  WeGov Consortium

Page 166 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Analyse Themenerkennung

Innerhalb von Onlinediskussionen werden verschiedenste Themen diskutiert, die mit Hilfe der Themenerkennung einfach identifiziert werden können. Abbildung 3 zeigt identifizierte Themen zur Fan Page von Angela Merkel und Abbildung 4 die Themen die innerhalb von Tweets zu Energiewende diskutiert werden. Anlage A zeigt eine Detailansicht der Analyse.

Anwendungsmöglichkeiten auf Facebook

Abbildung 3: Analyse aller Posts der Fanseite Angela Merkel (l.), Analyse eines ausgewählten Posts (r.)

Anwendungsmöglichkeiten auf Twitter

Abbildung 4: Analyse von Tweets zu Energiewende (l.), Analyse der letzten lokalen Tweets zu Energiewende (r.)

Anwendungsszenario  Zu welchem Zweck würden Sie die Analyseergebnisse verwenden?  Taugen die Ergebnisse für „Bürgernähe“, „Dialog“ und „Bürgerbeteiligung“?  Was für Probleme haben sie bei der Nutzung von sozialen Netzwerken? Was erscheint ihnen besonders lästig oder zeitaufwändig?  Wie können wir Ihre Arbeit mit dieser Analyse unterstützen?  Führen Sie aktuell ähnliche Analysen in Sozialen Netzwerken durch? Welche?

Qualitätssicherung  Lassen sich ihre Analyseergebnisse mit WeGov Ergebnissen vergleichen?  Welche Facebook Gruppen oder Themen würden Sie bewerten?

3

 WeGov Consortium

Page 167 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Analyse Diskussionsaktivität

Die Toolbox ermittelt Tweets und Benutzer die in Zukunft voraussichtlich die größte Reaktion hervorrufen werden. Abbildung 5 zeigt zur Suchanfrage nach Energiewende die fünf relevantesten Tweets und Profile auf Twitter.

Anwendungsmöglichkeiten auf Twitter

Abbildung 5: Tweets (l.) und Twitter Profile (r.) mit großem Einfluss – Analyse Energiewende

Anwendungsszenario  Zu welchem Zweck würden Sie die Analyseergebnisse verwenden?  Taugen die Ergebnisse für „Bürgernähe“, „Dialog“ oder „Bürgerbeteiligung“?  Wie können wir Ihre Arbeit unterstützen?  Führen Sie aktuell ähnliche Analysen in Sozialen Netzwerken durch? Welche?

Qualitätssicherung  Lassen sich ihre Analyseergebnisse mit WeGov Ergebnissen vergleichen?  Welche Themen würden Sie bewerten?

 WeGov Consortium 4 Page 168 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Analyse Benutzerverhalten

Innerhalb Sozialer Netzwerke nehmen die Akteure verschiedene Rollen ein. Die Toolbox identifiziert Benutzer nach ihren jeweiligen Rollen. Eine wichtige Rolle ist zum Beispiel Benutzer mit großer Reichweite. Abbildung 6 zeigt ein Tortendiagramm zur Twitter Suchanfrage nach Energiewende. Dargestellt werden fünf Rollen. Die anteilsmäßig größte Rolle ist Benutzer die täglich tweeten. Der rechte Teil in Abbildung 6 zeigt drei Benutzer mit großer Reichweite zur Analyse für Energiewende an.

Anwendungsmöglichkeiten auf Twitter

Abbildung 6: Analyse über die Menge von Tweets zu Energiewende

Anwendungsszenario  Nutzen sie Twitter zum einholen oder verbreiten von Informationen?  Zu welchem Zweck würden Sie die Analyseergebnisse verwenden?  Taugen die Ergebnisse für „Bürgernähe“, „Dialog“ oder „Bürgerbeteiligung“?  Wie würden Ihre Wunschergebnisse aussehen?

Qualitätssicherung  Lassen sich Ihre Analyseergebnisse mit WeGov Ergebnissen vergleichen?  Welche Themen würden Sie bewerten?

 WeGov Consortium

Page 169 of 217 5

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Analyserelevante Inhalte

Welche Facebook Seiten sind relevant? z.B. https://www.facebook.com/GegenAtomkraft

Welche Themen sind relevant? Welche Region ist relevant? z.B. Energiewende (Region: Dormagen)

Welches sind weitere relevante Inhalte? z.B. YouTube Video

6  WeGov Consortium

Page 170 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 C. Pre-Testing – Toolbox 2.6

Figure 25: Toolbox 2.6 - evaluation cycle pre-testing

The pre evaluation consist of a thorough testing of each functionality. Results

The usability and functionality test had the following results:

Toolbox User partner’s concern Improvements made to the toolbox part

Advanced FB Pages with ä, ö, ü do not work Prototype 2.6 public is improved to search handle German characters

Home page User guidance remains poor. User tips are not addressed as a priority in development of the next toolbox If the home page is not pre-configured it is version. It is addressed by a standalone very empty and user do not know how to FAQ list. start.

Home page It remains hard find the ID for end users to feed Facebook post widget

Advanced Computed scores (All scores (valence, This addressed in the FAQ document an analysis controversy, score for key posts, topic in prototype 3.0 more emphasis will be distances) should be documented. What do put on graphical scale presentation they conceptually mean and how are they which will improve the readability computed?

Advanced Key users and key posts retrieved from topic Realised in prototype 3.0 analysis analysis and behaviour analysis should be represented as link to allow to retrieve the original post

Advanced Topic groups and their scores on the “Further This will be solved by a different

 WeGov Consortium

Page 171 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

search analysis” page do not correspond to those presentation of the topic analysis result displayed in the main topic analysis page in version 3.0 Topic analysis

Impossible to navigate back to the main topic analysis page from the “Further analysis”

Table 22: Toolbox 2.6 – results

Overall, this could be seen as a perfectly acceptable outcome

Version 2.6

Launch 30.7.2012

Evaluation This version has not been evaluated by external end users because the final toolbox have been launched four weeks later.

Features Updates and improvements made as response to internal evaluation including:

● Removed hard-wired number of topics; this was set to 5 - now display of as many as analysis decides ● Added radius parameter for location-based searches ● Improved robustness/reliability of both Facebook and Twitter searches ● Facebook search now limited to max 5000 posts (some pages can return >40k status messages for example) ● Re-organised search page parameters ○ SNS selection at top ○ Location/language not displayed for Facebook ○ posts option removed ○ new "What / how much to collect panel" ● Only show "Get comments in new widget" link if there are >0 comments for a post! ● Handling of Facebook group URLs with German characters (e.g. http://www.facebook.com/pages/BILD-D%C3%BCsseldorf/191500460889083)

Table 23: Toolbox 2.6 – results

 WeGov Consortium

Page 172 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 D. Pre-Testing – Toolbox 3.0

The pre evaluation took place from 18.8.2012 to 27.8.2012 with the WeGov end user partner mainly. Following is the table with the pre version 3.0 characteristic and a list of features that have been developed after the launch of the toolbox 2.6.

Figure 26: Toolbox 3.0 - evaluation cycle pre-testing

Reported Bugs

The gaps that have been identified are listed within the following table.

Location Reported Description Fixed

analysis There is a maximum of tweets the analysis can handle. Searches NO with approx. 23.000 tweets are too big to run the analysis.

widgets 29.8.2012 Widgets don't appear (when attempting to use any Local 6.9.2012 Tweets widgets)

widget 29.8.2012 The "change" option for the current location (in My Locations 6.9.2012 location widget) has been removed

search 29.8.2012 Strange display problem with the paging options (e.g. bottom 6.9.2012 history of Searches History) fixed

analysis 22.8.2012 NICE TO HAVE: Each analysis is based on a bunch of NO posts/comments/tweets. It would be very helpful to show this input number of documents, because it helps the end user to assess the results.

search 22.8.2012 The search history shows 10 items per page. These items can be 12.9.2012 history bunched for the analysis. It is often the case that end users want to bunch items from several pages. For instance if the end user has scheduled 30 searches there are 3 pages with list items. Quick win: Can we show all searches on one page?

 WeGov Consortium

Page 173 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

topic 22.8.2012 Where are the key users gone? They are very important for the yes analysis level of single documents… So we should show the top 3 key users again.

analysis 22.8.2012 - the items within the analysis history are difficult to NO distinguish. Might it be possible that the user label the items manually?

analysis 22.8.2012 NICE TO HAVE: The main users of the toolbox are the NO researcher that doing analysis for the MP. It would be very beneficial if the researcher can send a link to the MP that refers on an analysis within the analysis history.

discussion 20.8.2012 The following screenshot shows four dots where the graph is YES activity plotted. Why is the number only 4? And what is the criteria how many dots are selected? Within a long-term collection I would expect more dots than 4.

 WeGov Consortium

Page 174 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 E. Material – Validation of Analysis Results with End Users

E.1 Constituencies and Locations

Participant Constituency Map

Bundestag 1 Heimweiler (Wahlkreis 202 (Office) Kreuznach, Rheinland Pfalz, Radius 40 km)

Source30

Bundestag 2 Visselhövede (Wahlkreis 036 (Office) Rotenburg 1 – Soltau- Fallingbostel, Niedersachsen, Radius 50km) Radius erhöht auf 80 km

Source31

Bundestag 3 Neidenbach (Wahlkreis 203 (MP) Bitburg, Rheinland-Pfalz, Radius 50km)

Source32

30 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=202 (Retrieved on 31.07.2012)

31 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=036 (Retrieved 31.07.2012)

32 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=203 (Retrieved on 31.07.2012)

 WeGov Consortium

Page 175 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Bundestag 4 Kirchworbis (Wahlkreis 189 (Office) Eichsfeld – Nordhausen – Unstrut-Hainich-Kreis 1, Thüringen, Radius 40km)

Source33

Bundestag 5 Friedrichshain-Kreuzberg (Office) (Wahlkreis 084 Berlin- Friedrichshain - Kreuzberg - Prenzlauer Berg Ost, Berlin, Radius 6 km) Source34

Bundestag 6 Groß-Gerau (Wahlkreis 184 (MP) Groß-Gerau, Hessen, Radius 20 km)

Source35

State PARL. 1 Düsseldorf (Wahlkreis 42 – (MP) Düsseldorf 3, NRW, Radius 10 km)

33 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=189 (Retrieved on 31.07.2012)

34 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=084 (Retrieved on 31.07.2012)

35 URL: http://www.bundestag.de/bundestag/wahlen/wahlkreise09/index.html?wknr=184 (Retrieved on 31.07.2012)

 WeGov Consortium

Page 176 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

State PARL. 2 Bielefeld (Wahlkreis 93 (MP) Brackwede, Heepen, Stieghorst, Sennestadt, Senne, Radius 15 km)

City of Cologne Köln (Radius 20km) 1

City of Cologne 2

City of Kempten (Radius 10km) Kempten

Table 24: Locations (constituencies and areas)

 WeGov Consortium

Page 177 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 E.2 Analysis Report – Sample

 WeGov Consortium

Page 178 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 179 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 180 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 181 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 182 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 183 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 184 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 185 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 186 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 187 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 188 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 189 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 190 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 191 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 192 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 193 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 194 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 195 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 E.3 Questionnaire - Sample

 WeGov Consortium

Page 196 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 197 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 198 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 199 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 200 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 201 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 E.4 Semi-structured Interview

 WeGov Consortium

Page 202 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

 WeGov Consortium

Page 203 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 F. Material – HeadsUp: Topic Analysis Evaluation

F.1 Sex Education report - run 124

Topics

In this debate there were a small number of quite lengthy and detailed comments. This potentially meant that each post dealt with multiple issues making it harder to group them in distinct topic groups. The report highlighted quite nuanced themes, for example many different elements of sex education such as, who should teach it and what information young people need to know. The toolkit matched the themes in the report well but the differences between the topics were not always apparent by just looking at the key words, particularly as many of the key words were repeated in the topic groups. Further investigation of the posts within the groups was necessary to gain a better understanding of the discussion. Only one theme from the report was not closely matched in the analyser – this is because it was on the issue of sex and relationship education in schools which as a theme was mentioned throughout the topic groups. The toolkit highlighted an extra topic group about pornography and the degradation of women that in the report had been included in a different theme (the sexualisation of society). The two posts contained in this group were very focused on this specific issue and were quite distinct from the others which could have placed a different focus on the analysis. The grouping of the key words is really important when looking for themes. Simply seeing a list of the most frequently appearing key words does not give the user enough context or show how the words are related. Some words are given meaning they wouldn’t have without other key words in the group eg. the words ‘age’ and ‘consent’ don’t mean much individually but together we better understand what is being discussed. In this forum the analyser excluded 39% of posts from the analysis – there were 36 posts in total but only 22 appeared in the topic analysis. This is quite a high percentage of posts to be excluded from the final analysis and therefore the report. This may have also had the effect of including key words in topic groups that were not contained in the posts below eg. in this analysis the word ‘abortion’ appears in the key words within one topic group but the posts contained in that group do not include that word. The word only appears twice in the dataset and once in the analyser, in a group discussing teen pregnancy. Key users The key users and key posts in the analyser matched very well with the quotes from the report. Most of the posts appearing in this small forum also appeared in the report.

 WeGov Consortium

Page 204 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Sentiment – report level

The topic groups that the analyser highlighted as negative included discussions on; teenage pregnancy, sex education in school, the age of consent and the degrading effect of pornography on women. The sentiment of the analyser at a report level worked well for this forum. The analyser differentiated between the discussions on current sex education provision being poor and the positive suggestions for improvements. This matched the human interpretation of the debate well as the quotes from the report show:

There were few posters who were satisfied with the current state of content and delivery of SRE [Sex and Relationship Education] as it stands in the UK.

The strategies of how to deliver SRE [Sex and Relationship Education] varied but there were lots of suggestions about how it could be improved. Sentiment – post level With this forum all posts were tested for accuracy as the forum was so small. Of the 23 posts that appeared in the sentiment analysis 16 were judged as accurate and 7 as being allocated inaccurately. This gives an accuracy rate of 70%. The toolkit does not deal well with posters making suggestions that include negative words such as ‘not’, ‘worse’, ‘afraid’ etc. even if the context for using them is positive.

 WeGov Consortium

Page 205 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Figure 27: posts showing incorrect analysis of sentiment

Above is an example of the toolkit getting the sentiment wrong. The human analysis of the words highlights 24 positive words and 19 negative words so the comment should be significantly more positive than the toolkit is suggesting. Even if the toolkit does not understand the context of the individual’s words, the number of positive words versus negative words suggests that the comment should have been categorised as positive. Why hasn’t it? It is very difficult to judge if comments are positive or negative, most are not at the extreme edges of debate so the degree to which the toolkit sees comments as positive or negative is also important to judge accuracy. Below is an example of where the toolkit got this complexity right:

Figure 28: posts showing correct analysis of sentiment

The examples here perhaps show the difficulty that the toolkit has with longer posts that deal with multiple issues and changes in sentiment within a single comment. The shorter, more focused posts (above) were sorted very accurately and the toolkit was good in this instance at understanding degrees of sentiment. The first post above is much more definite and passionately argued than the second which is less entrenched in the negative opinion.

 WeGov Consortium

Page 206 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 F.2 Youth Citizenship report – 127

Topics

This forum faired less well than the other two that have been analysed, with only half the topics appearing in both the analyser and the report. One of the issues in this forum was the quality of the data. Many of the posts were short, repeated very similar sentiments, and there were some misspelled words that could have affected the analysis eg. ’polititions‘ instead of ’politicians‘ and ’polotics‘ instead of ’politics‘. Many of the key words in the topic groups are repeated or are words which do not help us to understand the nuances of the debate eg. ’think‘, ’say‘. Words like ‘boring’ or ‘politicians’ or ‘MP’ which might help us to understand what the forum discussion was about (and were clearly mentioned many times from looking at the posts and key words) do not appear in the key words. More investigation of the individual posts is necessary with this forum, compared to the others analysed, to get an overview of the debate. Some of the topics that the toolkit highlighted that were not focused on in the report were still interesting and relevant, providing a different perspective on the debate. For example, the age that people should be allowed to start voting in elections was discussed widely but did not appear as a separate theme in the report. Another example that did not feature in the report, perhaps because it was too local to be relevant to a national debate, was about the lack of youth clubs and activities for young people in the London borough of Harrow. In this forum the key words from the frequency search were not that helpful as a small number of words appeared so often; the words ’politics‘, ’young‘, ’people‘, ’think‘, appeared a total of 926 times throughout the comments. The other words were mostly short connecting words that did not give a flavour of the debate at all. In this respect the analyser, although not matching exactly with the report, provided a much more easily understandable overview of the topics discussed within this debate. In this forum the analyser excluded 10% of posts from the analysis – there were 317 posts in total with 286 appearing in the topic analysis. As the posts were shorter and less diverse than some of the forums this may mean that more of them fitted easily into a topic group. Key users The analyser has more difficulty with this forum than most, perhaps due to the quality of the data (short posts, poor diversity of views, poor spelling). Most of the key users and key posts brought up in the analyser did not match the report, although it consistently highlighted some of the better, more in depth and thoughtful posts from this forum. Sentiment – report level The Youth Citizenship Commission forum was very negative in terms of young people’s reactions to politics and politicians, as this quote from the report shows:

 WeGov Consortium

Page 207 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

HeadsUp users overwhelming said that they found politics boring and too

complicated…Politicians drew many criticisms from young people for being out of touch, not listening, not speaking simply enough, setting a bad example, not visiting young people and blaming them for all society’s ills! As the analyser only rates half the topics as negative (three out of six) and the average sentiment for the topics are only negative by the values of -0.03, -0.10 and -0.31, this does not seem to compare well with the report. One of the more positive themes in this debate was the issue of volunteering, which the toolkit rates as the most positive, which is accurate. However, the next most positive topic group is characterised by the words; ‘politics’, ‘people’, ‘think’, ‘young’, ‘don’ (probably meaning ‘don’t’). Politics and how it relates to young people was one of the areas where much of the negativity was focused in the forum and a cursory look through the messages suggests that the comments were not particularly positive. A number seemed to have been categorised as neutral when they should have been negative. Most of the comments in this group were longer and more detailed than the average in the forum – so poor quality data does not appear to account for the inaccuracies of sentiment in this topic group. Sentiment – individual level The sentiment was more accurate when dealing with the most positive topic group about volunteering. However, the most negative topic group contained mostly posts that were a sentence or shorter in length, whereas the positive topic group contained better quality data. The spelling was quite poor across this forum but the negative topic group contained worse spelling and more text speak than the positive topic group. Of the 82 posts that were analysed 59 were judged as accurate and 23 as being allocated inaccurately. This gives an accuracy rate of 72%.

F.3 Equality report – run 126

Topics The equality forum report was quite closely matched with the analyser but there were three themes from the report that did not appear in the analyser; women’s representation in Parliament, university fees and Dale Farm. Almost all the topic groups were understandable at first glance and were sufficiently different from one another to identify separate themes. Considering how much data there was, the analyser made it easier to see the themes – this report (as the others do) does not just pick up on the areas of most debate but the range of issues debated. This is why a topic like Dale Farm may not have been highlighted in the analyser

 WeGov Consortium

Page 208 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

– it was included in the report partly because it was topical and not necessarily because it was the most hotly debated issue in the forum. The topics that the analyser highlighted independently of the report were still useful. It split a debate on women’s sport into sub-themes that were more detailed than the explanation given in the report. The debate on women’s sport was made up of at least 300 comments so it’s understandable that the analyser split it into different groups. One group of comments highlights the conversation around equality in sport, both genders being able to compete on a level playing field and whether mixed sport is desirable. Although there was some overlap, the second group of comments focused much more on the types of sports men and women play and whether this unequal division between sports means that they are sexist. In this forum the key words from the frequency search were not that helpful as so many small and joining words appeared so often; ‘think’, ‘you’, ‘should’, ‘agree’ etc. The analyser provided a good overview of the topics discussed within this debate. This is particularly helpful when there are so many comments as it’s very difficult to analyse manually. It is likely that the sheer number of comments made these results much more effective as an overview of the debate. However, in this forum there were many short comments (one sentence or less) simply agreeing with or supporting another poster’s point of view. As the analyser does not take into consideration the relationship between posts (in terms of their being replies or new posts), these kinds of comments could actually hinder the understanding of the topics being analysed. Excluding such short comments would have made the analysis more understandable and would be helpful where there are large numbers of comments that are unlikely to all be read and analysed manually. In this forum the analyser excluded 10% of posts from the analysis – there were 1186 posts in total with 1062 appearing in the topic analysis. As there were many of the short agreement/supportive posts, it would be interesting to know which posts it excluded rather than these. As the toolkit clearly excludes a number of posts from each analysis it would be interesting to see what options could be selected by the user to give the most useful results for their given data set. Could a user select posts to be excluded and therefore ensure the most relevant posts were included?

Key users Most of the key users and key posts brought up in the analyser did not match the report, although it consistently highlighted relevant posts from this forum. There were many posts in WeGov that dealt with similar issues so, the posts highlighted in the report and the analyser, although rarely written by exactly the same users, were not all that different in substance. For example, a quote from the report:

 WeGov Consortium

Page 209 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

It's wrong to tell people how to dress, this country should be run on the basis of equality, so if you're muslim, christian or any other religion you should be allowed to wear what you want. A key user post from WeGov:

I may not be a muslim but I dont think it is fair that muslim women are not allowed to cover their faces and heads in a head scarf. It is their choice to wear what they want and I dont think anyone should be able to stop that no matter how powerful they are. One of the issues noticed was that the key users were not always the first listed, although they were often only fourth or fifth in the list.

Sentiment – report level This forum was interpreted by WeGov as the most positive of our test forums, but does this stand up to comparison with the report? The only topic group that has a negative sentiment score is one that contains the key words: ’bullying‘, ’people‘, ’think‘, ’riots‘, ’feel‘. Clearly the words ‘bullying’ and ‘riot’ are not positive words but the sentiment score for this topic group is only -0.20. From these key words, we should expect a more negative analysis. Although this forum was very positive in the sense that there were many positive suggestions about what should be done to improve the equality of different groups within society there were also many negative responses to the status quo.

Figure 29: examples of short positive posts

One of the problems that the toolkit had to deal with in this forum was that there were a large amount of short comments simply supporting others’ points of view. See above for examples.

 WeGov Consortium

Page 210 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

This may have skewed the results by swamping the comments with these short positive statements (which were accurately allocated by the toolkit most of the time) but did not form a substantial part of the debate. An example of a discussion that should have appeared as more negative is the debate relating to women’s sport, which was probably the most discussed issue on this forum. The word ’sexist‘ was used 78 times in the discussions about sport yet this did not appear to register as a negative word in the toolkit. In some instances posters were saying that they didn’t think sports were sexist but in the majority of cases the word ‘sexist’ was used in a negative context along with words such as ‘rude’ and ‘unfair’. Only one of the posts containing this word was analysed as negative, whilst 23 were categorised as neutral and an overwhelming 54 were positive. Perhaps the toolkit did not understand the word ‘sexist’ or misunderstood it to be a derivative of ‘sex’ or ‘sexy’ which might be categorised as positive by the algorithm?

There was almost unanimous agreement that the world of sports favours men over women. Posters felt that men’s sports got more attention, were broadcast more often, and that sportsmen were higher paid than sportswomen… Stereotypes about gender were agreed to play a role in discrimination… A number of posters said they found these stereotypes frustrating, and said that people should be able to play any sport regardless of sex. Although this word may be an anomaly that is skewing the results in this particular forum, it shows perhaps how difficult it can be for the algorithm or a human to understand what the sentiment, whether positive or negative, is referring to explicitly. Sentiment – individual level The algorithm had the highest level of accuracy when allocating sentiment on this forum. This is perhaps because there was a large amount of data to analyse and, although there were some long comments, the debate was very fluid with posters having real-time conversations, meaning that posts were shorter and more focused on specific areas of the debate. This suggests that the sentiment analysis might work well with social media as conversations there are often quicker and more responsive than forums, particularly pre-moderated forums. The important caveat for this is that the algorithm will need to know the relationships between the posts and understand the discussion thread to be able to do this in a way that can be understood.

 WeGov Consortium

Page 211 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012 G. Material – WeGov Privacy Considerations

G.1 Appendix 1: Obligations of Data Controllers

This section is taken from [1], the ILAWS report on legal analysis. This is a publicly available version of an appendix to the WeGov deliverable D5.1 [6]. The Data Protection Directive’s obligations are contained within Articles 6, 8, 10, 11, 12-15, 17 and Article 25. These have been implemented within the Data Protection Act 1998, forming the eight data protection principles. These are:

9. Personal data shall be processed fairly and lawfully and, in particular shall not be processed unless- (a) at least one of the conditions in Schedule 2 is met, and (b) in the case of sensitive data, at least one of the conditions in Schedule 3 is also met 10. Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes. 11. Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed. 12. Personal data shall be accurate and, where necessary, kept up to date. 13. Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or purposes. 14. Personal Data shall be processed in accordance with the rights of data subjects under this Act. 15. Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data. 16. Personal data shall not be transferred to a country or territory outside the EEA unless that country or territory ensures an adequate level of protection for the rights and freedoms of data subjects in relation to the processing of personal data.

SCHEDULES 2 & 3

 WeGov Consortium

Page 212 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Data Protection Act 1998

SCHEDULE 2: Conditions relevant for purposes of the first principle: processing of any personal data 1 The data subject has given his consent to the processing. 2 The processing is necessary— (a) for the performance of a contract to which the data subject is a party, or

(b) for the taking of steps at the request of the data subject with a view to entering into a contract. 3 The processing is necessary for compliance with any legal obligation to which the data controller is subject, other than an obligation imposed by contract. 4 The processing is necessary in order to protect the vital interests of the data subject. 5 The processing is necessary— (a) for the administration of justice, (b) for the exercise of any functions conferred on any person by or under any enactment, (c) for the exercise of any functions of the Crown, a Minister of the Crown or a government department, or (d) for the exercise of any other functions of a public nature exercised in the public interest by any person. 6 (1) The processing is necessary for the purposes of legitimate interests pursued by the data controller or by the third party or parties to whom the data are disclosed, except where the processing is unwarranted in any particular case by reason of prejudice to the rights and freedoms or legitimate interests of the data subject. (2) The Secretary of State may by order specify particular circumstances in which this condition is, or is not, to be taken to be satisfied.

SCHEDULE 3: Conditions relevant for purposes of the first principle: processing of sensitive personal data

1 The data subject has given his explicit consent to the processing of the personal data. 2 (1) The processing is necessary for the purposes of exercising or performing any right or obligation which is conferred or imposed by law on the data controller in connection with employment.

 WeGov Consortium

Page 213 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

(2) The Secretary of State may by order—

(a) exclude the application of sub-paragraph (1) in such cases as may be specified, or (b) provide that, in such cases as may be specified, the condition in sub-paragraph (1) is not to be regarded as satisfied unless such further conditions as may be specified in the order are also satisfied. 3 The processing is necessary— (a) in order to protect the vital interests of the data subject or another person, in a case where—

(i) consent cannot be given by or on behalf of the data subject, or (ii) the data controller cannot reasonably be expected to obtain the consent of the data subject, or (b) in order to protect the vital interests of another person, in a case where consent by or on behalf of the data subject has been unreasonably withheld. 4 The processing— (a) is carried out in the course of its legitimate activities by any body or association which— (i) is not established or conducted for profit, and (ii) exists for political, philosophical, religious or trade-union purposes, (b) is carried out with appropriate safeguards for the rights and freedoms of data subjects, (c) relates only to individuals who either are members of the body or association or have regular contact with it in connection with its purposes, and (d) does not involve disclosure of the personal data to a third party without the consent of the data subject. 5 The information contained in the personal data has been made public as a result of steps deliberately taken by the data subject. 6 The processing— (a) is necessary for the purpose of, or in connection with, any legal proceedings (including prospective legal proceedings), (b) is necessary for the purpose of obtaining legal advice, or (c) is otherwise necessary for the purposes of establishing, exercising or defending legal rights. 7 (1) The processing is necessary— (a) for the administration of justice, (b) for the exercise of any functions conferred on any person by or under an enactment, or

 WeGov Consortium

Page 214 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

(c) for the exercise of any functions of the Crown, a Minister of the Crown or a government department. (2) The Secretary of State may by order— (a) exclude the application of sub-paragraph (1) in such cases as may be specified, or (b) provide that, in such cases as may be specified, the condition in sub-paragraph (1) is not to be regarded as satisfied unless such further conditions as may be specified in the order are also satisfied.

8 (1) The processing is necessary for medical purposes and is undertaken by— (a) a health professional, or (b) a person who in the circumstances owes a duty of confidentiality which is equivalent to that which would arise if that person were a health professional. (2) In this paragraph “medical purposes” includes the purposes of preventative medicine, medical diagnosis, medical research, the provision of care and treatment and the management of healthcare services. 9 (1) The processing— (a) is of sensitive personal data consisting of information as to racial or ethnic origin, (b) is necessary for the purpose of identifying or keeping under review the existence or absence of equality of opportunity or treatment between persons of different racial or ethnic origins, with a view to enabling such equality to be promoted or maintained, and (c) is carried out with appropriate safeguards for the rights and freedoms of data subjects. (2) The Secretary of State may by order specify circumstances in which processing falling within sub-paragraph (1)(a) and (b) is, or is not, to be taken for the purposes of sub-paragraph (1)(c) to be carried out with appropriate safeguards for the rights and freedoms of data subjects. 10 The personal data are processed in circumstances specified in an order made by the Secretary of State for the purposes of this paragraph.

G.2 Appendix 2: Privacy Policy Template

This privacy policy sets out how [business name] uses and protects any information that you give [business name] when you use this website. [business name] is committed to ensuring that your privacy is protected. Should we ask you to provide certain information by which you can be identified when using this website, then you can be assured that it will only be used in accordance with this privacy statement.

 WeGov Consortium

Page 215 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

[business name] may change this policy from time to time by updating this page. You should check this page from time to time to ensure that you are happy with any changes. This policy is effective from [date]. What we collect We may collect the following information:

• name and job title

• contact information including email address

• demographic information such as postcode, preferences and interests

• other information relevant to customer surveys and/or offers What we do with the information we gather We require this information to understand your needs and provide you with a better service, and in particular for the following reasons:

• Internal record keeping.

• We may use the information to improve our products and services.

• We may periodically send promotional emails about new products, special offers or other information which we think you may find interesting using the email address which you have provided.

• From time to time, we may also use your information to contact you for market research purposes. We may contact you by email, phone, fax or mail. We may use the information to customise the website according to your interests. Security We are committed to ensuring that your information is secure. In order to prevent unauthorised access or disclosure, we have put in place suitable physical, electronic and managerial procedures to safeguard and secure the information we collect online. How we use cookies A cookie is a small file which asks permission to be placed on your computer's hard drive. Once you agree, the file is added and the cookie helps analyse web traffic or lets you know when you visit a particular site. Cookies allow web applications to respond to you as an individual. The web application can tailor its operations to your needs, likes and dislikes by gathering and remembering information about your preferences. We use traffic log cookies to identify which pages are being used. This helps us analyse data about web page traffic and improve our website in order to tailor it to customer needs. We only use this information for statistical analysis purposes and then the data is removed from the system.

 WeGov Consortium

Page 216 of 217

D5.3 Evaluation of the final WeGov Toolbox 19 October 2012

Overall, cookies help us provide you with a better website, by enabling us to monitor which pages you find useful and which you do not. A cookie in no way gives us access to your computer or any information about you, other than the data you choose to share with us. You can choose to accept or decline cookies. Most web browsers automatically accept cookies, but you can usually modify your browser setting to decline cookies if you prefer. This may prevent you from taking full advantage of the website. Links to other websites

Our website may contain links to other websites of interest. However, once you have used these links to leave our site, you should note that we do not have any control over that other website. Therefore, we cannot be responsible for the protection and privacy of any information which you provide whilst visiting such sites and such sites are not governed by this privacy statement. You should exercise caution and look at the privacy statement applicable to the website in question.

Controlling your personal information You may choose to restrict the collection or use of your personal information in the following ways:

• whenever you are asked to fill in a form on the website, look for the box that you can click to indicate that you do not want the information to be used by anybody for direct marketing purposes

• if you have previously agreed to us using your personal information for direct marketing purposes, you may change your mind at any time by writing to or emailing us at [email address] We will not sell, distribute or lease your personal information to third parties unless we have your permission or are required by law to do so. We may use your personal information to send you promotional information about third parties which we think you may find interesting if you tell us that you wish this to happen. You may request details of personal information which we hold about you under the Data Protection Act 1998. A small fee will be payable. If you would like a copy of the information held on you please write to [address]. If you believe that any information we are holding on you is incorrect or incomplete, please write to or email us as soon as possible, at the above address. We will promptly correct any information found to be incorrect.

 WeGov Consortium

Page 217 of 217