Objective ICT-2009.4.3: Intelligent Information Management

Total Page:16

File Type:pdf, Size:1020Kb

Objective ICT-2009.4.3: Intelligent Information Management

S E R E N D I P I T I

SEnsoR ENricheD Information Prediction and InTegratIon

- Network of Excellence -

Date of Preparation: October 2009

Objective ICT-2009.4.3: Intelligent Information Management

Coordinator name: Patricia Ho-Hune Coordinator organization: ERCIM Coordinator email: [email protected]

Scientific Coordinator name: Prof Alan Smeaton Scientific Coordinator organization: Dublin City University Scientific Coordinator email: [email protected]

List of Participants

Participant Short Participant Name Country No. Name 1 European Research Consortium for Informatics and ERCIM F Mathematics (Coordinator) 2 Dublin City University (Scientific Coordinator) DCU IR 3 Glasgow University GLA UK 4 University of Amsterdam UvA NL 5 National University of Ireland, Galway NUIG IR 6 University of Economics, Prague UEP CZ 7 Queen Mary, University of London QMUL UK

1 Industrial Advisory Board (IAB)

IAB Member Company / Organisation Country Dr. Kenneth Wood Microsoft Research, Cambridge UK Dr. Paulo Villegas Telefonica Spain

Data Providers Group

Data Providers Group Company / Organisation Country Member

User Group

User Group Member Company / Organisation Country

2 Project Abstract

SERENDIPITI (Sensor Enriched Information Prediction and Integration) will integrate Europe’s leading research groups in the area of managing the output from sensor networks and create a sustainable critical mass of innovative activity among what is currently a fragmented group of researchers and research topics. SERENDIPITI has the following objectives:

 To integrate the research activities of a large number of researchers across different research topics with a view to enabling new kinds of applications and interactions with aggregated sensor data;  To foster creation and development of long-term relationships between established national research groups leading to the development of a virtual centre of excellence in the new field of semantic urban computing;  To influence the research agendas on the European and world stages in several key aspects of managing sensor network data.

SERENDIPITI will be based on collaborative research among partners both within and outside the core consortium, and all will contribute towards a common platform for managing aggregated sensor data. This platform will be large-scale and will allow semantic enrichment of raw sensor data on a large scale. This creates challenges in the semantic interpretation and and management of large volumes of noisy, errorsome and sometimes uncalibrated sensor data and SERENTIPITI will also address challenges of predicting events based on mining sensor data in real time. To constrain and make the project feasible, SERENDIPITI will focus on sensor data from both the real and the online worlds, restricted to two city areas, Dublin and Amsterdam respectively, but in a way where porting to other areas would be straightforward.

SERENDIPITI can only achieve its goals through sharing of resources and skills across research topics and across research groups, outreach to several other targeted projects, and the sharing and movement of people both within and outside the project.

3 Table of Contents

1.1 Vision, concept, objectives...... 6

1.1.1 Vision Imperative...... 6 1.1.2 Main Objectives...... 9 1.1.3 Relationship to the topic of the call...... 11 1.2 Long Term Integration...... 12

1.2.1 Long-term Research Agenda...... 13 1.2.2 Virtual Centre of Excellence...... 13 1.2.3 SERENDIPITI Platform...... 13 1.3 Joint Program of activities...... 14

1.3.1 Joint Programme on Management and Quality Assurance...... 15 1.3.2 Joint Programme on Integration and Sustainability...... 16 1.3.3 Joint program on Cooperative Research...... 28 1.3.4 Workpackage List...... 38 1.3.5 List of Deliverables...... 39 1.3.6 List of Milestones...... 40 1.3.7 Work Package Description Tables...... 41 2. Implementation...... 58

2.1 Management structure and procedures...... 58

2.1.1 Network management organisation...... 58 2.1.2 Management Activities...... 64 2.2 Individual Participants...... 67

European Research Consortium for Informatics and Mathematics – ERCIM...... 67 Dublin City University (DCU)...... 68 University of Glasgow (GLA)...... 70 University of Amsterdam (UvA)...... 71 National University of Ireland, Galway (NUIG)...... 73 University of Economics, Prague (UEP)...... 75 Queen Mary, University of London (QMUL)...... 76 2.3 Consortium as a whole (Joemon to do this)...... 78

2.3.1 Synergies and track record on successful previous cooperation...... 78 - need to do 1-2 pages why we make up a good, complementary team and on who has worked together with who, and when, and what for, citing examples...... 78

Describe how the participants collectively constitute a consortium capable of achieving the network's objectives, and how they are suited and are committed to the tasks assigned to them. Demonstrate that the participants have made a mutual commitment towards a deep and durable integration continuing beyond the period of Community financial support (for example, by attaching letters of commitment from the executive bodies of the organisations)...... 78

2.3.2 Sub-contracting...... 78 Sub-contracting is not planned in this network...... 78

2.3.3 New Contractors...... 78

4 It is not planned to add any new contractors during the course of this Network of Excellence. However a main goal of SERENDIPITI is to achieve Pan-European integration, thus researchers from new affiliated partners will be included in the NoE activities, though funded through the founding partners...... 78

2.3.4 Other Counties...... 78 None of the SERENDIPITI contractors are from non-EU countries...... 78

2.4 Resources to be committed...... 78

2.4.1 Mobilization of Critical Mass of Human Resources...... 78 2.4.2 Partner Contributions...... 78 2.4.3 EC Funding Management...... 81 Section 3. Impact...... 82

3.1 Expected impact as listed in the Work Programme...... 82

3.2 Spreading Excellence, Exploiting Results, Disseminating Knowledge...... 84

3.2.1 Talent Boosting...... 84 3.2.2 Dissemination Activities...... 85 3.2.3 Exploitation by SERENDIPITI Partners (2-3 pages total, Alan to do)...... 86 Section 4. Ethical Issues...... 87

Annex 1: Letters of commitment from executives of the 7 partners...... 91

Annex 2 : Letters of support and commitment from the industrial advisory board...... 92

Annex 3 : Letters of Support and Commitment from Data Providers...... 93

Annex 4 : Letters of Support and Commitment from Users Group...... 94

Annex 5: Sample information sources for serendipiti platform (dublin)...... 95

5 1.1 VISION, CONCEPT, OBJECTIVES The SERENDIPITI vision is based on asking what happens when we aggregate partial observations to create a grand overview that is greater than the sum of its parts, in this case the activities happening in an entire city. This notion of aggregating partial observations has in fact been around for centuries. When the earliest explorers set out in ships in the 16th century there was a need to create maps and charts, and cartographers based their drawings on partial observations from ship captains. Many of these observations were needed and these were potentially conflicting and errorsome. Slowly, over the centuries, there emerged a map of the world based on this raw “dirty” data, until satellite imaging made it all too easy and gave us the maps we have today. Of course, gathering data from multiple observations in order to validate a scientific theory is the basis of good science. The question we pose here is – what if we apply this to an entire city, to learn more about the places in which we live and uncover information about the things that are happening around us?

The aim of the SERENDIPITI Network of Excellence (NoE) is to break new ground in an area we term “large-scale, semantic urban computing”. To this end, we combine complementary expertise from a range of partners in such a way that it not only integrates existing research activities but that also facilitates progress in our individual research areas. The mechanism we use to achieve this is to have a central goal towards which the consortium strives and upon which the entire work programme is based. This goal is the establishment of a Virtual Centre of Excellence (VCE) in the area of semantic urban computing that will spear-head and coalesce EU research activity in this field. All planned activities within the SERENDIPITI NoE revolve around putting in place the necessary mechanisms, supports and collaborations to ensure the instantiation and long-term maintainence of a VCE in this area.

1.1.1 Vision Imperative How well do you really know the city in which you live? How do you keep yourself informed of the multiplicity of different social, economic, cultural and environmental events that happen around your city every day? Many of us may pride ourselves on being in touch with what's happening in our environs, on having our finger on the pulse of our “home town”, but many more would perhaps like to be better informed and there are always things going on that you don’t know of, or things that happen in an unanticipated, unscheduled way. Others, such as journalists, city planners, essential service providers (the police force, fire and rescue emergency services) have crucial vested interest in having early sight of events in a city as well as the ability to track these events as they evolve. Furthermore, for such agencies it would be useful to record key characteristics of events after the fact, from factual details as to their affect on city infrastructural resources to less tangible impacts such as public response/feeling. It is important to be able to record such factors either for posterity or in order to be able to better equip our city and its inhabitants to respond appropriately the next time such phenomena (or similar phenomena) occur.

Traditionally, the role of informing the wider populace of the occurrence and impact of key events in an urban setting has fallen to a small minority of the population. These tend to be acknowledged experts, or at least respected voices in the community – reporters, columnists, critics, pundits, social commentators. Their role has been to assimilate and digest information about the origin, occurrence and aftermath of events, whether it be the Bloomsday celebrations in Dublin, the Edinburgh Fringe Festival programme, the riots at the G20 summit in London or an Ajax championship game in the

6 Amsterdam Arena, and feed their perspectives back to society, usually through traditional media channels like print and TV. This traditional model of an expert minority is now evolving due to the advent of social networking, the blogosphere/twittersphere, and content sharing sites that mean that nowadays everyone has ready access to a media channel through which to provide their own feedback on events to the broader population. Each of us can now assume the role of critic, commentator and reporter, publishing our observations and perspectives on an event through Twitter as the event happens, and subsequently via the photos and videos that we upload to Flickr or YouTube. In parallel, we can notice that more and more data streams from sensors in the physical world are finding their way online, often without human mediation, and many of these are freely available. These include regularly updated sources such as traffic flow information, web-cams in key city locations, weather conditions, parking space and city bicycle availability, to name but a few1.

The SERENDIPITI vision contends that combining physical world sensor data streams with their online counterparts2 and linking with traditional media channels, where alignment and sychronisation is performed via time and geo-spatial references, can provide a more holistic view of city events than has been possible heretofore. Beyond a more comprehensive overview and analysis of events, both in real-time and retrospectively, for prediction, this holistic approach is vital in potentially uncovering inaccessible and hidden information about events that might never otherwise come to light. In the spirit of our analogy with sea captains, cartography and aggregating partial observations, it should be remembered that Christopher Columbus was actually looking for a new way to India in 1492 and his landing in The Americas is considered one of the most important serendipitous incidents in the history of exploration! We believe that there is real progress to be made in the context of urban computing and characterising urban events by accidentally discovering something of interest as a natural by-product of aggregation of online and physical partial observations of the events.

1 See Annex 1 for an extensive list of freely available sources relating to Dublin, for example.

2 Note that we accept that in this context the terms ‘physical’ and ‘online’ data sources are are not antonyms – some physical sources such as CCTV and web cams are also strictly online sources. We simply use these terms as convenient broad identifiers here, with the key distinction being that physical sources arise autonomously out a pre-installed sensor base, where online sources are human generated or mediated.

7 Combining physical world and online data sources for event prediction and tracking for semantic urban computing use cases

This vision naturally leads to the SERENDIPITI proposition of large-scale urban semantic computing whereby both autonomous and human-generated data sources can be harnessed together for both emerging novel delivery mechanisms and more traditional media channels to the benefit of multiple stakeholders in an urban environment. For example, consider the case of city planning in anticipation of a large event that requires city-wide mobilisation of urban resources – a make or break Republic of Ireland World Cup qualifier match in Croke Park, for example. By aggregating partial observations from multiple sources, we could infer that the unseasonable inclement weather3, coupled with 83,000 people descending on one area in Dublin to watch a mid-week evening football match on a normal working day, coupled with a lack of public parking4, led to traffic chaos5 that was widely reported in the media, and this drove strong negative sentiment towards the handling of such events. Whilst such analysis is a useful forensic tool for understanding “what went wrong” after the event, it is also important to record the partial observations and the inferences made so that when the next large event occurs in Croke Park (albeit perhaps of a different nature, such as a U2 homecoming concert) city planners can better prepare based on a solid understanding of the issues that arose in previous, similar events.

3 Available from: MET Eireann, All Weather information relating to the area including; forecast, temperature, radiation, pressure, wind, sunshine, pollen count, soil temperature, humidity, rainfall, http://www.met.ie/ OR AskMoby, Weather forecast for Dublin, http://www.askmoby.com/weather/get-forecast-pc

4 Available from: Dublin City Council, Number of spaces available at car parking locations around Dublin, http://www.dublincity.ie/dublintraffic/carparks.htm

5 As reported by AARoadwatch, Upcoming events which may affect traffic, http://www.aaroadwatch.ie/dublin/

8 Motivated by this, city planning is a key use case scenario targeted by SERENDIPITI to demonstrate the benefits of aggregating partial observations from the physical and online worlds, just as cartographers did with observations from ship captains all those centuries ago. Of course, the underlying technology platforms that support such aggregation could also be used in other scenarios that leverage different sensor feeds. Consider the anti-globalisation and climate change protests that led to storming of banks and general public disturbance at the recent G20 summit in London in April 2009. Before, during and after the event, police and security forces were on high alert, with more than 2,500 officers mobilised in an operation costing up to £10 million6. A large scale urban computing platform that supports real-time aggregated analysis of the London CCTV network, vehicular traffic flow, Twitter tweets from the general population7 and real-time updates from the traditional media sources could be an invaluable tool for police and security services in managing such incidents. City tourism services could also use similar analysis tools to help characterise the “spirit of Amsterdam” during Queensday celebrations, with a view to providing multiple perspectives on this cultural event thereby helping to promote it to a broader constituency.

1.1.2 Main Objectives SERENDIPITI aims at stimulating joint research so that the European research community becomes a primary feeding ground for industrial innovation in the field of semantic urban computing, with its own European world class self-sustaining scientific forum. The concrete realisation of this will be the establishment of the Virtual Centre of Excellence (VCE) that will provide mechanisms to assist integration of organisations, people and research resources whilst facilitating technology transfer to industry. To attract and stimulate interest in the VCE, a modular open framework, including a large repository of data sources, analysis tools and ground truth, will be made available to the community and will be administered and maintained by the VCE itself.

Integration of organisations and people: SERENDIPITI brings together leading European research teams to create lasting integration of the currently fragmented research efforts that individually address only aspects of the project’s vision. The foundations for an enduring VCE will be laid by fostering the creation of sustainable and lasting relationships between existing national research groups. The NoE’s Joint Programme on Integration and Sustainability constitutes the suite of initiatives used to drive integration and ensure lasting impact.

Technical integration: The SERENDIPITI NoE will build an open and expandable framework for collaborative research on semantic urban computing – the SERENDIPITI Platform. The framework will be based on both a large repository of data (and associated ground truth) and a common distributed system made up of flexible, modular and interconnected technologies for accessing the repository. The purpose of the NoE’s Joint Programme of Cooperative Research is to stimulate joint research effort, a key by-product of which will be the population of the SERENDIPITI platform with appropriate data and tools. The programme is designed to interlink clusters of the complementary technologies and expertise that are required to address the SERENDIPITI vision, but that to date have independently co-existed with little crossover or cross-fertilisation of ideas. The SERENDIPITI

6 G20 summit: “London prepares for lockdown”, Telegraph, 25th March 2009

7 Two people were arrested for scanning police radios during the G20 protests in Pittsburgh in Sep 2009 for using Twitter to advise protesters on police movements, as reported in “Queens 'terror' raid hits G-20 anarchist”, New York Post, 3rd Oct 2009

9 platform will be pushed out to the broader community with an annual grand challenge 8 organised to help ensure take up – the SERENDIPITI Platform Challenge.

Pan-European integration: Pan-European integration is addressed by dedicated activities for sharing lab facilities, technological developments and resources. The NoE key objective in a Pan-European context is twofold: (i) to enable exchange of research personnel and sharing of available resources not already widely available; (ii) to reach out to other European institutions with specific consideration given to the 12 new EU member states.

Dissemination and outreach: Lasting impact will be ensured by implementing an effective plan for training, joint research, technology transfer, dissemination and exploitation that will continue beyond the funded life of the project under the auspices of the VCE. This plan includes:

 disseminating the technical developments of the network across the broader research community in the EU and beyond

 providing the means to bring project and research results directly to the general public via a specific use case targeting the EU citizen

 providing a forum for key stakeholders to contribute to shaping the project’s vision that will be achieved via the establishment of an External Advisory Chamber bringing together interest groups of end users and data providers.

 influencing and contributing to key standardisation activities related to the NoE’s vision.

Industrial innovation: Close cooperation and engagement with industry is vital to ensure that SERENDIPITI and VCE research outputs transfer smoothly to commercial innovations. To this end, an Industrial Advisory Board (IAB) has been established featuring an initial core of representatives from complementary industrial sectors in order to provide multiple perspectives on the NoE’s work. The IAB will be expanded over the lifetime of the project, its mission being to assist the network in defining research directions and to provide advice on potential exploitation of research results as they emerge. The IAB provides a mechanism to ensure that research does not become unfocused and self- serving but rather acts as a pipeline for future products and services.

An inclusive forum for all stakeholders: At the heart of SERENDIPITI is the vision of aggregating freely available data sources to drive novel semantically enriched applications and services. It is thus crucial to bring together the public bodies or companies that are providing the data with the eventual end users of the envisaged novel technology. In fact, in some cases the data providers are also potential end users. Consider, for example, the Roads and Traffic Division of Dublin City Council (DCC), who currently operate a large traffic control system that extends throughout the Greater Dublin area along with specific sites for collection of HGV classification information as well as video detection. This information is currently provided to the public but this is done independently of other relevant data sources, such as information on weather or scheduled events in the city. By considering all sources, SERENDIPITI tools will enrich the data that DCC can feed back to the general public. Thus, this local authority is both a data contributor to the project and a consumer of the technological outputs. The External Advisory Chamber brings together a User Group and a Data Providers Group to help guide the NoE in the development of the SERENDIPITI platform and to inform and help guide

8 The grand challenge concept is a proven approach to stimulating research activity in a given domain. It is best achieved by presenting the research community with a difficult problem to solve and by providing researchers with the means, usually software tools and data, to investigate potential solutions (e.g. TREC, CLEF, INEX, TAC, etc)

10 the evolution of the VCE. These forums will avail of the latest social networking advances to ensure that they are well connected and constantly updated.

1.1.3 Relationship to the topic of the call SERENDIPITI as an NoE: SERENDIPITI is perfectly aligned to the scope and objectives of the NoE instrument as follows:

 A committed core of institutions with the ambition to integrate and coordinate their research efforts in the novel area of large scale semantic computing. The consortium have an excellent scientific reputation, have shown national and international leadership, and are highly experienced in fostering and leading collaborative teams of researchers in the areas of real-time large-scale data analysis, semantic information fusion and enrichment, association and correlation across information sources, pattern analysis and advanced information retrieval (IR) models. The consortium are also key players in relevant standardisation activities, e.g. W3C Semantic Annotations for Web Services Description Language and XML Schema Working Group (W3C SAWSDL WG), OASIS Semantic Execution Environment Technical Committee (OASIS SEE TC), Ontology Management Working Group (OMWG), and take leading roles in a variety of world-wide benchmarking activities (TREC, TREC-VID, INEX, CLEF, OAEI etc).

 The creation of an internationally renowned virtual research centre is at the heart of this proposal with both the Joint Programme on Integration and Sustainability and Joint Programme of Cooperative Research built around this common goal. For the former, WP2 (Integration of People & Organisations), WP6 (Infrastructure Integration & Sharing) and WP7 (Outreach and Spreading Excellence) are designed to put in place the necessary pre-requisites for the existence of a VCE, as well as instantiating the processes and mechanisms that will provide for continued access to VCE resources and running of VCE processes beyond the lifetime of the project. In the case of the latter, three technical R&D work packages are designed to stimulate inter-disciplinary research in the areas of real-time large-scale data analysis (WP3), semantic information fusion (WP4) and inference and event prediction (WP5) to ensure that the VCE has a generous supply of technology

As an NoE, SERENDIPITI targets world-class research organizations that wish to combine and integrate a substantial part of their activities and capacities in a given field, with a view to creating a sustainable virtual centre of excellence.

Match with strategic objectives of Objective ICT-2009.4.3: Intelligent Information Management: This proposal addresses all target outcomes of the strategic objective, as illustrated via the following quoted aspects of the work programme:

Target Outcome (a) Capturing tractable information:

 “technologies to acquire, analyse and categorise extremely large, rapidly evolving and potentially conflicting and incomplete amounts of information”

 The SERENDIPITI vision is built around this very notion applied to freely available city-wide data sources

Target Outcome (b) Delivering pertinent information:

 “proactive diagnoses of information gaps and triggering goal-dependent search, acquisition, structuring and aggregation of relevant local, remote and streaming resources”

11  SERENDIPITI use case scenarios target detection, analysis and tracking of city events in real-time from multiple sources as well as contextualised search of historical archives of this information.

Target Outcome (c) Collaboration and decision support:

 “problem solving and decision support systems for critical, information-bound domains”

 The SERENDIPITI use cases target city planners and essential service providers – the police force, fire and rescue emergency services.

Target Outcome (d) Personal sphere:

 “provision of personalised and context-dependent information from multiple sources and services”

 All use cases envisage demonstrations of context-dependent IR from multiple information sources, and the third SERENDIPITI use case specifically targets the EU citizen as end user

Target Outcome (e) Impact and S&T leadership:

 “initiatives designed to link technology suppliers, integrators and leading user organisations”

 SERENDIPITI’s External Advisory Chamber includes an Industry Advisory Board, a User Group and a Data Providers Group. 1.2 LONG TERM INTEGRATION Various research institutions across Europe are targeting different aspects of the SERENDIPITI vision, however, individually they cannot critically influence the direction of technological development in such a complex field as large-scale urban computing. A considered and well-defined plan for long-term integration can however help ensure that Europe as a whole is at the forefront of overseeing the evolution of this new multidisciplinary research area. Thus, the SERENDIPITI plan for long-term integration is based around the following core principles: a) it must promote a critical mass of research excellence in the relevant research areas within the European Community; b) it should foster convergence and synergy on technologies being developed so far independently by distinct research communities; and c) it should achieve resource optimization when targeting important multidisciplinary aspects of semantic urban computing. More specifically, the lasting impact and long-term integration rest on two main innovations:

1. The creation of an open and expandable framework for collaborative research based on a common distributed system and associated data repository that is promoted and made available within the research community. 2. The establishment of a Virtual Centre of Excellence that can take over the processes and mechanisms that are initiated during the funded project life-time in order to ‘boot-strap’ long-term integration.

12 1.2.1 Long-term Research Agenda The SERENDIPITI vision of aggregating partial observations of an event in the context of large-scale semantic urban computing brings many technical challenges given the sheer volume and nature of data sources that are noisy, dynamic, unaligned and unstructured. The envisaged urban computing platform requires real-time large-scale data analysis targeting multiple different modalities. The analysis output needs to be semantically enriched, followed by the integration (fusion) of the extracted information using a combination of scalable and real-time semantic, statistical, NLP (for text) and even image/video processing (for CCTV and web cams) techniques in order to ensure machine-readable knowledge. This knowledge then needs to be combined with inferencing based on heterogeneous information provided by diverse sources along with the ability to predict the occurrence of events based on past observations and inferences. Finally, it is crucial to be able to archive all data sources and/or analysis results and provide access to this rich multi-modal repository via content- based indexing that supports summarization, browsing, search and retrieval. These constitute technological challenges that will require long-term multi-disciplinary research. The SERENDIPITI consortium includes highly experienced teams with strong expertise in each of these areas that will work together to integrate tools and capabilities towards an open software platform that can be used as the basis for future research in the field of semantic urban computing for many years to come.

1.2.2 Virtual Centre of Excellence However, we firmly believe that the breadth of the SERENDIPITI vision requires more than technical integration between academic partners with expertise in specific technical areas. Rather, what is truly required is promotion of the SERENDIPITI agenda towards a longer-term initiative that mobilises an entire research community, and several of the project tasks directly address this. For this reason, the key objective of the network, besides technical validation of a vision of large-scale urban computing, is the establishment of such a research community, and furthermore one that includes not only academia but also anyone with a vested interest in the SERENDIPITI proposition. To this end, we will put in place a Virtual Centre of Excellence (VCE) in the field of semantic urban computing that will bring together technical researchers within a virtual laboratory, end users and data providers in the form of special interest groups, and a forum for sociological and ethical debate on the issues surrounding urban computing. The VCE will be overseen by an executive management group drawn from all mentioned constituencies, whose key mission will be to look to ongoing integration as facilitated by the VCE.

1.2.3 SERENDIPITI Platform Of course, the simple process of establishing a VCE will not be sufficient to ensure its success. We believe that it is necessary to provide added value to the different communities in order to attract their participation. We believe that this is best started with the research community as they are more likely to see the early benefits of participation in such an undertaking. For this reason SERENDIPITI will put in place a number of key activities as key components of the network's work programme that draw the community together, that run in parallel to the establishment of the VCE and that will ultimately come under the auspices of the VCE – these are the pillars in the long term integration diagram above. The most important pillar in this respect is the establishment and maintenance of an extensive

13 repository of physical and online data sources, as well as semantically marked-up analysis of these data streams. The existence, promotion and adoption of such a data set will be supported by additional pillars corresponding to researcher mobility and talent boosting targeting pan-European integration, as well as promotion of the SERENDIPITI network itself. To help ensure the long-term viability of the VCE the SERENDIPITI repository will be a live repository and not just a static data set put in place during the course of the funded project’s lifetime. Rather a key VCE activity will be to oversee it evolve and grow as ever-new forms of relevant data find their way online, with access to the data set ensured via the platform, tools and APIs developed and made available during the project.

The pillars supporting the development and promotion of the SERENDIPITI Platform for Urban Computing, constitute a key “unique selling point” of the network, and subsequently the VCE. The platform, designed and developed with input from the technical research packages (WP3, WP4 and WP5) will provide open APIs for the community to access the SERENDIPITI data repository and add their own analysis and Semantic Web tools and inferencing engines and strategies, even after the lifetime of the project. The platform will also be available for use in developing other applications or validating additional use case scenarios beyond those explicitly targeted by the network. Indeed one of the work tasks in SERENDIPITI will be to pose a challenge to that research community to develop such applications – the SERENDIPITI Platform Challenge.

1.3 JOINT PROGRAM OF ACTIVITIES

The work programme for SERENDIPITI is made up of 7 work packages arranged into 3 joint work programmes. The first of these, the Programme on Management and Quality Assurance has one work package (with the same name) but also involves input from the External Advisory Chamber. This chamber is made up of an Industrual Advisory Board (IAB) whose role is to provide advice, guidance, and feedback on the performance of the network, in a non-statutory but functionally important role. The IAB is augmented by a Data Providers Group and a User Group, representing those organisations who have consigned to provide data sources for use within SERENDIPITI and have expressed their willingness in their support letters in Annex 3, or will form part of the target case scenarios for the SERENDIPITI platform and who have likewise expressed their support through their letters in Annex 4. The Data Providers Group and the User Group are constituted as user groups with a less rigorous functional role than the IAB, which is a board, though all three constitute the External Advisory Chamber and this strengthens the quality assurance throughout the network.

Following the management and quality assurance, the Joint Programme on Integration and Sustainability holds the main mission of the network, to foster and facilitate collaboration across partners, across domains, and in doing so to create something which would not otherwise be achievable. The integration and sustainability programme is made up of three work packages on Integration of Organisations and People (WP2), Infrastructure Integration and Sharing (WP6) and Outreach and Spreading Excellence (WP7). This joint program is where the SERENDIPITI network moves from being a collection of independent research activities to being a force in the development of the area of semantic urban computing. This is complimented by the third work programme, the Joint Programme of Cooperative Research where developments in the basic research which enable the development of semantic urban computing, takes place. This also has three work packages, on Real- Time Large Scale Data Analysis (WP3), Semantic Information Fusion (WP4) and Inference and Event Prediction (WP5).

14 The groupings of work packages into joint programmes, each work programme with its constituent activities, and the interactions among the joint programmes, is shown in Figure 1.3.1. We now present each of the Joint Programmes in turn.

Figure 1.3.1: Interaction between Joint Programmes and Work Packages

1.3.1 Joint Programme on Management and Quality Assurance This joint program is coordinated by Ms. Patricia Ho-Hune from ERCIM

The management structure is conceived to optimally support the efficient and timely organization and delivery of the SERENDIPITI work plan. It is designed to ensure a coherent, scientific multi- disciplinary, administrative and financial coordination of SERENDIPITI, while providing the participants with the support and tools required for the achievement of the NoE’s objectives. In particular, the management will:

 Establish a democratic yet reliable overall organisation supporting the completion of the project activities;  Support the integration of both research teams and activities, and ensure in particular the interaction among the different SERENDIPITI Work packages and activities;  Assess the quality of work achieved and take appropriate measures if needed, in particular to supervise and review the completion of the milestones and deliverables;  Promote the visibility of the SERENDIPITI network;  Provide expertise to address Intellectual Property Rights issues;

15 Running through all of the management activities is a responsibility for quality assurance which is embedded in all aspects of the network. To achieve the correct blend of organisational rigour and scientific innovation, the work package on management is divided into two components, one for organisational management (WP1-A) and one for scientific coordination of the network’s activities (WP1-B). The first is coordinated by ERCIM and the project coordinator, the second is coordinated by DCU and the scientific coordinator. The two will work closely together.

1.3.2 Joint Programme on Integration and Sustainability This joint programme is is made up of three work packages on Integration of Organisations and People (WP2), Infrastructure Integration and Sharing (WP6) and Outreach and Spreading Excellence (WP7). The work package on infrastructure integration and sharing is quite a technical work package and is classified as RTD but we place it in the Programme on integration and sustainability because its goal is to bring the separate strands of work in WP3,4 and 5, together. The Work package on Integration of People and Organisations is mostly made from organisation activities while the work package on Outreach and Spreading Excellence has a variety of activities, from societal impacts to joint publications, and from web portal to running a benchmarking activity.

1.3.2.1 Integration of Organisations and People (WP2) This WP is coordinated by Dr. Vojtěch Svátek of University of Economics, Prague (UEP)

Human capital is the most important asset of any network and SERENDIPITI will pursue a range of activities towards boosting the quality of researcher talent both within and outside the network. This will be particularly aimed at young researchers and will involve researchers from a range of disciplines involved in SERENDIPITI activities which make up what we refer to as semantic urban computing. The integration will be achieved at the level of on-site joint research (fellowships and exchanges), visiting lectures, web conferencing, summer schools, joint PhD supervision, sharing of educational resources, involvement in joint publications and involvement in using the SERENDIPITI platform described in activity A6.5 through the SERENDIPITI platform challenge, described in activity A7.5. Care will be taken to extend the impact of the Network to other countries than those directly represented through project partners in activity A2.5. The ultimate goal, reflected from the very beginning, is to form a Virtual Centre of Excellence that will persist over the completion of the SERENDIPITI project.

The workpackage is broken into five varied activities described as follows.

Activity 2.1 – Towards A Virtual Centre of Excellence

Leader: QMUL

One of the primordial tasks for the Network is to lay the fundaments of a Virtual Centre of Excellence (VCE) on semantic urban computing that would assure persistence of the achieved accumulation of people and know-how. Initially composed and steered by the core partners, the virtual center will enlarge the member basis by integrating new partners in its governing body by establishing a research council and an associate membership program for academics. Clear priority will be given to enlargement with new member states partners with the constraints to avoid over representation of a single country. The virtual centre will also rely on its own industrial board for the purpose of continuous market awareness.

16 This activity targets planning and execution of the program towards the VCE on semantic urban computing. Specifically the following sub-activities will be undertaken.

Planning of Membership and Constitution: In this activity the plan, principles, problems and benefits of creating the envisaged centre will be analysed. This analysis will be used to produce well- defined terms of reference, membership and “mission statement” of the VCE. The results of this initial analysis will be weighed carefully against alternatives such as affiliation to existing professional bodies that will inevitably evolve to better represent the importance of semantic urban computing.

Strategic plan for sustainability: A carefully thought out, realistic and achievable strategic planning process is required to achieve sustainability. This planning process should address among other issues, legal issues pertaining to the establishment of the centre, financial sustainability, centre identity, links to existing national centres, programmes and networks, etc.

This activity will draw and execute a plan and run sub-activities, analysis and discussions related to the sustainability of the envisaged VCE. It will be lead by QMUL but will involve all partners.

Activity 2.2 – Researcher Mobility Programme

Leader: ERCIM

The Network will encourage the mobility of junior as well as senior researchers within the Network and will allocate substantial funding to it. The mobility program will have two schemes differing in both the organisational setting and their typical duration. The Researcher Exchange Program (REP) will be open to members of the SERENDIPITI Institutes and the Fellowship Program (FP) will be reserved for PhD holders from all over the world to allow young scientists to join research centres within the Network. This Fellowship program is similar to the one already running within GEIE ERCIM and will benefit from its 20-year experience.

The Mobility program and procedure will adhere to specific rules described in the deliverable “Definition of the Mobility Program”. In addition the status of this activity will be described in the annual progress reports.

 SERENDIPITI Fellowships (FP) This program will offer 12-18 month positions under a selection process managed by the Project Management Board. This program will aim to recruit excellent research fellows with respect to a particular topic addressed by the SERENDIPITI research objectives. In addition contacts will systematically be established with external academic institutes within activity A2.5 (Pan-European Integration) as described below. The research fellow will be hosted by two SERENDIPITI Members for a 6-9 month period each and will include another short visit to at least one more SERENDIPITI institute.

The process will follow 3 steps:

1- Launch of the call for proposals: the call for SERENDIPITI fellowship will be kept open for the first two years of the project;

2- Review and selection of best proposals will be conducted by the management Board on a monthly basis;

17 3- Recruitment of candidates will be carried out through extensive advertising at all SERENDIPITI member establishments and via the ERCIM channels, i.e. through publication on ERCIM Web site and announcements in ERCIM News magazine (approximately 11,500 copies in over 100 countries worldwide).

Available funding: the Network-budget guarantees four fellowships but may increase this number depending upon interest.

 SERENDIPITI Researcher Exchange Program (REP): The mobility of researchers will consist of scientific visits and research stays between members of different SERENDIPITI institutions, as for example those involved in the same WP research activities. Typical stays will be of one month although shorter or longer durations will be considered depending on which category of researchers it concerns. For each stay a joint proposal from the visitor and hosting institutions should be prepared, describing a project summary about the interest of the stay. Upon review of the Management board, the exchanges will be validated and refunded by a weekly lump sum to cover travel, accommodation and subsistence costs. This program is open to Junior and Senior Researchers from project members. To this aim the terms of the exchange will be adapted as follows: - PhD Exchanges and Internships: PhD students as well as recently graduated post-docs from within the Network will have the possibility to apply for a ‘talent-boosting’ internship within another partner. The typical duration of the internship will be 1 to 6 months. The applications will be granted through a review process managed by the Management rating both the qualification of the applicant and the overlap of interest between the applicant and the hosting institutes.

- Visits of researchers (incl. senior), as well as short visits of external scientists will be funded to enforce personal relations and understanding among scientists of the Network. In order to comply with its goal of integration within the scientific community, the Network will support short visits between a founding member (namely one of the partners of this proposal) and a non-founding or outside member, providing that the joint research program is relevant to the project activities and the visited institute is a founding member of the Network.

Available funding: the Network-budget forecasts more than 200 weeks for the REP but may increase this number if there is demand. In general and whenever it is appropriate, the Management board will consider shifting some budget between the two SERENDIPITI Mobility schemes to fit as much as possible with the needs of the researchers of the Network.

Finally, we will explore the industrial placement of SERENDIPITI researchers engaged in their final year of PhD studies. This will enable the transfer of technologies from SERENDIPITI to allied industrial domains and to actively strengthen industrial relationships relevant to the SERENDIPITI objectives.

Activity 2.3 – Cooperation in Education, Teaching Materials and PhD Formation

Leader: GLA

Six of the SERENDIPITI partners are leading research-oriented academic institutions and many of the senior researchers involved in SERENDIPITI teach courses at their Universities, give research and invited talks, tutorials at conferences etc. In order to consolidate such a vast experience and form a critical mass in the SEREDENDIPTI domain, the network will organise a set of activities aimed at co- operation in Summer School teaching, sharing of teaching materials and the development of PhD formation in multi-disciplinary areas.

18 We have specified the following sub-activities:

 The Network will, in its field of research interest, collect new teaching materials (lecture notes, video lectures, talks by SEREDIPTI researchers) and form links to relevant items in already existing repositories such as those from previous IST projects (Knowledge Web – REASE, K-Space, Reasoning Web etc.). The result will be a publicly available database of teaching resources.  The Network members will actively create commented summaries upon these resources, which can be seen as proto-syllabi for new, dynamically built courses focussing e.g. on large scale and/or semantic urban computing.  The Network will establish long-term cooperation in education of undergraduate students as well as in joint supervision of PhD theses in its fields of interest. In fact, all the SERENDIPTI supported PhD students will be jointly supervised by at least 2 partners of the consortium.  The interaction among the PhD students involved in the network will be fostered by organisation of an annual workshop – SerendiPhDi – where leading-edge research results will be semi- informally presented. The organisation of the workshop will be, for most part, carried out by the PhD students themselves to encourage and improve the organization skills of the students.  SERENDIPITI project meetings at partner institutions will also be scheduled to allow the opportunities for senior SERENDIPITI researchers to deliver guest lecturers to ‘local’ students during the times of these meetings and to contribute to local teaching in that way, so that students from each university can benefit from the consortium wide expertise. In addition, this will enhance our teaching repositories.  Aside from such face-to-face events, there will be joint web conferences during which the PhD students and researchers will present their contributions to the hot research topics of the Network.  We will exploit avenues like Erasmus programmes to strengthen interaction and academic cooperation between partners.

Activity 2.4 – Summer Schools and Social Networking

Leader: NUIG

An annual Summer School lasting one week and consisting of 5 days x 6 hours of teaching material will be created and will attract between 50 and 80 students, mostly from outside the network, to the topic of semantic urban computing. Lectures will be given by senior SERENDIPITI researchers and by guest lecturers and will cover the whole range of SERENDIPITI activities from sensor technology, data communications, semantic web technologies, databases, data mining and machine learning, software development and interfaces, and societal impacts of new technologies.

While one goal of summer schools is to gather people and to foster long-term collaborations between young researchers, the interactions between students generally end when the summer school ends. In order to go further and to ensure long-term relationships between summer school attendees, leading to new collaboration opportunities and bootstrapping new research fields, we will provide access to an online social networking application built around SERENDIPITI so that attendees are encouraged to stay in touch with each other, have the ability to interact with the school lecturers after the summer school, and build upon their summer school experience around the SERENDIPITI platform. It will also include facilities to exchange messages as well as sharing publications, and will be done by using an existing social networking platform.

Activity 2.5 – Pan-European Integration

19 Leader: UEP

While the Network partners are based in 5 EU countries, the integrating activities of the network will extend to relevant parties in the whole European space. In particular, PhD students and researchers will be systematically invited to Summer Schools and to the mobility programme; joint activities will be planned with related EU projects; public bodies in different European cities will be involved in the dissemination and the like. The effort will have the following three foci.

 General infrastructure. The outcomes of this activity will materialise in the form of a contacts database, which will serve for the purposes of other activities and WPs. In parallel, Activity 2.5 will continuously monitor the activities of the whole network in terms of keeping their benefits, as much as possible, accessible to parties from countries not represented in the consortium.  New EU member states. Particular attention will be paid to the situation in these states, in which the implementation of urban computing applications is rather novel but often progressing fast. The specific situation in these countries will be analysed in connection with prospective application of technologies developed or advertised within SERENDIPITI. Researchers from these states will be invited for presentations and discussions.  Industrial contacts. Dedicated effort will be devoted to establishment of contact with additional industrial subjects (aside those present in the IAB), and settle initial negotiations on technology transfer, which will subsequently continue within Activity A7.3.

1.3.2.2 Infrastructure Integration and Sharing (WP6) This WP is coordinated by Prof. Joemon Jose of Glasgow University (GLA)

SERENDIPITI’s objective is to integrate people and research from diverse fields like multimedia processing/computer vision, information retrieval, semantics, text processing and formal models of data integration into this new emerging area of semantic urban computing. The joint programmes on cooperative research (WP 3-5) is aimed at progressing research beyond the current state of the art and we will present details of this later. We will perform the joint programme on cooperative researchof activities based on a set of case studies aimed at demonstrating our SERENDIPITI vision and research results and these case studies and the platform on which they run forms part of this joint programme on Integration and Sustainability. The intention is not to focus on such demonstrators, but to use them as creative guiding force that direct the research and integration activities. For this purpose, we have defined three different but related case studies: city planning; police; and citizen/journalist. Such case studies require integration of data from multiple sources and are incomplete and noisy by nature. These case studies form a basis for integrating our research activities and to facilitate user-centred evaluation. In addition, we also define a set of activities, which extends the developments within the SERENDIPITI by exploiting the results outside the consortium.

The results of the activities of this work package include a publicly available SERENDIPITI platform (A6.5); 3 demonstrators (A6.4); development of data sets for experimentation (A6.3); design of an experimentation methodology (A 6.2); and mechanisms for making the research results accessible to the rest of the world. The latter includes the creation of virtual laboratory (A 6.1) and the provision of a SERENDIPITI application platform (A6.5).

The objectives of this activity are to facilitate the exploitation of SERENDIPITI research results within the consortium for the effective integration and also outside the consortium to integrate with other stakeholders.

20 The work package is divided into a series of five inter-related tasks, described as follows.

Activity 6.1 Virtual Laboratory

Leader: QMUL

The objective of this activity is to form a SERENDIPITI Virtual laboratory in the area known as semantic urban computing. The concept of a virtual laboratory is to make the SERENDIPITI research activities to reach outside the consortium and also make the SERENDIPITI activities flourish even after the end of the project. The Virtual laboratory, a component of our proposed Virtual Centre of Excellence (A 2.1), will host the SERENDIPITI infrastructures platform (A6.5), experimental data sets (A6.3), and tools developed in (WP3-5). The VL will work along with other aligned sets of societies (for example: CEPIS and the SMART Society).

Activity 6.2 Evaluation Methodology

Leader: GLA

Evaluation of scientific results is very important and SERENDIPITI takes this as a core activity. Proper evaluation allows characterizing the quality of the research results. However, there is no mature evaluation methodology developed so far for data integration and prediction. Most evaluation activities are scattered and depends on the research domain, for example: Information retrieval methodology used in TRECVid and CLEF.

Given the integrative nature of SERENDIPITI research activities, it is important to address the issue of evaluation methodology. Based on the scientific traditions so far, we will develop an evaluation framework for benchmarking SERENDIPITI research activities. This involves developing procedures for data collection, specification of experimentation and measures for comparison. In addition, we will develop a framework for user-centred evaluation of SERENDIPITI tools.

Activity 6.3 Data Sets & Ground Truth Generation

Leader: DCU

The objective of this activity is the assimilation of data sources for experimentation and development of ground truth information. The activities of SERENDIPITI involve integrating noisy information from heterogeneous data sources. We have made arrangements for collecting such data (e.g., TRECVid Surveillance data set; data collection form publicly accessible web cams, WikiMedia, Sound and Vision data sets etc.). However, due to the technological developments and sophistication of infrastructures newer forms of data is made available quite often. The collection and exploitation of such data sets will be a continuous activity by facilitating collection of data as and when they are made available. In addition, we will endeavour to create ground truth data sets from such collections for experimentation. This activity will interact closely with the Data Providers Group. The consortium will make concerted effort to make such collections publically available through our virtual laboratory.

21 Activity 6.4 Case Studies

Leader: DCU

The case studies form a basis for integrating research and people within the consortium. Three case studies are planned, namely city planning (policy), citizen/journalist and police.

Activity 6.4.1 City Planning (Policy)

City planning requires inputs from multiple sources: feedback from citizens, identifying the current needs, predicting the future needs, lessons from past events, and so on. The objectives of this case study are to explore the development of city planning system for a given area by integrating traffic- flow information (web cams, CCTVs,), blog, twitter, RSS feed information by various stake holders and other available information from environmental agencies etc. A number of tools are needed: event visualizer (analysing effect of various events on the planning); mechanisms for incident retrieval for a given context; timeline of events etc. These tools, in addition, should provide an incident predictor mechanism as and when a new stream of information becomes available (e.g., traffic flow, weather information). Such tools will be evaluated in a real user setting (involving city planners). The deliverable from this will be a Street Planner demonstrator, in month 18 and later on at month 34 with a refined and enhanced version.

Activity 6.1.2 Citizen Journalist

Journalists working in a news desk need to access and integrate data from multiple sources: web feeds; twitter feeds; multimedia streams etc. Ordinary citizens are also starting to use online sources of information for controlling their own, independent access to information and increasingly using the same information sources as journalists. The objective of this case study is to develop a demonstrator for modern journalist and the ordinary citizen. This demonstrator will integrate data from offline data resources, online real-time data streams and create journalistic alerts. In addition to responding to a set of events a journalist need to collect past events and analysis. This involves searching through a number of diverse data sources. SERENDIPITI tools will provide mechanisms for searching through large data archives of diverse nature. In addition, it also provide incident summary based on real-time stream information like tweets, blogs, and other comments from general public. The deliverable from this will be a Journalist news alert demonstrator, in month 18 and later on at month 34

Activity 6.1.3 Police

Finally, activities like the G20 meeting in London in Summer 2009 involve coordinating data from many sources. Many cities have data streams available from CCTV feeds, web cam feeds, and traffic information. In addition, more and more people start blogging, tweeting on events. The objective of this case study is to develop a police incident alert and predictor demonstrator for major events. The alert system will collect data from all available data streams and generate appropriate real-time alerts for Police personnel so that they can respond instantly to developments. In addition, given more information about the activity, crowd flow and other data streams it is possible to predict events that will be useful for police event management. These tools will be evaluated in a user setting and as with the other demonstrators, the deliverable from this in this case will be a Police event alert and predictor, in month 18 and later on at month 34

Activity 6.5 SERENDIPITI Platform

Leader: UvA

22 The objectives of this activity are to facilitate the exploitation of SERENDIPITI research results in the area of semantic urban computing within the consortium for the effective integration and also even outside the consortium to integrate with other stakeholders. We envisage 2 major activities. We will develop a SERENDIPITI Application Platform, which will be used for experimentation within the consortium, and mature forms of it will be made available to other researchers through a Virtual Laboratory (see activity A6.1). Based on our experience so far we will create two SERENDIPITI platforms: one based on Semantic Web technologies for providing semantic mash-ups, a second one based on the Hadoop/MapReduce framework for exploiting the parallel processing framework available to SERENDIPITI partners and described in Section 2.4 of this proposal.

Activity 6.5.1 Semantic Mash-up Platform (Coordinated by NUIG)

We will exploit the infrastructures developed in past and present projects like OKKAM, LARKC etc. for developing the SERENDIPITI application platform for the semantic integration of data from various sources and facilitating real-time applications. Based on expertise gained at NUIG with Sindice9, the Semantic Web index (partially-funded by the EU projects OKKAM and Romulus, as well as by the Science Foundation Ireland), sig.ma, a real-time mash-up of Semantic Web data, Webstar (cluster of 500+ core, 1TB RAM and 400TB disk space) and compute clusters from University of Amsterdam and from DCU, we will deploy a semantic mash-up platform in order to let SERENDIPITI members build their own applications using semantically organized data as gathered, extracted and integrated in the context of WP3, WP4 and WP5. We will ensure a 24/7 quality of service thanks to a clustered infrastructure providing scalable solution for developing such mash-ups. This platform will become an important component of the virtual laboratory, described in activity A6.1.

Activity 6.5.2 Hadoop/MapReduce Platform (Coordianted by UvA)

The objective of this activity is to exploit the publicly available distributed computing facilities for developing real-time computationally intensive applications. The MapReduce programming model provides a framework for organizing distributed computations that exhibits good "scale out" characteristics on clusters of commodity machines. With the release of the open-source Hadoop implementation, researchers now have ready access to a cost-effective tool for tackling Web-scale problems. We will develop computational frameworks based on MapReduce/Hadoop Framework which facilitates development and distribution of our platform and also forms part of the virtual laboratory, described in activity A6.1.

1.3.2.3 Outreach and Spreading Excellence (WP7) This WP is coordinated by Prof Noel O’Connor of Dublin City University (DCU)

SERENDIPITI has a very broad range of stakeholders to whom outreach from the Network of Excellence is possible, and relevant. These include the following

1 The community of researchers within and outside the network (both academic and industry), including students, postdoctoral researchers, academic Faculty, industry researchers, and interns, who all have an interest in the challenges of leveraging sensor data for novel applications.

9

23 2 Those organisations who are providers of data for public consumption (see WP3 and WP6). 3 Those who work within our three identified application areas –planning and those involved in scheduled events in cities, those involved in all kinds of journalism though not including ordiary citizens, and those involved in security – for whom there is likely benefit from access to applications developed on the SERENDIPITI platform (activity A6.5). 4 Application developers who would avail of the opportunity to use the SERENDIPITI platform in software development. These could be other academic or industry researchers, but the needs of this group will be specific and different to those researchers interested in the research challenges of sensor data integration mentioned in (1) above (WP X). 5 Standards groups such as the W3C Semantic Sensor Network Incubator Group and others.SERENDIPITY has already identified several projects at EU and at national levels with whom we should have at least dialogue and probably some real interaction. These include OKKAM, LarKC, the Kno.e.sis Citizen Sensing initiative, the MIT SENSEable City Laboratory , the US projects ‘This We Know’ and ‘Common Sense’ , the EU funded projects ‘WeKnowIt’, OKKAM and LarKC, and national projects such as the Irish funded project ‘SmartBay ’. 6 Because SERENDIPITI is an activity which is immediately comprehensible to the ordinary EU citizen who has an opportunity to benefit directly and indirectly from the SERENDIPITI activity through access to the applications developed on the SERENDIPITI platform.

With such a broad church of constituents for interaction, in order that outreach and the spreading of excellence is spread across this wide community, a comprehensive evaluation plan for the SERENDIPITI activities in this area will be developed early in the project which will include targets and milestones and this will be delivered at M3. This evaluation plan will be re-visited twice during the lifetime of the network, at M12 and M24 where the targets re-assessed and if necessary revised, with justifications, and the performance of the network against the targets will be presented at M12, M24 and M36.

The objectives of this WP are broadly defined as:

 To disseminate the activities and achievements of the SERENTIPITI network to a wide and diverse range of interested parties in a timely and efficient manner, from fellow researchers to ordinary citizens of the EU;  To enhance the research experience and expertise of researchers working in the area of semantic urban computing, from both within the network partners as well as from partners outside the core group;  To contribute to the development of standards in this emerging and important area;

Because WP 7 is one of the WPs which focuses on joint integration and is part of the Joint programme on Integration and Sustainability, it is one of the larger WPs in the SERENDIPITI proposal and is divided into 6 broad activities, described as follows.

A7.1 Web portal & Promotional materials

Leader: ERCIM, Participants: All

The SERENDIPITI online portal (both public & restricted areas) will be used for the exchange of information, coordination of applications to the SERENDIPITI mobility programme (A2.2) by researchers (1), support for dissemination and registration facilities for SERENDIPITI events (most

24 activities in WP2), and will be a central point for the consortium and for organizations, bodies, and research communities, as well as for public awareness on the topic of aggregated sensor information. It will also support access to the SERENDIPITI platform (A6.5) for aggregated sensor data for researchers, for application developers outside the network, for other EU and national projects (6) and for ordinary EU citizens (7) through the platform applications. End users in the three targeted application areas (3) for SERENDIPITI – city event planning, journalism and security/police – will also use the online portal for access to applications developed on the SERENDIPITI platform while application developers who use the SERENDIPITI platform (4) will avail of the online portal for documentation and support as well as progress updates and news. Data providers (2) may also use the online portal to see how their own data has been used and doing this may incentivise them to then become users of the outputs of the SERENDIPITI platform.

Within this activity, the SERENDIPITI project will produce a range of promotional materials, both in online form for general distribution as well as in paper for distribution at relevant conferences, workshops and other events. A quarterly electronic newsletter starting at M3 will be used for regular information exchange among the network and among the research community at large and will include information on the major activities ongoing within the network including research exchanges and visits, publications, status updates on the SERENDIPITI platforms, publicity and media coverage, theses, SERENDIPITI events. All Partners will actively help in contributing to the design of dissemination materials, through local press activities, selecting media partners in Europe for organised events for greater coverage. Promotion will also take place in finding synergies between or sensor data aggregation field or semantic urban computing, and ongoing related projects and programmes in Europe in order to to raise awareness and exploit already existing interested audiences from the research and industry sectors.

In addition, in order to outreach the research of the network and its outcomes outside the academic world, we will adopt a global communication strategy using Web 2.0 services That will include creating accounts and posting content on services such as (among others) Flickr (pictures of SERENDIPITI events), Twitter (real-time information about the network and interactivity with people interested in SERENDIPITI) as well as creating a Facebook page, so that people can become “fans” of the SERENDIPITI project.

A7.2 – Joint Publications

Leader: QMUL

Tangible evidence of real research collaboration within the network is the number of joint publications produced by SERENTIPITI partners. This kind of evidence also applies to demonstrating collaboration between the Network partners and partners or collaborators outside the network. This activity will promote and coordinate the production of joint papers, for journals, conferences, workshops and book chapters, based on work carried out within SERENDIPITI. Special emphasis and importance will be given to publications which span across partner sites as well as to publication targets which underscore the collaborative and multi-disciplinary nature of the SERENDIPITI workplan.

In addition to producing collaborative research outputs in the Joint Programme on Research Cooperation (WP3,4,5), the SERENDIPITY network will also take a lead in promoting publication in its novel niche area of semantic urban computing, through sponsoring special sessions at relevant conferences and leading special issues of relevant journals. Because of the wide impact of the SERENDIPITY work this will have a wide range and will include target venues in computing and engineering, in sensor networks, social sciences, architecture and planning, event management,

25 Already identified journal targets include, but are not limited to, SENSORS, IEEE Sensors Journal, the ACM Transactions on Sensor Networks (TOSN)

Relevant conferences include the ACM Conference on Embedded Networked Sensor Systems (SenSys), Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks (SECON), IEEE International Workshop on Sensor Networks and Systems for Pervasive Computing (PerSeNS), the International Semantic Web Conference (ISWC), the Extended (formerly European) Conference on the Semantic Web, and the World Wide Web Conference (WWW)

From among the wide community with likely involvement in SERENDIPITI, A7.2 will directly impact the researchers, both academic and industry(1), those working in standards groups (5), and those working in other related national and EU projects (6).

A7.3 – Standards and Technology Transfer

Leader: NUIG

Successful and scalable aggregation of various sensor outputs depends heavily on conformance and agreement to standards. These standards should be international, and open in nature since standards provide interoperability while still allowing for competition between sensor equipment and services. The SERENDIPITI partners will not only follow the existing standards but will drive them, especially by being involved in W3C standardization activities. SERENDIPITI will work towards a platform which allows easy plug-in of data taken from new sensor sources, easy semantic markup and integration of that new information into the shared sensor base, and straightforward use of the sensor base and mash-up of its derived outputs via applications running on the SERENDIPITI platform. Technology transfer, raising awareness of SERENDIPITI success and opportunities for exploitation which are described in Section 3.2.3 later in this document, will take place in the context of the operating SERENDIPITI platform which will act as a showcase and demonstrator of SERENDIPITI network capabilities.

To achieve these aims, SERENDIPITI will execute the following activities on technology translation:

 Identify internal (SERENDIPITI) and external (related research areas, big players, SMEs) technology/knowledge which is of relevance to the network’s activities.  Establish in the SERENDIPITI team of researchers, knowledge about IP and patents. Each of the 6 academic partners in the network already have in place support units to promote IP protection and exploitation and we will lever this in the operation of SERENDIPITI.  Investigate the requirements of projects external to the SERENDIPITI research. With an appropriate work force in place, we gather user requirements collected from external researchers and companies.  Organise and participate in EU related activities targeting Future Internet. To assist in this, ERCIM occasionally run strategic seminars to focus EU attention on new and emerging topics. SERENDIPITI will organize a strategic seminar on the sensor web and sensor web applications, with an invited audience of industry and academic researchers, probably to be held in Brussels or Luxembourg.  Become proactively involved in the emergence of standards in the area of the Semantic Web and the Sensor Web. One of the SERENDIPITI partners is NUIG, involved in various groups in W3C, notably co-chairing two Semantic Web related workgroups (SPARQL WG and RDB2RDF WG). In addition, NUIG is currently actively involved in the Semantic Sensor

26 Network Incubator Group. This group will end in March 2010, and shall be followed by a Working Group on the topic, as usually done in W3C. SERENDIPITI, via NUIG, will the be represented in this WG to drive the standardization process on Semantic Sensor Web.

As a whole, A7.4 will impact researchers (1), application developers (4). Standards groups (5), and other EU and national projects (6).

A7.4 Societal Impacts

Leader: DCU

New applications and services which don’t replace or update older ones but which are genuinely innovative, often have unpredicted impacts. This can effect how people work with the new services and how much faith and trust they place in the outputs of the new services. How long does it take to build up trust and what are the ethical and legal issues associated with aggregating sensor readings from publicly available sources for the purposes of personalising the data for specialist or generic purposes ? In this activity we shall study these issues in a direct and hands-on way. DCU will be involved in leading this activity and all partners shall be brought into this activity which will disseminate its progress and outputs across the project consortium.

A7.5 – SERENDIPITI Platform Challenge

Leader: UvA

Ito help promote use of the SERENDIPITY platform to develop applications other than the three case studies in A6.1, this activity will coordinate an end-of-project challenge, open to researchers inside and outside the network. The challenge will run during the final year of the network, when the SERENDIPITI platform is most mature and stable, and will be embedded as a track or ‘lab’ within the annual CLEF benchmarking activity. As of 2010, the well-known CLEF benchmarking event will have an open “call for tracks” or labs as they will be called. Prof Maarten de Rijke from University of Amsterdam is general chair of CLEF2010 and Prof. Alan Smeaton of Dublin City University is Program Co-Chair of CLEF 2010, so the consortium is well-connected into this activity. The challenge will feature an activity to develop the best new application to run on the SERENDIPITI platform, using data (video, text, images) shared with CLEF, augmented with other SERENDIPITY sources. Assessment criteria will include novelty, usefulness of the application, technical difficulty and user feedback. Input from the Industrial Advisory Board will be sought in the judging process and a small prize will be awarded at an event such as the CLEF workshop meeting. This activity will involve researchers (1), data providers (2) and application developers (4) from among the SERENDIPITI stakeholders.

The following table summaries how each of the activities in WP7 impact different categories of SERENDIPITI stakeholders.

Hyowon to make this look pretty

27 T7.1 Web T7.2 Joint T7.3 T7.4 T7.5 Portal and Publication Standards Societal Promotiona s and Impacts SERENDI l Materials Technolog PITI y Transfer Platform Challenge

1. √ √ √ √ √ Researchers (academic and industry)

2. Data √ - √ √ providers

3. Niche √ - √ - application areas

4. √ - √ - √ Application developers

5. Standards - √ √ - - groups

6. Other EU √ √ √ - - and national projects

7. EU √ - - √ - citizens

1.3.3 Joint program on Cooperative Research In Figure 1.3.1 we introduced the structural relationships between work packages and joint programmes. This Joint Programme of Cooperative Research is where developments in the basic research which enable the development of semantic urban computing, takes place. There are three work packages, on Real-Time Large Scale Data Analysis (WP3), Semantic Information Fusion (WP4) and Inference and Event Prediction (WP5). These three inter-twine in many ways and it is actually difficult to pull them apart they are so integrated. We introduce these Work Packages in turn.

1.3.3.1 Real-Time Large-scale Data Analysis (WP3) This WP is coordinated by Prof. Ebroul Izequierdo of Queen Mary, University of London (QMUL)

28 This WP aims at contributing towards building an integrated platform for semantic urban computing by developing tools for large-scale data analysis in real-time. The targeted tools implement low-level analysis algorithms that together with the outcomes of WP4 and WP5 in this Joint Programme of Cooperative Research, will encompass the technology needed to realise the SERENDIPITI software framework. Specifically, multimodal analysis will be performed on the output of physical sensors like CCTV cameras combined with available information extracted from on-line and social communities as blogs, twitter, Flickr, YouTube, etc. According to the use case described earlier, for effective navigation of a user around a new place, it is critical to provide the user with “real-time” intelligence on varied aspects of an environment such as traffic and weather conditions to name a few. In addition, interpreting the real-time changes it is also critical to provide the users with alternate solutions which closely correspond to “human cognition”. To achieve this ambitious objective, several innovative research methodologies extending beyond state-of-the-art will be developed. The general objectives of this WP are listed below:

 large-scale information discovery from online resources and physical environments in real- time.  real-time multimodal data analysis from varied information sources such as real world (e.g. CCTV) and online resources (e.g. blogs, YouTube)  multimodal data synchronisation of information from physical and online worlds  to extract knowledge regarding an environment (e.g. a city such as London/Milan, Dublin, etc) by analysing and indexing cartographs. The information provided by this WP will be used in WP4 and WP5. Using results of WP3, analysis algorithms, suitable descriptors and other appropriate description models will be developed. Potential description schemes will be proposed to the appropriate standardisation bodies within WP7 “Spreading excellence”, specifically activity A7.3.

The WP consists of six integrative activities and will be coordinated by QMUL.

A3.1 Distributed intelligent sensing

Leader: QMUL

This activity is dedicated to gather and filter all information measured by a dense multimodal sensor network and available information extracted from on-line and social communities as blogs, twitter, Flickr, YouTube. The sensor network covers a wide-range of modalities, like CCTV cameras, simple state/change indicators taking a binary (on/off) value, e.g. door open/shut in large indoor city scenarios, measurements of continues variables, e.g. temperature, noise level and complex sensors from meteorological stations when available. Information extraction from on-line and social communities will be concerned with the application of state of the art methods in information extraction (detection and classification of named entities, relations, events) and text mining (detection of recurring patterns of language use), with a special emphasis on scalability and real-time issues. The trade-off between semantic level of extraction vs. robustness and speed will therefore be an important aspect of research in this task. Semantic analysis of extracted text segments will be based on ontologies as much as possible, i.e. as defined by the set of underlying ontologies in the Open Linked Data cloud (e.g. Geonames, …), by the social-semantic ontologies to be used in the project (e.g. SIOC) and by a specifically constructed event ontology (see WP5). A further level of semantic text analysis will be based on the use of text mining methods for identifying qualifying aspects of extracted facts, such as sentiment (‘excellent’, ‘terrible’, etc.), timing (‘today’, ‘recently’, etc.) and space (‘here’, ‘a few miles away’, etc.).

29 Activity A3.1 will deal with the system intelligence in its most elementary form. The approach is hierarchical as in the overall system. Analysis is first confined to specific amounts of data, e.g., the data sensed at the moment when the analysis is conducted. Then the resulting information is combined with information from the past to build a second hierarchical analysis level. Finally, distributed information from several sites is put together to infer the output information of the distributed sensing subsystem.

Form a technical point of view, the main objective of this activity is to devise and apply techniques to transform raw data, in the form of multiple time series and on-line sources into meaningful information using a hierarchical approach. For single sensors this will simply be a binary state variable as a function of time. For more complex sensors, such as video and audio recorders, this could be a hypothesis identifying a highlight happening at space-time coordinates.

A3.2 – Cartograph Analysis of a City

Leader: QMUL

This activity focuses on the analysis of city maps and generates an index of “interesting aspects” of a city. The interesting aspects could vary from a street to a national monument and this information will be used as pre-knowledge to the SERENDIPITY platform. The cartograph analysis will exploit the availability of GPS information for precisely (within few meters, e.g Google Latitude provides 3m precision in motion and 500m precision when resting) localising the movement of users within the city. This activity consists of the following sub-activities.

 A3.2.1 – Indexing cartograph  A3.2.2 – Mapping GPS from user to the Cartograph  A3.2.3 – Real-time tracking of user movement

A3.3 – Traffic and Crowd Monitoring from Real-Time Multimedia sources

Leader: QMUL

In this activity, research methodologies for detecting “semantic events” such as traffic jams, accidents, and monitoring crowd behaviour will be developed. These tools will enable machines to mimic human like cognition in predicting the future consequences of a specific event by continuously monitoring past events. The presence of CCTV in urban environments will be used, as a primary source of information for the real-time extraction of semantic events. The activity consists of the following sub- activities

– Tracking and Tracing – Traffic monitoring for semantic events – Crowd behaviour monitoring – Real-time reasoning following a semantic event – Interpretation of consequences of a semantic event

A3.4: Sequential, anytime and on-line learning for Real-time semantic analysis

Leader: QMUL

30 This activity deals with algorithmic issues central to machine learning and knowledge discovery. One objective is to provide a coherent perspective on resource constrained algorithms that are fundamentally designed to handle limited bandwidth, limited computing and storage capabilities, limited battery power, and specific real-time network- communication protocols. In the targeted application scenarios which make up the new domain of semantic urban computing the process of generating the data is not strictly stationary. In many cases, there is a need for extracting some sort of knowledge from a continuous stream of data. Examples include twitter records and 24 hour CCTV streams. These sources are called data streams. Learning from data streams are incremental activities that requires incremental algorithms that take drift into account.

Important properties of such algorithms are that they incrementally incorporate new data as it arrives and that they are able to cope with dynamic environments, deal with changes in the data-generating distribution, they must process examples in constant time and memory. The goal of this activity is to adapt sequential learning, anytime learning, real-time learning and online learning from data streams and related topics and to achieve processing in real-time.

The knowledge generated from this activity will be subsequently made available to image analysis tools as a pre-knowledge to aid in real-time image classification using kernel or biologically inspired algorithms. In addition, this activity will also extract a list of “to-be experienced” aspects about a city and make available dynamic recommendations when the user is in the vicinity of these places. This activity will closely collaborate with A3.6 for interfacing with the user.

A3.5 - Real-time coding and streaming

Leader: QMUL

A real-time multimedia stream from different physical or online sources is an important aspect of the SERENDIPIT platform introduced in activity A6.5. The provisioning of multimedia services over heterogeneous environments like in SERENDIPITI is still a challenge, especially if the multimedia services have to be consumed anywhere and anytime and using any type of devices. Scalable media coding efficiently provides the adaptation of multimedia resources according to changing usage environments and to profile information (user and terminal profiles). It contains the self-adaptable information rich media that can able to accommodate streaming and interaction functionalities. This activity embraces the three different areas of the work: fully scalable media coding; streaming of scalable media coding in heterogeneous environments; and Quality of Experience of multimedia coding. In this context, the objectives of the activity can be identified as:

 The provision of a comprehensive evaluation report on the use of scalable media coding and multiple description coding techniques and state of the art as regards scalable media coding techniques.  Selection of the appropriate algorithms and mechanisms for scalable and adaptive media stream coding.  Definition of the framework for interworking between the application (encoding/decoding) modules and the media transport layer, in order to make use of the adaptive media transport capabilities and flow control capabilities of the selected transport protocols.  Implementation of a scalable coding/decoding scheme to be used for real time transmission of media streams over congestion aware and adaptive flow control protocols.

31 A3.6 – Real-time Multimodal Data Synchronisation

Leader: QMUL

This activity will collaborate closely with previously mentioned activities in aligning and synchronising information obtained from multiple modalities and multiple sources with respect to time and geo-spatial preferences. The outcome of this activity will (partially) reflect real-time dynamic updates on the events like traffic and crowd conditions and thereby suggesting alternative routes to the same location according to user preferences or user models.

1.3.3.1 Semantic Information Fusion (WP4) This WP is coordinated by Dr. Paul Buitelaar of National University of Ireland, Galway (NUIG)

This Work Package focuses on (1) semantically-enriching then (2) interlinking and fusion of large amounts of real-time information both from Social Web applications and from the physical world via Sensor-enabled devices. SERENDIPITI will advance the State of the Art in information integration from these two domains by bridging the efforts of these two communities and outreaching the work of the network in different venues that can bring the attention of one community to the other, such as WWW, SemTech, Percom etc.

Most specifically, our activities will concentrate on:

 Modeling real-time streamed information using Semantic Web technologies and Linked Data principles, especially relying on a set of agreed vocabularies and reference datasets;  Representing sensor information in a similar way, in order to bridge the current gap between sensor data and Web data thanks to the Linked Data principles;  Integration (fusion) of extracted information using a combination of scalable and real-time semantic, statistical and NLP techniques in order to enable a complete network of machine- readable knowledge, in combination with WP3.

In this way, we will provide real-time semantic and interlinked knowledge that can be used in various SERENDIPI applications of semantic urban computing, both within the network and outside. As an example, a geolocated picture, taken from a sensor-enabled cell-phone, could be linked to the latest tweets mentioning the same location, since both will be ultimately represented using the same models and linked to the same geolocation identifiers. In addition, we will focus on ubiquitous interfaces providing integrated views of data from these different sources, offering to the end-users of SERENDIPITI an integrated and semantically-enriched view on the topic of their choice, using scalable solutions.

The work package consists of the following five integrative activities and is coordinated by NUIG.

Activity 4.1 Structuring information using common semantics

Leader: NUIG

32 This activity will focus on identifying common vocabularies to represent information from the Social Web and from Sensor-enabled devices using Semantic Web technologies. Specifically, the activities of the network will include:

 A report on existing lightweight and scalable ontologies to represent dynamic information both from the Social Web and from Sensor-enabled devices, in terms of meta-data (time, provenance, etc.);  From this report, the partners will agree on a set of common ontologies such as FOAF and SIOC (to be extended if needed) in order to provide a common framework to expose structured information from existing sources both from the Social Web and from Sensor- enabled devices;  Developing applications, web-services and APIs to enable delivery of information using these ontologies from existing application (RSS feeds, tweets, sensor data, etc).

In addition, the partners of the Network will deliver their own data (from their institutes, universities, laboratories) using these practices and will encourage students and partners to do the same, by outreaching via lectures and tutorials, also in business-oriented venues such as Enterprise Data World and SemTech to go further than academia and achieve greater impact.

Activity 4.2 Interlinking real-time Knowledge

Leader: UvA

Structuring information using common representation formats and vocabularies (as done in activity A4.1) is a first step to address the issue of information discovery in the SERENDIPITI network. Yet, it also requires the use of common and shared identifiers to represent topics and geolocation information from both online and Sensor-data. This activity will:

 Take advantage of the current outcomes of the Linking Open Data project, that already provides billion of identifiers for such topics and via the LOD cloud;  Link data sources from online and from Sensor-enabled devices to existing datasets, such as Geonames for geolocation information and DBpedia for other topics;  Research methods for scalable and real-time integration (fusion) and interlinking of extracted information using a combination of scalable and real-time semantic, statistical and NLP techniques;  Ensure that geolocation information from sensor-enabled devices is represented using similar identifiers so that sensor information can be widely integrated with online data.

With these activities, SERENDIPITI will create a unique setting for bringing together and integrating the NLP, Social Web and Sensor web communities within this specific activity, working around the topic of semantic urban computing.

33 Figure 1: Current state of the Linking Open Data Cloud

Activity 4.3 Integrated browsing capabilities

Leader: DCU

The partners of the network will concentrate on the outcomes of the two previous activities to enable fusion of semantically-enhanced information and provide integrated browsing capabilities to SERENDIPITI applications, notably by:

 Advancing the state of the art in object-consolidation techniques in order to accurately identify that various sources of information are related to the same topic (event, location, etc.), in order to fuse them;  Identifying strategies for provenance management, enabling tracking of each piece of information about a particular topic even after fusion from various sources, both from online and from Sensor-enabled devices;  Providing new ubiquitous interfaces capable of absorbing and rendering large amounts of structured data around a particular topic and delivering APIs to let users of SERENDIPITI build their own mash-ups based on this data.

34 This activity will thus enable the development of end-user interfaces to manage the large amounts of data considered by the partners of the network, notably benefiting from the two previous work packages (WP2, WP3) in order to make sense of it.

Activity 4.4 Respecting privacy and ensuring trust

Leader:NUIG

Two more topics must be addressed when it comes to information fusion, especially using end-user information: privacy and trust. The partners of the network will:

 Deliver a comprehensive State of the Art of existing strategies to ensure privacy and trust on Semantic Web data, especially considering the use of policy languages;  Identify scalable and real-time solutions applicable to the SERENDIPITI platform and provide a set of use-cases where these solutions could be used within SERENDIPITI applications;  Consider the use of privacy-filtering capabilities in order to protect end-user information when it comes to information fusion from both online and sensor data;  Develop a set of plug-ins for visualization interfaces to assert the trustworthiness of the information that can be browsed by end-users.

Thanks to this last activity and also to activity A7.4 on societal impacts, the partners of the Network will ensure that the information captured and delivered within SERENDIPITI applications will neither infringe the privacy rights of users nor deliver wrong information. In fact there is a lot of synergy between this activity and A7.4, the main difference between them being that A4.4 comes from a system viewpoint, being part of the Joint Programme of Cooperative Research whereas A7.4 is part of the Joint Programme on Integration and Sustainability.

Activity 4.5 Adaptive Search Models

Leader: GLA

The activities in WP3 and WP4 generate a large amount of semantically annotated information of heterogenous and multimodal data types. The objective of this activity is to make advances in the development of adaptive retrieval models integrating semantic as well as statistical information about the content. Current state of the art models in information retrieval including web search engines, exploit statistical information. In SERENDIPTI, however, the research activities enrich the collection with additional information (for example: object information, event descriptions etc.) In addition it also generates large archives of cross-media data sets. The use case scenarios envisaged in this proposal also deal with the information needs of an individual or a group of individuals (for example: updating information about a specific event). In this activity, we will

 Develop adaptive retrieval models integrating semantic and statistical information  Develop profiling techniques to gather interests of users or group of users  Investigate adaptive and personalised delivery of information

35 1.3.3.1 Inference and Event Prediction (WP5) This WP is coordinated by Prof. Maarten de Rijke of the University of Amsterdam (UvA)

Building on the signal processing activities A3.2, A3.3 and A3.4 in WP3, and the information extraction activities A4.1 and A4.2 in WP4, this WP seeks to enable making inferences about the current state of affairs of an urban environment based on heterogeneous information provided from diverse sources and to predict the occurrence of events based on past observations and inferences. This WP addresses two main objectives: to establish normal patterns of information flow and event occurrences in urban environments and to predict expected events and recognize anomalous events.

In order to establish normal patterns of information flow and event occurrences in urban environments, multiple sources of urban information will be tracked and aligned. Associations and correlations across information sources need to be established, using heterogeneous categorical data and mixed categorical and numerical data. Urban developments (in news sources, user generated sources as well as sensory sources) need to be identified and cross-linked, so as to support a range of signalling and alert services. Networks in which diverse entities are connected in highly dynamic and uncertain ways will be identified and maintained to reflect dynamically changing environments. To be able to support the generation of hypotheses concerning ongoing or future developments, the WP will use both bottom-up data-driven methods and top-down knowledge-intensive methods, for which an ontology of events will be developed. Based on this ontology, urban events will be predicted.

We view urban environments as complex systems. A feature shared by many complex systems is that they can be modelled as networks or graphs. Many complex systems consist of a large number of elementary components that give rise to emergent behavior through sparse, often non-linear interactions. What are the key features of the dynamics of complex systems such as urban environments? How does adaptive behavior in such environments at the microscopic level (e.g., individuals) affect aggregate behavior at the macroscopic level of crowds or communities? Predictability deals with the behavior of complex systems and the extent to which this behavior can be predicted or controlled on the basis of (partial, possibly conflicting) knowledge and observations of its workings. Predictions of the behavior of complex real-world processes are increasingly based on explicit models and simulations. Being predictive means that one is able to tackle both the deterministic and statistical problems in a quantitative manner. How can we simultaneously consider randomness, stability and deterministic fluctuations? Characteristic of a “complex systems” perspective is that the specific details of the systems are less relevant than the topological structure of the networks that they form. This allows one to apply knowledge acquired within one application (for instance, city planning) to another application (for instance, news analysis or safety and security). To achieve its objectives, this WP is structured in five activities that break down the urban event prediction task into a sequence of natural processing steps.

Specifically, the workplan which delivers ths advanced processing and analysis needed to realise our visition of semantic urban computing, is structured into five integrative activites, described below.

Activity 5.1 Association and correlation across information sources

Leader: UEP

Activity 5.1 is devoted to determining associations and correlations across information sources and to detecting anomalous patterns. Building on the structuring and inter-linking activities (A4.1 and A4.2) in WP4, examples include correlations between urban news and (micro-) blogging sources, between

36 urban weather data and sensory traffic data, between political entities (people, organizations, …) in urban archives and in urban news.

Activity 5.2 Anomalous pattern analysis

Leader: UEP

For anomaly detection, this activity will work with an expanding population of entities in a growing collection of urban information to support the discovery of deviations from associations and correlations established in this activity. An important requirement here is that processing times match the timescales of need. The availability of multiple parallel sources will be used to contextualize observed anomalies and support the assessment of abnormal behavior. Examples include the volume of chatter (in blogs or micro-blogs and discussion forums) around an urban topic—significant increases can likely be linked to urban news events by observing the language usage around the topic in both user generated and edited sources.

Activity 5.3 Tracking urban developments

Leader: UvA

Activity 5.3 is aimed at developing methods for tracking issues, stakeholders and sentiments in urban environments. Examples include linking calendars of major sports events to news, traffic and crowd sensing data. The activity will extend methods developed for the microscopic level of individual entities to support inference about the macroscopic level of organizations, groupings and networks.

Activity 5.4 Dynamic network analysis

Leader: GLA

Activity 5.4 aims to extend existing network analysis techniques to cater for multiple and structured dynamic networks of people, locations and issues to support exploratory search and discovery processes. A key requirement is being able to support highly uncertain data, resulting in possibly very weak network links. We will benefit from the outcomes of WP4, especially activity A4.2, to use semantically-rich and structured information in order to provide such new Social Network Analysis techniques. This will allow the SERENDIPITI network to mine and analyze networks based on both physical and Web information, for instance identifying a “rugby community” from data composed of sensor-information (a mobile device located in the stadium “Croke Park” during a particular day/time) and Web data, such as tweets using the #rugby hashtag. Moreover, the activity will consider the use of information provided in the LOD cloud (such as a taxonomy of topics in DBpedia or geolocation hierarchies in Geonames) to provide network analysis at different levels of granularity. In addition, this activity will focus on real-time streamed data, thus considering time and context as important elements for these analysis: the network of people located in Croke Park can be a subset of rugby fans at one time, but of rock fans of the band “U2” a week later.

Activity 5.5 Event ontology for hypothesis generation

Leader: UEP

Activity 5.5 deals with knowledge structures aimed at supporting the formulation of hypotheses that explain unusual behavior. While redundancy and the fact that multiple sources tend to provide a set of

37 “parallel” views on the data being analyzed that can be used to contextualize in a data source (Balog et al., 2006)10, they do not provide a semantic characterization of the reasons underlying deviations from expected patterns of behavior. To support event prediction, an ontology of events will be created that will support the formulation of scenarios and generation of hypothesis in the face of urban developments.

Activity 5.6 Inference and prediction

Leader: UvA

The final activity in this WP, Activity 5.6, deals with the actual urban event inference and prediction tasks. The techniques to be developed need to be able to work with heterogeneous categorical data as well as mixed categorical and numerical data. The major expected result is a model of urban environments as a complex system, derived from sensed information from the online and physical worlds, where the interaction of a large population of heterogeneous individuals may lead to large swings in sentiment, attitude and behavior triggered by events in an urban environment, and strongly reinforced by group behavior and social interactions.

Inference and event prediction is a high-level topic that builds on many other WPs. The low- to mid- level information extracted by WP3 and WP4 will be used to gain high-level knowledge. WP5 will feed into WP6 and deliver core technology to support the planned case studies (A6.1); it also will build on the Joint Programme on Integration and Sustainability through A6.2 (Infrastructure sharing).

1.3.4 Workpackage List does this column on person-months have the requested funding, or total number of SMs ?

WP WP title Type of Lead Lead Person Start End Number Activity Participant Participant Months Month Month Number Short Name WP1 Management MGT 1 ERCIM ? 1 36 WP2 Integration of OTHER 6 UEP ? 1 36 Organisations and People WP3 Real-Time Large- RTD 7 QMUL ? 1 36 scale Data Analysis WP4 Semantic RTD 5 NUIG ? 1 36 Information Fusion WP5 Inference and RTD 4 UvA ? 1 36

10 K. Balog, G. Mishne, and M. de Rijke. Why Are They Excited? Identifying and Explaining Spikes in Blog Mood Levels. In: Proceedings 11th Meeting of the European Chapter of the Association for Computational Linguistics (EACL 2006), April 2006

38 Event Prediction WP6 Infrastructure RTD 3 GLA ? 1 Integration and 36 Sharing WP7 Outreach and OTHER 2 DCU ? 1 36 Spreading Excellence Total ???

1.3.5 List of Deliverables

Deliverable WP Dissemination Del. No Deliverable Name Nature Date No. Level (month) D1.2 A great demo of something 8 PU R

Main Deliverables

The following table lists a selection of ?? key deliverables which will each be peer-reviewed by external experts in the field and used through the project to further assess the effectiveness of ongoing activities. These selected deliverables are also included in the Gantt charts given in the next few sections.

Del. No Deliverable Name WP Nature Dissemination Deliverable

39 No. Level Date (month) D1.2 A great demo of something 8 PU R

1.3.6 List of Milestones

Milestone WP Exp. Milestone Name Means of Verification No No. Date D1.2 A great demo of something 8 X year project review Online repository is enabled Contribution to Dx.x Call for fellowships etc. … etc.

Main Milestones

The following table lists a selection of ?? critical milestones, These will be used as “special markers” to guide porject activities. These selected milestones are also included in the Gantt charts given in the next few sections.

Milestone WP Exp. Milestone Name Means of Verification No No. Date D1.2 A great demo of something 8 X year project review Online repository is enabled Contribution to Dx.x Call for fellowships etc. … etc.

40 1.3.7 Work Package Description Tables. Workpackage 1: Management and Quality Assurance

Workpackage number 1A Start date or starting event: M1 Workpackage title Management Activity type MGT Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per x x x x x x X participant:

Objectives The objectives of this work package are:

 To carry out the entire administrative and financial coordination of the project.  To monitor and assess the technical work and progress across work packages and teams.  To ensure timely submission of all contractual deliverables.  To elaborate and enforce quality procedures and provide a reliable IPR framework.

Description of work The management will handle all the administrative and financial tasks connected with the activities of the consortium: management of human resources, periodic reports, contract amendments, preparation of financial reports, financial supervision, funding redistribution, planning and monitoring of activities, and reporting. The management structure of SERENDIPITI is constructed to optimally support the efficient and timely execution of the project plan. This project will be headed by the Project Coordinator (PC), who will handle the day-to-day scientific and administrative management of the network. The PC will head a Project office to ensure the continuity and consistency of the administrative management.

The work is subdivided into 3 tasks as detailed below.

Task 1.1 – Overall management Task leader: ERCIM

 Establish and implement the overall organisation for the project within the ICT Program requirements.  Ensure efficient communication flow among partners and among work packages which will include all the necessary elements for the management of the consortium.  Formalise and update the Project’s database : main information features, reports, working documents  Ensure achievement of the working plan throughout the entire project  Handle all the administrative tasks connected with the NoE’s activities: management of human and material resources involved in the Project, planning and monitoring of activities, record keeping, reporting, administrative organisation the meetings, project reviews, progress reports…  Handle all the financial tasks connected with the project activities;

41 Task 1.2 – Quality Assurance Task leader: ERCIM

The Steering Committee will define a quality assurance plan stating precisely standard procedures for conducting the Project.

 Encourages and verifies that standards, procedures and metrics are defined, applied and evaluated.  Defines a procedure for identifying, estimating, treating and monitoring risks.

Task 1.3 – Management of intellectual property rights Task leader: ERCIM

Provide a global Intellectual Property Rights (IPR) frame for the whole participants to establish precise information on: the pre-existing know-how among the partners, a list of knowledge generated by the partner, which results should be subject for protection.

Deliverables D1.1 – Human resources allocation (M2, R)

This document will informs the other beneficiaries and the Commission of the persons who shall manage and monitor the work of the project, and their contact details

D1.2 – Quality Assurance Plan (M4, R)

This plan defines: (i) Quality processes (e.g., deliverable preparation, review preparation and post-review follow- up, activity-specific processes, etc.), tools and metrics, (ii) Tracking of progresses.

D1.3a-c – Periodic management report (M12, M24, M36, R)

This report comprises an explanation of the use of the resources, and a financial statement (Form C), from each beneficiary together with a summary financial report consolidating the claimed Community contribution of all the beneficiaries.

42 Workpackage number 1B Start date or starting event: M1 Workpackage title Scientific and Technical Coordination Activity type MGT Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per x x x x x x x participant:

Objectives The objective of this work package is to ensure adequate scientific co-ordination and monitoring of the NoE. This is translated in a number of secondary objectives:

 Scientific planning  Monitoring of scientific and technical activities  Ensuring self-assessment and scientific quality insurance  Coordination between and beyond scientific and technological (S&T) work packages

Description of work The work subdivides into two tasks, as detailed below.

Task 1.4 – Scientific Planning Task leader: DCU

 Strategic planning of main scientific activities and dissemination events (e.g. workshops, etc)  Risk analysis and contingency plans  Raising public participation and awareness by organizing public lectures and participation to EC events.

Task 1.5 – Monitoring and coordination of S&T Activities Task leader: DCU

 Monitoring of the S&T work packages  Supervision of integration between different S&T work packages  Scientific quality assessment  Compilation of periodic scientific activity reports into annual report

Deliverables D1.4a-c – Six-monthly progress reports (M6, M18, M30, R)

This report provides the technical progress made by the consortium in each work package within a six-month period.

43 D1.5 a-c – Periodic Scientific Activity Reports (M12, M24, M36, R)

44 Workpackage 2: Integration of People and Organisations

Workpackage number 2 Start date or starting event: M1 Workpackage title: Integration of Organisations and People Activity type: OTHER Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per participant:

Objectives  To lay the foundations for the Virtual Centre of Excellence in the area of large-scale semantic urban computing  To support the research synergy by a mobility programme  To support talent boosting among junior researchers inside and outside the network partners  To accumulate and interlink resources suitable for education  To extend the impact of the network to countries other than those of the partners, in particular to the new EU member states

Description of work Activity 2.1 – Towards the Virtual Centre of Excellence Task leader: QMUL This task will lay the fundamentals of a virtual centre of excellence in the area of semantic urban computing. It will establish governing organisation, determine membership criteria, and develop a strategic plan for sustainability.

Activity 2.2 – Researcher Mobility Program Task leader: ERCIM This activity will put in place mechanisms to support the mobility of junior as well as senior researchers.

Activity 2.3 – Cooperation in Education, teaching materials and PhD Formation Task leader: GLA This activity will collect teaching materials, establish collaboration at undergraduate levels, and faciulitate opportunaties for senior researchers to give guest lectures at visiting SERENDIPITI partners.

Activity 2.4 – Summer Schools Task Leader: DERI This task will be coordinated by DERI and will involve organising and delivering one Summer School of 1 week duration, each year.

Activity 2.5 – Pan-European Integration Task Leader: UEP A pan-European contact database for the field of the project will be developed and exploited, and the state of urban computing in new EU member states will specifically be analysed.

45 Deliverables : D2.1 VCE Implementation Plan – (M3, R) D2.2.a Definition of the Mobility program – procedures (M2) D2.2.b Annual Reports on Researcher Mobility – (M12, M24, M36, R) D2.3 QMUL Roadmap towards a VCE (M12, P) D2.4 Annual reports on Coperation in Education, teaching materials and PhD formation (M12, M24, M36, R) D2.5 Reports on Summer School activity held that year – (M12, M24, M36, R) D2.6 Mid-term report on postgraduate course and teaching activities – (M18, R) D2.7 Report on State of Urban Computing in New EU Member States – (M15, R)

Milestones : MS2.1 Establishment of the council for the SERENDIPTI Virtual Centre of Excellence; SERENDIPITI Summer School implemented (M12) MS2.2: Launch of the Fellowship program (M2) MS2.3: Statistics table of the Researchers’ exchanges in both schemes, FP and REP (M6, M18, M30) MS2.4 Pan-European contact database with broad coverage – M9

46 WP3: Real-time Large-scale Data Analysis

Workpackage number 3 Start date or starting event: M1 Workpackage title: Real-time Large-scale Data Analysis Activity type RTD Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per participant:

Objectives WP3 aims at contributing towards building an integrated platform for semantic urban computing by developing tools for large-scale data analysis in real-time. The targeted tools implement low-level analysis algorithms that together with the outcomes of WP4 and WP5 will encompass the technology needed to realise the SERENDIPITI software framework.

Description of work To meet the needs of the above objectives, the following activities have identified in the context of WP3:

A3.1 Distributed intelligent sensing Task leader: NUIG This activity is dedicated to gather and filter all information measured by a dense multimodal sensor network and available information extracted from on-line and social communities as blogs; twitter; Flickr; youtube. The sensor network covers a wide-range of modalities, like CCTV cameras, simple state/change indicators taking a binary (on/off) value, e.g. door open/shut in large indoor city scenarios, measurements of continues variables, e.g. temperature, noise level and complex sensors from metereologic stations when available.

A3.2 – Cartograph Analysis of a City Task Leader: QMUL This activity focuses on the analysis of city maps and generates an index of “interesting aspects” of a city.

 A3.2.1 – Indexing cartograph  A3.2.2 – Mapping GPS from user to the Cartograph  A3.2.3 – Real-time tracking of user movement

A3.3 – Traffic and Crowd Monitoring from Real-Time Multimedia sources Leader: DCU  – Tracking and Tracing  – Traffic monitoring for semantic events  – Crowd behaviour monitoring  – Real-time reasoning following a semantic event  – Interpretation of consequences of a semantic event

47 A3.4 – Sequential, anytime and on-line learning for Real-time semantic analysis

Leader: QMUL

The main objective of this activity is to provide a coherent perspective on resource constrained algorithms that are fundamentally designed to handle limited bandwidth, limited computing and storage capabilities, limited battery power, and specific real-time network-communication protocols. In the targeted application scenario the process of generating the data is not strictly stationary.

A3.5 – Real-time coding and streaming Leader: QMUL This activity embraces the three different areas of the work:  Fully scalable media coding  Real-time streaming of scalable media coding  Quality of Experience of multimedia coding

A3.6 – Real-time Multimodal Data Synchronisation Leader: GLA This activity will collaborate closely with previously mentioned activities in aligning and synchronising information obtained from multiple modalities and multiple sources with respect to time and geo-spatial preferences. The outcome of this activity will (partially) reflect real-time dynamic updates on the events like traffic and crowd conditions and thereby suggesting alternative routes to the same location according to user preferences or user models.

Deliverables D3.1 – Real-time analysis algorithms for Cross-Media Analysis and Annotation (M18, R) D3.2 – State-of-the-art report on current multimodal techniques (M12, P) D3.3 – Evaluation of real-time scalable and multiple description coding in heterogeneous environment (M24, P) D3.4 – Report on multimodal data synchronisation (M30, P)

Milestones MS3.1 – Initial report on traffic and crowd monitoring in real time multimedia analysis (M12) MS3.2 – Real time extraction knowledge by analysing and indexing cartographs (M16) MS3.1 – Prototype of scalable multimedia content streaming framework (M30)

48 Workpackage 4: Semantic Information Fusion

Workpackage number 4 Start date or starting event: M1

Workpackage title: Semantic Information Fusion Activity type RTD Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per participant:

Objectives

This Work Package focuses on (1) semantically-enriching then (2) interlinking and fusing large amounts of real- time information both from online applications and from the physical world via sensor-enabled devices. We will advance the State of the Art in information integration from these two domains by bridging efforts of these two communities. In addition, we will focus on ubiquitous interfaces providing integrated views of data from these different sources restricted to two city environments, offering to SERENDIPITI platform applications an integrated and semantically enriched view on the topic of their choice, using scalable solutions. That way, we will provide real-time semantic and interlinked urban knowledge that can be used in the various SERENDIPI applications.

Description of work

Activity 4.1 – Structuring information using common semantics

Leader: NUIG

This activity will focus on identifying common vocabularies to represent information from online and from sensor-enabled devices using Semantic Web technologies. Most especially, based on the state of the art for existing lightweight and scalable ontologies to represent dynamic in terms of shared meta-data (time, provenance, etc.), the partners will agree on a set of ontologies that will be used in SERENDIPITI to expose structured information from existing sources both from online and from sensor-enabled devices. We will also develop applications, web-services and APIs to enable delivery of information using these ontologies from existing data (RSS feeds, tweets, sensor data, etc) through the SERENDIPITI platform.

Activity 4.2 – Interlinking real-time Knowledge

Leader UvA

This activity will take advantage of the current outcomes of the Linking Open Data project which already provides billions of identifiers for such topics and via the LOD cloud in order to link data sources from the Web and from Sensor-enabled devices to existing datasets, such as Geonames for geolocation information and DBpedia for other topics. To do so, we will research methods for scalable and real-time integration (fusion) and interlinking of extracted information using a combination of scalable and real-time semantic, statistical and NLP techniques. The task will also ensure that geolocation information from sensor-enabled devices is represented using similar identifiers so that sensor information can be widely integrated with the Web data. In that way, the

49 network will bring together the NLP, Social Web and Sensor web communities within this activity.

Activity 4.3 – Integrated browsing capabilities

Leader: DCU

This task will concentrate on the outcomes of the two previous task, T4.1 and T4.2 to enable fusion of semantically enhanced information and provide integrated browsing capabilities to applications on the SERENDIPITI platform. We will (1) advance the state of the art in object-consolidation techniques in order to accurately identify that various sources of information are related to the same in order to fuse them (2) identify strategies for provenance management, both from online and from sensor-enabled devices and (3) provide new ubiquitous interfaces capable of absorbing and rendering large amounts of structured data around a particular topic and delivering APIs to let users of the SERENDIPITI platform build their own mash-ups based on this data.

Activity 4.4 – Respecting privacy and ensuring trust

Leader: NUIG

The topics of privacy and trust need to be addressed when it comes to information fusion, especially using end- user information. Task 4.4 will deliver a comprehensive report on existing strategies to ensure privacy and trust on Semantic Web data, especially considering the use of policy languages. We will identify scalable and real- time solutions and provide a set of use-cases where this solutions could be used within the SERENDIPITI platform and applications, as well as developing a set of plug-ins for the previous visualization interfaces to assert the trustworthiness of the information that can be seen by those applications and to protect end-user information when it comes to information fusion from both Web Data and Sensor-data This task will ensure that the information captured and delivered within the SERENDIPITI platform and applications will neither infringe the privacy rights of the users nor deliver wrong information.

Activity 4.5 – Adaptive Search Models

Leader: GLA

The topics of privacy and trust need to be addressed when it comes to information fusion, especially using end- user information. Task 4.4 will deliver a comprehensive report on existing strategies to ensure privacy and trust on Semantic Web data, especially considering the use of policy languages. We will identify scalable and real- time solutions and provide a set of use-cases where this solutions could be used within the SERENDIPITI platform and applications, as well as developing a set of plug-ins for the previous visualization interfaces to assert the trustworthiness of the information that can be seen by those applications and to protect end-user information when it comes to information fusion from both Web Data and Sensor-data This task will ensure that the information captured and delivered within the SERENDIPITI platform and applications will neither infringe the privacy rights of the users nor deliver wrong information.

Deliverables

D4.1 – Report on lightweight ontologies for representing social data on the Semantic Web – (M3, P) D4.2 – Framework for structuring information using Semantic Web technologies – (M6, M15, M24, P) D4.3 – Scalable framework for real-time information fusion and interlinking with Linked Open Data– (M9, M18, M27, P) D4.4 – Browsing and mash-up framework for SERENDIPITI – (M12, M21, M30, P) D4.5 – Report on privacy and trust on the Semantic Web – (M6, P)

50 D4.6 – A framework for privacy and trust on the Semantic Web – (M12, M24, P) D4.7 – Report on Semantic User Modelling (M12, P) D4.8 – Report on Adaptive Search Modelswith experimental results M24 D4.9 – Implementation of Adaptive and personalised delivery of information M36

Milestones

M4.1 – First iteration of a framework for structuring information using Semantic Web technologies – M6 M4.2 – Second iteration of a framework for structuring information using Semantic Web technologies – M15 M4.3 – Third and final iteration of a framework for structuring information using Semantic Web technologies– M24 M4.4 – First iteration of a framework for real-time information fusion and interlinking with Linked Open Data – M9 M4.5 – Second iteration of a framework for real-time information fusion and interlinking with Linked Open Data – M18 M4.6 – Third and final iteration of a framework for real-time information fusion and interlinking with Linked Open Data – M27 M4.7 – First iteration of a framework for privacy and trust on the Semantic Web – M12 M4.8 – Second and final iteration of a framework for privacy and trust on the Semantic Web – M24

51 Workpackage 5: Inference and Event Prediction

Workpackage number 5 Start date or starting event: M1 Workpackage title: Inference and Event Prediction Activity type: RTD Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per participant:

Objectives

The main of this WP is to make inferences about the current state of affairs of an environment based on heterogeneous information provided by diverse sources and to predict the occurrence of events based on past observations and inferences. This WP addresses two main objectives: to establish normal patterns of information flow and event occurrences in urban environments and to predict expected events and recognize anomalous events.

Description of work

Activity 5.1 – Association and correlation across information sources

Leader: UEP

To develop techniques for expressing and handling associations and correlations in multiple sources of heterogeneous urban data. To work with an expanding population of entities in a growing collection of urban information to support the discovery of deviations from associations and correlations.

Activity 5.2 – Anomalous pattern analysis

Leader: UEP

To detect anomalies in the associations and correlations found by A5.1.

Activity 5.3 – Tracking urban developments

Leader: UvA

To develop methods for tracking issues, stakeholders and sentiments in urban information. Extend methods developed for the microscopic level to support inference about the macroscopic level.

Activity 5.4 – Dynamic network analysis

Leader: GLA

Extend existing network analysis techniques to cater for multiple and structured dynamic networks of people, locations and issues to support exploratory search and discovery processes.

52 Activity 5.5 – Event ontology for hypothesis generation

Leader: UEP

Assess the utility and feasibility of developing an ontology of core urban events for scenario and hypothesis generation. Subsequent inference and prediction activities will be centered around events in this ontology.

Activity 5.6 – Inference and prediction

Leader: UvA

To develop methods to predict urban events as identified by the ontology of events produced by A5.5. Work with heterogeneous categorical data as well as mixed categorical and numerical data.

Deliverables

D5.1 – Report on association and correlation mining – (M3, P) D5.2 – Framework for discovering association, correlations and anomalies – (M6, M15, M24, P) D5.3 – Framework for tracking urban developments – (M9, M18, M27, P) D5.4 – Report on dynamic network analysis – (M6, P) D5.5 – Framework for dynamic network analysis – (M12, M21, M30, P) D5.6 – Definition of event ontology – (M12, P) D5.7 – Report on inference and prediction – (M9, ) D5.8 – Framework for inference and prediction in urban environments – (M15, M24, M33, P)

Milestones

M5.1 – Event ontology – M12 M5.2 – Second iteration of urban development tracking toolkit – M18 M5.3 – Second iteration of dynamic network analysis toolkit – M21 M5.3 – Third and final iteration of inference and prediction toolkit – M33

53 Workpackage 6: Applications and Infrastructure Sharing

Workpackage number 6 Start date or starting event: M1 Workpackage title: Infrastructure Integration & Sharing Activity type: RTD Participant number 1 2 3 4 5 6 7 Participant short name ERCIM DCU GLA UvA NUIG UEP QMUL Person-months per participant:

Objectives The objectives of this Work package are to provide mechanism for integrating SERENDIPITI research results within the consortium and also outside the consortium. For integrating results within the consortium we envisage two major activities: SERENDIPTI platform and also 3 major case studies for building Demonstrators). In addition to make larger impact of our activities we will develop evaluation methodologies and create large ground truths. Final set of activities are aimed at making the SERENDIPTI infrastructure for sharing with other stake holders through the provision of virtual laboratory.

Description of work

Task 6.1 – Virtual Laboratory Leader: QMUL This activity will focus on creating a virtual laboratory for disseminating and exploiting SERENDIPTI activities and results even after the end of the project. It will host a set of data resources, SERENDIPTI application platforms, and tools developed. This task will be coordinated by QMUL and involves all other partners.

Task 6.2 – Evaluation Methodology Leader: GLA This activity focuses on developing a set of specification for data set formations (gold standard for experimentations), procedure for experimentation and a set o measures for comparisons. This task will be coordinated by UG and involves all other partners.

Task 6.3 – Data Set Creation and Ground Truth Generation Leader: DCU It is important to have large collections of data sources for validating experimental results. In this task, we will identify and collect such data sources and clean them up for experimentation and also develop ground truth data. This will be a continuous activity since more and more types of data sources made available publicly as the time and technology progresses. Such data sets form the basis of our experimentation and also will be used for SEREDIPTI Platform Challenge (A 7.5). This activity is coordinated by DCU.

Task 6.4 – Case Studies Leader: DCU We will conduct 3 case studies integrating our technological results.  Task 6.4.1 City Planning

54 Planning for an event exploiting multi sensor data sources and large archives prior data  Task 6.4.2 Journalist / Citizen Creating report or programme based on multi sensor data sources and large archives prior data  Task 6.4.3 Police Planning an event based on multi sensor data sources

Task 6.4 - SERENDIPITI Platforms Leader: UvA We will experiment with two different technologies to create two versions of SERENDIPTI platforms. This will be sued for SERENDIPTI Platform challenge (A 7.5). This activity will be coordinated by UvA.  Task 6.5.1 Semantic Mash-Up Platform We will create a Semantic Mash-up platform for the integration of data from various sources and facilitating real-time applications by exploiting the results of past projects.  Task 6.5.2 Hadoop/MapReduce Framework Exploit widely used framework for realising a SERENDIPTI platform and its exploitation

Deliverables D6.1 – Report on the formation of the Virtual Laboratory detailing legal and ethical issues (M12, P) D6.2 – Report on Evaluation Methodology (M12, P) D6.3– First set of data sources for experimentation (M12, P) D6.4 – Initial versions of the 3 Demonstrators based on the case studies (M24, P) D6.5 – Initial versions of the 2 SERENDIPTI Platforms (M24, P) D6.6 –Report on Virtual Laboratory (M36, P) D6.7 – 3 prototypes for the 3 case studies (M36, P) D6.8 – 2 SERENDIPTI Platforms (M36, P) D6.9 – SERDENDIPTI data sources (M36, P)

Milestones M6.1 Implementation of Virtual Laboratory (M18) M6.2 First implementation of the demonstrators (M12) M6.3 First implementation of the SERENDIPTI Platforms (M12)

55 Workpackage 7: Outreach and Spreading Excellence

Workpackage number 7 Start date or starting event: M1

Workpackage title: Outreach and Spreading Excellence

Activity type OTHER

Participant number 1 2 3 4 5 6 7

Participant short name ERCIM DCU NUIG GLA QMUL UvA UEP

Person-months per participant:

Objectives  To disseminate the activities and achievements of the SERENTIPITI network to a wide and diverse range of interested parties in a timely and efficient manner, from fellow researchers to ordinary citizens of the EU;  To enhance the research experience and expertise of researchers working in the area of semantic urban computing, from both within the network partners as well as from partners outside the core group;  To contribute to the development of standards in this emerging and important area;  To study the societal impacts of new applications and services based on sensor data;

Description of work A7.1- Web portal & Promotional materials Leader ERCIM This activity will be coordinated by ERCIM with assistance from all partners and will involve preparation and dissemination of many kindsof electronic and print media. The activity will reach out to 6 of the 7 relevant stakeholder categories identified in SERENDIPITI.

A7.2 – Joint Publications The activity on joint publications is coordinated by QMUL but will involve all partners in the network. Of particular importance will be the coordination and production of publications which span across multiple partner sites. Several publication initiatives including special sessions at conferences and workshops and special issues of targetted journals, will be undertaken.

A7.3 – Standards and Technology Transfer

Leader: NUIG

The activity on standards and technology transfer will involve SERENDIPITY taking a proactive role in the development of standards, such as W3C standards, applicable in the sensor data domain. Led by NUIG, the activity will also involve helping to set the agenda for future research in this area through strategic seminars and

56 raising awareness of IP issues relative to sensor data.

A7.4 – Societal Impacts

Leader: DCU

DCU will lead the investigation into the impacts that new applications built upon the SERENDIPITY platform will have. This will include assessment of the trust and belief that people – both specialist users and ordinary EU citizens – have in the aggregated sensor content. This will involve inputs from researchers (1), data providers (2), users in niche application areas (3) and ordinary EU citizens (7)

A7.5 – SERENDIPITI Platform Challenge Leader: UvA In the final year of the network, SERENDIPITI will run an end-of-project challenge in the area of semantic urban computing, open to researchers inside and outside the network and embedded as part of the annual CLEF benchmarking activity. This will feature an activity to develop the best new application to run on the SERENDIPITI platform developed in A6.5, using data shared with CLEF. Assessment criteria will include novelty, usefulness of the application, technical difficulty and user feedback. Input from the Industrial Advisory Board will be sought in the judging process. This activity will involve researchers (1), data providers (2) and application developers (4).

Deliverables : D7.1 Outreach and Spreading Excellence Evaluation Plan - Action plan for the first year of work package – (M3, P) D7.2 Outreach and Spreading Excellence Evaluation Plan (year 2) - Action plan for the second year of work package – (M12, P) D7.3 Societal Impact - preliminary report of the assessment of societal impact of SERENDIPITI – (M18, P) D7.4 Outreach and Spreading Excellence Evaluation Plan (year 3) - Action plan for the third year of the work package – (M24, P) D7.5 Final Societal Impact Report - final report of the assessment of societal impact of SERENDIPITI – (M36, P) D7.6 Report on A7.5, SERENDIPITI Platform Challenge – (M36, P)

Milestones : MS7.1 Web portal and promotional materials in place – M6 MS7.2 10 joint publications spanning more two or more partners submitted or published – M24 MS7.3 Special seminar/workshop on the sensor web – M24 MS7.4 W3C Working Group on Semantic Sensor Network formed – M24 MS7.5 SERENDIPITI Platform Challenge completed – M33

57 2. IMPLEMENTATION 2.1 MANAGEMENT STRUCTURE AND PROCEDURES The project management is designed to support the implementation of the network’s work plan and the achievement of its objectives, as well as the completion of its contractual obligations vis-à-vis the European Commission, in compliance with detailed rules and procedures.

Firstly, network coordination is split between an Administrative and Financial Coordinator and a Scientific Coordinator. The former directs network management and administration, including the fulfilment of EU reporting and contractual obligations, and defining mechanisms for conflict resolution. The latter coordinates the overall scientific and technical activities of the network, supervising the network across all activities by coordinating interactions, monitoring the time schedule and recommending appropriate actions. The complementarities of the roles allow each person to focus on achievement of the network work plan from different perspectives. The Network Coordinator, Ms. Patricia Ho-Hune, has experience in coordinating a former successful Network of Excellence, namely MUSCLE. The Scientific Coordinator, Prof. Alan Smeaton, has extensive experience in scientific coordination at national, European, and worldwide levels.

To support the scientific and administrative management of the network the following dedicated structures have been set up: a Management Board (MB), a Member General Assembly (MGA) an Industrial Advisory Board (IAB), a User Group (UG) and a Data Providers Group (DPG). These will be discussed in detail in the following sections.

2.1.1 Network management organisation The managerial bodies are independent democratic organs, yet linked to each other to ensure the overall coherence of their activities:

 Scientific Coordinator (SC)  Project Coordinator (PC)  Management Board (MB)  Members General Assembly (MGA)  Industrial Advisory Board (IAB)  User Group (UG)  Data Providers Group (DPG)

All partners will be represented in the Members General Assembly ensuring a democratic and transparent management of the network activities.

The overall network managerial organisation is presented in Figure 2.1 below.

The implementation of the work is structured in 7 work packages that synchronise and supervise the actual activities. As described in detail in Section 2.1, these WPs are headed by the corresponding WP-leaders who constitute the Management Board (MB) which is the most important decision-making body in the network.

58 EC liaison

Scientific Project Coordinator Coordinator

Data Provider Project Group Management Office Board User Group WP leaders Sc. coordinator General Industrial Project coordinator Assembly Advisory Board

Workpackages

Figure 2.1: Network organisation

As a consequence there is a direct link between WPs (that perform the actual work) and the central decision-making body that determines the direction in which the network is headed and this provides for integration across work packages and across partners. This short feedback loop will prove to be a crucial factor in the successful synchronization of the network’s goals.

Finally a moderated discussion list will provide a forum to be used by all the partners to initiate discussions and express opinions regarding both research and dissemination activities. This list is a virtual assembly of all the integrating network participants, which will be complemented by regular audio conferences and face to face meetings.

In joining this initiative, each of the partners have explicitly agreed to commit to this project.

2.1.1.1 Scientific Coordinator The Scientific Coordinator has the responsibility of ensuring the overall scientific and technical coordination of the network, in line with the strategic decisions of the MB. Within this Network of Excellence, Prof. Alan Smeaton from Dublin City University will act as the Scientific Coordinator and will chair the Management Board.

Alan Smeaton has a long involvement in EU research funding going back to the first round of ESPRIT projects in 1984. He has been involved in most of the successive framework programmes throughout these years and has experience of involvement in Integrated Projects, Networks of Excellence,

59 STREPs and Coordinated Actions. Most recently he has been a partner coordinator in the FP6 Network of Excellence K-Space. He has extensive expertise of scientific coordination through his role as co-founder and coordinator of TRECVid, an annual benchmarking activity which has run since 2001 and which, in 2008 for example, involved almost 80 participating research groups from North America, Europe, Asia and Australia. He also has experience of direct management of scientific activities through his role as founding director of the Centre for Digital Video Processing at Dublin City University, a research centre with 45 funded researchers. Alan Smeaton is currently Deputy Director of CLARITY, the Centre for Sensor Web Technologies which brings together the research activities of 80 researchers across 3 University sites in Ireland.

The Scientific Coordinator will monitor the technical work at the project and network level (e.g. detecting potential incompatibilities and technical problems that can delay the progress of the overall project), and will provide suggestions as to how problems can be solved among the work packages. He will also keep track of the technical integration work and will be responsible for every update of the systems in use. Furthermore, he will organize technical meetings and training activities, propose the Agenda for the technical meetings, and agree on the deliverables produced by the work packages. He is also responsible for liaising with related projects and initiatives.

The Scientific Coordinator is responsible for the quality of the deliverables produced.

Among his duties, the SC supports the PC in the continuous risk analysis and contingency planning tasks within the network.

2.1.1.2 Project Coordinator ERCIM, acting as the Project Coordinator, will rely on a dedicated team which has a very thorough knowledge of network coordination. Patricia Ho-Hune, the Project Coordinator already has 3 years of experience in this position, and will be responsible for all the administrative and financial coordination within and across work packages. Patricia Ho-Hune has already managed the MUSCLE network of excellence, and is thus familiar with integration activities as well as spreading excellence.

The Project coordinator shall report and be accountable to the Management Board.

Specific responsibilities include:

 Official point of contact with the European Commission and contract signatory with the EC  Deputy Chair of the Management Board (including agenda setting, production of minutes)  Collation and preparation of periodic reports and deliverables  Monitoring of the Quality Assurance scheme  Management of the delivery of major milestones  Preparation of financial reports  Processing and reimbursement of member costs  Establishing efficient internal management and budget control procedures  Organisation of periodic meetings  Partnership and consortium modifications  Intellectual property rights guidance  Preparation of European Commission and closure reports  Point of contact for conflict resolution  Contractual amendments to the contract and to the consortium agreement

60 To ensure the continuity and consistency of the administrative management, the coordinating partner will operate a Project Office for the entire duration of the Network. This office will

 Maintain a central archive of all documents produced within the network.  Manage the distribution of information inside and outside the network.  Maintain the work plan and produce consolidated reports on efforts, results, and resource consumption.  Ensure the financial coordination of the network. The Project Coordinator works in his activities with the Scientific Coordinator and the Management Board.

2.1.1.3 Management Board The Management Board (MB) is the executive decision-making board of the consortium. It is in charge of setting policies and strategic decision-making and acts as the executive and supervisory body for the network, reporting and accountable to network members. To be operational and yield sufficient executive power, the management board shall be made up of all WP leaders. In this network, the following members will sit on the MB:

 the work package (WP) leaders, to ensure efficient steering of the network’s underlying activities,  the network Scientific coordinator, also chairing the Management Board,  the Network coordinator, also deputy Chairman of the Management Board.  Management Board Composition

Role Name Institution Scientific coordinator and WP1B Leader Prof. Alan Smeaton DCU Project coordinator and WP1A Leader Ms. Patricia Ho-Hune ERCIM WP2 Leader Prof. Vojtěch Svátek UEP WP3 Leader Prof. Ebroul Izquierdo QMUL WP4 Leader Dr. Paul Buitelaar NUIG WP5 Leader Prof. Maarten de Rijke U Amsterdam WP6 Leader Prof. Joemon Jose U Glasgow WP7 Leader Prof. Noel O’Connor DCU

The Management Board is responsible for coordinating the network and is in charge of all operational management aspects of the network.

Meetings will be held at least twice a year and probably three times, and dedicated audio-conferences will be organised monthly, allowing for a close monitoring of the network’s day-to-day activities. If

61 necessary, the Management Board members can call for additional live meetings and audio-conferences to address particular issues.

A majority of two thirds is required for decisions, based on one vote per member. The Consortium Agreement will document those decisions where a greater percentage in favour is required.

2.1.1.4 Members General Assembly The Members General Assembly (MGA) is chaired by the Project Coordinator. The MGA will exist through a general mailing list and an annual plenary meeting, with telephone conferences convened as relevant issues arise. The annual plenary meeting will be the opportunity to gather all consortium players, to promote cooperation and integration, provide the Management Board with a broad audience to present the main network orientations, and give the different network bodies the opportunity to meet.

The MGA will support one plenary meeting per year, to which all SERENDIPITI participants are invited, and possibly also external guests.

2.1.1.5 Work package Leaders Work package Leaders will ensure the monitoring and coordination of all the activities ongoing in their work package. They are responsible to the Management Board.

The work package Leaders will organize work package meetings as required, using extensively dedicated tools ranging from mailing lists to audio-conferencing services.

Given the multidisciplinary nature of the SERENDIPITI network, establishing communication across work packages will be essential. Audio-conferencing services on-demand will be made available and use of cooperative workspaces and internet-based tools strongly encouraged.

To ensure that contributors to work packages work and cooperate in an integrated manner, all deliverables produced by a given work package will be reviewed by at least two other work packages, under the responsibility of the WP leaders and of the Scientific coordinator. The objective is to link all WP activities closely in order to maintain constant communication across activities, so as to ensure the early identification of problems such as interoperability issues or delays.

2.1.1.6 Industrial Advisory Board The Industrial Advisory Board (IAB) is a panel of external industrial experts and academic researchers who have a strong track record in fields related to SERENDIPITI. Their role will be to advise on network strategy and complex technical decisions. The IAB will be comprised of:

 Dr. Kenneth Wood, Deputy Managing Director of Microsoft Research, Cambridge UK.  Dr. Paulo Villegas, Telefonica I+D, Madrid, Spain  Dr. Gábor Prószéky, MorphoLogic, Budapest, Hungary  Name 4  Name 5

62 These are all senior researchers with extensive experience and leadership track records and we are delighted that all have agreed to join the SERENDIPITI IAB. Their letters of support can be seen in Annex 2. The IAB members will be requested to sign a confidentiality agreement in order to be registered on the project document repository for limited access to network reports and deliverables. These external industrial experts will meet at least once per year with the Project Management Board at a dedicated annual meeting.

2.1.1.7 Data Providers Group The data providers group will be constituted as a grouping of online sensor information sources which will be harvested and aggregated by the SERENDIPITI platform. It is an open-ended grouping in that we expect that throughout the lifetime of the network, and beyond, the number and diversity of information sources which we will use will increase. Because of this it would be impossible, or at best damaging to the network’s plans, to define the data provider group at this time so instead we have recruited some key players and we include in Annex 3 some letters of support from the Irish Meteorological Office (the Irish state weather bureau), the Irish Marine Institute, the blog site boards.ie, Dublin City Council, Prague City Hall, Sound and Vision, and others

The grouping is not constituted as a “board” per se in that it does not have a formal role in the management of the SERENDIPITI network. Participation in the data provider group is seen as a kind of reward for these organisations’ commitment to SERENDIPITY and the group benefits by getting first sight of technical and demonstration developments which have happened within the network. This affords these data providers a first view of opportunities for them to leverage and use the SERENDIPITI platform and possibly move from being just a data provider to also being a user. This is shown in Figure 2.2 where the arrow connecting the Data Provider Group to the User Group shows an ideal migration from data provider to actual user.

2.1.1.8 User group The SERENDIPITI user group is constituted as a grouping of key users for whom applications on the SERENDIPITI platform will be developed during the lifetime of the network, and beyond. Initially this will be composed of key users in the city event planning, journalist, and security areas.

As can be seen in the letters of support in Annex 4 we already have commitment from the Dublin City Council (event planning), the Korps Landelijke Politiediensten (KLPD, Dutch National Police Service) (event planning and security), Amsterdamse Innovatie Moto, and others.

Like the Data Provider Group, the User Group is not constituted as a formal board and does not have any formal role in the management of the SERENDIPITI network. Participation in the user group is seen as an opportunity for these organisations to get first sight of technical and demonstration developments which have happened on the SERENDIPITY platform, as well as an opportunity for the end users to network together.

User Group Data Provider Group

Wikimedia Foundation

City Event Planning Boards.ie Journalist/Citizen Twitter Police 63 Sound and Vision

EPA/MI/Met Ken Wood

Paulo Villegas

Industrial Advisory Gábor Prószéky Board Others

Others Figure X12: Information Flow within SERENDIPITI Management Structure

2.1.2 Management Activities In order for the management to be efficient in coordinating the SERENDIPITY network, several underlying activities will have to be carried out during the entire network duration. Ranging from “Quality Control” or “Reporting” to “Technology Watch” and “Risk Assessment”, the scope of managerial activities is described in the following sections.

Quality Control

Quality Control will be part of the Quality Assurance System within the Project Management Guidelines which will be developed for the implementation of the network. Processes and procedures within the Quality Assurance System will have to prove compliance with the general principals of standards, law and certifications.

More specifically the Quality System will be part of the Quality Assurance Plan which will be a deliverable of the network (D1.2). This document will be used internally by the consortium to describe the guidelines adopted by the network on documentation of network activities, periodic reporting, preparation of financial statements, approval and submission of deliverables, and risk management.

The development of the Quality System, in order to facilitate the implementation of the quality management, will include the following phases:

 Identification of the procedures needed  Planning, design and development of the procedures and the forms to be implemented  Development of an implementation guide for all the partners Management Reporting

64 The Project Office (PO) will set up standard management reporting templates for the management of the programme. All work packages will agree Network Milestones with the PO. Each milestone should provide assurance on timely achievement of a major work package deliverable.

All work packages will provide the PO with an updated one page report on a monthly basis. The report will cover:

 Activities during the reporting period  Major achievement of milestones against plan  Project and work package meetings  Deviations from plan, risks and issues affecting timely delivery, cost or quality  Next steps and foreseen actions

In addition, network monitoring will be completed with a six-monthly report, validating the major progress of the network against the contractual milestones and presenting the resources utilised versus the resources budgeted in terms of person-months for every team across the different work packages.

The Project Office will consolidate these reports and produce a consolidated report, which will be submitted to the Management Board for validation and reviewed in detail with the Project Coordinator.

Finally, completion of work package activities and submission of deliverables after a careful internal assessment will be ultimately validated by the Scientific Coordinator and the Project Coordinator.

Management Tools

A project management reporting tool will be used by the PC to produce standard reports. A web-server will be set up (as part of the network’s web portal) with private access for consortium members, offering online reporting and monitoring, a project library containing all current documents relating to the network, a news board and a management information exchange. This web-based tool will simplify the collection of information across different teams and work packages.

In addition, a collaborative working platform (e.g. BSCW) will also be used to support efficient collaboration between the partners. The ERCIM BSCW licence will be of benefit to all partners and will act as an internal document repository, not only for the exchange of documents in progress, but also for storing final and validated reports or deliverables.

Information Flow

A key success factor in project management is to ensure that information circulates rapidly and efficiently to all of the network’s actors and stakeholders.

To this end, the management will rely on a wide array of communication support tools. First, dedicated mailing lists will be created and archived (one for each work package, for the entire network and one for every managerial body of the network). In addition, the network will rely extensively on free audio-conferencing for addressing technical or managerial issues. The use of voice over IP tools will also be encouraged. Periodic technical and management meetings will also be organised to support exchanges and discussions within the network. If necessary, collaborative workspace licences can be acquired to further support collaboration among research teams. Ultimately, all efforts will be made by the management to support fluid information flow and to avoid information bottlenecks.

65 Consortium Agreement

Before the network starts, the consortium members will sign a formal Consortium Agreement in which roles, responsibilities and mutual obligations will be defined. These will include the sensitive questions of intellectual property rights (IPR), and the structure and organization of the network. It will adopt the recommended guidelines laid down by the Commission and will include:

 Specific arrangements concerning intellectual property rights to be applied among the participants and their affiliates, in compliance with the general arrangements stipulated in the contract.  Management of knowledge generated by the network, and rules for knowledge transfer.  Internal organization of the consortium, its governance structure, decision-making processes, reporting mechanisms, controls, penalties and management arrangements.  Arrangements for the distribution of the community contribution among participants and among activities.  Rules for partners joining and leaving the consortium.  Provisions for the settlement of disputes within the partnership. In addition, the Consortium Agreement will define rules to distribute knowledge in proportion to the effort leading to the generation of such knowledge. In case of debate, the Management Board will have the final say, and any conflicts will be resolved using specific voting mechanisms defined in the Consortium Agreement.

The owner of knowledge must provide adequate and effective protection for knowledge that is capable of industrial or commercial application. The consortium participants may publish information on knowledge arising from the network provided this does not affect the protection of that knowledge. So before any knowledge dissemination takes place, the matter must be agreed with the Management Board.

Participants will also be able to use knowledge which they own arising from the network, in accordance with the provisions agreed amongst them in the Consortium Agreement. When using knowledge, the consortium partners will make every effort to ensure confidentiality and the need to safeguard the interest of the consortium partners, especially their intellectual property rights.

Finally the Consortium Agreement will document in detail the treatment of intellectual property rights, including:

 Protection of knowledge  Access rights for use in the network  Access rights for using knowledge in subsequent research activities  Access rights for sub-contractors  Access rights for parties joining or leaving the network  Access rights for third parties  Specific provisions for access rights to software  Royalties resulting from substantial commercial benefits

Conflict Resolution and Relationship Breakdown

The consortium decision-making process is aimed at building consensus throughout the network with the activities of one partner not having adverse effects on the activities of another partner.

66 In the event that disputes or differences arise that cannot be resolved the following process shall be followed:

Disputes within a work package that cannot be resolved internally by the work package leader should be referred to the Project Coordinator who will attempt to reconcile differences.

If this does not resolve the dispute, the Project Coordinator will table the issue for discussion at the earliest opportunity with the Management Board.

In case the dispute remains after discussion with the Management Board, the conflict will be presented to the Member General Assembly.

The final settlement of outstanding disputes will be managed through arbitration in Brussels under the rules of arbitration of the International Chamber of Commerce by an arbitration panel appointed under those rules. The award of the arbitration panel will be final and binding upon the partners concerned.

Where the dispute concerns intellectual property, the dispute can be elevated to the Management Board that can request the assistance of the European Commission IPR helpdesk or require the creation of an IPR External Advisory Panel to provide counsel and advice. The decisions of an IPR Strategic Task Force in such matters are binding for all partners.

2.2 INDIVIDUAL PARTICIPANTS

European Research Consortium for Informatics and Mathematics – ERCIM ERCIM, founded in 1988, is an EEIG which provides simple and effective management for joint projects between its members. ERCIM has its central office located in France and acts as a front-end to access the scientific expertise of its members. ERCIM members are all research organisations, or national consortia of research organisations and universities, with strong activity in I.T. research, namely: AARIT (Austria), CCLRC (UK), CRCIM (Czech Republic), CWI (The Netherlands), CNR (Italy), FNR (Luxembourg), FNRS/FWO (Belgium), FORTH (Greece), FhG (Germany), INRIA (France), NTNU (Norway), SARIT (Switzerland), SICS (Sweden), SPARCIM (Spain), SRCIM (Slovakia), SZTAKI (Hungary), TCD (Ireland), VTT (Finland), PLERCIM (Poland), PEG (Portugal). ERCIM also became the European host for the W3C on January 1, 2003.

The central ERCIM Office in Sophia Antipolis, France will coordinate the project. Since 1990, the ERCIM Office team has developed a proven expertise in managing research projects and large networks, and has been involved in over 30 projects funded by various European Community programmes. Community-funded projects under the Seventh Framework Programme include the D4Science project and Coordination and Support Actions Digital World Forum and MobiWeb2.0. Sixth Framework Programme projects funded by the EC and managed by ERCIM Office include the Networks of Excellence DELOS, MUSCLE and CoreGRID, Specific Support Action GRID@ASIA, Coordination Action EchoGRID, STREP GridCOMP and Net-WMS, and Integrated Projects DILIGENT, ACGT and VITALAS. The ERCIM Office has also a very valuable competence in results dissemination, as part of its assets rest with customized web design, set-up and assistance, and the edition of the ERCIM News magazine (over 11500 copies distributed worldwide). This, combined with the nineteen research organizations disseminated across Europe makes ERCIM a key player in

67 European IT research and development, and a reliable foothold for international co- operation.

Patricia Ho-Hune graduated from the University of Paris XI and obtained her PhD in chemistry at Loughborough University (UK) in 2002. She worked as a research chemist for a pharmaceutical company where she was involved in a FP5 European project. In 2005 she joined the European Polysaccharide Network of Excellence (EPNoE) and was in charge of the administrative project management as well as the organisation of research activities. She joined ERCIM in October 2006 and has been managing several successful projects such as the MUSCLE NoE, the GridCOMP STREP, the EchoGRID SSA and the Interlink CA.

Celine Bitoune is ERCIM Projects group financial coordinator responsible for optimizing W3C and ERCIM human and financial resources across all European project activities. In addition she is in charge of the financial and administrative management of all World Wide Web Consortium (W3C) funded projects, among which, WAI-AGE and MobiWeb2.0 (two Coordination and Support Actions). She has written two financial guidelines to European Commission financial rules for participation in funded projects (Frameworks 6 and 7), widely distributed within and beyond the ERCIM consortium. Ms. Bitoune joined ERCIM in 2004 as Assistant and as Task Leader of the Mobility Programme to the CoreGRID Network of Excellence involving 42 partners. Prior to ERCIM, she worked in EURECOM as financial and administrative coordinator of the institute’s European and national funded projects.

The project Office will be in charge of the Administrative and Financial management of the Network and will also lead tasks in the Work packages dedicated to integration and dissemination activities, namely the management of the Researcher Mobility Program, the set-up of project website and the preparation of promotional materials. It should be noted that GEIE ERCIM has its own International Fellowship Programme and Internal Mobility scheme. It also organises prospective workshops (like the EU-NSF ones), seminars and conferences. The project Office will bring its experience in disseminating the project results with the support of its Web team (web design, set-up and assistance) and with its Communication Manager who will provide assistance for designing the brochures and other materials. This, combined with the twenty research organisations disseminated across Europe makes GEIE ERCIM a reliable partner for international cooperation.

Dublin City University (DCU) CLARITY: The Centre for Sensor Web Technologies at Dublin City University (DCU) is an Irish national research centre that focuses on the intersection between two important areas, Adaptive Sensing and Information Discovery. It is funded by Science Foundation Ireland and by directly by industry partners including IBM, Vodafone, Amdocs, Episensor, and others. CLARITY develops innovative new technologies towards improving the quality of life of people in areas such as personal health, digital media and management of our environment. CLARITY is a partnership between University College Dublin and Dublin City University, supported by research at the Tyndall National Institute (TNI) Cork. The overarching theme of CLARITY's research programme – bringing information to life – refers to the harvesting and harnessing of large volumes of sensed information, from both the physical world in which we live, and the digital world of modern communications & computing. CLARITY brings the expertise of more than 80 researchers plus a management and administration team in areas as diverse as new sensor technologies, wireless networking, multi-modal imaging and analysis, content analysis and access,

68 distributed artificial intelligence and multi-agent systems. The group has taken part in previous and current EU projects including the FP7 network 3DLife, the FP6 Integrated Project aceMedia and FP7 NoE K-Space as well as further projects going back several years.

Alan Smeaton is Professor of Computing in DCU and Deputy Director of CLARITY. He is a founding coordinator of TRECVid which, since 2001, has been an annual world-wide initiative to benchmark the effectiveness of information retrieval from digital video libraries coordinated by NIST and funded by the US Dept. of Commerce. His early work focussed on text-based information retrieval then moved to information retrieval of images and then video information. He has graduated more than 25 PhD and M.Sc. research students and currently leads a team of 16 researchers at postdoctoral and PhD levels. He is a member of the editorial boards of 5 journals and has published almost 300 refereed papers/book chapters/proceedings. He has been program chair or co-chair for nearly 10 international conferences and is on the program committees of between 10 and 20 conferences each year. He has 6 patents and has won significant grant income from national and international funding agencies, and from industry.

Prof. Noel E. O’Connor is an Associate Professor in DCU and a Principal Investigator (PI) in CLARITY, with responsibility for the research strand on Contextual Content Analysis. His research is focused on multi-modal analysis for knowledge extraction from a variety of sensor data sources. His research is funded under Science Foundation Ireland and Enterprise Ireland, EU framework projects and industry contracts. Since July 2000, he has generated over €7.9M in funding, edited 3 books of proceedings and published over 140 peer-reviewed papers in journals and conferences. He is a reviewer for Signal Processing: Image Communication, IEEE Trans. on Circuits Systems and Video Technology, IEEE Trans. on Image Processing, Pattern Recognition and Computer Vision and Image Understanding.

In 2008, Bert Gordijn was appointed to the Chair of Ethics at DCU where he now leads DCU’s Ethics Institute. Previously, Dr Gordijn held the post of Lecturer and Clinical Ethicist within the Department of Ethics, Philosophy & History of Medicine at the Radboud University, Nijmegen (Netherlands). He has been appointed to the Scientific Advisory Board of the European Patent Office, the External Science Advisory Panel of the European Chemical Industry Council, and the UNESCO expert committee on ethics and nanotechnology. He holds a number of editorial positions in key international ethics publications including Editor-in-Chief of the International Library of Ethics, Law and Technology, published by Springer.

Prof. Barry Smyth holds the Digital Chair of Computer Science in University College Dublin and is a visiting Professor at Dublin City University. He is the Director of CLARITY, an ECCAI Fellow and a co-founder, director, and Chief Scientist of ChangingWorlds Ltd., recently acquired by Amdocs. His research covers a broad set of topics within Artificial Intelligence including Case-Based Reasoning, Machine Learning, User Modeling and Planning with particular focus on Personalization techniques, which look at ways of combining ideas from these areas to develop information systems that automatically learn about, and adapt to, the needs of individual users. He has published over 300 refereed articles and has several best paper awards.

Apart from providing scientific leadership of the network, DCU’s will coordinate work package 7, Outreach and Spreading Excellence, which will involve all partners. Most of DCU’s efforts are based in work package 3 where they lead the activity on Traffic/Crowd Monitoring from Real Time Multimedia Sources (A3.3) and DCU’s expertise in video and image processing from the Centre for

69 Digital Video Processing gives a strong background and skillset which can be leveraged. DCU already has ongoing funded projects as part of CLARITY in the area of real time video processing from sports footage and this will provide useful for this work package. DCU is also heavily involved in work package 6 on Infrastructure Integration and Sharing, where they lead tasks on Data Sets and Ground Truth Generation (A6.3) and on Case Studies (A6.4). The background in formal, worldwide benchmarking activities (TREC and TRECVid) as well as the direct contact with user groups and data providers in Dublin, makes DCU a natural choice for involvement in these activities. Finally, DCU also leads the activity on Societal Impacts in A7.4 where the Institute of Ethics in DCU will have a strong presence in ensuring rigorous and thorough examination of the impact of SERENDIPITI work on the general public in terms of privacy, identity, trust and accessibility.

University of Glasgow (GLA) The University of Glasgow (GLA, UK,) dates from 1451, when the Scottish King James II persuaded Pope Nicholas V to grant a bull authorising the founding of a university in the city. Modelled on the University of Bologna, Glasgow was, and has remained, a University in the European tradition and is the fourth oldest university in the English- speaking world. Today, the University of Glasgow is one of the top 100 universities in the world with an international reputation for its research and teaching. University of Glasgow has more than 6,000 staff, including 2,500 researchers. Glasgow is a member of the prestigious Russell Group of 20 leading UK research universities, is a founder member of Universitas 21, an international grouping of universities dedicated to setting worldwide standards for higher education, is a member of IRUN (International Research Universities Network) - an international network of broad-based research universities. The University activities are categorised into 9 faculties containing 45 specialist departments between them. One of the larger specialist departments is the Department of Computing Science (DCS), with 31 full-time academic staff, which houses the Information Retrieval Group as one of its 7 research laboratories. In the recent UK research assessment exercise, 80% of DCS researchers are categorized as internationally leading.

The IR Group in Glasgow has six senior academics, headed by Professor Keith van Rijsbergen, fifteen post-doctoral research fellows and fifteen PhD students. The Information Retrieval Group has a vigorous programme of research, based on both theory and experiment, aimed at developing novel, effective, and efficient retrieval approaches for all types of information. The group plays a leading role in the international information retrieval community and has set trends in many aspects of IR research. The group has a long and strong research history in a wide area of information retrieval research from theoretical modelling of the retrieval process to large-scale text retrieval systems building and to the interactive evaluation of multimedia and multimodal information retrieval systems. The group maintains strong links with researchers in Machine Learning and Human- Computer Interaction, as well as with industry through knowledge and technology transfer. Members of the group have also been extensively involved in organising major conferences (ECIR 2002, 2007, CIKM 2010) workshops and summer schools in the area of information retrieval. It has been a partner in the EU FP6 SEMEDIA, SIMAP, MIAUCE, SALERO and KSpace projects.

Joemon M. Jose is a Professor at the DCS. He is a fellow of the BCS, a chartered information technology professional (CITP), member of the ACM, IEEE, and IET professional bodies He has a well-established reputation in research on multimedia information retrieval, developing advanced retrieval models, studying the role of emotion in search, personalization and adaptive retrieval. He has published over 110

70 journal and conference articles and leads a team of 6 PhD students and 12 post-doctoral researchers. He currently holds 2 research grants from the EU-IST programme (SALERO, and MIAUCE) on multimedia retrieval and multi-modal interaction. He was also involved in SEMEDIA, K-SPACE and IP-RACINE projects. For the academic year 2004-05, he was a visiting research scientist at the School of Computer Science at Carnegie Mellon University, Pittsburgh. He has strong links with Industry and has held an industrial case grant supported by the Sharp Laboratories Europe Ltd. He was the recipient of a short-term research fellowship (STRF 2003) from BT Exact Laboratories on adaptive retrieval. He has organised conferences and events related to multimedia information retrieval including AIR 2008, SSMS 2007, AIR 2006, AMR 2005, IRiX 2005 & IRFEST 2005 and was guest editor for the Information Processing and Management on Adaptive Retrieval.

C. J. (Keith) van Rijsbergen is an Emeritus Professor at the DCS and his research in Information Retrieval covers both theoretical and experimental aspects. He has written two books on information retrieval and is co-author of "Information Retrieval: Uncertainty and Logics". His research interests are in the mathematical foundations of information retrieval and currently develop an IR model based on quantum theory. He is a fellow of the Royal Academy of Engineering, the Royal Society of Edinburgh, the IET, the BCS and the ACM. He is the chair of the RAE assessment panel for computing science and of the Information Retrieval Faculty advisory board. van Rijsbergen was the recipient of the Tony Strix Award in 2004 and the Gerald Salton Award in 2006.

Mark Girolami has been Professor of Computing & Inferential Science since 2006 at the University of Glasgow, where he holds a joint appointment in the Department of Computing Science and the Department of Statistics. He was awarded an EPSRC Advanced Research Fellowship (EP/E052029/1) in 2007 to undertake a five year study related to the Synthesis of Probabilistic Prediction and Mechanistic Modelling within Systems Biology through which he has pioneered the exploitation of Bayes factors in evidentially ranking mechanistic models of biological systems. . At Glasgow he leads the cross-disciplinary Inference Research Group (www.dcs.gla.ac.uk/inference) currently consisting of 17 researchers (8 Post Doctoral RAs and 9 PhD students) where the main focus of research is methodological development in statistical inference and machine learning with major applications in computational and systems biology. His recent successful applications include statistical modeling of telecom fraud and counterfeit currency detection. Related to this proposal is a ‘Fast Breaking’ paper, according to Thomson Reuters’ ScienceWatch, which places it among the top 1% of highly cited computing science papers since 2008. . His PhD was awarded in 1998 for a thesis on Independent Component Analysis and there are currently in excess of 1000 citations to the papers published during his PhD studies. His research is funded by current grants from EPSRC, Microsoft Research Europe and various industrial collaborators. Along with UK based industrial collaborators he is an author of three international patents that have emerged from his research work. Mark is a member of the EPSRC Peer Review College (2005-2010), serves on the Medical Research Council (MRC) Bioinformatics Training Panel (2005-2009), and is an associate-editor for IEEE Trans Neural Networks and Pattern Recognition Letters.

GLA will coordinate work package 6, Infrastructure Integration and Sharing, which will involve all partners. In this work package they will coordinate the Activity 6.2 on Evaluation methodology bringing in the expertise from the IR and HCI fields. In addition, GLA will coordinate Activity 2.3 on cooperation in education, Teaching Materials and PhD formation by bringing in experience from similar activities in the KSpace NoE. They will also lead RTD activities on Real-time Multimodal Data Synchronisation (A3.6), Adaptive Search Models (A4.5), Dynamic network analysis (A5.4) by utilising their prior expertise in multimedia retrieval, development of advanced retrieval models like Logic based retrieval, Divergence from randomness model, effective user profiling techniques, and

71 machine learning. Specifically they will bring in their expertise from the EPSRC funded ADAPT project and IST funded MIAUCE, SEMEDIA, SALERO and K-Space projects.

University of Amsterdam (UvA) The University of Amsterdam joins with researchers from the Informatics Institute (Faculty of Science). The Intelligent Systems Laboratory Amsterdam (ISLA) within the Informatics Institute focuses on processing information in pictorial, auditory and/or textual form and the consequences such information has for subsequent actions. ISLA members are interested in pictorial databases, learning from text and pictures, computer vision, multimedia and multi-modal information integration, link discovery, and agent technology. These topics are covered from theory to practice, from basic principles to applications. ISLA has gained very substantial experience in many areas relevant for the project, especially WP3, 4 and 5: Web Search Engines, Access to Cultural Heritage, XML Retrieval, Web Mining, Language Technology, Machine Translation, Log analysis, User generated Content and Social Media Analysis, Semantic Search. ISLA is involved or has been involved in many national and EU funded projects, and collaborates with a large number of research groups, both nationally and internationally. Of special relevance for Serendipiti are the MultimediaN project on the development of multimedia information technology for usage in high-end applications (in media, intelligence, and heritage), VIDI-Video (video retrieval), DuOMAn (online media analysis), TNT (news monitoring and impact prediction), and the AID project on semantic access to scientific information within a virtual lab environment for e-science.

Maarten de Rijke is full professor of Information Processing and Internet in the Informatics Institute at the University of Amsterdam. In 2001 he was awarded a Pionier grant, the highest personal award in the Netherlands on the basis of a proposal. He leads the Information and Language Processing Systems group (with 25 researchers), largely funded from external sources. A (past) co-coordinator of evaluation efforts at CLEF, INEX, and TREC, his current research focus is on intelligent web information access, with projects on urban and vertical search engines, semi-structured documents, cross-channel mining, media analysis, user generated content, entity retrieval, and multilingual information. He has published over 400 papers and books and is the director of ISLA (with 65 staff), of UvA’s Information Science bachelor program and of its newly founded Center for Content, Creation and Technology (CCCT), a collaboration between UvA’s faculties of science, behavorial science, and humanities in which news and user generated content are focus areas.

Christof Monz is Assistant Professor of Natural Language Processing within the Information and Language Processing Systems group. Previously, he was an assistant professor in computer science at Queen Mary, University of London, UK, and a post- doctoral research fellow at the University of Maryland Institute for Advanced Computer Studies (UMIACS), USA. He holds a PhD in Computer Science from the University of Amsterdam, and a degree in Computational Linguistics (summa cum laude) from the Institute for Natural Language Processing, University of Stuttgart, Germany. He has co-authored more than 40 refereed publications and served as member of over a dozen conference program committees. Having led several large-scale software implementation efforts, he leads a project on statistical machine translation funded by a UK governmental end-user and CoSyne, an FP7 project on machine translation that involves urban and involving user generated content.

Arnold W.M. Smeulders graduated from Technical University of Delft in physics in 1977 (M.Sc.) and in 1982 from Leyden University in medicine (Ph.D.) on the topic of visual pattern analysis. In 1994, he became full professor in multimedia information

72 analysis at the UvA. He has an interest in cognitive vision, content-based image retrieval, multi-modal analysis as well as in systems for the analysis of video. He has written over 250 papers in refereed journals and conferences. In 2000, he was elected fellow of the International Association of Pattern Recognition. He was associated editor of IEEE transactions PAMI. Currently he is associate editor of the International Journal for Computer Vision as well as the IEEE transactions Multimedia. Currently, he is scientific director of the MultimediaN national public- private partnership of 30 institutions and companies, and of the national research school ASCI. He is principal investigator of VIDI-Video, an FP6 project on video retrieval. He has graduated over 30 PhD-students.

UvA will coordinate WP 5, Inference and Event Prediction, which will involve all partners except ERCIM. Most of UvA's efforts are based in WP4 (where they lead the activity on interlinking real- time knowledge (A4.2)) and WP5, where UvA's broad expertise in and range of ongoing projects on, mining and prediction for video, text and sensory data provides a solid springboard. UvA is also heavily involved in WP3 and in WP6, where they lead the activity on the SERENDIPITI platform (A6.5). UvA's long-running and significant investments in infrastructure for textual, video and sensory analysis and search makes it a natural choice for involvement in the platform-related activities. Finally, UvA also leads the activcity on the SERENDIPITI Platform Challenge (A7.5), where UvA's close involvement in CLEF and other formal, worldwide benchmarking activities (INEX, TAC, TREC) will ensure proper methodological founding and organisational set-up.

National University of Ireland, Galway (NUIG) The Digital Enterprise Research Institute (DERI), part of the National University of Ireland, Galway is a leading research institute in Web Science, networked knowledge and semantic technologies. DERI has developed an institute strategy around the “Networked Knowledge” topic consisting of Semantic Reality and Social Semantic Information Spaces as main components and performs research on the Semantic Web, social networks, sensor network platforms, and semantic web services. Research results are applied to solve integration problems in areas such as eLearning, eGovernment, eBusiness, and eHealth. DERI develops advanced Semantic Web infrastructures, such as platforms for running large-scale, data-intensive experiments, which facilitate collaborative social working environments, scalable storage and reasoning engines, distributed computing, and ontology development (WebStar), models for information integration, e.g. SIOC (Semantically Interlinked Online Communities), platforms for distributed social semantic desktops (NEPOMUK), and semantic search engines (SWSE, Sindice, SIG.MA). In Sensor Networks, DERI is co-developer (together with EPFL, Switzerland) of the Global Sensor Networks (GSN) platform which provides a flexible middleware layer that abstracts from the underlying, heterogeneous sensor network technologies and supports fast and simple deployment and integration of sensor networks on a large scale. In Semantic Web Services, DERI extensively contributes to the framework of SWS technology in the context of the Web Service Modelling Ontology (WSMO), the Web Service Modelling Language (WSML) and the corresponding architecture and Web Service Execution environment (WSMX).

DERI has been very successful in bringing in EU funding from the FP6 programme into Ireland in the last 5 years. To date DERI has participated in, 6 IPs, 8 STREPs, 2 NOEs, 1 CA, with total funding of €8 million since 2004. DERI actively participates in and leads research funded by the EU FP7 programme (FAST, Romulus, Okkam, CONET, PECES, iMP, MONNET and Net2 Marie Curie IRSES) and projects in other complementary work programmes (EATRAIN LLP, WAVE CIP eParticipation PSP, Rural inclusion CIP ICT PSP, eCAALYX AAL). Additionally, DERI has been successful in attracting over €3M from industry-oriented funding through Enterprise Ireland. The

73 DERI ecosystems, known as DERI LAB and DERI LAND, have both augmented the presence of existing multinational corporations and have started to attract new companies to Ireland, while also providing the environment for spin-out opportunities.

DERI members actively participate in a number of standardisation activities such as the Wireless Sensors Enterprise Led Network (Ireland), the W3C Semantic Annotations for Web Services Description Language and XML Schema Working Group (W3C SAWSDL WG), OASIS Semantic Execution Environment Technical Committee (OASIS SEE TC), Semantic Web Services Challenge (SWS Challenge), Semantic Web Services Initiative (SWSI), and the Ontology Management Working Group (OMWG). Semantically Interlinked Online Communities (SIOC) is a member submission to the W3C.

Dr. Paul Buitelaar (PhD 1998, Computer Science, Brandeis University, USA) is a senior research fellow and head of the newly established DERI Unit for Natural Language Processing. Before joining DERI in 2009, he was a senior researcher at the DFKI Language Technology Lab and co-head of the DFKI Competence Center Semantic Web in Saarbrücken, Germany. His main research interests are in language technology for semantic-based information access. He has been a researcher and/or project leader on a number of national and international funded projects, e.g. on concept-based and cross-lingual information retrieval (MuchMore), semantic navigation (VIeWs), ontology-based information extraction and ontology learning (SmartWeb, Theseus-MEDICO), semantic-based multimedia analysis (K-Space). See also http://www.paulbuitelaar.net/

Dr. Alexandre Passant is a postdoctoral researcher at the Digital Enterprise Research Institute (DERI), National University of Ireland, Galway, where he co- leads the Social Software Unit. His research activities focus around the Semantic Web and Social Software: in particular, how these fields can interact with and benefit from each other in order to provide a socially-enabled machine-readable Web, leading to new services and paradigms for end-users. Prior to joining DERI, he was a PhD student at Université Paris-Sorbonne and carried out applied research work on "Semantic Web technologies for Enterprise 2.0" at Electricité De France, defended summa cum laude in June 2009. He is the co-author of SIOC, a model to represent the activities of online communities on the Semantic Web, the author of MOAT, a framework to let people tag their content using Semantic Web technologies, and is also involved in various related applications as well as standardization activities in W3C. He authored about 50 papers on the topic of Social Semantic Web technologies and co-authored a book on the topic (Springer) as well as organizing workshops and giving tutorials in venues such as WWW, ISWC, ESWC and SemTech. He is a member of IEEE and ACM. More at http://apassant.net

Dr. Giovanni Tummarello has a Laurea and a Ph.D Degree in Electronic Engineering, with background in audio signal processing and computational intelligence. His works include Semantic technologies with special interest on Multimedia Semantics. He has created several libraries related to the MPEG-7 and participated in the Multimedia Semantic W3C Incubator Group. Dr. Tummarello is currently leader of the Data Intensive Infrastructures within NUIG-DERI where he coordinates several projects dealing with general-purpose Semantic Web technologies. He is the creator of the Sindice linked data search engine (http://www.sindice.com), the Semantic Web search engine and mash-up platform http://sig.ma and the Semantic Web Pipes project. He is the creator and Chair of the Semantic Web applications and perspective conference Series (Ancona 2004, Trento 2005, Pisa 2006, Bari 2007). He was co-chair of the Identification Identities and Identifiers workshop at the World Wide Web conference (2007) and European Semantic Web conference (2008).

74 For the important network objective of integrating people, DERI/NUIG will coordinate activity 2.4 (Summer Schools and Social Networking), based on experience in the organization of summer schools in previous years (summer schools co-organized by DERI for the ReasoningWeb and Nepomuk EU-funded projects). In the context of WP6, DERI expertise in semantic search and semantic mash-ups will be central in activity 6.5.1 (Semantic Mash- up Platform). The longstanding involvement of DERI in standardization efforts, in particular in the context of W3C, will make DERI well positioned to lead efforts in activity 7.3 (Standards and Technology Transfer). On research efforts, DERI will coordinate WP4 on Semantic Information Fusion, which will involve all partners. In this work package DERI will coordinate activity 4.1 (Structuring information using common semantics), bringing in expertise on knowledge representation using lightweight semantics, and activity 4.4 (Respecting privacy and ensuring trust), on which topic DERI recently co-organised the first international workshop (“Trust and Privacy on the Social and Semantic Web” at the European Semantic Web Conference 2009). A further main research contribution by DERI will be in the context of WP3 on activity 3.3 (Distributed intelligent sensing), which will leverage its expertise in semantic text analysis.

University of Economics, Prague (UEP) UEP (http://www.vse.cz) is the sixth largest Czech university, founded in 1953. It consists of six faculties with about 19 thousand students. The Department of Information and Knowledge Engineering (DIKE) within Faculty of Informatics and Statistics was founded in 1990. The Knowledge Engineering Group within DIKE is recognised for its research and educational activities in knowledge discovery from databases, web/text/multimedia mining, web engineering and knowledge-based systems. It participated as funded partner in eight EU projects: in the multimedia area (6FP NoE K- Space), in the KDD area (5FP projects Sol-Eu-Net and MiningMart and COST project TARSKI), in medical informatics (4FP project MGT and DG SANCO project MedIEQ), in the TEL area (6FP IP KP-Lab), and in the digital libraries area (eContent M-CAST). The group was involved in multiple EU network projects such as PetaMedia, Knowledge Web, Ontoweb, KDnet, or EUNITE, and its members participate in several W3C working groups. The group also hosted top-class international conferences such as ECML (1997), PKDD (1999), EKAW (2006) and ISMIS (2009).

UEP, as leader of WP2 (Integration) and in particular of A2.5 (Pan-European Integration) will benefit from its existing network of contacts, including those in the new EU member states. The scientific expertise of UEP wrt. SERENDIPITI is concentrated to WP4 and WP5, in particular to association discovery from large datasets including urban ones (for A5.1 and A5.2), web mining and information extraction (for A4.2), as well as to ontological engineering (for A5.5) and semantic web (for A4.1 and A4.3). Association discovery will leverage on experience gained in the national project LISp-Miner (supported by two CSF grants) and in the EU projects Sol-Eu-Net, MiningMart and TARSKI. A mature tool for association mining, LISp-Miner, is available for exploring the associations in the Serendipiti resources. The tool is highly optimised for performance on large datasets, and it is going to be grid-enabled in the near future. Web mining and information extraction was thoroughly studied in the EU MedIEQ project and in the national (CSF-funded) project Rainbow. In the EU K-Space project, in turn, UEP has been the leader of the task devoted to mining complementary resources to multimedia, and contributed to the development of the Core Ontology for Multimedia (COMM) and of the distributed RDF querying architecture.

Dr. Vojtěch Svátek obtained the PhD in Informatics from the UEP in 1998 and became Associate Professor in 2007. His main research domains are data/text/web/multimedia mining and ontological engineering. Local contact person in EU-funded projects K-Space and MedIEQ, co-ordinator of two grants of the Czech Science Foundation. Program Co-Chair of EKAW

75 2006 conference, PC member of about ten other relevant conferences (ECML/PKDD, ISWC, ESWC, ASWC, ISMIS, SAMT etc.); co-organiser of workshops held at ECML/PKDD, ISWC, ESWC and MIE. Member of the W3C OWL Working Group.

Dr. Jan Rauch obtained the PhD in Mathematical Logics from the Mathematical Institute of Czechoslovak Academy of Sciences in 1987, and became Associate Professor in 1999. His interest is in various forms of association discovery, in logical foundations of KDD, and in the links between KDD and document engineering. Among other he carried out KDD projects on urban social climate and on vehicle monitoring. Local coordinator in EU-funded projects Sol-Eu-Net and Tarski. Coordinator of several national projects, currently of a five-year (2008-2012) CSF project “Application of knowledge engineering methods in knowledge discovery from databases”, and contact person in bilateral projects (with USA, France, Slovenia and Finland). Conference Chair of ISMIS-09, Member of Steering Committee PKDD, Member of Advisory Committee ECML and PKDD 2002, Program Co-Chair of PKDD-99, Workshops chair of PKDD2000, Tutorials Chair of ECML/PKDD2007.

Dr. Milan Šimůnek obtained the PhD in Informatics from the UEP in 2004. He has been the lead developer in numerous large software projects in the fields related to SERENDIPITI such as KDD (the LISp-Miner toolbox), air pollution and energy consumption prediction, and 4D multimedia simulation of urban settings (the Praha4D project).

Queen Mary, University of London (QMUL) Queen Mary, University of London (QMUL) is one of the UK's leading research-focused higher education institutions. Amongst the largest of the colleges of the University of London, QMUL's 3,000 staff delivers world class research across a wide range of subjects in Science and Engineering, Medicine and Dentistry and Humanities. QMUL is ranked in the top 11 leading research Universities in the UK according to the 2008 Research Assessment Exercise (RAE) and fourth amongst University of London multi-faculty colleges. With a budget of £260 million per annum and a yearly economic impact on the UK economy of some £600 million, Queen Mary is a research-focused university, which has made a strategic commitment to the highest quality of research.

QMUL hosts one of the UK’s leading research groups in multimedia signal processing, computer vision, security and intelligent systems: The Multimedia and Vision group (MMV) enjoys a distinguished reputation for innovation, receiving direct funding from overseas organisations such as Nokia, Philips, Nortel, the Department of Defence and the EU. The MMV group has 36 members, including four members of academic staff. The group has participated and coordinated many EU funded projects including RACE MAVT; ACTS MOMUSYS, PANORAMA and Custom TV; Esprit UNITE; Basic research DRUMS; IST SAMBITS, IMPACT, MARINER, SHUFFLE, CRUMPET, EDEN, SADEGUARD, SCHEMA, BUSMAN and several others. It coordinated the IST NoE K- Space and the COST292 Action. It was one of the main contributors and steering member of FP6 IST Integrated Projects aceMedia and MESH, Co-ordinator of the STReP EASAIER and a partner in RUSHES. In FP7 ICT, the group is a partner in Papyrus and APIDIS, and a core partner of the NoE PetaMedia. In the last 3 years alone, the group has published over 80 journal papers, most of them in the IEE and IEEE Transactions in the field, over 300 refereed conference papers and secured over £4 Million in grant funding from various sources.

The scientific expertise of QMUL in large scale data analysis will help to lead WP3. QMULwill take the lead in different Activities that are closely related to its current research theme, in particular data

76 sensing and analysis (A3.1 and A3,2), real-time semantic event detection and analysis (A3.3 and A3.4), scalable coding and streaming of multimedia (A3.5), and multimodal data synchronization (A3.6). In addition to that QMUL will use its experience from NoE K-Space and NoE PetaMedia to lay out the plan towards the sustainable VCE (A2.1) and virtual laboratory (A6.1). QMUL will also lead the dissemination activity A7.2 to produce joint publications based on concrete and real joint research efforts in the SERENDIPTY platform.

Prof. Ebroul Izquierdo holds the Chair of Multimedia and Vision and is head of the MMV group. He is a Chartered Engineer, a Fellow member of the Institution of Engineering and Technology (IET), chairman of the Visual Information Engineering professional network of the IET, a senior member of the IEEE, and a member of the British Machine Vision Association. He is an associate editor of the IEEE Transactions on Circuits and Systems for Video Technology and has been guest editor of numerous journals. Prof. Izquierdo coordinated the European project BUSMAN, he coordinated and chaired the steering committee of the European research network Cost292 involving 38 institutions world-wide and the network of excellence on semantic inference for automatic annotation and retrieval of multimedia content, K-Space involving 14 European key research institutions and industrial players. He is also a member of the steering committee of the Networked Electronic Media platform NEM. Prof. Izquierdo has published over 300 technical papers and chapters.

Dr. Alan Pearmain is a senior lecturer in the Department. He joined the department in 1979 after previously working at University College, Dublin and Brookhaven National Laboratory, New York. He has worked in several RACE and ACTS projects (RACE R- 1083: PARASOL, RACE R-2083: MAVT - Mobile Audio Vis-ual Terminal, ACTS AC-098 MoMuSys - Mobile Audio-Video Systems) and on RACE AC-360 CustomTV. He was workgroup leader for the hardware development workgroup in MoMuSys and is currently working in the IST project SAMBITS. He has published many papers and sections of books. Recent publications have been on Multimedia systems and VLSI design tools. He has BSc (Eng) and PhD degrees from Southampton University, England.

77 2.3 CONSORTIUM AS A WHOLE (JOEMON TO DO THIS)

2.3.1 Synergies and track record on successful previous cooperation

- need to do 1-2 pages why we make up a good, complementary team and on who has worked together with who, and when, and what for, citing examples

Describe how the participants collectively constitute a consortium capable of achieving the network's objectives, and how they are suited and are committed to the tasks assigned to them. Demonstrate that the participants have made a mutual commitment towards a deep and durable integration continuing beyond the period of Community financial support (for example, by attaching letters of commitment from the executive bodies of the organisations).

2.3.2 Sub-contracting

Sub-contracting is not planned in this network.

2.3.3 New Contractors

It is not planned to add any new contractors during the course of this Network of Excellence. However a main goal of SERENDIPITI is to achieve Pan-European integration, thus researchers from new affiliated partners will be included in the NoE activities, though funded through the founding partners.

2.3.4 Other Counties

None of the SERENDIPITI contractors are from non-EU countries.

2.4 RESOURCES TO BE COMMITTED

2.4.1 Mobilization of Critical Mass of Human Resources This will be a section of text which will highlight the leveraging that EU funding will bring to the existing funding which partners bring to SERENDIPITI – full of graphs, pie charts and a huge table of the number of SMs allocated to each activity, both funded and unfunded, for each partner. Ebroul says we can then take this table away from the proposal document when it turns into a “description of work” document when we’re funded ;-) Anyway, can’t do this until detailed budget is done.

2.4.2 Partner Contributions Each of the partners in SERENDIPITI brings a range of complimentary skills, resources, and already- funded activities to this Network of Excellence which when it is taken together, presents a compelling case. In this section we outline the equipment and infrastructural resources as well as an indicating the other funding sources which we bring to the proposal.

78 Dublin City University The DCU CLARITY center has access to a computational cluster consisting of 56 nodes, each of which is two Intel Xeon E5430 Quad-Core processors. These are rated at 2.66 GHz, with 2x6 MB cache and 8 GB RAM each. This gives us 448 cores in total. Storage is via external RAID, consisting of DELL MD3000 disks which give a total of 12 TBytes. DCU also has a second cluster consisting of 12 machines, each of which consists of two Intel Xeon E5420 Quad-Cores rated at 2.5 GHz each, with 2x6MB cache and 16 GBytes RAM each, giving 96 cores in total. This is accompanied by 18 TBytes RAID storage. Finally, if the need arises, through the Irish Centre for High End Computing, DCU and NUIG can access a 479-node cluster with 958 cores in total and appropriate amounts of main memory and RAID storage. It is anticipated that this equipment would be used in SERENDIPITI to run machine learning as part of real-time large-scale data analysis in WP3.

CLARITY itself is a Centre for Science Engineering and Technology (CSET) funded by Science Foundation Ireland which aims at creating, and managing, information for the sensor web. It is this funding which will be leveraged for SERENDIPITI activities. CLARITY employs almost 50 researchers at Dublin City University including postdoctoral researchers, PhD students and administrative and support staff. CLARITY’s current funding is augmented by funding from other Irish funding agencies including Enterprise Ireland, Marine Institute, Environmental Protection Agency and National Digital Research Cente, as well as from industry partners including Vodafone, Foster-Miller, Disney, Adidas, Amdocs, and others.

Glasgow University The department of Computing Science (DCS), Glasgow University brings in expertise through their information retrieval group and the Inference group. Both these groups together consists of over 50 researchers with a number of fundings from industry, UK funding bodies and EU IST programmes. Both these grups have large computational facilities and also the University provides a number of GRID computing facilities. Recently, GLA have purchased 52 quad core Optron computational servers for research purposes. In addition, GLA have a back-up server with 16TB of storage.

Specifically, GLA will bring in the following funded projects through their IR and Inference research groups: Foundations research in information retrieval inspired by quantum theory (EPSRC,UK); Towards context-sensitive information retrieval based on quantum theory: with applications to cross- media search and structured document access (EPSRC, UK); Puppy-IR (EU-IST); Cross Disciplinary Account - Computational Statistics and Cognitive Neuroscience (EPSRC, UK); mathematical & Statistical Modelling of Cytokine Receptor Cross-Regulation by Cyclic Amp (BHF, UK), LSLR: Large Scale Logical Retrieval (Matrixware, Austria).

University of Amsterdam The ISLA group has its own internal cluster of 200+ cores and 100+Tb storage. In addition it has access to, and frequently uses, clusters provided the national computing center Sara (3976 CPUs, “infinite” storage) as well as the DAS-3 cluster (350+ cores, 25+Tb storage). Together with Sara, ISLA is currently conducting cloud computing experiments. This equipment will be used in SERENDIPITI for the WP5 activities.

ISLA brings together world-class research in search and analytics for textual, image, video and sensory data. ISLA employs around 65 research staff, with approximately 85% of the required funding coming from non-university sources: the Netherlands organization for scientific research (NWO), the Technology Foundation (STW), FP6, FP7 projects, large-scale public-private projects (MultimediaN), and a broad range of projects with governmental organizations (intelligence, security, political and

79 others) and commercial partners including a large number of SMEs as well as Philips, Elsevier, Unilever and others.

NUI Galway Knowledge storage and retrieval in the context of the SERENDIPITI platform for semantic mash-ups will be implemented on the WebStar infrastructure available at DERI/NUIG, which provides a large interconnected cluster of machines for web science experiments, i.e. a cluster of 500+ core, 1TB RAM and 400TB disk space. Based on expertise gained at DERI/NUIG with Sindice, the Semantic Web index (partially-funded by the EU projects OKKAM and Romulus, as well as by the Science Foundation Ireland) and sig.ma, a real-time mash-up of Semantic Web data, WebStar will be used to deploy a semantic mash-up platform for building applications that will use semantically gathered, extracted and integrated data. DERI/NUIG will ensure a 24/7 quality of service for WebStar thanks to a clustered infrastructure, providing scalable solutions.

The Digital Enterprise Research Institute is a Centre for Science Engineering and Technology (CSET) funded by Science Foundation Ireland which aims to be the excellence centre for Web Science, with a special focus on the Semantic Web and Semantic Reality. It is this funding which will be leveraged for SERENDIPITI activities. DERI employs about 120 persons at the National University of Ireland, Galway including administrative and support staff, M.Sc. and Ph.D. students as well as postdoctoral and senior researchers. In addition to the CSET, DERI’s funding is augmented from various EU projects in FP6 and FP7, national funding agencies such as Enterprise Ireland as well as funding from DERI’s industrial partners including Cisco, Nortel, Ericsson and others.

University of Economics, Prague UEP is equipped with about 400 dual core machines primarily serving for educational purposes. A part of them can be integrated to a computational cluster; experiments are already under way regarding the use of a smaller cluster for distributed data mining. If necessary, the grid infrastructure of the CESNET association, consisting of about 1400 CPUs (both MIPS-based SMP machines and clusters), could also be exploited.

The human resources available at UEP for SERENDIPITI include both senior and junior researchers at two closely interlinked academic departments: the Department of Information and Knowledge Engineering and at the Laboratory for Intelligent Systems, altogether comprising about 30 research and administrative staff. The majority of them have extensive experience from EU projects such as K- Space, MedIEQ, PetaMedia, KP-Lab, Sol-Eu-Net, MiningMart or M-CAST, as well as national projects such as LISp-Miner or Rainbow. Part of the group’s budget is also being received from the private sector, e.g. from Seznam.cz, the most widely used Czech search portal. Finally, the most direct co-funding to SERENDIPITI will be drawn from the long-term research project of the Ministry of Education, MSM 6138439910 (2007-2013), focusing on knowledge discovery methods, with annual budget of around 400 Keuro.

Queen Mary, University of London The QMUL MMV group has fully access to a computational cluster consisting of 200 dual core (>2GHz) nodes with 4GB RAM. As the WP3 “real-time large-scale data analysis” leader, it is anticipated that this equipment would be used in SERENDIPITI to run machine learning process.

The MMV enjoys a distinguished reputation for innovation, receiving direct funding from overseas organisations such as Nokia, Philips, Nortel, the Department of Defence and the EU. The group is funded to develop event detection, scalable coding and streaming of multimedia, web mining and information extraction, and multimedia information retrieval. The expertise and developed technique will be integrated in the SERENDIPITI activities. The MMV group has 36 members, including four

80 members of academic staff. The group has participated and coordinated many EU funded projects including RACE MAVT; ACTS MOMUSYS, PANORAMA and Custom TV; Esprit UNITE; Basic research DRUMS; IST SAMBITS, IMPACT, MARINER, SHUFFLE, CRUMPET, EDEN, SADEGUARD, SCHEMA, BUSMAN and several others. It coordinated the IST NoE K-Space and the COST292 Action. It was one of the main contributors and steering member of FP6 IST Integrated Projects aceMedia and MESH, Co-ordinator of the STReP EASAIER and a partner in RUSHES. In FP7 ICT, the group is a partner in Papyrus and APIDIS, and a core partner of the NoE PetaMedia.

2.4.3 EC Funding Management Short half-page on how funding will be managed by coordinator.

81 SECTION 3. IMPACT 3.1 EXPECTED IMPACT AS LISTED IN THE WORK PROGRAMME

The NoE SERENDIPITI is expected to greatly improve integration of fragmented research in the field of large-scale information management and boost related technological developments. As a consequence, the NoE will vastly impact R&D in this new field of semantic urban computing. It will also seriously impact European services for information access, management and planning, as well as, enhance social cohesion by leveraging important aspects of user interaction with other citizens and planning authorities. At technical level, the SERENDIPITI vision ties together several aspects of information management, exploiting complementarily and integration of physical world data, knowledge extraction, online learning and geo-spatial information. It will offer a platform and services to predict events (for city planning) based on real-time data and semantic analysis.

More specifically and in the context of objective ICT-2009.4.3: Intelligent Information Management, in the sequel we describe the expected impact of the NoE SERENDIPITI per item listed in the corresponding work programme.

 Better leveraging of human skills, improved quality and quantity of output and reduced time and cost allowing users to concentrate on more creative and innovative activities.

SERENDIPITI integrates research activities of a large number of researchers across different fields of work with focus on large-scale data analysis for semantic information mining and events prediction. SERENDIPITI aims at providing a more holistic view of city events by combining physical world sensor data streams with their online counterparts and linking with traditional media channels. Beyond this, a more comprehensive overview and analysis of events both in real-time and retrospectively for enhanced prediction is addressed. SERENDIPITI’s holistic approach is vital in uncovering inaccessible information about similar or related events in an automated way enabling users to concentrate on more creative and innovative activities. By aggregating partial observations from multiple sources, SERENDIPITI will support the organization of city events and prevent accidents based on understanding of the issues that arose in previous events. Thus, it will enable improved quality and quantity of output with reduced time and cost.

 Increased ability to identify and respond appropriately to evolving conditions (e.g. in finance, epidemiology, environmental crises …) faster and more effectively.

As stated before, the SERENDIPITI vision of aggregating partial observations of an event in the context of large-scale semantic urban computing brings many technical challenges. A huge amount of sensory data and complementary resources available is not actually usable and is deemed to be lost due to many reasons including:

. The amount of available sensory and semantic information is too large, and the data has to be viewed in linear time. There are some approaches to define summaries, but they currently are very limited. . Multimedia content is not truly scalable, searchable, and streamed in real-time.

82 By addressing these two aspects, the SERENDIPITI project will help to speed up the creation of a future generation of technology and services, in which large scale data from online resources and physical environment can be analysed at real-time. It will also enable the creation of effective summaries from vast amounts of data. The analysed output will be semantically enriched, followed by the fusion of the extracted information from sensory data and online resources in order to identify events and respond appropriately to evolving conditions. The obtained knowledge will be combined with inferencing based on heterogeneous information provided by diverse sources along with the ability to predict the occurrence of events (e.g. city planning, crisis, accidents) in real-time to keep the citizen informed faster and more effectively.

 Reinforced ability to collaboratively evolve large-scale, multi-dimensional models from the integration of independently developed datasets.

SERENDIPITI will provide an access to heterogeneous large-scale information mined from various archives of both physical and online information. This vision naturally leads to the SERENDIPITI proposition of large-scale urban semantic computing that supports real-time aggregated analysis of multidimensional data. All access and analysis mechanisms will be provided over the SERENDIPITI system, which will allow stable incremental inclusion of a variety of content sources, including multi- format and multi-media elements. Thus, the SERENDIPITI approach fully addresses this specific “expected impact from the work programme”

 Higher levels of information portability and reuse by creating ecology of systems and services that are dynamic, interoperable, trustworthy and accountable by design.

SERENDIPITI provision of long term access to digital materials enables cross-disciplinary collaboration. International scientific collaborations will benefit from the portability and reuse of data repositories (e.g. city maps). An increasing number of investigations and inferences depend on re- using data collected from earlier predictions and observations. Thus analyzing the data in new ways and in conjunction with other sources of information will help to provide services that are dynamic and, interoperable.

The main economic benefits depend on the re-use of stored digital information that is impossible to reproduce or it is far too costly to regenerate. SERENDIPITI will provide effective and affordable digital strategies and tools which will underpin the move from an industrial to a knowledge economy. SERENDIPITI will support social cohesion by enabling trustworthy and accountable citizens participation through blogs or social networking. SERENDIPITI will also investigate the full potential of content protection and address solutions for semantic urban communities that are trustworthy and accountable by design.

 Increased EU competitiveness in the global knowledge economy by fostering standards- based integration and exploitation of information resources and services across domains and organisational boundaries.

SERENDIPITI has the potential to impact global standards and European IPRs. The technology addressed in SERENDIPITI will influence several sectors of the multimedia and data mining industry and related markets, specifically, multimedia, Internet service providers, security services and city event planners. As such, the range of standards that can be influenced by SERENDIPITI contributions is broad. The integrative nature of SERENDIPITI research concept allows support for innovation and uptake of revolutionary ideas. It will also help the creation of additional European IPRs and coherent

83 research roadmaps. Actually, SERENDIPITI with a core of academic institutions is in the best position to support and contribute to the development of coherent research roadmaps across domains and organisational boundaries.

 Strengthened EU leadership at every step of the computer-aided information and knowledge management lifecycle, creating the conditions for the rapid deployment of innovative products and applications based on high quality content.

As argued before, the successful completion of SERENDIPITI is important for city event planning, journalism and reporting, police and security applications. However, the technical challenges are significant. Therefore, the careful combination of expertise and research institutes in the SERENDIPITI consortium provides an effective way to face these challenges. Indeed, this NoE is unique in its aim to push new paradigms related to real-time large scale semantic urban computing in a real world application. This effort yields a radically different perspective on the required offline urban computing for event prediction leading to novel promising applications and thus helping to guarantee EU leadership in the field.

The SERENDIPITI project will also help to strengthen European competitiveness at several steps of the computer-aided information and knowledge management lifecycle. It will also help to create the conditions for the rapid deployment of innovative products. In an era of worldwide technological competition, the European IT industry really needs to make an effort to increase European technological capacities. Research results can be directly acquired by advanced technological markets (such as Japan and USA) and developed nations are starting to subcontract and transfer technology development (and therefore knowledge discovery) to countries with lower costs such as India and China. The SERENDIPITI project will help to reinforce European competition in the innovative field of the addressed thematic priority of large-scale, semantic urban computing systems. It will bring together partners in multimedia services and infrastructure with key research partners in semantic- based knowledge and multimodal synchronisation of physical sensor and online resources, to integrate complementary competence and to reach the critical mass needed to achieve technological success and market competitiveness. Through the positioning of the NoE partners at the very source of radical changes in the semantic urban computing, the envisioned Virtual Center of Excellence will become a place from which numerous scientific activities will start in completely new directions. With its large mass stemming from the joined EU and intenational research partners, and with extensive activities towards changing the research agendas (see our Spreading Excellence plan, Section 3.2) the Virtual Center of Excellence in semantic urban computing is very likely to gain the momentum to become a key player in the world, fostering new developments in the field of semantic urban computing.

3.2 SPREADING EXCELLENCE, EXPLOITING RESULTS, DISSEMINATING KNOWLEDGE

3.2.1 Talent Boosting Talent Boosting in SERENDIPITI is mainly conducted in WP2 on “Integration of People and Organisations”. In order to promote talent boosting, the core pillar of SERENDIPITI will concentrate on a Virtual Centre of Excellence (activity A2.1, coordinated by QMUL) in the area of semantic urban computing that will assure persistence of the achieved accumulation of people and know-how. It will integrate new partners in its governing body by establishing a research council and an associate membership program for academics. The network will also promote a Research Mobility Program through SERENDIPITI Fellowships (A2.2, coordinated by ERCIM), offering 12-18 months positions,

84 where research fellows will be hosted by two member institutes for a 6-9 month period each and will include another short visit to at least one more institute in the network. The NoE will also concentrate on Cooperation in Education, Teaching Materials and PhD Formation (A2.3, coordinated by GLA) through shared teaching material (lecture notes, video lectures, etc.), invited talks in the various SERENDIPITI partner institutions as well as joint- supervision of Ph.D. students by at least two members of the consortium. To boost the talent of young researchers and students also from other institutions, the NoE will organize a series of summer schools on Semantic Urban Computing (A2.4, co-coordinated by NUIG). In order to let young researchers attending the summer school stay in touch on a long-term basis (ensuring joint-project opportunities), the NoE will promote the use of online social networking services in addition to the usual summer school activities. Finally, SERENDIPITI will concentrate as well on pan-European integration of young researchers in particular from new EU member states (A2.5, coordinated by UEP).

3.2.2 Dissemination Activities Dissemination activities in SERENDIPITI are spread between WP6 on “Infrastructure Integration and Sharing” and WP7 on “Outreach and Spreading Excellence”.

In the context of WP6, the emphasis on dissemination will be in activities A6.1 on the development of a “Virtual Laboratory”, A6.2 on “Evaluation Methodology” and A6.4 on “Case Studies”. The SERENDIPITI Virtual Laboratory will be made available to researchers outside of the consortium to experiment with semantic integration of data from various sources and developing real-time computationally intensive applications in semantic urban computing. In connection with this, an evaluation methodology for benchmarking SERENDIPITI research activities will be set up that will pull in researchers from outside the consortium through the organization of a series of user-centered SERENDIPITI challenges (see also A7.5) - involving the development of procedures for data collection, specification of experimentation and measures for comparison. Finally, the set of SERENDIPITI case studies will ensure dissemination of research activities to a wider range of stakeholders in further development of semantic urban computing, i.e. the general public and press (A6.4.2), city councils (A6.4.1) and local authorities such as the metropolitan police (A6.4.3).

Building on the activities of WP6, we will implement core dissemination activities in the context of WP7, involving a broad range of stakeholders to which outreach from the SERENDIPITI NoE is of interest – including: i) academic and industry researchers with an interest in the challenges of real-time analysis and semantic integration of sensor and online web data; ii) public data providers such as city councils, meteorological office, Environmental Protection Agency, etc.; iii) application areas such as in urban planning, events organization, journalism, security, etc.; iv) application developers in academia and industry with interests in novel applications that leverage real-time analysis and semantic integration of sensor and online web data; v) standardization groups such as the W3C Semantic Sensor Network Incubator Group and others; vi) related research and application projects such as These include OKKAM, LarKC, the Kno.e.sis Citizen Sensing initiative, the MIT SENSEable City Laboratory , the US projects ‘This We Know’ and ‘Common Sense’, the EU funded projects ‘WeKnowIt’, OKKAM and LarKC, and national projects such as the Irish funded project ‘SmartBay ’, all mentioned earlier.

In order to properly organize the planned SERENDIPITI dissemination activities in all of the directions mentioned above, a comprehensive dissemination work plan with targets and milestones will be developed early in the project (M3), to be revised twice during the lifetime of the network (M12, M24).

85 Dissemination activities in WP7 will be in A7.1 on the development of a “Web portal & Promotional materials”, in A7.2 on “Joint Publications”, in A7.3 on “Standards and Technology Transfer”, in A7.4 on “Societal Impacts” and in A7.5 on development of the “SERENDIPITI Platform Challenge”.

The web portal will be used for information exchange within and beyond the consortium, coordination of the mobility programme, access to the application platform, data sets and use case applications. Promotional materials will include hard-copy brochures, a quarterly electronic newsletter and exploitation of Web 2.0 services such as Flickr, Twitter, Facebook and others for outreaching the project to a wider audience. In regard of scientific dissemination, the emphasis will be on producing joint publications that span across consortium partners and which underscore the collaborative and multi-disciplinary nature of the activity plan. Additionally, the NoE will promote wider scientific publication in the specific area of semantic urban computing through sponsoring special sessions at conferences and journal special issues. Technological development in semantic urban computing will be supported by active involvement of consortium partners in standardization activities, e.g. on semantic sensor networks, whereas technology transfer will be ensured by identifying potential industry connections, relevant IP protection and exploitation, and by active participation in EU related activities on the Future Internet.

From a very different perspective however, the NoE will also address potential societal impacts of future applications in semantic urban computing, e.g. by examining levels of trust in these applications, ethical and legal issues in sensor analysis and integration etc. Finally, to help promote the NoE activities we will implement the SERENDIPITI Platform Challenge, which will be an open call to researchers and developers inside and outside of the consortium to develop novel applications by use of the SERENDIPITI platform and associated data sets. The challenge will run during the final year of the network, when the SERENDIPITI platform is most mature and stable, and will be embedded as part of the annual CLEF benchmarking activity. Input from the Industrial Advisory Board will be sought in the judging process and a (small) prize will be awarded at the event sponsored by the NoE.

3.2.3 Exploitation by SERENDIPITI Partners (2-3 pages total, Alan to do) For this last part we need 1/3 page from each partner on how they plan to exploit, what structures are available in their institution to support commercialisation activities – these have been sent to Alan by email, Alan to integrate here

86 SECTION 4. ETHICAL ISSUES

SERENDIPITI partners are fully aware of the threats and vulnerabilities which face our privacy, our unique identity, the trust we place in systems, and the security issues they raise, as well as the role that we as individuals play in this rapidly-changing world. More and more we find that technological developments either impinge on these fundamental rights or challenge the boundaries of what is right and what is wrong, what is legal and what is not.

The SERENDIPITI Network of Excellence will have its integration activities based around the emerging area of semantic urban computing, and although we do not challenge either the privacy, identity, trust or security of individual people, because we aggregate sensed information which is ultimately derived from the activities of individual people we need to consider these aspects when it comes to crowds or groups and we need to ensure that we follow best practice in protecting the rights of individuals.

The SERENDIPITI consortium explicitly addresses the many and varied issues raised by these topics through the inclusion of Prof Bert Gordjin from Dublin City University, within the consortium to lead activity A7.4 on “Societal Impact”. Prof Gordjin holds the Chair of Ethics at DCU where he also leads the newly-formed Ethics Institute. He is a Clinical Ethicist with a proven track record in studies of the impact of new technology, especially that based around ambient assisted living, on people. His recent work has covered the development of structured methodologies for analysing world scenarios in which ambient intelligence and pervasive computing abound, and from these scenarios his methodologies can be used to propose safeguards to address threats and vulnerabilities. In a way this corresponds to a SWOT analysis – strengths, weaknesses, opportunities and threats – of a world where rich sensing of our real physical world as well as our virtual online worlds, exists. This is the space in which SERENDIPITI will operate.

Prof Gordjin will lead the specialist activity A7.4 where all partners will have some input and role. His experience as Scientific Advisory Board of the European Patent Office, the External Science Advisory Panel of the European Chemical Industry Council, and the UNESCO expert committee on ethics and nanotechnology will mean that he holds an industry-aware view on issues of privacy, identity, trust and accessibility. Particularly relevant to his role in SERENDIPITY is the role he played in the FP6 project SWAMI (Safeguards in a World of Ambient Intelligence) funded under the EC’s Sixth Framework Programme, which identified and analysed the social, economic, legal, technological and ethical issues related to identity, privacy and security in the forecasted but not then deployed Ambient Intelligence (AmI) environment. This project recently led to publication of a book published by Springer in 2009 of which Prof. Gordjin was the series editor. Also relevant to our treatment of societal issues in SERENDIPITI is the ISTAG vision of an AmI future, published in 200311 which is “sunny” according to the SWAMI project conclusions but even ISTAG had a warning about how enabling technologies can facilitate monitoring, surveillance, data searches and data mining, which should then make us aware of issues of privacy, trust, identity, etc.

So while there has been lots of discussion among experts, and recommendations of caution concerning how the work we propose in SERENDIPITI might infringe on the rights of individuals, few people have actually done anything to address this. The insights brought to the area of research in which SERENDIPITI will operate by this previous work combined with the expertise of Bert Gordjun mean that within SERENDIPITI we can now explicitly addresses ethical concerns (in terms of the research, its conduct and outcomes) in actual practice, and we outline how ethical issues raised by the work in SERENDIPITI will be handled.

11 ISAT Advisory Group Ambient Intelligence: From Vision to Reality Office for Official Publications of the E, Luxembourg, 2003

87 Privacy

Information stored in digital format, even if anonymous, is subject to privacy regulations which may vary from one jurisdiction to another, but whose underlying principles are the same. These principles require us to respect the right of individuals not to have their privacy violated, to provide means whereby people whose information has been included in some system can access that information in order to check its accuracy. Privacy invasion has different facets including identity theft, the “little brother” phenomenon, data laundering, disclosure of personal data, surveillance and risks from profiling resulting from personalization. All of these will be catered for within the operation of SERENDIPITI.

Because SERENDIPITI targets the harvesting of publicly available information for the purposes of aggregation and semantic enrichment, there may sometimes be issues of ownership and rights of access to that original sensed data. In the majority of cases the information we will harvest will not have such constraints imposed but many public authorities such as meteorological offices, who provide weather information, city councils who provide information on vehicular and pedestrian traffic, news sources who provide information on events, and so on, do have conditions on access. To address this we have contacted several organisations as examples, to gauge their reaction to the SERENDIPITI proposal and the idea of re-purposing and aggregating their information for public use. These include the Meteorological Office in Ireland, Dublin City Council, the Irish Marine Institute, the Irish-based bulletin board “boards.ie”, and others. In all cases the organisations were supportive of the SERENDIPITI proposal and have written letters of support for this proposal which we include as Annex 3. We stress that this is just a sample and a non-exhaustive list of possible sensor sources for SERENDIPITI but we include these letters to illustrate the support we get from data providers. A more thorough list of targets for information sources on Dublin, Ireland, is included as Annex 5 to this proposal where we enlist the information source and the URL. Once again we point out that this list is just a starting point and is based on a cursory and brief information gathering exercise.

Usage-based research, which involves users, either users we have recruited explicitly or the general public, requires us to keep user data so that we can learn user preferences. Each of the partners in SERENDIPITI has already well-established guidelines within our institutions on how such usage data is to be anonymised, kept secure, and have restricted access. This includes informing users of the existence of that stored data and also of user rights to inspect what is stored about them and to guarantee that such data will not be transmitted to third parties and used for other purposes. SERENDIPITI will follow these European legal regulations regarding privacy at all times. This is at policy level and will be monitored and reinforced by the SERENDIPITI Administration Office in ERCIM, and its legal department.

Identity

It is both desirable and undesirable to maintain the identify of individuals as we are developing a suite of applications which aggregate individuals into crowds and then senses crowd or community behaviour from both the real and online worlds. The desirable aspects of maintaining identity include the fact that people can always relate easier to a new technology once they can personalise or attune to our individual tastes and this allows us to configure and arrange things in a way unique to us – we see this in how we configure our desktops (both our physical office desks and our on-screen computer desktops), how we parameterise devices, and how we use ‘skins’ on different applications, etc. In a SERENDIPITI setting we can and will do this but we have a unique scenario whereby the identity of the individual should be anonymised and preserved while at the same time the information sensed

88 from the physical and online words will have been contributed partly by those individuals whose anonymity among crowd behaviour is what we seek to preserve. This creates a quandary – how to balance individual identity vs. identity or belonging within a crowd or community. A similar but far less complicated scenario exists in rating products for an online catalog where we want the individual not to standout or dominate but to lose their individuality and homogenise and behave as part of the crowd while at the same time we want the individual to feel unique, wanted and cherished. How we balance the needs for homogenisation of behaviour with the needs for uniqueness and identity is a problem that has arisen elsewhere in the development of modern technology, albeit on a lesser scale and with less importance? The techniques behind personalization balance issues of identity and uniqueness with crowd wisdom and do so successfully by finding that “sweet spot” that works for users. In SERENDIPITI we face similar but more complex versions of the same problem, compounded by the fact that our sensed information comes from both the real and virtual/online worlds, and that our applications need to be real time.

Issues of identity and ethics have various components including information related to legal identity, identification, authentication of users, and user preferences and all of these have some bearing on the work within SERENDIPITY, and will be addressed to dome level in activity A7.4 on societal impacts.

Trust and Security

The notions of trust and security have two kinds of interpretation in the world of sensing – a technical sense which has issues of trust in data transfer, accuracy of sensor readings, security in storage and encryption and so on, and a social, cultural and legal sense which has issues like a user’s trust and confidence in a system, a mis-trust which arises from experiencing a loss of control or a misunderstanding, and a general honesty that the information a user provides is not deliberately errorsome because of malicious or other reasons. The technical interpretations are well covered by other projects and activities, but activity A7.4 will address how user perception, and believe, and trust in systems can be earned, and kept, and also how it can be lost.

Accessibility

The ethical and social issues that concern privacy, identity, trust and security can be extended to include the trustworthiness of any information generated for public consumption, as well as general overall accessability to that information which is why we include a specific thread of activity on considering accessibility within the SERENDIPITI network.

89 ETHICAL ISSUES TABLE

YES Page

Informed consent Does the proposal involve children? Does the proposal involve patients or persons notable to give consent? Does the proposal involve adult healthy volunteers? Does the proposal involve Human Genetic Material? Does the proposal involve Human biological samples? Does the proposal involve Human data collection? Research on Human embryo/foetus Does the proposal involve Human Embryos? Does the proposal involve Human Foetal Tissue / Cells? Does the proposal involve Human Embryonic Stem Cells? Privacy Does the proposal involve processing of genetic information or personal data (eg. health, sexual lifestyle, ethnicity, political opinion, religious or philosophical conviction) Does the proposal involve tracking the location or observation of people? Research on Animals Does the proposal involve research on animals? Are those animals transgenic small laboratory animals? Are those animals transgenic farm animals? Are those animals cloned farm animals? Are those animals non-human primates? Research Involving Developing Countries Use of local resources (genetic, animal, plant etc) Benefit to local community (capacity building i.e. access to healthcare, education etc) Dual Use Research having direct military application Research having the potential for terrorist abuse ICT Implants Does the proposal involve clinical trials of ICT implants?

I CONFIRM THAT NONE OF THE ABOVE ISSUES APPLY TO MY PROPOSAL

90 ANNEX 1: LETTERS OF COMMITMENT FROM EXECUTIVES OF THE 7 PARTNERS

* ERCIM

* DCU

* GLA

* UvA

* NUIG

* UEP

* QMUL

91 ANNEX 2 : LETTERS OF SUPPORT AND COMMITMENT FROM THE INDUSTRIAL ADVISORY BOARD

# Paulo Villegas

# Ken Wood (version with signature to arrive by post)

# Gabor Proszeky

# Amit Sheth

92 ANNEX 3 : LETTERS OF SUPPORT AND COMMITMENT FROM DATA PROVIDERS

# Marine Institute PDF to be used and text of the letter in DOC

# boards.ie (pdf)

# Meterological office (on the way)

# Environmental Protection Agency

# Dublin City Council PDF and DOC

# ESA

# Sound and Vision: PDF and DOC

# Prague City Hall PDF

93 ANNEX 4 : LETTERS OF SUPPORT AND COMMITMENT FROM USERS GROUP

* Amsterdamse Innovatie Motor (AIM) (promised to Alan); phone call and letter follow-up by MdR on Oct 19

* KLPD (National Police): PDF

94 ANNEX 5: SAMPLE INFORMATION SOURCES FOR SERENDIPITI PLATFORM (DUBLIN)

This annex contains a cursory list of freely available information on the city of Dublin, sources which could form a starting point for sources of information for the SERENDIPITI platform. This list does not contain information on vehicular traffic flow, web cameras for traffic/pedestrian analysis, mobile phone usage, event planning requests and many many more. The list is included purely as an indication of the range of material available.

Description URL Blog

Financial and economic blog by journalist http://adammaguire.com/blog/ Blog from crime reporter in Ireland http://irishcrimereporter.blogspot.com/ Blog of Irish economist and journalist http://www.davidmcwilliams.ie/ Round up of articles from a range of irish blogs http://irishblogs.ie/ Blog looking at the cost of consumer goods http://www.irishtimes.com/blogs/pricewatch/ Blog covering irish art and entertainment http://www.irishtimes.com/blogs/pursuedbyabear/ Blog about activities in Dublin http://dublin.metblogs.com/ Dublin-based Music blog http://guesslist.ie/ Dublin-based Music blog http://www.nialler9.com/ Large blog site and bulletin board for Irish issues http://www.boards.ie/

Environmental Data

Weather forecast for Dublin http://www.askmoby.com/weather/get-forecast-pc Index of odour issues from Dublin’s Wastewater http://www.dublincity.ie/WaterWasteEnvironment/WasteW Treatment Works ater/Ringsend%20Waste%20Water %20Treatment/Pages/RingsendWasteWaterTreatment.asp x Satellite images providing sea surface http://envisat.esa.int/instruments/aatsr/ temperature in Dublin bay Satellite images providing indicator of water http://envisat.esa.int/instruments/meris/ quality in Dublin bay Water level data from the Tolka river http://hydronet.epa.ie/stat_121088.htm?entryparakey=W Swell height, direction and period, wind strength http://magicseaweed.com/Dublin-Area-Surf-Report/694/ and direction and basic weather details for the Dublin Bay area All Weather information relating to the area http://www.met.ie/ including; forecast, temperature, radiation, pressure, wind, sunshine, pollen count, soil temperature, humidity, rainfall Rainfall radar information from Met Eireann and http://www.met.ie/latest/rainfall_radar.asp rainfall prediction Satellite images providing information on http://www.iup.uni-bremen.de/sciamachy/index.html greenhouse gases Temperature, water level and atmospheric http://www.marine.ie/home/services/operational/oceanogr pressure for Dublin aphy/TideGauge.htm Water quality information including Dublin city http://www.wfdireland.ie/ rivers Meteorological and wave info from the M2 buoy http://www.marine.ie/home/publicationsdata/data/buoys/ in Dublin bay

95 Weather information and predictions for the Dublin area

http://www.windguru.com/int/index.php?sc=4862 Energy consumpion information for Dublin- http://www.e3.ie/index.php based universities Energy consumption for Dublin Universities http://energy32.ucd.ie/live_data_dcu.shtml Realtime information on air quality http://www.epa.ie/whatwedo/monitoring/air/data/ Energy usage data http://www.cso.ie/px/SEI/DATABASE/SEI/sei.asp

News / Media

Regular news updates and announcements of http://www.newstalk.ie events Breaking Irish news stories http://www.breakingnews.ie/ Website of the Irish Times newspaper http://www.irishtimes.com/ Website of the Irish Indepdent newspaper http://www.independent.ie/ News Website of national broadcaster http://www.rte.ie/news/index.html

General Online Information

Social Networking site with information on www.facebook.com activities and events in Dublin Online photo repository with 277,621 geotagged www.flickr.com photos of Dublin Maps with geotagged photos, videos, wikipedia http://maps.google.com/ and web camera information Lottery totals and winner information http://www.lotto.ie/Prizes-and-Results/ Microblogging website offering opinions and www.twitter.com links in small messages House loans paid and approved by quarter, http://www.cso.ie/px/Doehlg/Database/Doehlg/Doehlg.asp commencement notices, etc. Trending and detailed information on houses http://www.daft.ie / http://www.myhome.ie and buildings for rent and sale Information source for crime events http://www.garda.ie/ Irish Stockmarket prices http://www.ise.ie/index.asp?docID=-1&locID=5 List of part-time jobs for young people http://www.nixers.ie Upcoming events which may affect traffic http://www.aaroadwatch.ie/dublin/ Info on all events including music, clubs, http://entertainment.ie/going-out/ theatres, comedy, dance and ballet, talks, conferences and exhibitions News, reviews and reports of various music and http://www.hotpress.com arts events Information on all events including music, clubs, http://www.irishtimes.com/theticket/ theatres, comedy, readings and art

Transport and Crowd Movement

Regular traffic updates from Automobile http://www.aaroadwatch.ie/dublin/ ssociation Regularly updated traffic camera images from http://www.dublincity.ie/dublintraffic/ across the city Number of spaces available in car parks in http://www.dublincity.ie/dublintraffic/carparks.htm Dublin Announcements of possible disruptions due to http://www.dublincity.ie/ROADSANDTRAFFIC/SCHEDULE road works DDISRUPTIONS/Pages/ScheduledDisruptionsHome.aspx Number of available rental bikes/bike spaces http://dublinbikes.mobi/ Bus timetables and real time updates http://www.dublinbus.ie/

96 Traffic updates between 7-10am and 4-7pm http://dublincityfm.ie/ daily Daily list of traffic accidents http://www.garda.ie/Controller.aspx? Page=138 Fixed charge traffic offences http://www.garda.ie/Controller.aspx?Page=138 Latest news and alerts regarding trains and light http://www.irishrail.ie/news_centre/travel_alerts.asp rail services Timetable, bus connections and travel alerts for http://www.luas.ie the Dublin light rail train system Information on ferry sailings and traffic http://www.stenaline.ie/ferry/latest-sailing- conditions information/dublin-port-holyhead Web cameras of Dublin on the Internet, e.g. http://images.ireland.com/webcam/liveview.jpg O'Connell Street Dublin Airline prices http://www.skyscanner.ie/ Real time details of flights to/from Dublin airport www.dublinairport.com/ Web cameras of Dublin on the Internet, e.g. http://images.ireland.com/webcam/liveview.jpg O'Connell Street Dublin

Scheduled Events including Sports

BBC TV and Radio Programmes, available as http://www.bbc.co.uk/programmes Linked Data A Structured extraction of Wikipedia information, http://dbpedia.org available as Linked Data A social bookmarking service interlinked with http://faviki.com DBPedia, available as Linked Data A large-scale database of upcoming events http://upcoming.yahoo.com/ (worldwide) Free guides (worldwide), user-diven, available as Linked Data Website of major ticket seller for upcoming http://www.ticketmaster.ie events Details of upcoming music events from major http://www.mcd.ie music promoter Details of upcoming music events from major http://www.aikenpromotions.com/ music promoter Details of Gaelic Athletic Association sporting http://www.gaa.ie events Details of Soccer sporting events http://www.fai.ie Dublin-based Music blog http://guesslist.ie/ Dublin-based Music blog http://www.nialler9.com/ Calendar for Irish schools http://www.schooldays.ie/articles/school-calendar

97

Recommended publications