The Opportunities and Challenges of Open Data

Professor Sir @Nigel_Shadbolt FREng

CILIP 2016, Brighton scarcity to abundance… scarcity to abundance… powers of 10

Kryder’s Law

Kryder’s Law the data spectrum the data spectrum the data spectrum the data spectrum the data spectrum open data: nothing new… the power of open…

Open Licences

Open Open Source Standards

Open Open Data Participation

open data and insurance… open data and insurance… open data and spending open data and crime open data and transport open data and the environment open data and cultural heritage open data and research open data and research UK Open Government Data http://theodi.org Use evidence Share for investment knowledge

Create ideas Communicate together impact

Build demonstrator http://prescribinganalytics.com/ http://opencorporates.com/viz/financial/index.html# open data benefits…

Political Economic Social

• Transparency • Efficiency • Inclusion • Accountability • Innovation • Poverty • Participation • Growth • Diversity

Research Media Data

• Dissemination • Data • Engagement • Innovation Journalism • Improvement • Data Literacy • Acquisition linked data

The four micro principles of the Semantic Web

1. All entities of interest, such as information resources, real-world objects, and vocabulary terms should be identified by URIs. 2. URI references should be dereferenceable, meaning that an application can look up a URI over the HTTP protocol and retrieve RDF data about the identified resource. 3. Data should be provided using the RDF/XML syntax 4. Data should be interlinked with other data. data on the web - the LOD data on the web - the LOD Linked 5★ Gov Data – towards an National Data Infrastructure

•With data.gov.uk a national digital infrastructure being built

•URIs for schools, roads, bus stops, post codes, properties, companies, ...

•Some of the data links across and connects other data together

•Key data link points exist Open Linked Data and Local Government - http://opendatacommunities.org/ Open Linked Data and Property http://landregistry.data.gov.uk/ Open Linked Data and Companies http://www.companieshouse.gov.uk/toolsToHelp/data Products.shtml open data infrastructures: local, national, international

Energy Contracts Spending Consumption Tariffs Suppliers Providers

Administrative Geography Properties Timetables Fares

Performance Companies “data hugging” excuses…

• We know the data is wrong • We know the data is wrong, and people will tell us where it is wrong • We know the data is wrong, and we will waste valuable resources inputting the corrections people send us • People will draw superficial conclusions from the data without understanding the wider picture • People will construct league tables from it • It will generate more Freedom of Information requests • It might be combined with other data to identify individuals/sensitive information • It will cost too much to put it into a standard format • Our IT suppliers will charge us a fortune to do an ad hoc extract “data hugging” excuses…

• We know the data is wrong • We know the data is wrong, and people will tell us where it is wrong • We• All know these the sometimes data is wrong, have and some we will truth waste in them valuable resources i•nputtingOften they the corrections rationalise people the official send us fear of the unknown • People• Data ownerswill draw need superficial to be conclusions helped through from the data without understanding the wider picture • Example & precedent are your friends • People• Data curators will construct and data league scientists tables on the from up… it • It will generate more Freedom of Information requests • It might be combined with other data to identify individuals/sensitive information • It will cost too much to put it into a standard format • Our IT suppliers will charge us a fortune to do an ad hoc extract challenges: boundaries… challenges: computation and storage…

• There will be a very great deal of data…. • Who bears the cost of supporting it? • Can we really treat the Web as a large decentralised ? • Dotsam and netsam? Shutterstock challenges: curation…

– Interoperability – Endurance – Formats – Proprietary – Migration challenges: quality…

• A typical example • NaPTAN (public access transport points) • Includes 360,00 bus stops • Around 18,000 errors challenges: coverage…

• A typical example • NaPTAN (public access transport points) • Includes 360,00 bus stops • Around 18,000 errors • Which can be crowd source improved challenges: data literacy…

• Can’t trust people with the data • They might interpret it incorrectly • Do they have the skills • New levels of data literacy

challenges: data literacy…

• Can’t trust people with the data • They might interpret it incorrectly • Do they have the skills • New levels of data literacy opportunities for you

• Everyone depends on data/information/knowledge…. • Data literacy will be a new essential skill… • Your knowledge and insights, methods and techniques more important, more central more required than ever… – Indexing, organisation, context, sampling, anonymisation… • You are essential to making data capture, publication, analysis and interpretation business as normal… • Everyone is a data/information professional now… an open world…

“by making the demand for openness a paramount issue, quite new possibilities would be created, which, if purposefully followed up, might bring humanity a long way forward towards …co- operation on the progress of civilization”