Open Data Licensing: More Than Meets the Eye
Total Page:16
File Type:pdf, Size:1020Kb
Open Data Licensing: More than meets the eye Mashael Khayyat Trinity College Dublin & King Abdulaziz University, Jeddah [email protected] Frank Bannister Trinity College Dublin [email protected] Abstract In discussions of open government data (hereafter simply open data or OGD) the question of how such data should be licensed or whether they need to be licensed at all has to date received only limited attention – at least in the academic literature. A common assumption, at least in the public sphere, is that a large fraction of the data collected by governments can and should be released free of any constraints or restrictions for all to access and do with as they will. However, even for data that do not fall within the ambit of the security of the state it is far from obvious that this must be so; different forms of formal licensing may be appropriate and necessary in many cases. A libertarian approach to OGD is just one of a number of licensing options. A common assumption, at least in the public sphere, is that a large proportion of the data collected and held by governments can and should be released free of any constraints or restrictions for all citizens, communities and organizations to access and use as they wish. However, even for data that does not fall within the ambit of personal privacy, the security of the state or is otherwise sensitive, it is far from obvious that this should be so; different forms of formal licensing may be appropriate in some cases and necessary in others. A libertarian, free-for-all approach to open government data is just one of a number of licensing options from which governments can choose. This paper will explore the various dimensions of open data licensing. Starting from a definition of what a licence is, it will first look at the debate(s) that have surrounded licensing in the worlds of the open systems, freeware, shareware and open source. It will then examine and critique a number of existing or proposed open data licences including various international and national licencing frameworks. The Creative Commons and Open Database (ODbL) Licenses will be critically examined and possible problems with the concepts underlying various licences will be explored. The question of what may be suitable for standard public licenses and what may require bespoke or customised licensing will be analysed. Other questions to be investigated will be the policing and conformance as well as the implications of modern analytics and the mashing up of large data sets from different sources. 1 1. Introduction The modest, but growing, body of research into the barriers to the release of data collected and held by governments consistently includes a discussion of legal issues. There are several obstacles which fall under this general heading including existing legislative requirements such as data protection acts, intellectual property rights, risks of consequential harm, commercial sensitivities, concerns about modern data analysis technology and individual privacy rights (Barry & Bannister 2014; Janssen 2012; Bertot et al 2010). Some scholars and theorists argue that, in such a legally complex situation, a well-designed licensing regime is not only necessary, it is critical to the success of open data initiatives. Creating a legal environment in which citizens, communities and corporates can use such data with clarity and confidence about their rights and obligations is essential if societies are to make the most of these data. Its absence is likely to hinder creativity and the economic, social and political benefits that are widely expected to ensue (Korn and Oppenheim 2011). According to Korn and Oppenheim an understanding of open data licensing is important for establishing “which and how” data can be re-used. It is not only important to understand the legal issues that may arise in the context of licensing open data, but also the different types of licences that are available and the implications that they carry with them. This paper examines open government data licensing and explores a number of its dimensions. Its objectives are to highlight the complexities surrounding this topic, to examine some of these and to critique the current approach to open government data (OGD) licensing. This paper is organised as follows. Section two looks at the background to open licencing starting with a brief review of the Open Source movement and the approach to open licensing of software. Section three examines current OGD/open data (OD) licences and different approaches to licencing. Section four is a critique of the concept of OGD and includes some reflections on how the question of OGD licensing might evolve. Section five is a brief conclusion and contains some recommendations for future research. 2.0 Background 2.1 The Open Software Movement and Copyleft In the world of information and communications technology (ICT) the term ‘licence’ is traditionally associated with software though licensing is also used for other type of intellectual property such as methodologies. Within the Information Systems (IS) literature and community there has been and continues to be much discussion about the relative merits and demerits of open software and open source. Over several decades, the Open Source movement has proposed or developed a number of business models which are designed to offer users various forms of freedom to modify software and pass it on, partially encumbered or unencumbered, to others. The foundational principle of the movement is that software should be free in the sense of free of restrictions on use and modification rather than free of charge (which is separate issue). Unsurprisingly, some complicated legal problems can arise once one starts exploring the question of software licences in any dept 2 Non-proprietary software comes in a number of flavours. One key distinction is whether or not the source code is available. Freeware is the term generally applied to applications which anybody can use for free and without a licence, but which the user cannot modify or sell on a third party. Many PC games and utilities, for example, fall into this category. Some smartphone Apps broadly fall into this category. Another variation is shareware where the user needs a licence or permit and there may or may not be a charge for use. More complicated problems arise when source code is made available. This means that the user can modify the code, but while the original source code may be free, a user may feel entitled to charge for his modifications. Thus a developer may take some open source code, modify it and charge for the enhanced product. The latter may not matter provided he supplies the source code of his modification to other users to do what they want without further conditions or cost even though this will limit his ability to make money from his enhancement. Developers have sometimes tried to circumvent this problem by embedding proprietary code or by attaching proprietary add-ons to open source code. Others have tried to ‘claim jump’ and hijack the free source code. This problem has led to what is called the Open Source Definition which sets out a number of criteria for open source software namely: There must be free redistribution. No royalties or fees; Distribution must include the source code; Derived works are allowed. Modifications must be permitted; The integrity of the author's source code must be maintained; There can be no discrimination against persons or groups; There can be no discrimination against fields of endeavour; Distribution of licence. One licence covers all; A licence must not be specific (tied) to a product ; A licence must not restrict other software; A license must be technology-neutral. (Open Source Initiative 2014). Not all of these apply, or can be adapted to apply, to data, but some can. An attempt to create a similar set of principles for open data is discussed in section four. An obvious question about open source is this: in such a world how does a software developer make a living? Various attempts have been made to address this problem and as a result there are currently over 100 free and/or open source licences available - including one from the EU namely the European Union Public Licence (Joinup 2014; SchmThe most important and influential of these licences is probably the General Public Licence (GPL) which incorporates the concept of copyleft. Under a GPL licence, a developer who modifies open source code cannot impose any conditions on a user’s use further modification of the modified product although he can charge for the modifications that he has made (Free Software Foundation 2014). A full discussion of this is beyond the scope of this paper. This summary is presented because several of the problems and issues in open data have parallels in open source software though there are other issues that arise with data, but which are not a problem with software and vice versa. Nonetheless, given that the open source movement has been around for several decades there are likely to be useful lessons which can be drawn from the accumulated knowledge in this field. As will 3 be seen, the principle of copyleft has been adapted and applied to data in the Creative Commons Licence. 2.2 The Legislative Context There is a number of critical laws surrounding the data that governments use and the way that government are allowed to use such data. Even without considering other factors (of which there are many – see below) existing legislation has multiple implications when it comes to licensing both software and data. Central to any discussion of data licensing are two types of act: data protection acts and freedom of information (FoI) acts, though other legislation and quasi legislation (such as privacy rights and official secrets acts) also bear on licence design.