CORD TASK FORCE, JUNE 2007

Document 126

Guidelines on Writing and Publishing Software as Open Source

0e45bf098a1a1c6295d72b120077afec.doc 1 1. Introduction

Open source software is software where the author (the 'licensor') gives a number of fundamental freedoms to the user (the 'licensee') via a license agreement. These freedoms include the possibility to study how the programme works, to adapt the code according to specific needs, to improve the programme, to run it for any purpose on any number of machines and to redistribute copies to other users1. It is necessary to note that just allowing access to the source code of a software system does not make it open source. The open source software development model differs from the closed source or proprietary model. Differences include the way the software is bundled or packaged and the roles played by participants. Both models also share common issues such as security and quality. The most well known attempt to informally define an open source process is Eric Raymond’s ‘The Cathedral & the Bazaar’ paper (Raymond, 2000), where some common principles underlying the process are described. The most known principles are: ‘release early and release often’ and ‘given enough eyeballs all bugs are shallow’. These two principles indicate the power of open source: (a) rapid evolution so that many users/programmers will be given the opportunity to use the new system and modify it; and (b) many programmers working at the same time on the same problem, increasing the probability and quality of its solution. The following paragraphs explore aspects of the open source software development process from the perspective of the developer. Open source is a combination of two important properties: accessible source code for all and a right to freely use the software, modify it, and redistribute it. Both properties affect in both positive and negative ways the software products and the development process.

1 http://ec.europa.eu/idabc/en/chapter/468

0e45bf098a1a1c6295d72b120077afec.doc 2 2. How to develop and publish OSS

The elements of a software engineering process are generally the following:  Identify the requirements, and the ‘requirers’.  Design a solution that meets the requirements.  Modularize the design and plan the implementation.  Build it; test it; deliver it; support it. No element of this process ought to commence before the earlier ones are substantially complete, and whenever a change is made to some element, all dependent elements ought to be reviewed. An Open Source project should include all the above elements and a set of guidelines for developing and publishing open source software is presented in the following paragraphs. These guidelines have been derived from content found on the web and the web sites dealing with the Open Source process and try to present in plain format the software engineering process from its early beginning till it is published and reviewed. The steps that should compose an Open Source Software development and publishing process are: 1. Start a new project: The hardest part about launching an open software project is transforming a private vision into a public one. Before starting an open source project a search has to be conducted to see if there's an existing project that complies with the project requirements, or is so close that it makes more sense to join that project and add functionality than to start from scratch. 2. Choose a Good Name: The name of the project will have to give some idea what the project does, or at least is related in an obvious way to it, is quite easy to remember. Attention should be paid so as to avoid using a name that is the same as some other project's name, or infringe on any trademarks. Finally, if possible, it is good for the project name to be available as a domain name in the .com, .net, and .org top-level domains. 3. Create a website for publishing the software Web site: Primarily a centralized, one-way conduit of information from the project out to the public. The website may also serve as an administrative interface for other project tools, such as software documentation, software downloads, etc. Therefore, the users will be able to visit this website and find all software information and releases. 4. Have a Clear Mission Statement: Present to potential users a short description for the project. 5. State That the Project is Free: Upon publishing the software, the project website front page must make it unambiguously clear that it is open source software. 6. Software Requirements Specification: A requirements analysis will help identify the problems the software system should address and the form solutions might take. Therefore the requirements specification will identify an initial mapping of problems to system-based solutions. In an open source process it should be specified what the software should or shouldn’t do and who’ll take responsibility for contributing new or modified system functionality. 7. Features List and Software Requirements List: A brief list of the features the software supports should be presented to potential users. If something isn't completed yet, it can still be listed, with a "planned" or "in progress" flag next to it. The computing environment required to run the software should be also specified. 8. Development Status: People always want to know how a project is doing. For new projects, they want to know the gap between the project's promise and current reality. For mature

0e45bf098a1a1c6295d72b120077afec.doc 3 projects, they want to know how actively it is maintained, how often it puts out new releases, how responsive it is likely to be to bug reports, etc. 9. Downloads and software distributions: Open source software may be obtained in different forms depending at which point in the chain of development and distribution one accesses it. Different individuals will have a need for these differing forms. Developers need the software as source code. End users may only wish to obtain and deploy a compiled version of the source code for their platform (known as a binary). The important thing to note is that the open source licence applies to the software regardless of the version in which it is obtained. The software should be downloadable as source code in standard formats. When a project is first getting started, binary (executable) packages are not necessary, unless the software has such complicated build requirements or dependencies that is not easy for all users to get it to run. The following should be made available to users, either online in the website or through other means: a. Software distributions, meaning sets of compatible software packages, or applications, that have been gathered together to form a complete usable system. Usually a distribution is distributed on an installation CDROM or DVD. b. Software releases, i.e. numbered (and possibly named) versions of packages taken from the version control system and tested for deployment. Released versions can be expected to work reliably. The release number gives an indication of what stage the developers believe they have reached. Since open source projects tend to create numerous incremental releases, it is usually safe to wait until major changes have been added, often marked by an even number release. The exception here is when a release is triggered by a security update. c. Software packages. These are self-contained software units produced by projects. Often a project will have several packages, one containing software libraries, another one containing programs, and others containing fonts, sample files and graphics for example. The most successful packages are tightly focused on performing a narrow task very well. This allows them to be easily and simply used and built upon by other packages. Such small and well defined packages are also much easier to test after changes have been made 10. Version Control and Bug Tracker Access: A version control system is an automated system that tracks who makes changes to the software, when and why. A version control system enables developers to manage code changes conveniently, including reverting and "change porting" and allows everyone to watch what's happening to the code. It maintains the current version of the software which is then packaged into releases. Downloading source packages is fine for those who just want to install and use the software, but it's not enough for those who want to debug or add new features. The presence of anonymously accessible version controlled sources is a sign -to both users and developers- that this project is making an effort to give people what they need to participate. Software version control tools such as the Concurrent Versions System are widely used in the OSS development. Tools such as CVS serve as both a centralized mechanism for coordinating FOSS development and a means for controlling which software enhancements, extensions, or upgrades will be checked in to the archive. If checked in, these updates will be available to the community as part of the alpha, beta, candidate, or official released versions, as well as the daily-build release of a software. Software version control requires coordination but allows stabilizing and synchronizing dispersed somewhat invisible development. This coordination is necessary because decentralized code contributors and reviewers might independently contribute software updates or reviews that overlap, conflict, or generate unwanted side effects.

0e45bf098a1a1c6295d72b120077afec.doc 4 On the other hand a bug tracking system will enable developers to keep track of what they're working on, coordinate with each other, and plan releases. Enables everyone to query the status of bugs and record information (e.g., reproduction recipes) about particular bugs. Can be used for tracking not only bugs, but also tasks, releases, new features, etc. If version control or a bug tracker for the project cannot be offered right away, then a sign should be put up saying that this will be set up soon. 11. Communications Channels: The project should present communication channels such as mailing lists, that are usually the most active communications forum in the project, and the "medium of record", or real-time chat, a place for quick, lightweight discussions and question/answer exchanges. Visitors usually want to know how to reach the human beings involved with the project. Therefore, the addresses of mailing lists, online discussion forums, chat rooms, and IRC channels, and any other forums where others involved with the software can be reached, should be made accessible to public. It should be made clear that all authors of the project are subscribed to these mailing lists, so people see there's a way to give feedback that will reach the developers. 12. Developer Guidelines: If someone is considering contributing to the project, he'll look for developer guidelines. Developer guidelines are not so much technical as social: they explain how the developers interact with each other and with the users, and ultimately how things get done. The basic elements of developer guidelines are: a. pointers to online discussion forums or emailing lists for interaction with other developers b. instructions on how to report bugs and submit patches c. some indication of how development is usually done 13. Documentation: Many open source projects face serious challenges generating and maintaining high quality, end-user documentation. Documentation is essential. There needs to be something for people to read, even if it's rudimentary and incomplete. The most important documentation for initial users is the basics: how to quickly set up the software, an overview of how it works, perhaps some guides to doing common tasks. After writing the documentation, it should be run by typical new users to test its quality. A simple, easy-to-edit format should be used, such as HTML, plain text, Texinfo, or some variant of XML, something that's convenient for lightweight, quick improvements on the spur of the moment. This is not only to remove any overhead that might impede the original writers from making incremental improvements, but also for those who join the project later and want to work on the documentation. One way to ensure basic initial documentation gets done is to limit its scope in advance. That way, writing it at least won't feel like an open-ended task. A good rule of thumb is that it should meet the following minimal criteria: a. Tell the reader clearly how much technical expertise they're expected to have. b. Describe clearly and thoroughly how to set up the software and somewhere near the beginning of the documentation, tell the user how to run some sort of diagnostic test or simple command to confirm that they've set things up correctly. Startup documentation is in some ways more important than actual usage documentation. c. Give one tutorial-style example of how to do a common task, i.e. pick one task and walk through it thoroughly. Once someone sees that the software can be used for one thing, they'll start to explore what else it can do on their own. d. Label the areas where the documentation is known to be incomplete.

0e45bf098a1a1c6295d72b120077afec.doc 5 This point is of wider importance, actually, and can be applied to the entire project, not just the documentation. An accurate accounting of known deficiencies is the norm in the open source world. 14. Availability of documentation: Documentation should be available from two places: online (directly from the web site), and in the downloadable distribution of the software. It needs to be online, in browsable form, because people often read documentation before downloading software for the first time, as a way of helping them decide whether to download at all. But it should also accompany the software, on the principle that downloading should supply (i.e., make locally accessible) everything one needs to use the package. It should be also noted that, making mailing lists, chat logs, bug reports and as much other project information as possible accessible to search engines and bookmarking systems, is another way for diffusing information on a project and allows users to easily find out already- answered questions. 15. Developer documentation: Developer documentation is written to help programmers understand the code, so they can repair and extend it. This is somewhat different from the developer guidelines discussed earlier, which are more social than technical. Developer guidelines tell programmers how to get along with each other; Developer documentation tells them how to get along with the code itself. The two are often packaged together in one document for convenience, but they don't have to be. 16. Software quality: An important requirement for open source code is that it should be ‘modular, self-contained and self explanatory’, to allow development at remote sites and to facilitate future maintenance of the open product. The software should therefore be composed of smart data structures and the source code should be readable and self-descriptive, presenting adequate code explanations and comments. All functions and processes should be clearly specified, along with their inputs and outputs. They should be logically independent and highly cohesive. Furthermore, open source code should be testable to allow rapid evolution and simple enough to allow frequent modifications and extensions. Concluding, in order to achieve software of high quality, all software components (throughout the software implementation and before each software release) should be evaluated against four basic criteria: testability, simplicity, readability and self-descriptiveness. These criteria have been taken from the ISO (International Standards Organization) international standard concerning the characteristics of software quality that is considered to be an important milestone in the development of software quality measurement. 17. Example Output and Screenshots: If the project involves a graphical user interface, or if it produces graphical or otherwise distinctive output, some samples can be put up on the project web site (inarguable proof that the software works). Screenshots, how-to guides, and maybe frequently asked questions will help convey system-use scenarios. Software bug reports could also appear in newsgroup messages, on bug-reporting Web pages, or in bug databases describing what isn’t working as expected. 18. Choosing a License and Applying It: OSS is a software product distributed by license, which conforms to the Open Source Definition, the best known of which are GNU General Public License (GPL) and Berkeley Software Distribution (BSD). Several other license agreements have been also published for use with open source software. All licenses have some common features, most notably making software free to users both in terms of having no cost and in terms of minimizing restrictions on use and redistribution. If possible, one of the existing open

0e45bf098a1a1c6295d72b120077afec.doc 6 source licenses should be used or modified in order to meet the needs; some licenses work better than others for particular business models. Possible license choices include: a. No license at all (that is, releasing software into the public domain) b. The BSD (Berkeley Software Distribution) License that place relatively few constraints on what a developer can do (including creating proprietary versions of open source products) c. The GNU General Public License (GPL) and variants that attempt to constrain developers from “hoarding” code – that is, making changes to open source products and not contributing those changes back to the developer community, but rather attempting to keep them proprietary for commercial purposes or other reasons d. The Artistic License, which modifies various of the more controversial aspects of the GPL 4 Chapter 1 e. The Mozilla Public License (MozPL) and variants (including the Netscape Public License or NPL) that go further than the BSD-like licenses in discouraging software hoarding, but allow developers to create proprietary add-ons if they wish. On the other hand, the EUPL (European Union Public Licence) could be used for software developed for the European Community. This license has been elaborated in the framework of IDABC, a European Community programme managed by the European Commission's Enterprise and Industry Directorate General, with the aim to promote Interoperable Delivery of European eGovernment Services to public Administrations, Business and Citizens. Several software tools have been developed inside this IDABC programme or its predecessor, IDA. Some IDA or IDABC developed tools are used by public administrations outside the European Institutions, under a licence delivered by the European Commission, which is the institution acting on behalf of the European Communities when the work is under copyright of the Communities. For some time, interest has increased in the publication of the software source code under a licence that would not limit access and modifications to this source code. The original EUPL licence was created for such software, as corresponding to IDABC objectives and after a preliminary legal study to assess its conformity to European law. The Licence is written in general terms and could therefore be used for derivative works, for other works and by other licensors. The utility of this Licence is to reinforce legal interoperability in pooling public sector software, by adopting a common framework rather than multiple specific or national licences. This license could be extended in order to conform to the Open Source Definition. Once a license is chosen, it should be stated on the project's front page. There is no need to include the actual text of the license there; but just the name of the license that will link to the full license text on another page. This tells the public what license the software will be released under, but it's not sufficient for legal purposes. For that, the software itself must contain the license. The standard way to do this is to put the full license text in a file called COPYING (or LICENSE), and then put a short notice at the top of each source file, naming the copyright date, holder, and license, and saying where to find the full text of the license.

0e45bf098a1a1c6295d72b120077afec.doc 7 3. Turning existing software into OSS

Converting existing software into OSS presents a challenge different than creating OSS software from the start. This is mainly due to the fact that existing software is often accompanied by established procedures concerning its maintenance and support; procedures that will need to be re- evaluated and possibly modified. As can be understood from the discussion in chapter 2above, the main aspects of OSS software, apart from its license, is the support of the free and complete documentation of everything the software depends upon: installation instructions, code description, design, etc. This creates a significant overhead for existing software with poor documentation. In addition, this documentation needs to be dynamic so that contributions may become available as soon as possible to potential other developers. The dynamic nature of developers’ documentation requires the utilisation of (most often) web tools that allow the easy collaboration for bug reporting, documentation update, requests for planned enhancements, etc. Technologies like BugZilla and wiki are commonly used. The next list of actions follows the similar list presented in the previous chapter commenting only specific issues for the conversion of existing software into OSS. 1. Start a new project: Indeed the conversion to OSS of an existing software system is a project on its own. As with new OSS, it is necessary to first understand and define why the target software needs to be provided as OSS. 2. Choose a Good Name: This step is optional, and the decision is based on whether the OSS system needs to be presented as a continuation of the old version or not. 3. Create a website: If a website already exists, then it may be needed to change in order to comply with the abovementioned guidelines. Otherwise, a new website should be created. 4. Have a Clear Mission Statement: As presented in chapter 2. 5. State That the Project is Free: As presented in chapter 2. 6. Software Requirements Specification: The current software requirements should be rectified in order to clearly specify what the software does or does not do and who’ll take responsibility for contributing new or modified system functionality. 7. Features List and Software Requirements List: As presented in chapter 2. 8. Development Status: As presented in chapter 2. 9. Downloading and software distributions: The guidelines presented in chapter 2 concerning the software distributions need to be observed. If the current software distribution procedures differ, then a migration path needs to be defined, in particular if licensees already exist. 10. Version Control and Bug Tracker Access: The existing systems for version control and bug tracking should be exposed to the developers’ community, if they exist. If licensing issues prohibit this, or if simply they do not exist, then the use of open source tools for this functionality must be considered. 11. Communications Channels: As the Commission already uses CIRCA, the readily available communication channels provided by this groupware application should be considered. 12. Developer Guidelines: The developer guidelines presented in the previous chapter must be maintained also for software that will be converted to OSS. It is possible that the existing software does not even include in its documentation such guidelines, and in this case, this documentation must be augmented. 13. Documentation: The complete documentation set of the existing software should be examined and brought in compliance to the documentation requirements for OSS. The need for complete documentation arises from the fact that the software users now cannot benefit from e.g., a

0e45bf098a1a1c6295d72b120077afec.doc 8 training session or something similar, and the actual documentation is all the information they can get. As a user/developer support activity is usually not foreseen, the only resort a software user has is what he or she can find on the project website documents. 14. Availability of documentation: As presented in chapter 2. 15. Developer documentation: As presented in chapter 2. 16. Software quality: The requirement for software quality may not be met by the existing software. This is not a blocking factor for the conversion of the existing software into OSS, although it may be considered as something that will have to be addressed at a high priority as now the system architecture and source code becomes publicly available. 17. Example Output and Screenshots: As the software already exists, this task is not expected to require much effort, as much of what is required is already available (user guides, examples) or easily obtainable (e.g., screenshots). 18. Choosing a License and Applying It: An issue that needs to be taken into account is the definition of the OSS license to be used. This decision requires taking into account the terms of the existing license, especially if licensees already exist. A clear migration path should be provided for them.

0e45bf098a1a1c6295d72b120077afec.doc 9 4. References

[1] IDABC website: Open Source Observatory [2] Producing Open Source Software: How to Run a Successful Free Software Project (Karl Fogel). [3] The Cathedral and the Bazaar by Eric S. Raymond [4] IEE Proceedings – Software: Understanding the Requirements for Developing Open Source Software Systems (Walt Scacchi) [5] OSS Watch (Open Source Software Advisory Service): Open Source Software Development [6] Developing with open source software: Free and Open Source Development Practices in the Game Community (Walt Scacchi) [7] O’Reilly: Understanding Open Source and Free Software Licensing (Andrew M. St. Laurent): Chapter 7: Software Development Using Open Source and Free Software Licenses [8] O’Reilly Online Catalogue: Open Sources: Voices from the Open Source Revolution (Bruce Perens) [9] Understanding Open Source Software: Understanding Open Source Software Development (Joseph Feller and Brian Fitzgerald) - 2002, Addison Wesley [10] IDABC website: Documentation on Open Source Software (OSS)

0e45bf098a1a1c6295d72b120077afec.doc 10