The Héphaïstos Conference Proceedings

First International Conference on Open Source Collaborative Development Platforms, Paris France Première conférence Internationale sur le développement collaboratif de logiciels libres, Paris France 16th and 17th of November 2006

XDR Consulting This is the Version 1 of the proceedings.

Some papers went AWOL and we are tracking them back :-) And We didn't got yet the time to put in all the round table notes.

But I think that a Version 1 now is better than a Version 2 when the Call For Paper for the Second edition is already out.

If you want to have a more recent version please send an email to: [email protected]

This document has some full papers, some abstracts

You might want to also download the ODF presentations at http://www.ethiqa.com/hephaistos/presentations.html Table of Content Introduction Patrick Sinz (Ethiqa)...... 6 The Héphaïstos Project...... 6 The Héphaïstos Conference...... 6 From 0 to 300 The Adullact at the age of 4: François Elie, ADULLACT...... 8 what is the ADULLACT ...... 8 How to develop together...... 8 So we need a forge...... 8 Conclusion...... 9 PicoForge and Shibolleh Managing Identities in a Forge environment: Olivier Berger (INT Evry).10 Introduction...... 11 Shibboleth and SSO Service ...... 12 Enhanced authentication in Picolibre...... 14 Standard authentication Scheme in phpGroupware ...... 15 Shibboleth+Apache as an integrator and SSO service...... 16 Mixed environment, and legacy...... 17 Implementation of the phpGrouware Shibboleth adapter...... 18 Using Apache-based authentication in phpGroupware...... 18 Mapping Apache session's REMOTE_USER to a phpGroupware account ...... 18 Additions to phpGroupware...... 19 Configuration of phpGroupware's access protection in Apache...... 20 “Full Apache” access control...... 20 Semi Apache ...... 21 Final phpGroupware setup with Shibboleth for PicoLibre...... 24 Conclusion ...... 25 Bibliography...... 25 Annex...... 26 Components added ...... 26 In package phpgwapi...... 26 New “apache” phpgrouware module...... 26 New table...... 26 Components modified...... 27 Setup / configuration...... 27 phpGroupware modules...... 27 Configuration decisions...... 27 When using Shibboleth(mod_shib) ...... 27 Keep it Simple, Editorial control at simple.wikipedia.org: Matthijs Den Besten (Oxford e-Research Centre)...... 29 Background...... 29 Preliminary Investigation...... 30 Towards Management of Enterprise Intranet Resources Using Open Source Technologies: Diana Gorea (IRCAD University of Strasbourg, France) ...... 33 1.Introduction and Context...... 33 2. The Architecture of the System...... 34 3. Corporate Documentation and Tools...... 36 4.Authentication System...... 37 5.Documentation and Project Management...... 37 6.Further Development...... 39 7.References...... 39 Libresource, learning from customers, Stephan Bagnier, ARTENUM...... 41 Managing a Packaging Farm: Johan Euphrosine (mekensleep.com)...... 42 Managing the Testing Process in an Open Source Environment Xavier Drozdzynski (XDR)...... 43 Mantis, , Gforge Comparision, Integration models and the Future of Bug Tracking: Christopher Mann (Infopol)...... 45 Open Source Software Factory – Step by Step: A Case Report: Alan Kelon Oliveira de Moraes (Centro de Informàtica – UFPE, Brazil)...... 46 1. Introduction...... 46 2. Software Factories...... 46 3. Open Source Software Development Model...... 47 4. Step by Step...... 47 Step 1. Define the Factory Business Model...... 48 Step 2. Define the Factory Organization...... 48 Step 3. Define a Lightweight Development Process ...... 48 Step 4. Enable the Work in a Geographically Distributed Way...... 48 Step 5. Provide a Web Site for the Factory...... 49 Step 6. Provide an Exclusive Web Site for Each New Project...... 49 Step 7: Define Roles for Each New Software Project...... 49 Step 8: Team Members Must Work in Harmony...... 49 6. Conclusions...... 49 7. Acknowledgements ...... 50 8. References...... 50 Towards a software licensing guide for Open Source Business Models: Rafael de Albuquerque Ribeiro (Federal University of Pernambuco, Brazil)...... 52 Introduction...... 52 Licensing guide variables...... 53 Location in the Products x Processes matrix ...... 53 Dual Licensing...... 54 Licensing models...... 55 Case study ...... 56 JBoss...... 56 Covalent...... 56 Wind River...... 57 MySQL AB...... 57 Google ...... 57 Cyclades...... 57 Guide...... 58 Products...... 58 Services...... 59 Licensing Guide Summary...... 60 Conclusion ...... 61 References...... 61 Round Table 1: Extending the domain of intervention of Forges: Chair Philippe Aigrain (Sopinspace), François Elie (ADULLACT)...... 63 Introduction...... 63 Some findings...... 63 Round Table 2: Getting Forges to collaborate, collaborating through forges: Chair Barbara Held, IDABC European Commission...... 65 Introduction on how, and why to get forges to collaborate, and the EU OSOR project...... 65 Why is the Commission involved in eGovernment activities? ...... 65 Why does IDABC promote OSS in Public Administrations?...... 65 The OSOR – objective...... 65 The OSOR – first steps...... 66 The OSOR – tasks...... 66 The OSOR – What is needed ...... 66 The OSOR – Open Questions ...... 66 Some findings of this session...... 67 Round Table: Forges the Next Generation: Chair Harald Wertz (University of Paris 8), Thierry Stoehr (AFUL)...... 68 Some Conclusions: Patrick Sinz...... 69 Initial Thoughts...... 69 Some notes on the evolution of technologies in the forge environment...... 69 Why do we need a business model for Forges ?...... 70 An historical parallel...... 71 Some Participants notes...... 71 Oliver Berger:...... 71 Introduction Patrick Sinz (Ethiqa)

The Héphaïstos Project The Héphaïstos project has been launched through a collaboration of the French Association ADULLACT (Association des Utilisateurs de Logiciels Libres pour l'Administration et les Collectivitées Territoriales) and the Linex team of the Regional Government of Extremadura, Regional Minister of Infrastuctures and Technological Development, in Spain, with the recent addition of the UK based Open Source Academy.

The current status of the Héphaïstos project links together the Gforge platforms of the ADULLACT, and of the Junta de Extremadura's Spanish Public Sector Gforge and will shortly integrate a Forge platform managed by the UK Open Source Academy, five additional forges are anticipated to be connected till end of 2005 and we hope to have almost all the European Countries linked in by the end of 2006.

The strategic goal of the Héphaistos Project is to help Public Sector institutions to better understand the structural advantages of adopting Open Source solutions, and by promoting full participation in the development of Open Source solutions, help them to manage the transition from «user» toward “ participants” who may fully benefit from the Development Process itself. Their participation can be direct if they have their own development teams or indirect through subcontractors, in most cases it shall be a mix of both.

We are achieving this by promoting in each European Member country the deployment of an Open Development Platform using Gforge, Savannah or similar technologies, dedicated to the Public Sector. We then link all these platforms in a way that enables members of the Public Sector in each country to be aware of ongoing developments not only in their own country but also in other European Community member state. This is particularly useful for all solutions that have to implement European directives and therefore share many common elements, although the exact transposition into local law might need some adaptation. Another facet of the Héphaistos project is the creation of an training and certification process enabling Open Source and project managers to fully understand the Open Source collaborative development process and tools.

The Héphaïstos Conference The goal of this conference is not limited, and in some ways not even related to the project. What we hope to achieve is a better understanding of the use, potential and future of “Forges” in general.

This will be explored through our various presentation, and through three “round tables” ● Round Table 1: Extending the domain of intervention of Forges: Chair François Elie (ADULLACT), Philipe Aigrain (Sopinspace) Are forges only relevant to Software development ? or is the Free and Open Source development process a beacon of things to come in all human endeavors ? ● Round Table 2: Getting Forges to collaborate, collaborating through forges: Chair Barbara Held, IDABC European Commission How to collaborate between various forges, and how can forges help to promote collaborations within the European Public sector (or any other “industry”). ● Round Table: Forges the Next Generation: Chair Harald Wertz (University of Paris 8), Thierry Stoehr (AFUL) So, now that we know where we are, where are we going ? From 0 to 300 The Adullact Forge at the age of 4: François Elie, ADULLACT what is the ADULLACT ● Association of Developers and Users of Logiciels Libres in Administrations and Territorial/ Collectivites ● public funds have to pay once ! ● not only use but develop it ! ● creation: nov 2002 ● adullact.org & adullact.net ● 283 projets / 2298 users on adullact.net forge

How to develop together ● direct development ? ● buy together ? ● modelizing together ? ● the challenge in the future, maintaining together !

So we need a forge... ● using an existing forge ? ● let members have their own forges ? ● a new forge for a new concept ! (as savannah, objectweb...) ● interrogations ○ who controls colaborative tools may control works ○ localisation ? ○ identity ? ○ security ? ○ perenity ? ● 2002 november ○ project birth ○ because is based on a ○ gforge project continue the project from the state before the fork ○ reflexion about choice ■ dec 2002: test on my own computer :-) ○ avril 2003 ■ installation at CRI74 ■ with the help of gforge project administrators ■ adullact.net birth ● confirmation of choice ○ Objectweb, Debian etc... install a gforge ○ first upgrade ○ Participation on gforge improvement ■ french translation ■ backup procedure ■ tabs extension system ■ links and search on other public forge

Conclusion is sharing software thanks to public funds possible ? obviously yes! many Adullact Hosted projects are reused and shared among various collectivities. PicoForge and Shibolleh Managing Identities in a Forge environment: Olivier Berger (INT Evry) Authors Quang Vu DANG (IFI) Olivier BERGER (GET/INT) Christian BAC (GET/INT) Benoît HAMET (phpGroupware)

Abstract This paper presents a proposal to addresses the need for multiple authentication sources for users of collaborative work platforms based on phpGroupware, such as Picolibre. The proposed approach, developed for the needs of GET and Picolibre, relies in a generic solution for integration of phpGroupware servers in a Shibboleth infrastructure. We have developed new phpGroupware adapters for this integration, which we hope to contribute to the phpGroupware project. This document should serve as a basis for discussion among interested phpGroupware developers and users, in order to validate the level of generality of the proposed approach, and eventually decide of the adoption of the proposed modifications for integration in the phpGroupware standard code-base. We hope that this approach can also help maintainers of other collaboration platforms, who want to integrate a park of deployed platforms with external user identification and authentication services, get a better view of solutions available with Shibboleth.

Introduction The Groupe des Écoles des Télécommunications (GET) is composed of several engineering and business schools together with research centres in Paris (ENST), Brest (ENST Bretagne) and Évry (INT), in France. The research teams are made up of more than 600 full-time research equivalents. The range of the researchers' expertise, from technologies to social sciences, enables the integrated approach so characteristic of GET research and fosters its adaptability to new application sectors and new usages in response to current challenges in the field of Information and Communication. Starting in order for use as a pedagogical platform, Picolibre [1] is a libre-software system developed at GET, and released under the GNU GPL license. It provides a Web-based collaborative work platform built on top of phpGroupware1 and other libre software tools. Picolibre provides project hosting facilities for small teams of software developers, which was mainly oriented to teaching and research environments2. Picolibre integrates several libre software Web applications, but lacked some features in the current stable version, like a engine. However, we have developed an in-house GET platform, based on Picolibre components, integrating new services like a Wiki engine, or a Webdav server, called ProGET [2]. We seek to integrate these new features in a new generation Picolibre, making it a more generic and complete platform, and available for all as libre software.

Several Picolibre platforms have been deployed at GET, and developers or researchers may be using services of several such platforms, while working on projects initiated on these different

1 http://www.phpgroupware.org/ 2 The reader will find more details about PicoLibre at http://www.picolibre.org/ platforms. At the present time, different accounts may be created on these platforms for the same person. Here comes the need of Single-Sign-On (SSO) facilities between these platforms. GET is in the process of deploying soon a federation of authentication systems and applications, based on Shibboleth, for a better integration of its Information System, which is at the present time distributed among the different schools. Shibboleth is an infrastructure designed to build an identity federation allowing applications and identity providers to share and exchange attributes concerning user profiles, in order to facilitate user identification and authentication in the realm of a deployed identity federation (SSO, etc.). We want to investigate the use of Shibboleth also by PicoLibre, but we should note that users of the Picolibre platforms may be : ● either GET agents or students, who are already registered in GET's “company directories” (so, who will be known to Shibboleth), ● or external contributors, outside GET (unknown to the GET Information System). Nowadays, PicoLibre is not yet interfaced with GET's company directories to authenticate its GET users, and manages its own directory. Here comes the need to develop adapters to integrate Picolibre platforms in the coming Shibboleth federation. But even if GET users are recognised by Picolibre, through its use of Shibboleth, we will still have to support other users not known of GET's directories. Such requirement will also be described in our proposed solution.

The same issues, which are addressed here for Picolibre users and GET, may be found also for other networks of collaborative work platforms, for instance among libre software development communities, for authentication into the various software development platforms they use (Gforge, Trac, etc.).

Shibboleth and SSO Service Shibboleth3 is a complete open source platform developed for project "Internet2"4, aiming at building the federation of identity for education institutions and their partners. It is based on the SAML standard (Security Assertion Markup Language), which defines the assertion of authentication and exchange of attributes of users. It supports the SSO service and the authorization, allowing the definition of access control privileges to Web resources, by using user-associated attributes. The basic architecture of Shibboleth has three components: ● Identity Providers (IdP), ● Service Providers (SP), ● and a "Where Are You From" service (WAYF).

Identify Providers (IdP) are responsible of user authentication by using an existing SSO service, for example Central Authentication Service (CAS)5 and to provide the user attributes for the access control process. They manage a directory of users, attributes, and eventual passwords. Service Provider (SP) is the part of Shibboleth (written in C/C++) managing access to the

3 http://shibboleth.internet2.edu/ 4 http://www.internet2.edu/ 5 http://www.ja-sig.org/products/cas/index.html resources which are requested by the users in the federation. The Apache web server "plugin" mod_shib is a module which allows Web pages access control, basing its decision onusing the user attributes values defined in the IdP. The service "Where Are You From" (WAYF) is an additional component in the federation, which helps the user to explicitly choose his/her IdP “of origin”, selecting the place where he/she may be known as a source of authentication. When a user wants to use a service offered by the members of the federation, the service's Shibboleth SP component redirects the user towards his/her IdP, or a WAYF service for choosing his/her preferred IdP. Then he/she will authenticate to the SSO service which is indicated by this IdP (for instance a CAS server where a password will be prompted). After successful authentication, he/she will be redirected automaticaly back to the requested service, but being authenticated, this time. After having authenticated the user, the SP then requires some informations describing the user, and filters these informations for the authorization process. This information is then sent to the service's Web applications in the form of HTTP headers (through the mod_shib adapter, for instance, in Apache). The whole system is described in the following figure (1), describing the whole Shibboleth protocols :

Illustration 1: Shibboleth protocols

This figure originally available at http://www.switch.ch/aai/demo/Demo3.png : © SWITCH (The Swiss Education & Research Network) Shibboleth offers several advantages like the possibility to distribute a user-base among several directories (IdPs), that applications themselves don't keep local authentication tokens like passwords, or the fact that each applications may identify a user based on a different set of predicates on its Shibboleth attributes. This gives the whole system a lot of flexibility, while relying on a secure distributed and trusted architecture, with demanding user account management procedures on the IdPs. Discussion of all this is out of the scope of the present article.

Enhanced authentication in Picolibre At the present time, Picolibre users can authenticate against the phpGroupware application (and its underlying database or LDAP directory), which provides the basis for users management in PicoLibre. They login to phpGroupware when they want to access the “virtual desktop” homepage of PicoLibre, which contains the list of the projects to which they collaborate. But they may also authenticate directly to one of the other components integrated in the platform, residing on a same Web server, such as Sympa6 (the mailing list manager) or others to come in the future (like Twiki7), without going first to the phpGroupware login page.

Standard authentication Scheme in phpGroupware In general, phpGroupware handles access-permissions to its applications in an autonomous way, for locally recognised users, basing itself on a local "account", stored in a "directory" operated for instance by a RDBMS like MySQL, or by a LDAP directory. To be allowed to access phpGroupware, the user should have an account which determines the user's profile: the access rights for a list of the modules that the user may use. In phpGroupware, access to applications is a three phases process: 1. Authentication: verifying that the user is actually the owner of a user account, by using an instance of the auth_ class which will retrieve the account, and validate the password against the local authentication directory. 2. Identification of the user's profile: this phase will be handled by an instance of the account class after it has been authenticated successfully. Here, if phpGroupware is setup in "auto- create" mode and no profile is already existing in the local directory, a new profile can be created automatically, by default ("guest" access for instance). 3. Creation of a work session: basing on the user's profile which has been retrieved, phpGroupware will create a work session, and grant access to the phpGroupware modules that the user is allowed to use. Here, the phpGroupware authentication phase only uses a "local directory", as configured by the administrator during the phpGroupware server configuration phase.

6 http://www.sympa.org/ 7 http://www.twiki.org/ Depending on the physical implementation of the local accounts directory used by phpgroupware (MySQL, LDAP, ...), it is possible to share the user's profile with other applications deployed in the same networked environment as the phpGroupware server, providing that necessary administrative policies are adopted, and that custom technical adapters are developed, or configuration decisions are taken. But this is quite demanding with respect to the administrators' skills. Using the same LDAP directory as a user account directory for several high-level Web applications may be such an easy solution (actually used at the moment in PicoLibre), but may not scale and adapt to all security concerns, anyway. So there's, at the moment, no “elegant” SSO service facility in phpGroupware, which would allow phpGroupware to grant, to users already known in other parts of the organisation's information system, a "transparent" access to its applications (unless they share the same “back-end” LDAP server for instance, as explained above).

Shibboleth+Apache as an integrator and SSO service An other way to assemble all the current authentication mechanisms in the various Web applications installed on the same Picolibre system, is the use of a common “front-end”, i.e their common Web server's authentication system (here we consider only Apache, as a standard reference tool). But if (in the future) the applications composing a single Picolibre platform are actually deployed on several Web servers, we need a more advanced mechanism for sharing this authentication. We still need also, as described above, SSO with other applications deployed throughout GET, outside the Picolibre platform, to which users will have already authenticated.

Shibboleth can come and help solve all these needs. And hopefully many Web applications we use, like Sympa, or Twiki are now becoming compatible with Shibboleth. Thus, Apache and Shibboleth will be able to act as the integrator of their authentications mechanisms (using a distributed Web service approach).

But unfortunately, phpGroupware's authentication mechanisms come short in such a situation. We need to develop a new phpGroupware identification and authentication adapter for the Apache + Shibboleth combination.

Mixed environment, and legacy Picolibre may be using the GET Shibboleth federation once adaptors have been added to all its applications. But, as we have described our users typology above, Shibboleth can't be the exclusive authentication mechanism used. We are not in a typical “intranet” or “extranet” system, and have both “company” users, and “external” users. We will then need some way to “bypass” Shibboleth for some of our users.

Also, one issue which has to be solved when Shibboleth (or any such external authentication mechanism) is used, is the local mapping, in the applications, between the way users are recognised in Shibboleth, identified, and the internal reference of the local profile (or account) in the application. Of course, if Shibboleth is deployed prior to setting-up the Picolibre platform, and all its users will be known in Shibboleth, and only Shibboleth users are recognized by the applications, then such mapping is trivial. But it gets much worse if Shibboleth is deployed on an existing environment with many legacy accounts already existing in Picolibre. Also, Picolibre users working or studying at GET may be known in various Shibboleth IdPs inside GET. But it is not obvious yet, at the present time, that they may be recognised by standard set of attributes fitting the needs of Picolibre.

Having considered all the constraints above, we propose to integrate Picolibre (hence phpGroupware) with Shibboleth with a flexible approach. In particular, here, we try to facilitate the progressive integration of existing deployed phpGroupware instances, thus diminishing the migration burden for administrators and users (accounts re-creation, etc.). As a result, the design of the new Picolibre authentications system (including the phpGroupware Shibboleth adapter) will then need to support these cases : ● new users, who are known in GET's company directories (like GET agents and students registered in one of the Shibboleth IdPs), but not yet having a developer account in Picolibre, who will be able to create a new account in Picolibre/phpGroupware, ● “legacy” users of Picolibre with an account in phpGroupware, who will still access Picolibre, and : ○ if they also have an account in Shibboleth at GET, be able to map their legacy Picolibre account to the new Shibboleth identity (for doing SSO with the CAS services of the IdPs), ○ if not (for people external to GET), still be able to use Picolibre, as before, “bypassing” the Shibboleth SSO engine, ● new users external to GET, who will still be able to apply for registration in phpGroupware, as before. The situation should be similar for any other authentication mechanism than Shibboleth, to which phpGroupware would authenticate. We then tried to propose a standard mapping mechanism which would not be too specific to Shibboleth.

Implementation of the phpGrouware Shibboleth adapter We will describe in this section the way we have implemented the Apache + Shibboleth adapter in phpGroupware, taking into account all the constraints described above.

Using Apache­based authentication in phpGroupware By using an Apache authentication method, phpGroupware will not authenticate users internally, in its accounts directory (LDAP, MySQL, ...). Instead of that, it will depend on the Apache session's environment variable REMOTE_USER (which will hold something like a user's logname, or email, …), which is defined when the HTTP transaction is authenticated by the web server8. The benefit of this approach is the availability of many authentication schemes, using the existing Apache modules such as mod_auth_ldap, mod_auth_mysql, mod_cas, mod_shibb, etc. PhpGroupware will then take full advantage of this versatile integration mechanism with external authentication sources like Shibboleth. However, depending on the browser's features, and specific mechanism used, in certain cases, the identity of the user may be “hidden” in the web browser's internal state : he/she may have initiated a session, but may not be able to logout from phpGroupware unless the browser is closed9. For instance, the configuration of the Apache web-server for a phpGroupware instance may be such as (for auth. against a LDAP server) : AuthType Basic AuthName "phpGroupware" AuthLDAPEnabled on AuthLDAPAuthoritative on AuthLDAPURL ldap://my.openldap-server.com/dc=my_org,dc=org?uid require valid-user

Mapping Apache session's REMOTE_USER to a phpGroupware account In our case, with Shibboleth, we want phpGroupware to take part in a federation of identity providers, which consists in several sources of identity attributes. Once a user has authenticated to Apache (through mod_shib), we need to determine its profile, by retrieving its phpGroupware account, based on the use its attributes in Shibboleth. The choice of the attributes to use is difficult. For instance, two different users may appear as having the same "identity" in one specific attribute's values on different IdP (same uid number in different LDAP servers, for instance). Then, if we used the existing "trivial" mapping of Apache REMOTE_USER values to phpGroupware accounts IDs, it wouldn't work. We then need to use another attribute, say the email, which is supposed to be of unique value among all the IdPs of the federation (or some ID card number, etc.). But let us now suppose that the chosen attribute is the email; then, depending of the identity policy

8 With the “basic” authentication mechanism, when no advanced Apache auth. driver is used (like mod_shib), this will result in the prompt of a login + password dialog-box on the Web browser. 9 Some current Web browsers versions don't support easy HTTP session's REMOTE_USER re-initialisation by default enforced in the federation10, a single person may belong to two different sources of identities where he/she is known with two different emails: for instance, a departmental email (with a sub-domain in the rightmost part), and a company-wide email (with only the master domain name of the company). To better support these situations, we felt the need to develop a new (optional) mapping mechanism, active during the identification phase in phpGroupware, which would use a mapping table for projection of these different attribute values to a single account ID in phpGroupware.

There may then be two modes of operation available in phpGroupware, configured by the administrator : ● the old "trivial" mapping, for sites where REMOTE_USER is a unique ID in the federation of identities (using current phpGroupware implementation). ● the new mapping mechanism, for sites where each user does not necessarily have a unique ID which can be identified in the company's Information System.

Additions to phpGroupware PhpGroupware's code will be enhanced with a new class mapping that we have developed (which is described in more details in the Annex). After having passed the phases of authentication (with Apache and Shibboleth, for instance) a user will be directed to the new login. equivalent script, in the /phpgroupware/phpgwapi/inc/sso directory, which will realise the mapping, and the creation of the phpGroupware work session. If the search for an existing suitable mapping is not successful : ● a new account can be created, ● or a new mapping can be created to an existing account in phpGroupware (verifying, on creation, that the user owns the password to this existing account).

If phpGroupware allows users to create the accounts (as defined in phpGroupware's configuration : “Autocreation of account”), the script create_account.php provides an interface for creating the new account, based on informations provided by the external auth system (in the case of shibboleth). It will help retrieving various user's description fields like names, email, IDs, etc. If phpGroupware is configured to support the new “mapping by table” feature, the script create_mapping.php provides the function for creating a new mapping if the user already had an existing account in phpGroupware. During this process, the user will have to authenticate to the legacy account in phpGroupware, before creating this new mapping, in order to ensure that nobody can hijack anyone's existing account when a deployed Picolibre platform will enter the new Shibboleth federation. If phpGroupware is configured to use the “trivial” mapping mechanism, no such authentication to the legacy account will be done, as we fully trust the IdPs, and consider that the matching of Shibboleth attributes to the local accounts is non-ambiguous. In the case there would be a successful match for trivial mapping for most users, but only a small

10 It is not clear yet to the authors if such situations may arise when the full Shibboleth infrastructure will be rolled-out through the GET, as we are not directly associated to the deployment project. Anyway, even though this case may not be frequent with Shibboleth, it will probably happen with other Apache based auth. sources in large environments, or during migration phases, for instance. number of failing cases, then a “sequential” mapping mechanism could be activated, providing that the administrator has carefully checked for absence of potential “hijacking” risks. In such cases, both mechanisms would then be applied sequentially : trivial mapping first, then mapping “by table” if no success.

Several important configuration options of phpGroupware used here may be such as the following in setup/templates/default/config.tpl: auth_type = “remote user” mapping_type: choice between “All”, “Unique ID”, “Mapping Table” mapping_field: choice between “uid”, “email”, “account_lid” (default) ...

We want to stress that this new phpGroupware user account identification, through and optional mapping phase, is not specific to the use of Shibboleth as an external authentication method. It may become useful also for other sources of authentication through Apache (or with other mechanisms), so we propose that it will be integrated in the standard code-base of phpGroupware's core-API module. Thus phpGroupware will become much more inter-operable with the whole IS of a company, and less oriented to use as a specialised dedicated and isolated groupware server.

Configuration of phpGroupware's access protection in Apache Access to phpGroupware's Web pages and to associated Web applications of Picolibre can be configured through careful customisation of Apache configuration directives. As we explained above, we may be using phpGroupware in an “intranet” environment, or a mixed environment such as the one for Picolibre. In a mixed environment like Picolibre's, where we will use authentication through Apache and Shibboleth, Shibboleth will only recognize company agents, and not external “contributors” to hosted projects. The same problems stand for every other authentication mechanisms than Shibboleth, actually, when in mixed environments. Through specific configuration directives, we can devise different access points for the same Web application, and choose to make access control by Apache mandatory (called “Full Apache”), or optional (resp. “Semi Apache”)

“Full Apache” access control In this case, Apache acts as a “wrapper” around the whole phpGroupware application, protecting all its accesses. Every access to phpGroupware must go through Apache authentication module (hence Shibboleth). Only users already known to the identity federation will be granted access. This can be achieved through an Apache server configuration like the following : ...[any auth. mechanism here] require valid-user

When users are authenticated, they will proceed to the optional mapping phase, which will eventually help retrieve any existing legacy phpGroupware local accounts (see figure bellow for the resulting whole process in “Full Apache” mode).

Semi Apache On the other hand, phpGroupware may be accessible also without first authenticating to Apache (“bypassing” Shibboleth). This may be useful to be able to setup, at the same time : • an authentication on the local phpGroupware accounts directory, with a traditional phpGroupware auth method (for external users outside the company, contributors, ...) • and an authentication against Apache (and Shibboleth) as well (for instance for all regular "Intranet/extranet" access by company agents, students), with SSO option if available.

In Apache configuration directives, for instance, the “top-most” location /phpgroupware/ itself would be “world accessible” (in the limits of phpGroupware's own login script, and protections in the PHP code). Then the configuration of the specific part providing authentication through Apache's auth modules should be available under a subdirectory, for instance /phpgroupware/phpgwapi/inc/sso: ...[any auth. mechanism here] require valid-user So, to access the phpGroupware server, the user would be free to choose whichever method suits him/her most, by going to different entry points (different URLs), for instance : ● http://server/phpgroupware/login.php: which would propose the standard phpGroupware login + password PHP dialog. Used by local users, and which would also provide an alternate link (named “SSO”, for instance) pointing to the other URL bellow, ● and http://server/phpgroupware/phpgwapi/inc/sso/login_server.php11 for instance : which would be accessed from the SSO link above, or directly, whose sole purpose would be to direct to the Shibboleth infrastructure login and SSO pages (plus mapping).

The login process would then be the following :

Final phpGroupware setup with Shibboleth for PicoLibre With Shibboleth acting in the Apache access control wrapper's for Web applications of Picolibre, as described above, and with addition of the necessary mapping phase, phpGroupware can now use the SSO services available in the GET Shibboleth infrastructure. Picolibre platforms will then participate to the Information System in a standard way for most of its users, accessing the identity federation to recognise them. For the specific needs of Picolibre platforms at GET, phpGroupware access will be configured to

11 Note that the URLs provided here may be aliased through rewrite-rules, etc. for simplicity of use. The naming hierarchy used here reflects the physical distribution of the scripts in the phpGroupware installation. use the “Semi-Apache” authentication method described above, since we have users outside the boundaries of GET. In the Apache configuration we will use the Shibboleth authentication : AuthType shibboleth ShibRequireSession On require valid-user

As we have explained above, the user known to Shibboleth may be known from several of identity sources (IdP), and we need to define which attribute will be used to identify the user, and define the mapping strategy to phpgroupware accounts in consequence. We chose the email, so the AAP.xml configuration file of the Shibboleth SP associated to phpGroupware will be defined so as to transmit the value of the user's email for REMOTE_USER :

The same kind of configuration can then be applied for access to other integrated applications of the Picolibre platform such as Sympa (or TWiki in the future) for them to benefit also from phpGroupware's or external applications authentication for SSO.

Conclusion We have described a proposed method for integrating phpGroupware with Shibboleth to allow the use of SSO mechanisms in accessing phpGroupware, while supporting several authentication sources (various Shibboleth IdPs, and local accounts too). The integration relies a lot on the use of Apache's authentication modules instead of proceeding with an internal phpGroupware auth. mechanism. There are very few specifics of Shibboleth in this respect, most of the issues being the same if phpGroupware uses other types of Apache auth. mechanisms. We introduce several options for configuration and adaptation to other environments, in order to achieve the most generic solution. They help adapt to several kinds of situations, like the availability of a fully operational Shibboleth environment, or the support of an existing deployed phpGroupware user base, potentially not already in sync with the Shibboleth deployment. Integration of the new modules developed in phpGroupware has been implemented with the current 0.9.16.010 version of phpGroupware12, and avoiding modifications of its architecture. We have successfully tested on a prototype system on two phpGroupware and Picolibre platforms, with the Shibboleth infrastructure of GET/INT.

Further tests will be needed, but we are satisfied with the proof of concept obtained, which offers us a migration path to an easy integration of deployed Picolibre platforms in the GET IS, while keeping existing legacy accounts. The SSO facility obtained will help design new networks of collaborative platforms that will hopefully offer greater usability and flexibility, for wider adoption both inside companies or among creative communities on the Internet.

12 and may be partially integrated also in version 0.9.18 at the time you read this document. Bibliography [1] Cousin E., G. Ouvradou, P. Pucci and S. Tardieu, 2002, PicoLibre a free collaborative platform to improve students skills in software engineering, in: 2002 IEEE International Conference on Systems, Man and Cybernetics, Vol.1, IEEE, p. 564-568. [2] Berger O., C. Bac, and B. Hamet, 2006, Integration of Libre Software Applications to Create a Collaborative Work Platform for Researchers at GET, International Journal of Information Technology and Web Engineering 1 (3), 2006. Annex

Components added

In package phpgwapi In this package we add the auth class in the file auth_remoteuser.inc.php which corresponds to the method of authentication through Apache. This class does not do a lot, it is only overloading the base class with the trivial process of authentication in phpGroupware. For the mapping phase during identification of the user, we add the class mapping in the file mapping.inc.php. It inherits from the class mapping_ which is based on the existing account in phpGroupware, to make the mapping "trivial" and to validate the account. Class list : class auth : auth_apache.inc.php class mapping_ : mapping_ldap.inc.php class mapping_ : mapping_sql.inc.php class mapping_ : mapping_picolibre.inc.php class mapping : mapping.inc.php

New “apache” phpgrouware module We add a new Apache package for the authentication though Apache. Scripts : ● /phpgwapi/inc/sso/login_server.php : for login process, ● /phpgwapi/inc/sso/create_account.php : interface allowing the user to create an account ● /phpgwapi/inc/sso/create_mapping.php: interface allowing the user to create a new mapping ● /preferences/mapping.php: manage mappings (Allow, Deny, Delete a mapping)

New database table We add the table phpgw_mapping (attibutes : user_ext, location, auth_type, status, account_lid) to make mapping by table. It links values of REMOTE_USER to user local phpGroupware accounts. Components modified

Setup / configuration We modify the configuration of phpGroupware in module /setup/templates/default/config.tpl : ● Add an option for the type of authentication though Apache : auth_type = remoteuser (only choice available at the moment, but present for future needs) ● Add an option Allow fallback authentication method (at the time of writing only remote user is allowed) ● Add a parameter mapping_field indicating the account if to make mapping "trivial" ● Add options mapping_type (all, id, table) indicating the type of mapping phpGroupware modules ● /login.php : modified to make mapping if auth_type = remoteuser ● /admin/inc/hook_deleteaccount.inc.php : remove the corresponding mappings

Configuration decisions If using Apache server's auth modules (mod_auth_ldap, mod_auth_mysql ...), declare the module configuration to protect phpGroupware. For exemple with mod_auth_ldap AuthType Basic AuthName "phpGroupware" AuthLDAPEnabled on AuthLDAPAuthoritative on AuthLDAPURL ldap://my.openldap-server.com/dc=my_org,dc=org?uid require valid-user

Corresponding configuration in phpGroupware auth_type = apache (for Full Apache) auth_type = sql,ldap... (with option semi_apache = yes : Semi apache) mapping_type=id,table,all mapping_field = account_lid (By default REMOTE_USER=uid)

When using Shibboleth(mod_shib) Install SP and indicate the attribute chosen for REMOTE_USER (uid, email…) For example to put the value of the email for REMOTE_USER.

Declare mod_shib with Apache server to protect phpGroupware AuthType shibboleth ShibRequireSession On require valid-user

Of course the Shibboleth SP will need to be registered within the existing federation. Configuration in phpGroupware : auth_type = apache : Full Shibboleth auth_type = sql,ldap... with option semi_apache = yes : Semi Shibboleth mapping_type = id , table , all mapping_field = account_lid or the other (Depend with REMOTE_USER) Keep it Simple, Editorial control at simple.wikipedia.org: Matthijs Den Besten (Oxford e­Research Centre) Matthijs den Besten (University of Oxford) & Jean-Michel Dalle (Université Pierre & Marie Curie)

Background The WikiWiki collaborative editing system originally created by Ward Cunningham in 1994 [1] has become a very popular and widely used tool in recent years, most notably for Wikipedia, the free online encyclopaedia, but also within various organizations for project communication and documentation. The success of Wikipedia, its rapid growth and visibility on the Internet, provides evidence that WikiWiki works well. Yet, it is also this particular application of wiki technology to the encyclopaedia that draws most criticism, for instance from star Computer Scientist David Parnas: “Wikipedia provides a fast and flexible way for anyone to create and edit encyclopedia articles without the delay and intervention of a formal editor or review process,” he acknowledges. “But will this process actually yield a reliable, authoritative reference encompassing the entire range of human knowledge?” he asks [2]. People involved in Wikipedia generally seem to recognize that these concerns and several mechanisms are being tried out to alleviate them. It is these mechanisms for quality control that we will study here. More specifically, there are three broad types of modifications to the original WikiWiki system that have been added to Wikipedia so as to create a partial substitute to the formal edition and review process that is characteristic of traditional encyclopaedias: 1) The allocation of specific rights to a limited group of experienced users such as the right to block other users who are misbehaving and to the right to remove edit- possibilities to pages which are contentious; 2) The introduction of bots – “automatic processes that interact with Wikipedia as though they were a human editor” (http://en.wikipedia.org/wiki/WP:B) –, which carry out routine searches for, among others, page vandalism and broken links; 3) The application of labels, for instance to indicate that a page is of high quality (“featured article”) or that a page is considered for deletion (“articles for deletion”). It is this in last modification that we are most interested for now. Do these page labels act as attractors for contributions or contributors? Do they generate more edits and discussions? And do the labels result in a higher quality of the pages they have marked? In a first step, we focus on the study of simple.wikipedia.org, “a free encyclopedia written in simple English for easy reading” which, with slightly more than ten thousand articles, is a relatively small offspring of the main Wikipedia; within this corpus, we look at the effect that the crucial label “unsimple” has on the readability of the articles that it marks. This research is in a way a direct continuation of previous research that focused on what attracts developers to contribute to specific files in open source software systems [3]. Furthermore, the fact that editorial controls affect writing style has already been demonstrated by Emigh and Herring [4] in a comparison of Wikipedia with Everything2, while Les Gasser and colleagues have proposed a set of metrics for quality assessment that we happily adopt here due to its relevance to our topic [5]. For obvious reasons, for now, we specifically focus on readability metrics as measures for the simplicity of pages. Preliminary Investigation Our database contains revisions of 27 497 pages of which only 250 have been labelled “unsimple” at a point in their lifetime. These 250 pages however represent about one fifth of the size of the dump: this may be explained by the larger average size of pages once labelled “unsimple” and/or due to the significantly higher number of revisions on those pages (the “unsimple” pages have 736 characters and 37 revisions on average whereas the average over all pages is 159 characters and 6 revisions respectively). The first question we wanted to investigate was whether we could establish a relation between readability measures and the appearance of the “unsimple” label on a page. For that purpose we applied the GNU style tool (version 1.02) to all revisions of all pages in our database and aggregated the results in the table 1 on the next page. For a variety of readability grades, this table compares the overall readability of the project with that of pages just before and after the “unsimple” appeared and disappeared.

Table 1 Effect of the unsimple-label on readability. Overall Before p After Tag p Improve p Median Tag ment

Kincaid 8.39 11.30 0.05 8.65 0.79 2.52 0.02

ARI 9.25 12.94 0.05 9.79 0.68 3.01 0.03

Coleman 10.67 11.50 0.00 10.61 0.75 0.85 0.00

Flesch 68.74 58.69 0.01 69.02 0.92 -9.80 0.00

Fog 11.76 14.50 0.07 11.68 0.94 2.68 0.02

Lix 40.23 47.83 0.05 39.83 0.88 7.55 0.01

SMOG 9.67 11.36 0.00 9.83 0.56 1.44 0.00

More in particular, for seven readability grades listed in column one, column two gives the median score of all revisions of a page averaged over all pages. Next, the scores are given for the average state of pages just before the label or tag “unsimple” was applied and just after it was removed. The small values of p right next to the scores are the p-values of t-tests as an indication of the probability that two distributions are the same. Thus, the low p-values next to the column “Before Tag” indicate that, on the basis of a two- sided t-test, the readability of pages just before “unsimple” appears is significantly different from the overall readability in simple.wikipedia.org. In contrast, the high values next to “After Tag” indicate that pages can hardly be distinguished from regular pages anymore after the “unsimple” tag has been removed, while the low p-values on the right confirm that the “unsimple” tag yields significant improvements, at least for a one-sided paired t-test. The metrics shown – i.e. The Kincaid formula, the Automated Readability Index (ARI), the Coleman-Liau formula, the Flesh reading easy formula, the Fox Index, the Lix formula, and SMOG-Grading – are all well established readability grades with each their peculiarities. Hence it is all the more surprising that all point in the same direction. Especially neat in this context are the values of the Flesh reading easy formula. According to the manual page of “style”, “[t]he index is usually between 0 (hard) and 100 (easy), standard English documents averages approximately 60 to 70.” Not only are “unsimple” pages objectively significantly less readable than regular pages, and not only does the “unsimple” tag generate a significant improvement to a page in many cases, but, as if by magic, the community of simple.wikipedia.org manages to keep the readability of pages more or less on the simple side within the margins of standard English. NB: Values in Table 1 concern the 99 pages that were “simple” before and after.

Table 2 User activity by type. All pages Unsimple Edit while Tag Untag pages “unsimple” “unsimple” “unsimple”

Anonymous 46491 (26%) 2238 (37%) 745 (33%) 41 (16%) 37 (37%)

Registered 59235 (34%) 2073 (34%) 670 (29%) 98 (39%) 53 (52%)

Administrato 32387 (18%) 1248 (22%) 429 (19%) 111 (45%) 10 (10%) r

Bot 38737 (22%) 504 (8%) 444 (19%) 0 (0%) 1 (1%)

Another question we investigated was whether the editors involved in tagging pages “unsimple” are any different from regular users of simple.wikipedia.org. Table 2 gives an indication of these differences and similarities. The table gives the rough counts of all revisions that can be attributed to anonymous visitors, registered users with an account, users with special “administrator rights” and software bots. What is appears from this table is that the acts of adding and removing the “unsimple” label to and from pages is dominated by, but not limited to, administrators. Meanwhile, so far, bots are completely absent from the process, even though they play an important role in general. Is this the result of a conscious decision not to rely on automated processes for the potentially contentious nature of tagging? Or are the conscientious editors at simple.wikipedia.org simply not aware of the readability metrics that appear so powerful in predicting whether a page would be considered “unsimple” or not from Table 1? Further research is needed to determine whether the efforts to keep it simple at simple.wikipedia.org are really as efficient as they appear to be at first sight and whether they could be improved. And afterwards similar questions can be asked for the broader issue of quality control in much large projects like en.wikipedia.org.

NB: User groups were determined on the basis of a recent table user-group assignments. However, keep in mind that the assignment of groups to users need not be static. References [1] W. Cunningham and B. Leuf. The Wiki Way. Quick Collaboration on the Web. Addison-Wesley, 2001. [2] P. Denning, J. Horning, D. Parnas, and L. Weinstein. Wikipedia risks. Communications of the ACM, 12(48), December 2005. [3] M. den Besten, J.-M. Dalle, and F. Galia. Collaborative maintenance in large open- source projects. In E. Damiani, B. Fitzgerald, W. Scacchi, M. Scotto, and G. Succi, editors, Open Source Systems, volume 203, pages 233–244, Boston, 2006. IFIP International Federation for Information Processing, Springer. [4] W. Emigh and S. C. Herring. Collaborative authoring on the web: A genre analysis of online encyclopedias. In Proceedings of the Thirty-Eighth Hawai’i International Conference on System Sciences (HICSS-38)., 2005. [5] B. Stvilia, M. B. Twidale, L. C. Smith, and L. Gasser. Assessing information quality of a community-based encyclopedia. In Proceedings of the International Conference on Information Quality, 2005 Towards Management of Enterprise Intranet Resources Using Open Source Technologies: Diana Gorea (IRCAD University of Strasbourg, France)

Diana Gorea, [email protected]­strasbg.fr Johan Moreau, [email protected]­strasbg.fr IRCAD­EITS Institut de Recherche contre les Cancers d'Appareil Digestif – Europeean Institute of Tele­Surgery Strasbourg, France

Abstract. The paper proposes an intranet architecture dedicated to dynamic environments, with applications spread across different servers and locations, with varied client usage. The system is exploited in a medical environment in which users are trained in various domains, but collaborate and combine their work. The system is formed by integration of disparate independent software components that communicate in a service-oriented manner. The system makes use mostly of open source technologies and methodologies.

Keywords: intranet, open source, content management, .

1. Introduction and Context An intranet is an enterprise-wide information distribution system that uses Internet tools and technology. The level of architecture or technology complexity varies depending on the number of users or the business logic. An intranet is used to give employees access to the documents, to distribute software, enable group scheduling, publish communication provide an easy front to the database (back end), let individuals or departments publish the information they need to communicate to the rest of the company. Physically, an intranet is formed by linking the various pieces of information and communication technologies that an organization owns or uses, so that the any resource is available to anyone who needs it at any time. An Intranet encourages the members of the organization to make better and more informed decisions. An Intranet encourages and supports more effective use of people by people and should support faster and more efficient decision making processes. The core of an intranet system remains invariable, regardless of the rapid evolution of technical and software advancements. It is provided by the content, namely the internal knowledge asset (IKA). This is the enterprise intellectual property, its employees’ knowledge and expertise, being unavailable to the general public and is produced through the efforts of an internal community. IKA may include business strategies, market trend analysis, financial information, internal document workflow information, internal rules and policies, operation manuals, inter-departmental communiqués, or details regarding specific projects and contracts. Unlike external sourced information, IKA is highly focused and very specific to the enterprise logic, having the most added value, because it is produced internally by the specialists in the discipline who are aware of the enterprise requirements. Due to that reason the enterprises have begun using Wiki systems and blogs as part of their intranet systems, especially for internal communication and news releases, or for managing the organizational specific knowledge. The main advantages of such an environment are: • fostering participation, creativity and responsiveness from users and groups, leading to a “vivid” intranet • improvement of the quality of teamwork

• quicker information circulation, assimilation and feedback

• know-how acquisition and reuse

To maintain content integrity and quality, the intranet owner may integrate new techniques like content moderation and submission guidelines. We set and tested an intranet system dedicated to a dynamic center [IRCAD, EITS] where health care professionals, surgeons, physicians, researchers, engineers and computer scientists continuously collaborate and combine their work. The institute promotes the latest surgical technologies involving robotics, virtual reality and advanced IT applications. In addition it provides a virtual surgical educational environment to distribute specialized media content [WEBSURG]. There are a lot of ongoing projects that requires tight interdisciplinary teamwork. Such projects include: development of cooperative software for 3D patient reconstruction, simulation tools for tumor radio frequency ablation, augmented reality and robotics applied in the anatomy of the digestive system to automate the surgical gesture, virtual university specialized in minimally invasive surgery (interactive multimedia lessons explaining operative procedures, video and audio podcast, experts interviews, lectures, editorials, Continuous Medical Education accreditation). The institute is consuming and supporting open source software and also contributes to the open software community with contributions to several libraries released under public licenses: libDicom (library for reading DICOM images in medical imagery), vgSDK (specialized 3D engine for medical objects surface rendering), YAMS, OpenBuildFarm (tools for building applications). Certain members of the institute contributed independently to other open source projects like nagios-i18n, kosmos-i18n.

2. The Architecture of the System The system is developed by integration of disparate independent software components (open source mostly) in a way that is loosely coupled and structured, as depicted in figure 1. The communication among modules is done in a service-oriented manner. Each component can consume, integrate and pass on information that other component generates and can provide at its turn information for other modules. The system uses an SSO [SSO] authentication system based on LDAP [LDAP] and Kerberos [KERB] or SSL, allowing authentication transfer from one intranet module to another. The LDAP (Lightweight Directory Access Protocol) runs over TCP/IP and allows querying and modifying directory services. The entry point in the intranet system is the XWiki application. It provides access to the knowledge base of the enterprise and also aggregates little pieces of information from other modules in a user transparent manner. We chose XWiki [XWIKI] (eXtended Wiki), which is a wiki application based on the J2EE platform. The XWiki project is open-source and it is based on other open source projects, as Struts [STRUTS] (for URL rewriting), [HIB] (for relational database storage), OSCache [OSC] (for caching the documents), Radeox [RAD] (as wiki rendering engine), DBCP [DBCP] (for reusing database connections), Subversion [SVN] (for version control), Velocity [VELO], Groovy [GROOVY] (for programming inside documents). XWiki has a lot of advanced features that cannot be found in many other wiki systems: multiple spaces, page commenting, file attachments, programming inside documents, web service integration, advanced searching, RSS feeds, LDAP integration, PDF export etc. The most interesting part that XWiki provides compared with other wiki systems is by far the possibility of sharing of both data and code. This is a natural approach if we have in mind the dynamically generated content, which proves that data and code are related very closely. That means that code can be easily converted to data, or, more exactly, provides data aggregation and transformation, which finally results in new data. This way, XWiki documents expose a versatile and intelligent behaviour. Moreover, the information may be organized using a unique metaphor. All the information (data, code) is stored in the XWiki database. The XWiki database can be directly queried from Groovy scripts inside the XWiki documents. Nevertheless most usual database access is done through the Java objects (and further managed by Hibernate) available through Velocity or Groovy contexts. External libraries can also be used on the fly from the scripting context. We actually added several additional ad-hoc but still powerful features into the system using only the XWiki scripting support. OpenGroupware [OGO] is an open source team communication software solution intended for companies, government institutions or distributed project teams. The main functions are: contact management, group calendar, resources planner, task management, e-mail client, project and document share, news, palm and PDA synchronization, user and group management. Furthermore, OpenGroupware is an extensible application and a portal server. All the available applications described above are implemented as plug-ins to the main server can be extended and enhanced in various ways. OpenGroupware allows pluggable synchronization with handheld wireless devices running clients as iCal or Outlook by using a platform independent information standard protocol called SyncML[SYNCML]. Subsequently any member can synchronize his/her own public calendar or contact list with the OpenGroupware server. The users can book conference rooms and can share their calendar with other members of the institute. A special solution has been found for PDA synchronization through DAWAN Plugin for OpenGroupware [DAWAN], which is distributed under GPL license. Our research and development teams use the system [SAVANE] to manage the tasks, bugs and support requests. Through its web interface, Savane provides an environment for collaborative project management. It implements a customizable mail notification system, making sure that users keep up to date with information that affects them. It implements also a bug tracker that allows to assign tasks and set priorities to them.

Figure 1. The intranet system architecture

3. Corporate Documentation and Tools

The intranet can be used as a medium for distributing employee information, announcements and policies, with the advantage of quick delivery of firsthand information that usually enhances the company culture. The blog functions of XWiki are used to spread company news and events, having the advantage of increasing productivity and to reducing administrative costs. Second-hand information can complete the first announce and improve the quality of information XWiki blog system bridges the gap between email and instant messaging. The blog space allows both news and information circulation and recording of new ideas and concepts. The blog is categorized into: global and departmental announcements, news, personal or user dialogue. are very effective tools for the knowledge management, while making it possible to collect, preserve and divide capacities often under-exploited otherwise. Each research or development team works on several projects in the same time. The process of systematically writing the project documentation is not particularly straightforward. It is a lot easier to produce short notes, which are gradually added to some common documents, achieving a continuous update of the documents. Besides, the wiki has become a meeting point for discussing articles or institute activities (courses, conferences, visits). An interesting facility that can be leveraged in XWiki is the use of mind maps enabled by the Freemind [FREE] plugin. This encourages content creation, organization and better pursue of the ideas. The tool allows also document generation based on the mind map. The way wikis are conceived can facilitate the process of collaborative decision making. A relevant example is the management of candidates posts for a job. Original curriculum vitae file are saved in download area of Savane and then attached to the candidate pages in XWiki, which are made visible to the decision makers. We implemented a voting system that enables posting the votes and comments on candidate pages. We use XMLRPC [XMLRPC] calls to OpenGroupware XMLRPC server to extract and list information into XWiki documents. The calls are issued from the Groovy scripts that generate XWiki pages containing the result data: the events and appointments of the week, information about the contacts or the teams, or any other information that is stored in the OpenGroupware database. We created a XMLRPC proxy module which provides a handy API for accessing data from the OpenGroupware XMLRPC server from any Java client, in our case the XWiki Groovy scripts. One of major difficulties in a company is to find the relevant information especially when the information system uses several mechanisms to store it. This heterogeneity of formats and storage environments is inevitable, considering the different nature of information: local or Web documents, people, resources, companies, multimedia files etc. Subsequently each system provides its own method of searching for this information. For example, XWiki uses SQL queries to search in its database and Lucene [LUCENE] to search in its page attachments. The Savane system allows searching in its database for projects, tasks, bugs or people. MnoGoSearch [MGS] is used to search in the corporate file system, which uses a MySQL database to store and index metadata about the files. We also need results for academic article search engine usable on Internet through a subscription or even Google advanced search. One of the most important functionalities of the intranet system is the corporate directory search. The information in the directory is continuously synchronized with the OpenGroupware system. OpenGroupware offers a rich and flexible format for storing persons and companies and associate them with the ongoing events. As we described before, we created a proxy that retrieves data into the XWiki system by means of Web Service calls. These calls can be invoked by the search scripts. Hence the search methods are heterogeneous and complex and we need a solution to adapt the search system to enterprise needs and integrate, connect and filter the results. In our XWiki front end we created search groovy scripts that call all web fronted systems (XWiki, MnoGoSearch, Google, OpenGroupware, Savane) and provide an unique search and result interface for the final user.

4. Authentication System The SSO (Single Sign On) is a software solution based on a directory, which allows the users of an enterprise computer network to transparently gain access to all the authorized resources, following an authentication when the user initially accesses the network. Thus a single password allows the access to the applications and the multi-platform systems of the enterprise. In fact the directory directly asserts the identity of the client to the applications in question without the repetitive user intervention. This approach has the advantage of increasing both the ergonomics of the access to the intranet applications and the security of the information systems by limiting the password circulation in the network. It also can enable the definition of roles inside certain applications. In another words the SSO offers convenience to the users, because it releases them from the constraint of having to manage large numbers of identifiers and passwords. In our case the gain associated to the use of SSO is very high. All the services we employed in our intranet support LDAP authentications, but not all of them support Kerberos for the moment. The users are already authenticated in the Savane and Subversion systems using the Kerberos/LDAP SSO. Due to the fact that XWiki and OpenGroupware do not directly support Kerberos yet we use the SSL protocol over HTTP in order to avoid clear text password circulation on the network. Besides authentication, the directory is the base for user, group and project creation in the other components of the system. Actually we have developed software modules that make SOAP calls in order to create: • users, groups, webspaces, permissions and the corresponding documents in XWiki • users and groups in OpenGroupware • users, projects and permissions in Subversion • users and mailing lists in Mailman [MAIL] The corporate data (users, groups, resources) is originally stored in the LDAP directory. Once a modification in the organizational structure occurs (e. g. new member), the administrative manager of the institute supplies the new data to the directory and subsequently initiates a work flow engine that informs all the decision makers and allows information upgrade (e.g. subsequent modifications on projects). This way the decision makers or the work flow engine's administrator can constantly update the data and synchronize it with the other components of the system. In fact we developed synchronization scripts that are launched automatically every night in order to keep all the components' data consistent with the LDAP directory.

5. Documentation and Project Management Documentation helps to ensure efficiency, continuity and consistency in software project development. Yet the specifications can get very complex and unstructured, especially in the upstream step. In this case traditional hierarchical techniques and software for managing documentation are not very reliable. This stands especially in the case of XP development process, which is iterative and without long-term vision. The wiki systems address very well this problem. Their simplicity in shared notes capturing along with hypertext publishing made them attractive and compelling tools for documentation. Moreover, the configuration management process provided by the wikis conforms to the Capability Maturity Model (CMM) [CMM] requirements for software development documentation. For these reasons we chose the XWiki system for managing documentation for our ongoing projects. Furthermore, we leverage the XWiki programming capabilities to integrate all the perspectives of software development process. So the XWiki system is used both to collaboratively edit the project documentation and to aggregate other project related information like source code, project and task management, mailing lists etc. For each ongoing project we created a corresponding space in the XWiki system. The project space is has restricted access only to the project team. Within this space we stored the project documentation that is edited collaboratively by the members of the project. Monitoring tools like Recent Changes, History and Version Comparer allow the the community to moderate and provide good support for traceability. Finally, detailed release documentation became essential for the configuration management process; change requests can be quickly written up and made widely known to the team. Each of the projects has a dashboard that integrates XWiki content with specific project information provided by other intranet service components: the persons working in this project and the assigned tasks (provided by Savane), addresses of external partners, links to the source code (provided by Subversion) or links to read documentation code (provided by Doxygen [DOXY]), the corresponding mailing lists archives (provided by Mailman). This approach improves response time because all the information is gathered in the same system in a project oriented manner, enables reuse and transformation of important data and optimizes the development process. The planning for all the projects is created automatically by conflating Savane tasks progress and OpenGroupware events. SQL requests to Savane and OpenGroupware back ends are submitted from XWiki pages and the result data is transformed in Gantt [GANTT] file format, which is basically an XML [XML] file. Each project manager can then view the data stored in Gantt format and has the possibility of testing, printing or exporting the generated plan. The export is done by transformation of the file into SVG (Scalable Vector Graphics) [SVG] format using a XSLT [XSLT] processor. The mailing lists are managed by Mailman. We created several SOAP [SOAP] web services that allow direct consulting of the mailing lists (members, archives, research) from XWiki pages. Another important aspect of project development is the progress monitoring, that ensures the team is making satisfactory progress to the project goals. There are different progress monitor techniques. Sourceforge for example mainly makes use of two of them : "most downloaded" and "most active". The “most active” progress monitor serves the best the purposes of our projects in development. This information is provided by Subversion and added to the XWiki dashboard of the project. The project dashboard in XWiki system displays also other monitors or metrics that combine technical requirements, resource planning and scheduling for quantifying the accomplishments against the total amount of work. For instance we calculate the Earned-value metric [EARNED] based on past trends in order in order to predict future performance. Resulting from the planning stored in Savane and OpenGroupware we can estimate at any moment each persons advancements and thus the earned value per project. We can also set up milestones and view the open issues (tasks, bugs, support requests) for each developer or team, as well as the availability of each member. The testing process (test creation, conducting tests, evaluation of results) is managed by another open source system called TestLink [TEST]. Using XWiki Groovy scripting capabilities we created a mechanism that connects an functional test failure with a bug in Savane. This facility proved to be extremely useful for the testers. In conclusion, the aggregate system based solely on open source that we built is adequate to cope with iterative software development as extreme programming.

6. Further Development So far the system we built proved to be easily adjustable and customizable to enterprise intranet particularities and needs. We plan on going further and add functionalities to adapt XWiki to the management of the content of our educational website. New text and multimedia content to be published or sent as newsletter arise on a weekly basis. The process can be automated by exporting intranet information stored in XWiki into the file system. The content can be retrieved by FTP and published on the websites. Another matter of real interest is the surgical video content management. The institute produces approximately 500 surgical video productions per year. We developed a set of tools for managing the video content life cycle workflow (rushes, processing, validation, production, encoding, metadata). We envisage also the integration of these tools as component services in the intranet system. The workflow of videos can be controlled using a XWiki dashboard in a similar manner with the rest of the projects. At the moment we cannot use the SSO system to securely access our intranet XWiki (only by SSL for the moment). For that purpose we plan on integrating JGuard [JGUARD], which allows Kerberos authentication and authorization from Java applications. For faster discovery and more accurate retrieval of the enterprise knowledge base stored in XWiki, as well as easier navigation we are going to implement a categorizing mechanism, based on community contribution (folksonomy tagging, tag clouds). The search based on this kind of content organization can be used complementary with the existing searching mechanisms.

7. References [IRCAD] ***: The IRCAD Home Page, http://www.ircad.fr [EITS] ***: The EITS Home page, http://www.eits.com [WEBSURG] ***: Websurg, The Electronic Book of Surgery, http://www.websurg.com [XWIKI] ***: The XWiki Project, http://www.xwiki.org [STRUTS] ***: Apache Struts Project, http://struts.apache.org/ [HIB] ***: Hibernate, http://www.hibernate.org/ [OSC] ***: OSCache, http://www.opensymphony.com/oscache/ [RAD] ***: Radeox Wiki Render Engine, http://radeox.org [DBCP] ***: Commons DBCP, http://jakarta.apache.org/commons/dbcp/ [SVN] ***: Subversion Project Home, http://subversion.tigris.org/ [VELO] ***: Velocity, http://jakarta.apache.org/velocity/ [GROOVY] ***: Groovy – Home, http://groovy.codehaus.org/ [SSO] ***: The Single Sign On, http://www.opengroup.org/security/sso/ [KERB] ***: Kerberos: The Network Authentication Protocol, http://web.mit.edu/kerberos [LDAP] ***: Lightweight Directory Access Protocol (v3) Technical Specification, http://tools.ietf.org/html/3377 [OGO] ***, Opengroupware Home Page, http://www.opengroupware.org [SYNCML] ***: Open Mobile Alliance – SyncML, http://www.openmobilealliance.org/tech/affiliates/syncml/syncmlindex.html [DAWAN] ***: Dawan, http://open-source.dawan.fr/ [SAVANE] ***, Savane, https://gna.org/projects/savane [XMLRPC] ***, The XMLRPC Home Page, http://www.xmlrpc.com/ [MAIL] ***, Mailman, the GNU Mailing List Manager http://www.gnu.org/software/mailman/index.html [CMM] ***: The CMMI Main Page, Carnegie Melon SEI, http://www.sei.cmu.edu/cmmi [FREE] ***: FreeMind, http:///freemind.sourceforge.net/ [DOXY] ***: DOxygen, http://www.stack.nl/~dimitri/doxygen/ [GANTT] ***: The Gantt Project, http://ganttproject.sourceforge.net/ [SVG] ***, Scalable Vector Graphics, http://www.w3.org/Graphics/SVG/ [XSLT] ***: XSL Transformations (XSLT) Version 1.0, http://www.w3.org/TR/xslt [EARNED] ***: Earned Value Management, http://evm.nasa.gov/ [TEST] ***: TestLink Homepage, http://testlink.sourceforge.net/ [LUCENE] ***: Lucene, http://lucene.apache.org/ [MGS] ***: MnoGoSearch, Web Search Engine Software, http://search.mnogo.ru/ [XML] ***: Extensible Markup Language (XML), www.w3.org/XML/ [JGUARD] ***: JGuard, http://jguard.xwiki.com/xwiki/bin/view/Main/WebHome [SOAP] ***: Simple Object Access Protocol (SOAP) 1.1, www.w3.org/TR/soap/ Libresource, learning from customers, Stephan Bagnier, ARTENUM Abstract: LibreSource is a generic collaborative platform dedicated to software development (forge) and supporting groupware facilities to manage distributed communities. Based on an innovating approach and supported by a full JAVA/J2EE design, LibreSource is highly modular and can be easily tailored to a large range of applications. Since the start of the project, constraints of modern software projects have been taken into account, leading to integrate a consistent set of tools to handle the complete development/ validation/publication process. As example, LibreSource provides modern Synchronizers, to perform the versioning control, bug-trackers, forums and communication tools. LibreSource offers a very close integration of each component, allowing a high degree of awareness for the users. Last, a rôle Based Access Control allows a very refined security to each ressource and hosted data. Thanks to its technological advance, LibreSource has been successfully used to support scientific and industrial projects, like SPIS of the European Space Agency, where previous forges shown their limits and could not answer the constraints and requirements of these projects. As an open source project, LibreSource knows today a strong community success with an increasing number of downloads and contributions. At the Hephaïstos conference, we will present the new release (2.0) of both Community and Enterprise versions. As example of interoperability, the integration of external tools, like Subversion, SSO and future extensions towards over platforms, will be discussed in more details. Over the past few years, the work done closely with the community and the support of industrial projects with the same tool made of LibreSource an advanced and mature solution today. These results may be useful to draw a first outline of a future open-standard regarding forges and software factories. The presentation will focus on the learnings we got from these application cases and the reasons why LibreSource may become one of the best candidate for the Next Generation Forges. Managing a Packaging Farm: Johan Euphrosine (mekensleep.com)

Presentations to be included in the proceedings V2 Managing the Testing Process in an Open Source Environment Xavier Drozdzynski (XDR)

Managing the Testing process in OSS environment ? What for ?, after all as Linus Torvalds said/wrote: "Having enough eyes, all bugs seem shallow" He is obviously right, but: ● OSS projects are used in real industrial projects ● OSS projects are used in Customer Oriented projects ● OSS projects are used in End User Oriented projects ● OSS projects are used in « Business Critical» projects And many of these OSS projects have only a few contributors (Open Cemetery, how many hackers want to jump right into that code ? well some, but ...) So if not enough eyes want to look at code voluntarily, we have to pay them for this, and if we do not want to pay to much we need some order in the madness.

So we should learn from traditional environment, that target ad hoc software development And organize a Software qualification project that targets: ● Strategy / Structure / Culture ● Process / People / Tools ● Portfolio / Program / Project ● Software qualification approach

And execute the qualification PROJ ECT MANAGEMENT Approach based on : CONCEPTION EXECUTION EVALUATION Risk / Functional / Customer / • Launching the project • Carry out the plan •Perform Chance  Analyse - Audit / Diagnostic the Execution  First recommendations • Support and assist the team • Amen Software qualification  Action PLAN approach BRING THE CHANGE

To help us many Qualification Tools do exist in the Free and Open Source world, for instance: ● Load Performance tool : Open Sta, Stress Test, ● Network analyser : Ethereal ● Testing tool : WebInject, Jamelon, Sahi, Avignon. ● Software management : Track+, OneOrZero, TestLink, QaTrack, Blitz, TestRunner, TestTopia ● Version Management : CVS, subversion, ... ● Bug tracking : bugtracker, , ● Code analyser : windows leaks, WET, Flawfinder, BFBTester, ● Data management : Dbmonster, ● Data generator : Yagg, RecordEditor ● Budget Management : Time Accounting Management Software ● Planification Management : Open Gantt. ● Dashboard / reporting : Pentaho ● Document management : knowledge tree

An example is the management of the test process with TestOpia “The place where test are happy” Why TestOpia ? it is based on Mozilla / Bugzilla, easy to integrate in many forge environments, reactive core developers, and a large community.

Our goal is to integrate a service offering around TestOpia for Forge users.

(more info on the ODF Presentation annex). Mantis, Trac, Gforge Comparision, Integration models and the Future of Bug Tracking: Christopher Mann (Infopol)

What is a bug ? Computer Bugs are around since at least 1947 (M. Suze might have found some earlier, but didn't report them) What is bug-system integration ? Integrating bug tracking in the development system is a way to help developpers update their code without ignoring invonvenient facts found out by users or testers.

The presentation compared Infopol's experiences with: Bugzilla, Mantis, G-Forge, Trac Christopher Mann's attempt to integrate Mantis in G-Forge and the contribution that came of it.

And presented some reflection on: Is the one-environment approach necessary ? and Organizational issues

Why track bugs ? ● Communication ● Knowledge management ● Versions ● Ease the end-user experience ● Accountability ● Money ● Development style

Is a new function a bug ? ● Feature request and bug tracking have many similar functions and processes.

Integration of a bug-tracking system ? ● Are we talking about tech or organizational integration ? ● What resources are justifiable ? ● Why build you own bug-tracking software ? ● Why integrate an existing BT in your CDE ?

Christopher Mann developped an Integration of the Mantis with the Gforge plaform, the main goal being to support more development methodologies, and provide better services to users and programmers than the current bugzilla platform. Open Source Software Factory – Step by Step: A Case Report: Alan Kelon Oliveira de Moraes (Centro de Informàtica – UFPE, Brazil)

Alan Kelon Oliveira de Moraes, Silvio Lemos Meira, Jones Oliveira de Albuquerque Centro de Informática – Universidade Federal de Pernambuco (UFPE) Caixa Postal 7851 – 50732-970 – Recife – PE– Brazil {akom, srlm, joa}@cin.ufpe.br

Abstract

In the last years, it has been realized an increasing movement around open source software development where the team is distributed and usually uses a very light process. The challenge to be overcome is how to enable software engineers to work in a distributed software factory in accordance to best practices applied in the open source model. This work presents the steps in building a distributed software factory together with the results gained in a real experience conceived at Open Experience Environment (OXE) Software Factory.

1. Introduction

In the early years of computer science, software was not seen as a product and software development was not conducted by a process either [14]. Nowadays, it’s realized the necessity to apply a development process to gain software with quality, decrease costs and time [15] and also reduce the risks that huge projects involving a big number of companies end up in failure [20]. At the same time, successful open source projects encourage companies to search for the integration of this development model with its own projects [16]. The research question is how to enable software engineers to work in an open source software development environment which is usually distributed and characterized as a virtual organization [3]? On the other hand, real software organizations aim to professionalize their operations working like a factory, where productivity is taken by standardized methodologies and tools that provides coordination, systematization and formalization [1]. The focus of this paper is to show which steps are necessary to create an Open Source Software Factory based on Open Source Software Development (OSSD) best practices and Software Factories principles. We present a real experience conceived at Open Experience Environment (OXE) Software Factory (http://www.oxe.org.br) [24] The paper proceeds as follows. We start with the necessary background discussion defining what is a Software Factory and defines some background on Open Source Software Development model. Then we present the steps to create an Open Source Software Factory based on OSSD model and finally we consider some conclusions and prospects.

2. Software Factories

The term software factory labels development facilities with formal approaches to software development [28]. Bemer labels a software factory as software development facility aimed to reduce variability in productivity through standardized tools and management controls [29]. A common mistake is to compare the term software factory applied to software development context with the understanding of the term in other industries, which is associated to generally mass-produced products including large-scale centralized operations, standardized and unskilled job tasks, standardized controls, low skilled workers, mechanization and automation. A software factory aims to standardize good practices, gradually improve tools and techniques, and strategically integrate their efforts with rigorous employee training [28]. Indeed the overall concept is to have a more structured approach for software development, emphasized code reuse as well standardization and control over tools and processes The concept of a factory also implies a particular way of organizing work with considerable job specialization, formalization of behavior and standardization of work processes [1]. Building a software factory means gathering a group of factors. Harvey and colleagues [6] suggest an approach to do so: to define a detailed software development process; to give extensive training in the new process to staff members; to specify the process specification separated from process execution; to collect quantitative and qualitative data about project execution (interviews, software process assessments, process attributes for each project, configuration management system, project tracking data) and analyze the results to try to find out opportunities for software process improvement. What is really necessary to build a software factory depends on each situation, but in general there are two essential questions to be considered: a suitable infra-structure environment to work and a defined software process development. The professional competence of factory members must also be taken into account.

3. Open Source Software Development Model

The basic Open Source Software philosophy [8] [10] is extremely simple: when programmers are allowed to work freely on the source code of a program, code improvement is realized because collaboration helps to correct errors and enables adaptation to different needs and hardware platforms [9]. In fact this has happened and Open Source software is well known today because of its high degree of reliability and portability [17][18]. In this new approach, the hierarchically organized structure adopted in all productive processes is put aside in favor of a new kind of bottom up structure, which is non-coercive and largely decentralized [7] [9] [27]. According to Fuggetta[4], many of open sources practices are also applicable in proprietary software development. The open source just leverage some practices to the extreme. His analysis does not cover one of the main points in open source success: the distributed development. Some questions are extremely important: What happens in the OSSD model that works so well? Why do not IT corporations have the same kind of successful results? [11] The essence of the OSSD model is the rapid creation of solutions within an open, collaborative environment [9] [30]. Despite the many challenges associated with geographically distributed development [19], the OSSD model relies on collaboration and agile development practices. Key practices of the OSSD model include: projects start quickly; requirements are defined as soon as possible; development is distributed among contributors in a collaborative way; development cycles are short, fast and iterative; reuse existing code from similar projects; developer’s multi-project flexibility. By using collaborative development processes and tools for managing projects issues, communication and code, the OSSD model allows virtual teams to produce quality results. To do so, it is necessary tools manage all project activity throughout the application development life cycle, the collaboration on documents and design models, to track project issues, and to conduct structured and traceable project communication. Some practices should be adopted in traditional software development. There are at least three areas in which open source practices can benefit corporate IT organizations: team communication, user involvement and staffing models [2]. The companies should adopt agile processes being proven on open source projects. Corporate IT teams should also evaluate the following practices that are common to open source and agile communities: release software early and often, formally involve users in development, enable collective code ownership, practice continuous integration and employ automated testing to source code [9] [21].

4. Step by Step

There is no standard way to create a Distributed Software Factory based on OSSD model. Therefore we discuss general issues to consider in designing this kind of software factory. The presented steps are based on OXE Software Factory experience, an open source software factory run last year during a Software Engineering (IN953) graduate course at Federal University of Pernambuco, Brazil [22]. IN953 exposes students to real, team-oriented development in a software development organization staffed and managed by students under the guidance of the faculty. Several students are professional developers, certified programmers and work in industry, too. These courses are hands-on courses that require student participation in one of the factories defined. The projects for each software factory are chosen by professors and software factory managers. The demands are characterized by Request For Proposals (RFP) and have one client per project. These projects are in collaboration with C.E.S.A.R (http://www.cesar.org.br) and its partners which reflect current trends in industry and makes its professionals (which are students in the course) motivated [23]. Step 1. Define the Factory Business Model

The factory intends to be an organization inhabited by developers engaged in the common effort to develop open source software. But, what which business models are suitable for this configuration/environment? What kind of services should be provided? What about real projects with real clients? The factory must define some strategic way to invite community's developers to collaborate in its projects and, at the same time, the factory must consider clients wills and how to manage their requirements. OXE Software Factory’s strategies include calling community's attention to the software project, as soon as possible, by releasing the first prototype in the open source community, e.g. VENSSO (http://vensso.sourceforge.net), the product developed last year. Open source community developers will be engaged in the product lifecycle, attempting to spread coding activities. We work with GPL license and provide additional services around our products. We aim to work with software product lines to better organize our development and to boost our productivity through reuse [25].

Step 2. Define the Factory Organization

The software factory is supposed to define the responsible members for the organizational management. OXE has Management Committee, Sales, Research and Development, Products, Finance, Component Library, Quality and Project divisions. Each one is composed by PhD and M.Sc. students from Federal University of Pernambuco, Brazil.

Step 3. Define a Lightweight Development Process

A software factory needs a configurable software development process platform based upon proven best practices [1]. The process must be flexible and must enable what is necessary to be constructed for each stage of the project. It is necessary to think about a lightweight process that can be applied without obstructing the natural stages taken by software factory’s team and the open source community developers. OXE has defined the IXI Process which focuses on the best practices of Software Engineering, based on the Unified Process [5] and the agile practices such as Extreme Programming [21], to better coordinate the activities and concentrating on the organization's objectives: deliver the desired product to client. IXI Process is iterative and incremental, followed by five phases with defined milestones and objectives. Each phase is composed by activities that result in some artifacts necessary to go on the process.

Step 4. Enable the Work in a Geographically Distributed Way

A distributed software factory is intended to work as a virtual organization. That means the adjustment of three points: the existence of a shared goal by the team members, geographical distribution and the use of Internet services to manage and coordinate the interdependences [3]. These factors characterize OSS communities and must be considered in open source factories as well. As factory team members are geographically distributed, the Internet is the main tool to coordinate activities and actions. That means a distributed and collaborative development environment comprising communication, management and peer revision tools. Asynchronous communication through mailing lists is the main communication channel because developers are distributed world wide and we cannot guarantee synchronous availability. Mailing lists provides a way for project members and interested people to communicate with each other any time. This tool allows project discussion, improves knowledge management and sharing among developers and enables the product self- documentation. An issue tracker helps to manage the project progress, tracking many different kinds of activities in the projects, including bug reports and feature requests. The assets must be under a control revision system [26] to avoid effort duplication, to solve possible conflicts between two different versions of the same document, and to manage and control source code interdependences.

Step 5. Provide a Web Site for the Factory

The software factory is intended to provide its own web site where clients and community may surf. The factory site must aggregate information about its business mission, development process, news, team members, projects and solutions. The factory web site is also a point to centralize definitions and coordination. OXE Factory publishes its site at http://www.oxe.org.br. OXE Software Factory is itself open which means that collaborators are welcome to the factory. These collaborators are invited to contribute on both software products and in the IXE process improvement.

Step 6. Provide an Exclusive Web Site for Each New Project

The factory has its own site, but for each new project another site must be built. Therefore, each new project will have its site with its instantiated process, templates, artifacts resulting from each development phase, downloads, news and other project intrinsic characteristics. The aim of each project website is to allow open source community factory members to follow the development process. One of our projects may be found at http://vensso.sourceforge.net/.

Step 7: Define Roles for Each New Software Project

For each new project, development roles are necessary, but they may change depending on the project domain or software requirements. Thus some roles may be defined according to each project, such as: project manager, software architect, software engineer, configuration engineer, test engineer, requirements analyst, database designer, and others.

Step 8: Team Members Must Work in Harmony

When working in a distributed software factory, team members are supposed to learn some relationship practices. The first one is to trust on other people’ works - while we can't stay face to face and check our partners progress, we must trust in their reports. The second lesson is to code in a collective way. It is possible using the tools described on step 4. That means joining efforts to get result with quality. OXE Software Factory members try to work in accordance with these best practices.

6. Conclusions

Corporations who allow developers to bring their open source experiences to bear in the corporate environment increase both software quality and employee satisfaction. In this work, we have described the initial guide to build professional open source software factories. We have listed some steps addressing the integration of the two worlds: software factories and open source software development model. However, after following the suggestions, one of the most important things to figure out is the necessity of a good remote structured environment where team members may communicate and come up with solutions. Team leadership is usually welcome but the most important thing is to have all the team members actually engaged in a common effort. We can assess the quality of our guide only by using it in real experiences – that’s what Open Experience Environment (OXE) Factory has been doing [24].

7. Acknowledgements

One important factor of a open source software factory success is the team members themselves. We are in luck to have a team that really works together. Our special thanks to Carlos Eduardo, Clélio Feitosa, Euclides Neto, Lucas Schimitz, and Severino Andrade, OXE members.

8. References

[1] Aaen I., Botcher P., Mathiassem L. (1997) “The Software Factory: Contributions and Illusions”. In proceedings of the Twentieth Information Systems Research Seminar in Scandinavia, Oslo.

[2] Barnett L. (2004) “Forrester Report: Applying Open Source Processes in Corporate Development Organizations.” Available at http://vasoftware.com/ [3] Crowston K., Scozzi B. (2002) “Open Source Software Projects as Virtual Organizations competence rallying for Software Development”. In IEEE Proc-softw Vol 149, No1, February.

[4] Fuggetta, A. (2003) “Open source software––an evaluation”. Journal of Systems and Software, Volume 66, Issue 1, Pages 77-90.

[5] Jacobson I., Booch G., Rumbaugh J. (1999) “The Unified Software Development Process ”. Addison-Wesley Professional; 1st edition.

[6] Harvey S., Herbsleb J., Mockus A., Krisghnam M., Tucker G. (1999) “Making the Software Factory Work: Lessons from a Decade of Experience”. Available at: www.research.avayalabs.com/ user/audris/papers/factory.pdf, last acess: 29/07/2005.

[7] Mockus, A., Fielding, R. T., Herbsleb, J. D. (2002) “Two case studies of open source software development: Apache and Mozilla”. ACM Transactions on Software Engineering and Methodology. Volume 11, Issue3, Pages 309-346.

[8] Perens, B. (1999) “The open source definition. in Open Sources: Voices from the Open Source Revolution” C. Dibona, S. Ockman, and M. Stone, Eds., O’Reilly, Sebastopol, Calif., 171–188.

[9] Raymond, E. S. (1999) “The Cathedral and the Bazaar”. 1st. O'Reilly & Associates, Inc.

[10] Stallman, R. (1999) “The GNU Operating System and the Free Software Movement” in Open Sources: Voices from the Open Source Revolution, C. Dibona, S. Ockman, and M. Stone, Eds., O’Reilly, Sebastopol, Calif., 53–70.

[11] VA Software (2005) Survey Report: Application Development and Open Source Process Trends, available at http://vasoftware.com/

[12] VA Software (2005) Leveraging Open Source Processes and Techniques in the Enterprise”, available at http://vasoftware.com/

[13] Sommerville, I. 1996. Software process models. ACM Computing Surveys 28(1), 269-271.

[14] Naur and B. Randall (Eds), Software Engineering: Report on a conference by the NATO Science Committee. Petrocelli/Charter, NY: NATO Scientific Affairs Division.

[15] Fayad, M. E. 1997. Software development process: a necessary evil. Communications of ACM, 40(9), 101-103

[16] Goth, G. 2005. Open Source Meets Venture Capital. IEEE Distributed Systems Online, 6(6), 2.

[17] Zhao, L. and Elbaum, S. (2003) Quality assurance under the Open Source development model. Journal of Systems and Software. 66 (1), 65-75.

[18] Norris, J. S., Kamp, P. (2004) Mission-Critical Development with Open Source Software: Lessons Learned. IEEE Software, 21 (1), 42-49.

[19] Mockus, A. and Herbsleb, J., Challenges of global software development. In Proceedings of IEEE METRICS (pp. 182-184).

[30] Perens, B. (2005) The Emerging Economic Paradigm of Open Source. http://perens.com/Articles/Economic.html

[21] Beck, K. 2000 Extreme Programming Explained: Embrace Change. Addison-Wesley Longman Publishing Co., Inc.

[22] Jones Albuquerque and Silvio Meira. Software Engineering in Practice: Building Software Factories. ESELAW04 - 1st Experimental Software Engineering Latin American Workshop. SBC - Brazilian Computer Society and IEEE/TCSE-Technical Council on Software Engineering. October, 18. Brasília - DF, 2004.

[23] Wiegers, K. E. 1996 Creating a Software Engineering Culture. Dorset House Publishing Co., Inc.

[24] Cavalcanti, A. P. C., Lucena, L. R., Lucena, M. J. N. R., Moraes, A. K. O. de, Fernandes, D. Y. S., Pereira, S. C., Albuquerque, J. O. and Meira, S. R. L. 2005. Towards an Open Source Software Factory. In: 2nd Experimental Software Engineering Latin American Workshop, Uberlândia, MG, 2005.

[25] Weiss, D. M. and Lai, C. T. 1999 Software Product-Line Engineering: a Family-Based Software Development Process. Addison-Wesley Longman Publishing Co., Inc.

[26] Pilato, M. 2004 Version Control with Subversion. O'Reilly & Associates, Inc.

[27] Johnson, K. A. (2001) Descriptive Process Model for Open-Source Software Development. Master Thesis, Univ. Calgary, Alberta.

[28] Cusumano, M. A. (1989) The Software Factory: A Historical Interpretation. IEEE Software. 6 (2), 23-30.

[29] Bemer, R. W. (1969) Position Papers for Panel Discussion: The Economics of Program Production. In Information Processing 68 (pp. 1626-1627). North-Holand, Amsterdam.

[30] Goldman, R. and Gabriel, R. 2004 Innovation Happens Elsewhere: How and Why a Company Should Participate in Open Source. Morgan Kaufmann Publishers Inc. Towards a software licensing guide for Open Source Business Models: Rafael de Albuquerque Ribeiro (Federal University of Pernambuco, Brazil)

Rafael de Albuquerque Ribeiro2, Fábio Queda Bueno da Silva1, Alan Kelon de Oliveira Moraes1,2, Jones Oliveira de Albuquerque3 and Silvio Lemos Meira1,2

1Federal University of Pernambuco, 2C.E.S.A.R - Recife Center for Advanced Studies and Systems, and 3Federal Rural University of Pernambuco (UFRPE), Brazil {rafael.ribeiro, alan.kelon, silvio}@cesar.org.br, [email protected], [email protected]

Abstract. Companies that use Open Source Software in their business models are already showing up at the top hundred biggest computing companies in the world, confirming in this way that Open Source Software is compatible with profitable business. Among the concerns still outstanding for startups is figure out how to license its product since the source code, considered as the raw material for traditional software companies, is now almost freely available. The goal of this study is to propose a software licensing guide for the development of new business based on Open Source Software.

Introduction

After successful examples of international software companies based on Open Source Software, the theme started to draw an even bigger attention (OSBC, 2006; Goth, 2005). What really motivated someone to startup a new business based on Open Source Software as its basis and which would be the advantages of such choice had become an interesting topic in the software market (Rossi, 2005). Companies such as Red Hat, MySQL, Trolltech and JBoss had already presented excellent market results (MySQL, 2004; Marson, 2005). It is still worth pointing that there exist some difficulties for using Open Source Software in a commercial manner. Among those difficulties, two of them deserve a special attention. The first - and probably the most important - is the understanding of the market niches where Open Source Software is a viable solution (Karels, 2003; Softex, 2005). The second difficulty - mostly related to the entrepreneurs used to the traditional software marketing model where the main revenue source is only the licensing of the software and its source code, in this model, is considered secret - is to understand how to be profitable giving away what was considered to be the raw material of the company (Goldman, R. and Gabriel, R. 2004). Along these dilemmas, it is already known that licenses play an important role on the success of the Open Source project (Krill, 2006). This work is an initial atempt to address this issue, aiming to propose a licensing guide for startups based on products or services that are Open Source. The article is structured as follows. In the second section, every variable used for the development of the business model guide is defined. In the third section, a case study is presented to support the guide that is present on the fourth section. Apart from presenting the proposed guide, the fourth section explains the details of each possible path that exists on the model. The last section draws out future works and trends in open source business models field.

Licensing guide variables

In this section we define the variables that were used to build the proposed guide on the last section: location (Lins, 2005), business strategy and licensing model. Location in the Products x Processes matrix

The first variable analysed on this study is the location of product or service in the Products x Processes matrix (see Figure 1). This matrix was first developed in a study by Hayes and Wheelwright (Hayes, 1984) and further tailored for software market (Lins, 2005).

Continuous flow

Packaged Software / Components Software Factory / Corporate Solutions Custom Software / Solutions Software as Consulting Services a Service

Discontinuous Low sales volume High sales volume flow Low Standardization High standardization Products developed upon Low price products request

Figure 1. Products x Processes matrix

In the Figure 1, the horizontal axis of the matrix represents the sales volume of the product (or the transaction volume in the case of services), the left boundary of the matrix represents products with low sales volume and low standardization, on the right boundary are represented products with high sales volume and high standardization. The production process is represented on the vertical axis of the matrix. The upper boundary represents the products that have a continuous development flow and on the lower boundary the products with a discontinuous development flow. In Lins' research, there were only three categories identified, the only change made to the matrix found on Lins' research was the inclusion of a new Software as a Service category that represents software that are not sold but charged per usage without providing any binary or source code at all, it is actually renting the software without providing any form of ownership.On the next paragraphs, each category is analysed. Packaged Software and Components Packaged products are usually commodity software, they can be components on a bigger product or software that are embedded on devices. SGBDs, operating systems, application servers and routers are all part of this category. Software Factory and Corporate Solutions Products that are identified as Corporate Products usually have characteristics of Packaged Software but they have a potential for personalization. Custom Software and Consulting Custom Software has the characteristic of having low sales volume and low production volume, they are usually made upon request and are highly customized for each customer. In certain cases they are even sold along with the source code. Software as a Service Software as a Service is a direct consequence of the focus shift on the value chain on computing market as pointed by O'Reilly (2005) “software itself is no longer the primary locus of value in the computer industry. The commoditization of software drives value to services enabled by that software”(O'Reilly, 2005 pp. 468). He also adds that “New business models are required” (O'Reilly, 2005 pp. 468). Those new business models are built around the easy medium that Internet provided for offering such services. Business Strategy The second variable analysed on this study is the strategy used, a research from ITManagers (Koenig, 2004) has classified the main strategies used with Open Source Software, those strategies are related to the means of revenues and the incentives that the companines provide to the software development. The following strategies were identified: Patronage Strategy in which a company gives incentives for the development of a Open Source Software project. The main reasons for a company to use this strategy are usually the attempt of imposing a de facto standard or the effort of commoditizing a software layer in order to spur competition and have revenue based on an upper layer above the one that was commoditized. Optimization The optimization strategy consists of using a commoditized software where the company can not have profits anymore and make the product have an optimal performance on top of that commoditized software. On closed architectures this strategy is impossible since changes on the underlying layer are not possible. The optimization strategy is based on the “law of conservation of modularity” from Clayton Christensen (Christensen, 2004) that says that either the integrated system or the subsystem needs to be modular in order to maximize the performance of the other system.

Dual Licensing

The dual licensing allows the possibility of the software to be used in an Open Source Software project without any costs but it imposes fees for the usage on closed source projects. Embedded Due to the fact that GNU/ is a very flexible, extensible and portable system, it can be found on more than a half of the embedded systems found in the market. This strategy consists on using the GNU/Linux as a basis for developing embedded devices. The exchange between the company and the software community will be heavily dependent on the licensing used on the embedded software. Subscription In this strategy the user is charged for a value added service and the software itself is freely distributed in order to increase the installed user base. Hosted Service providing companines that have their services based on Open Source Software are examples of companines that use the hosted strategy. The hosted strategy is characterized by companines that offer hosted services based on Open Source Software and charge for their usage or for something related to the usage.

Licensing models

The last variable of the model is the licensing model. Software licenses are the legal contract that dictates the rules for the user about the software usage and, if possible, the modification and redistribution of the software. Although the great majority of the users never knows what is expressed on them, this is a very sensitive subject for any company that is willing to make any commercial usage of the software, and knowing the terms expressed on it is pre-requisite for taking benefit of the software. Open Source licensing is often considered by some as radical, nevertheless they are built on solid legal foundations of law and copyright laws (Laurent, 2004). Among the main licenses, five of them deserve a greater analysis. The GPL and LGPL from the (http://www.fsf.org), the BSD based licenses, MPL based licenses and the Apache License 2.0. GPL The General Public License (GPL) (GNU, 2006), used on the majority of the GNU/Linux is also the most known license from Free Software Foundation, its main principle is that any modification made to a software governed by GPL that is distributed must be made available to the ones that receive that software. This imposition denies companines from hiding improvements done on top of GPL software. LGPL The imposition of releasing source code for any distributed software under GNU OS would be such a high barrier for commercial softwares to be written for it. For solving that issue, the Free Software Foundation developed another license, the Lesser General Public License (LGPL) (GNU, 2005). The main difference between the LGPL and the GPL is that the LGPL protects only the original code, making it impossible for closing its source code, but any code that only links to the original code can be licensed under any license. BSD based The Berkeley Software Distribution (BSD) license (OSI, 2006) is the most permissive Open Source Software license analysed on this study. It allows even that the software becomes part of another software without imposing any restriction at all. For such reason, it is very common to find commercial projects that are based on a BSD licensed software. The only restriction and that was found only on the original BSD license was that the derivative work should contain an advertisement on the documentation metioning the original authors of the BSD licensed software. MPL based The Mozilla Public License (MPL) (Mozilla Project, 2006a), used on the Mozilla web browser, can be considered as the half way in between the GPL license and BSD license, it allows derivatives works that uses software licensed as MPL. The only imposition is that modifications made to the original code needs to be published, any other code that is not part of the original product does not need to be published (Mozilla Project, 2006b). Apache License 2.0 The main difference between the Apache 1.0 license and the 2.0 was the inclusion of clauses to cover patent aspects (OSSWatch, 2006). The rules that govern the Apache are basically the same of the BSD based licenses, it even allows code to be included in a closed source product. The table 1 compares the licenses above. Table 1. Comparison between the analysed licenses. Demands redistribution of source code License Original Derivative That makes usage GPL Yes Yes Yes LGPL Yes Yes No BSD No No No MPL Yes No No Apache v2.0 No No No

Case study

In order to validate the proposed guide, some of the most known softwares companines that uses or develops Open Source Software were analysed. The companies were: former JBoss (now part of Red Hat), Covalent, Wind River, MySQL AB, Cyclades and Google. Each of them will be analysed on the following pages.

JBoss

JBoss (http://www.jboss.com) is most known for its Java products. The product line of JBoss range from persistence tools for Java to a full fledged J2EE Container, the JBoss Application Server. The source of revenue for JBoss is the consulting provided for its products, the suport services and the professional certification services offered. The majority of JBoss products are licensed under LGPL license. In relation to the model, JBoss is located on Consulting Services since it both provider services in the form of consulting and support. The strategy identified is a mix of Subscription and Consulting.

Covalent

Covalent (http://www.covalent.com) offers both services and products. The services are support and consulting for the whole line of products from the Apache Foundation (http://www.apache.org). One of the products Covalent offers is Enterprise Ready Server (ERS), a bundle with a customized Apache HTTP server and Jakarta Tomcat server preconfigured with support for ASP, , PHP, JSP and other features without requiring almost any configuration. Covalent is placed on two locations in the matrix, Consulting Services and Packaged Software. The strategies identified from the ITManagers research were Optimization and Subscription. The license used in the ERS is a closed license and the product is based on Apache products licensed under Apache License.

Wind River

Wind River (http://www.windriver.com) focus is on developing software for devices. In 2003, it announced their line of Linux solutions in response to demmand from customers. The move to Linux based products has not discontinued the development of it proprietary software, the VxWorks. According to the Wind River CEO, Ken Klein, some companies have still some concerns about using GPL in its products. The products and services Wind River provides are both located on the Consulting Services/ Custom Software in the matrix. The strategies identified were consulting, optimization and embedded. And the license when developing for the Linux kernel is GPL as it is imposed by the rest of the Linux kernel.

MySQL AB

The swedish MySQL AB (http://www.mysql.com), owner of MySQL one of the best known Open Source SGBDs, has three revenue sources: licensing of the MySQL server, support and consulting services, and licensing of the usage of the MySQL trademark. The MySQL server uses a dual licensing scheme that results in virtually two distinct products that share the same source code. The free version is available only for non-commercial softwares developed under GPL license. The reason for having that license scheme is to increase the installed user base and serve as and advertising of the product. MySQL AB is located on the Consulting Services and Packaged Software inside the matrix. Two strategies could be identified, Consulting and Dual. MySQL server licensing is based on two licenses, a traditional one and GPL.

Google

Google (http://www.google.com), owner of the most used search engine on the Internet (Sullivan, 2006) that carries the same name as the company, runs its services on a tailored and enhanced version of the Red Hat Linux server. Google developed its own file system on top of the Linux kernel called Google FS (Google, 2003) and since there is no product distribution, it does not need to release the source code for its enhancements. Google business are located in the Software as a Service location in the matrix, the strategy identified was Hosted and the licensing of the software used in Google search engine implementation is partly based on GPL licensed software, the Linux Kernel.

Cyclades

Cyclades (http://www.cyclades.com) specializes on the development of network devices as routers, keyboard-video-monitor (KVMs) for remote management and other devices. Cyclades products are based on Linux in order to reduce the production costs and make the product development easier. It is worth mentioning that the software used on Cyclades products are all tailored in order to run on limited hardwares. Since part of the software used on Cyclades products are based on GPL licensed products, Cyclades customers can download the source code for such products on Cyclades web site after filling in a form and verifying that they own a Cyclades product. The location on the matrix identified for Cyclades was Components, the strategy is embedded and the licensing is a mix of traditional and GPL but for this study we are going to take only GPL into consideration. The table 2 summarizes the case study. Table 2. Case study summary. Company Location Strategy Licensing JBoss Consulting Services Consulting LGPL Subscription Covalent Consulting Services Optimization Traditional (Based Packaged Software Subscription on Apache licensed software) Wind River Consulting Services Consulting GPL** Custom Software Optimization Embedded MySQL AB Consulting Services Consulting GPL, Packaged Software Dual Traditional Google Software as a Service Hosted GPL Cyclades Components Embedded GPL, Others*** *Only for ERS **Software developed for Linux kernel ***Traditional licensing and other OSS licenses Guide

Based on the case study presented on the previous section and on the characteristics of each license analysed, a licensing guide was developed to help the establishment of new business based on Open Source Software.

Products

For business models with revenue based on products, the following alignments on the Products x Processes matrix were identified: Packaged Software, Components, Software Factory, Corporate Solutions and Custom Software. For the alignments above, the following strategies were identified: Packaged Software and Components Since there is a high standardization and sales volume on both Packaged Software and Components, it is common to find the following strategies associated to those alignments: Dual, Embedded and Optimization. In the dual strategy, the Open Source licensing plays an importante role since it is used both for increasing the installed user base and for stimulating the enhancement of the product through the community. The main source of revenue will be the licensing of the software for commercial usage. For using the embedded strategy, the pre-requisite should be a product that is a device and use some Open Source software on it. It is worth mentioning that if the product is based on a GPL licensed software, the code needs to be available at least to the customers of the product. The Optimization strategy is based on offering some competitive advantage for your product to be run on top of an Open Source stack, the benefit for the customer is having a better product while for the community is having more compatible applications. Software Factory and Corporate Solutions A possible strategy for this alignment is Patronage, because the products offered on those alignments are always highly customizable products that usually share its core with another products. According to West (2003), IBM found out that working with a large user driven environment such as Apache project is more flexible than using someone else solution. This impression was a result from the experience it had with the Apache project group when it founded the porting of the Apache Server to Windows NT in order to accomplish its Websphere Server market strategy (Moltzen, 1998). IBM experience with the Apache foundation is a sample of the Patronage strategy. Custom Software In the Custom Software alignment, where there is a low production and sales volume but associated to high cost products, it is common to find the Optimization strategy, where the company builds a product on top of a commoditized Open Source layer.

Services

For service based business models, the Consulting and Software as a Service alignments were identified. The next step is to analyse the strategies compatible with each alignment. Consulting Services The Consulting Services alignment has the characteristics of having low sales volume associated to a high value per service. The strategies Consulting and Subscription are commonly associated to this alignment. In the consulting strategy, the company usually offers a transition path from a proprietary solution to an Open Source solution, companines that offer those services usually sponsors or have agreements with projects or Open Source companies. The other strategy used for this alignment is Subscription, where the company produces a software and releases it without any charges, but instead charges for services such as updates or consultancy for improving the usage of the product. Software as a Service Associated to the Software as a Service alignment it is possible to identify the Hosted strategy, since there is a high volume of transactions associated to a low charge per transaction. Another characteristic of this alignment is not having software distribution but instead offering the product and charging for its usage. Licensing Guide The last step in the development of the guide is to stablish a relation between the strategies and the licensing options. Dual The option of GPL on Dual licensing is to stimulate non commercial software to use GPL licensing as well. Embedded The GPL license is only mandatory in the case of developing against the Linux Kernel or other GPL licensed software, in case it is a user space software running on Linux or any other case, there is no imposition of using GPL license, any less restrictive license can be freely used. Optimization The same rule applied for Embedded can be applied to the Optimization strategy, GPL is only mandatory when linking directly to GPL licensed code. Patronage The license that fits for Patronage strategy is MPL since it protects only the original code and allows the development of proprietary extensions. Apart from MPL, there is the possibility of using a more restrictive license as LGPL or even GPL, the only diference is that there will be the necessity of an extra contract between the company and the developers that states that the company has the rights of usage of the code under a different license or product. An example is the scheme used on OpenOffice by Sun Microsystems, where there is a Joint Copyright Agreement associated to the LGPL licensing (Vaughan-Nichols, 2005). Consulting There is no license associated with Consulting strategy since there is no direct association between the service and the code. Subscription The Subscription strategy is usually used in conjunction with licenses from the Free Software Foundation such as the GPL and LGPL. The reason for using such restrictive licenses is avoid closed forks that diminish the project energy. Hosted There is no imposition on the license for usage in the hosted strategy, but the usage of GPL code to build the software deserves special attention on this strategy because there will be no imposition for sharing the modifications due to the fact that there will be no code distribution.

Licensing Guide Summary

The first step in order to find out the business model is deciding the main source of revenue. The next step is to identify the location of the product or service on the Products x Processes matrix, after that, it is time for picking one of the possible strategies, the last step is to pick the licensing model when there is more than one possibility. Figure 2. Guide for using Open Source Software.

GPL + Dual Commercial Packaged Software Embedded Dependent Components Optimization Dependent Software Factories Products MPL Corporate Solutions Patronage BSD Professional LGPL Services Optimization Dependent Software as Hosted Any a Service Services Consulting Consulting Any Services LGPL Subscription GPL

Conclusion

This article makes a contribution to the development of a guide for startups enabling a overview on the aspects involved in establishing a licensing schema and business strategy. Further investigation as well as an increase of the scope of the Case Study is needed in order to improve the accuracy of the preliminary license guide. Another key aspect for improving this research is to do a deeper analysis on the Business Model subject, since it plays an important role in the overall scope of this research. References

Christensen, C (2004) Clayton Christensen on Disruptive Technologies and Open Source. Paper presented at Open Source Business Conference 2004, San Francisco, CA. Retrieved September 15, 2006 from http://www.windley.com/archives/2004/03/16.shtml GNU (2005) GNU Project – GNU Lesser General Public License. Retrieved September 15, 2006 from http://www.gnu.org/copyleft/lgpl.html GNU (2006) GNU Project – GNU General Public License. Retrieved September 15, 2006 from http://www.gnu.org/copyleft/gpl.html Goldman, R. and Gabriel, R. 2004 Innovation Happens Elsewhere: How and Why a Company Should Participate in Open Source. Morgan Kaufmann Publishers Inc. Goth, G. 2005. Open Source Meets Venture Capital. IEEE Distributed Systems Online pp 2, 6(6). Hayes, M. & Wheelwright, S. C, 1984. Restouting our competitive edge: competing through manufacturing. New York: Wiley. J. West. How open is open enough? melding proprietary and open source platform strategies. Research Policy, 32, 7 (July 2003), 1259--1285. Karels, M. J. 2003. Commercializing Open Source Software. Queue 1, 5 (Jul. 2003), 46-55. Koenig, J, 2004. Seven open source business strategies for competitive advantage, ITManagers. Retrieved October 16, 2005 from http://management.itmanagersjournal.com/management/04/05/10/2052216.shtml?tid=85 Laurent, A. M. 2004 Understanding Open Source and Free Software Licensing. O'Reilly Media, Inc. Lins, T. S. A, 2005. relação entre a implantação da qualidade de software e o ciclo de crescimento empresarial: proposição de um modelo baseado em um estudo de casos reais (pp 11). Centro de Informática – Universidade Federal de Pernambuco, Graduation thesis. Marson I, 2005. JBoss hints at financial success, ZDNET UK. Retrieved October 9, 2005, from http://news.zdnet.co.uk/software/applications/0,39020384,39189951,00.htm Mozilla Project (2006a) Mozilla Public License. Retrieved September 15, 2006 from http://www.mozilla.org/MPL/MPL-1.1.html Mozilla Project (2006b) Mozilla Public License FAQ. Retrieved September 15, 2006 from http://www.mozilla.org/MPL/mpl-faq.html MySQL, 2004. Leading Open Source Software Companies MySQL AB, Sleepycat Software and Trolltech AS Prove Strength of Dual-License Model. Retrieved October 9, 2005, from http://www.mysql.com/news-and-events/press-release/release_2004_10.html O’Reilly, T. (2005) The Open Source Paradigm Shift. In Perspectives on Free and Open Source Software, eds. Feller et al., MIT Press Rossi, C. and Bonaccorsi, A. 2005. Why profit-oriented companies enter the OS field?: intrinsic vs. extrinsic incentives. In Proceedings of the Fifth Workshop on Open Source Software Engineering (St. Louis, Missouri, May 17 - 17, 2005). 5-WOSSE. ACM Press, New York, NY, 1-5. OSBC (2006) Open Source Business Conference. http://www.osbc.com OSI (2006) Open Source Initiative – The BSD License. http://www.opensource.org/licenses/bsd- license.php OSSWatch (2006) Apache License (v2) – An Overview. Retrieved September 15, 2006 from http:// www.oss-watch.ac.uk/resources/apache2.xml Vaughan-Nichols, S, 2005. Sun Changes OpenOffice.org Licensing, eWeek. Retrieved January 15, 2006 from http://www.eweek.com/article2/0,1759,1855892,00.asp Round Table 1: Extending the domain of intervention of Forges: Chair Philippe Aigrain (Sopinspace), François Elie (ADULLACT) Philipe Aigrain submited some very interesting Introductory notes

Introduction

● Getting forges right : everybody does not do everything + collaboration is rare at the level of the most individual components (because of coordination costs). This said, software forges remain a remarkable of a mature environment for productive collaboration. Main reason seem to be : clear aim + good distribution of tasks (with roles that fit various degrees of involvement) + it's about tools

● Extending to other creative or innovative endeavors is hard (many failure stories).

● Wiki-based encyclopedias: another success story based on similar factors (well-defined aim, ability to contribute separate bits, possibility to contribute at various levels, working model for discussing design decisions at a hierarchical level). ● Other example: collaborative Webzine publishing (SPIP, Drupal and C°), News forges (Slashdot, but with some requirements on number of participants).

● Some reasons for failures in generalizing: ● The “one reason” : contents that are more « personal », hard to properly articulate individual and collective. ● « Media » that are less modular (for instance narrative or argumentative) Examples: Voter Y, alternative treaty design, collective book writing from scratch (in contrast to revising collectively a book).

Going around the difficulties: ● Aggregating contents produced independently (Flick'r Creative Commons) ● Metadata forges (Technorati tags) ● An example from us at Sopinspace: Glinkr (www.glinkr.net) ● Combining personal expression and collective production

Some findings

The initial finding where: • people do not spontaneously collaborate to create a large document, or an action oriented plan, there is a need for an initial “”. In a nutshell, if somebody publishes a medium sized document, or a well documented action plan, and has access to a significative community, then a collaboration can start, and the “forge tools” become relevant. This was at first seen as a difference between “coding” and “other activities”, but if we “dig down” a little bit, it is quite similar to the situation in software creation. You cannot setup a forge environment and hope that everybody will connect to your site and start from scrach, you need to have a solid initial fondation. • The ration “lurker/active member” is probably different compared to sofware engineering, and a rather large “initial group” has to be “pulled together” to make a community site a success. • Workflows have to be adapted, and integrate things like: spell checking, rewriting for clarity, transcription from physical meetings, ... • There are a couple of ethnological/sociological studies done on “coder communities”, that could be relevant for the general community. Round Table 2: Getting Forges to collaborate, collaborating through forges: Chair Barbara Held, IDABC European Commission

Introduction on how, and why to get forges to collaborate, and the EU OSOR project.

Why is the Commission involved in eGovernment activities? ● The European Commission does not have a legal mandate by itself to become active in IT, issues of administration or eGovernment: These are clearly domains of the Member States. ● The Commission acts only on explicit request of the Member States in these areas. ● The justification for the Commission's activities are often deducted from core competences: e.g. completion of the single market, promotion of European competitiveness. eGovernment Action Plan i2010 ... puts a strong emphasis on "sharing" eGovernment applications and experiences – and on essential infrastructure services

IDABC is http://europe.eu.int/idabc

Why does IDABC promote OSS in Public Administrations? ● It's our mission: The "IDABC Decision" that implemented the programme requests: ● The dissemination of Good Practice in OSS ● The support of reuse and sharing of applications ● European administrations produce many customised applications to represent their processes. ● There are similar and could often be reused in localised versions. ● Why should the tax payer pay twice for the same? ● The openness of the code may lead to higher quality of the software. ● Development methods and the legal framework of OSS (licences) seem to fit well (better than anything else) to the requirements of cross-border collaboration.

OSS is not the only possible way to share eGovernment applications between Member states, but it is certainly an important one.

A key part of the IDABC strategy is the OSOR program.

The OSOR – objective To create a platform to actively support the sharing of OSS-based eGovernment applications and experiences across Europe – connecting EU services and Member States

The OSOR – first steps ● Study Pooling Open Source Software: provides requirements, success factors ● OSS Inventory: Taxonomy for applications ● Call for Tender: Framework Contract awarded to Unisys Consortium ● Feasibility study has started ● Implementation of a prototype (summer 2007) ● Workshops: Networking European national and regional repository projects

The OSOR – tasks ● Information platform ○ Website delivering news around OSS ○ Providing commented links to and information on other European collaboration platforms in Members States and EU institutions ● Platform for uploading and downloading ○ Registry and repository for download ○ Processes for up-loading ● Platform for collaboration ○ Supporting cross-border collaboration (technically and organisationally)

The OSOR – What is needed ...

● Technical Platform ○ Website with adequate tools (for search, retrieval, collaboration, publishing, versioning etc.) ● Governance/Management ● Steering Committee ○ Clearing Process ○ Common policies ● Publishing processes ○ Quality guidelines ○ Licences ● Support for collaboration ○ Networking / Displaying Communities ○ "Helpdesk" / Consultancy

The OSOR – Open Questions ... ● Common policies ○ Areas to be covered? ○ Compliance? ● Governance/Management ○ What should be the role of the Commission? ○ What should be the role of Member States? ● Services to be provided ○ Operational support ○ What else? Compatibility test? Certifications? ● Collaboration ○ Who should take part? Public administrations … ● Sustainibility ○ How to make the OSOR independent from IDABC Some findings of this session.

One market place does not fit all needs, but all market places gain from being interconnected. The Héphaïstos project (linking together Public Sector Forges) generates quite a lot of interest.

Forges have a “life cycle” and need to get over a “tipping point”, if you have only one or two projects in your forge partnering with other large forges is difficult, you might fear loosing control.

But sharing among more mature forges is definitivelly possible and attractive.

Technical barriers have to be overcome.

Sharing taxonomies, integrating bug and feature reporst systems, sharing news feeds, and of course language issues. Round Table: Forges the Next Generation: Chair Harald Wertz (University of Paris 8), Thierry Stoehr (AFUL)

Some Conclusions: Patrick Sinz

Initial Thoughts The conference is a success, and three findings seem obvious, or at least create a consensus among the participants. 1) It should be repeated next year, the subject matter is interesting and strategic. 2) Collaboration with teams, conferences, research centers who focus on software engineering should be strengthened, although the meeting between methods, technologies and services is a key feature of “forges”, and a driver in the Free and Open Source environment. 3) The participants of the conference are keen to collaborate and exchange on subjects like interoperability, taxonomies, business models, etc... We are at the beginning of an “era”, where “Forges” in various sectors and geographies will influence the way technology is built, and software market shaped. Three “business models” will shape the future of the Software industry. – closed source is not going away “Real Soon Now” – Service Oriented Architectures combined with Service Oriented Business models will rise (are risen) Google or Mappy are typical examples of this model (one built on many “open source” component the other on a more “closed source” environment. – Finally Free and Open Source solutions will rule a large part of the market, and the various participants will “meet” through forges. And one market place cannot fit all, many different solutions will have to be built and will compete.

Some notes on the evolution of technologies in the forge environment. The next generation of forges are “almost” there (until the next) currently although this “hurts a little” we have to admit that many forges are just used as “glorified FTP” servers, but as services are built upon the basic infrastructure of the forges, and the sophistication of the developers increases this changes. First the versionning system of the forges really becomes the versionning system of the project. Bugtracking and Feature Request management on the forge becomes an integral part of the development process. Collaborative tools like forum, wikis and instant messaging systems move from “internal tools” to “public tools” enabling non hierarchical teams to collaborate.

But the next generation will really accelerate the development and the quality of Open Source Solutions when the full development process will be supported, from modelisation to test case management, and this is “just around the corner”.

Unfortunately a “white elephant” was also in the room, this is the sustainable funding of forges.

Why do we need a business model for Forges ? The Free and Open Source Business Model needs a collaboration platform. ● Free and Open Source Solutions Users need to be able to “share the pain” of support: “if I invest the cost of distributing my software/contribution on a FOSS solution, I will win because the cost of maintenance and evolution will be shared between me and other users” ● Free and Open Source Solutions Providers need a “meeting point”: “If you use my software you will not be locked in because it is there out in the open, anybody can contribute (sort of)” ● Developers need a place to be recognized.

And Open Source “” cannot just be managed by “setup and pray”. ● A useful Forge is a critical path in the development and support process. ● Outsourcing it implies a trust in its reliability. ● A useful Forge delivers “trusted” software ● Most useful Forges have been and will be attacked! (warez distribution, and trojan distribution) A shared forge has two functions: ● Be a “developer magnet on the internet” ● Share the cost of maintaining a critical resource. It needs a full set of competence, and resources ● Platform Expert ● Content Architect ● Security Expert ● System and Network Support Technicians ● Customer support Technicians ● Redundant HW/SW architecture ● Bandwidth The Current business models are not really sustainable in the long term. The Generic Forges: sourceforge, BerliOS,... are paid for by Advertisement, this makes sense for a “glorified FTP server”, but not for a professional development environment. They are also supported by “forge software vendors”, but selling many forges is partly in contradiction with having a limited number of “easy to find” shared place of collaboration. The technology/Developer community driven Forges: objectweb, joomlaForge, ... are aid for by vendors, but the technology success does not necessarily translate itself in economical success for a specific vendor. Some User Community driven Forges: Spline (TU Berlin) are paid for by an institution, other Client/Business driven Forges: Adullact/Héphaïstos are paid for by a “client/business” community but these kind of Member sponsored forge may need to implement 2 class of users, if they are too successful. (since membership and use/participation in a specific technology is not the same) Conclusion: A sustainable business model for the support of collaboration in the Free and Open Source ecosystem implies a value proposition that at least part of the community recognize as something worth some money. Helping developers to focus on their core business while providing the intermediation between them and the necessary service providers seems the way to build a real market place. An historical parallel Once upon a time, farmers, craftspersons, artists, etc... had to travel from their place of residence to the adobe of their potential “clients”. Often the travel was dangerous, many indelicate people found it easier to attack the seller on the road rather than paying them. Latter some inventive feudal powers invented tolls, which was at first a less painful way to get to the money and goods of the sellers. This encouraged them to find other alternative roads to their customers. The more enlightened powers encouraged a service industry supporting the sellers and the clients in market town (and taxed them). The “owners” of the market town provided security and “brand recognition”

Some Participants notes

Oliver Berger: As you know from previous posts, I've attended much of the recently held Hephaistos Conference and presented there our experience on developping PicoForge and using Shibboleth.

I was glad to see many interesting presentations and talking to all these people interested in Forges, either promoters of the use of forges for specific communities, or developers/maintainers of forge software. I'd like to report on some of the bits I remember, hoping that others will provide their reports too. First of all, I think that the organizers have organized a great conference, for a first edition, and the only criticism may be that I would have preferred to be able to talk to more people from Mandriva there, as the conf was held at Mandriva's building. Too bad; maybe next time. Several people were present from Public Administrations or institutions related to this sector, who try and promote the use of the libre software model for developments in the public administrations sector : folks from the Adullact's forge, and Admisource, or the forge of the Junta in Extremadura, and also someone from Sweden, and even a representative of the IDABC program.

Being able to discuss questions of interoperability was the main topic of interest for me. Some elements have been shown like the search facility between partner forges implemented in GForge, and used by the GForges relating to PA projects, for instance (Hephaestos project at Adullact). Recent activity was made around this topic like discussions in the frame of the Overcrowded initiative, and older bits like our proposals for CoopX. I think we have discussed interesting ideas like using Web 2.0 techniques also in the forges, things like tagging / taxonomy for projects identification / categorisation, maybe. Some presentations have provided us with interesting perspective on software development and integration basing on J2EE components, and in particular on ObjectWeb's technology, like what the IRCAD did with xwiki and other tools. Also, using similar technologies were the elements proposed by the LibreSource project. I think they have developped a very interesting tool, which could be a solid basis for building forges of next generation, but whose paths for adoption by projects in the libre software communities remain to be confirmed. There would be lots of other elements to note, but I must say I don't remember that much at the present time, and let others contribute in the comments :-). Last, we mentioned that we (at GET/INT) are hosting a mailing-list which was setup in the frame of the overcrowded project, which is meant for improving contacts between people involved in the field of forges, like maintainers of several forge applications. At the moment there are mostly french-speaking people, but we're inviting all interested parties to join us! and maybe we can continue with discussions in English. You may contact me for more details.

And use the discussion list here: https://picolibre.int­evry.fr/wws/info/forges