Mobile Open Source Economic Analysis

LiMo Foundation White Paper August 2009

™ White Paper  Executive Summary

It is possible to use quantitative techniques to examine a number of the proposed economic benefits of open source . The claimed benefits are a reduction in cost of acquisition, access to innovation and cost of ownership of software technology.

The quantitative techniques we use to conduct our analysis are based on measuring source lines of code (SLOC) applied to publicly accessible open source project repositories. To aid our analysis, we have developed a command line tool to mine information on open source projects using the ohloh1 web service.

Based on this analysis, Based on this analysis, there is a strong case for constructive engagement with there is a strong case for open source communities where the corresponding open source software constructive engagement components are used within a collaboratively developed, open mobile software with open source platform such as the LiMo Platform™. communities... There is additionally a case for mobile software platform providers to consider using certain strategic open source projects as the basis for development of new functionality on their roadmap.

There is no proven case within this analysis for converting existing proprietary items already within a mobile software platform to open source. To conduct a cost-benefit analysis of that scenario would require examination of more factors than SLOC alone.

1http://www.ohloh.net

™ White Paper  1. Introduction

The subject of open source is increasingly important in relation to mobile device platforms and in view of this, it is vital to understand the underlying economic factors driving the use of open source software in a mobile context. This paper seeks to move beyond opinion-based debate, by identifying the economic case for open mobile platforms to acknowledge and embrace their use of open source software and to actively contribute back changes to open source components modified or adapted within their platform.

This white paper attempts to quantify and corroborate the benefits of using open Moving away from source software in mobile platforms in relation to key components which opinion-based lie below the mobile commodity line. This line, for our purposes, lies approximately conjecture towards around the UI framework level of a typical mobile software stack. Components data-based analysis below the line are considered for this analysis to be commodity software. Above the line lies the domain of differentiation. The approach we use involves applying economic cost-benefit analysis techniques where applicable in addition to citing relevant authoritative peer-reviewed material. The following areas of claimed benefit have been analysed in relation to open source mobile software components around or below the commodity line:

• Reduced cost of software acquisition

• Access to software innovation

• Reduced cost of software ownership

The analysis of this last area involves trying to quantify the cost to a mobile platform provider of failing to engage with upstream changes.

™ White Paper  2. Adopting open source to reduce the cost of software acquisition 2.1 The COCOMO model

The claim that adopting existing open source technology reduces the cost of software acquisition can be measured using the COnstructive COst MOdel2 (COCOMO) developed in 19813 The COCOMO Model, by Dr Barry Boehm4, Emeritus Professor of Software Engineering at UCSC and developed at USC a leading software engineering academic. COCOMO has since evolved into an and based on industry standard5 with respect to software cost metrics. The model computes the measurement of SLOC, cost of software development as a function of the total source lines of code (SLOC) is widely used of the corresponding components. for estimating COCOMO has been significantly refined since its inception to reflect the software costs. intervening changes in software development methodology and techniques, in particular to acknowledge more iterative approaches which better reflect modern development. The latest version of the model, COCOMO II, contains a number of further adjusting factors and according to the UCSC Center for Systems and Software Engineering:

“This new, improved COCOMO is now ready to assist professional software cost estimators for many years to come”6.

The approach taken by COCOMO II is twofold. First, a hierarchy of three different cost models (organic, semi- detached and embedded) is introduced which is designed to take into account the overhead of development depending on the type of project being analysed. Secondly, COCOMO combines the cost model with suitable annualized engineer cost/productivity figure to yield the equivalent cost of development within a typical software engineering context. These elements combine in a single regression function as follows:

Effort Applied = a(KLOC)b [man-months7]

Development Time = (Effort Applied)d [months]

People required = Effort Applied / Development Time [count]

2http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html 3Barry Boehm. Software engineering economics. Englewood Cliffs, NJ: Prentice-Hall, 1981. 4http://sunset.usc.edu/Research_Group/barry.html 5See US Govt Dept of Defense SoftwareTech estimation site: https://www.thedacs.com/databases/url/key/4 6http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html 7http://www.amazon.com/Mythical-Month-Essays-Software-Engineering/dp/0201835959

™ White Paper  The coefficients in this function vary according to the project type thus:

Software project a b c d Organic 2.4 1.05 2.5 0.38 Semi-detached 3.0 1.12 2.5 0.35 Embedded 3.6 1.20 2.5 0.32

(source: Software Cost Estimation With Cocomo II)

More detailed information on the COCOMO coefficients is available elsewhere8. For our purposes, COCOMO data can be viewed as a recognized and respectable starting point to begin an empirical examination of the potential benefits that open source offers for mobile platform providers in terms of the cost of software acquisition, access to innovation and cost of software ownership.

2.2 The application of COCOMO to open source software

The applicability of COCOMO models to open source software was introduced in an influential and well-regarded economic analysis, “Why Open Source? Look At the Numbers!” written by D. Wheeler in 20029 (and updated regularly since), which remains a widely cited10 paper in relation to the economics We used a loaded cost of of . The Linux Foundation commissioned some research11 in Oct 2008 $75,000 per engineer per updating Wheeler’s work. For the first calculation, they used the basic (i.e. “organic annum – the same figure used project”) COCOMO model applied to Fedora 9. Their choice of annualized salary by the Linux Foundation when figure was justified as follows: they updated Wheeler’s work.

“To calculate the costs for these distributions, a base salary was found for from the US Bureau of Labor Statistics. According to the BLS, the average salary for a US in July, 2008 was $75,662.0810. This was the salary amount used in our SLOC Count run … the programmer making the average US salary figure of $75,662.08 is actually costing the employer $97,604.08 in compensation alone. This is just one piece of the total wrap pie.”

8http://www.amazon.com/Software-Cost-Estimation-Cocomo-II/dp/0130266922 9http://www.dwheeler.com/oss_fs_why.html 10For example: http://abstract.cs.washington.edu/wiki/index.php/Open_Source_and_Search, 11http://www.linuxfoundation.org/publications/estimatinglinux.php

™ White Paper  Combining these factors and applying them to the Fedora 9 source base, the research calculated an equivalent development cost of $10.78 billion for 204.5 million source lines of code (or SLOC) or in other words, $52/SLOC for its development up to the current state. Table 1 shows the COCOMO figures taken from this paper and how they were arrived at by using the coefficients for an organic project.

Total Physical Source Lines of Code (SLOC) 204,500,946 Development Effort Estimate, Person-Years (Person-Months) 59389.53 (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05)) (712674.36) Schedule Estimate, Years (Months) 24.64 (295.68) (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) Total Estimated Cost to Develop $10,784,484,309 (average salary = $75,662.08/year, overhead = 2.40).

Table 1: SLOC and estimated production values for Fedora 9 (source: Linux Foundation)

For the Fedora 9 itself, the paper acknowledges that the “organic project” COCOMO model is not appropriate since:

“the Linux kernel code is typically more complex than an “average” application—among other things—it requires an analysis that goes beyond the basic COCOMO model. A user space application like Mozilla, for instance, is much easier to code line by line since it’s abstracted at a much higher level and has to handle far less tasks. A modern and enterprise-class kernel is asked to do a great number of extremely complex things, all at once.”

The paper moves on to indicate that an adjusted version of the organic project model is used which takes in the exponent value from the semi-detached project model instead. The result of this is an upwards revision of the equivalent cost of development of the 2.6.25 Linux kernel of $1.32 billion for 6.772 million SLOC or $202/SLOC for its development up to the current state. Table 2 shows the corresponding figures from the paper which details the use of adjusted COCOMO coefficients:

Total Physical Source Lines of Code (SLOC) 6,772,902 Development Effort Estimate, Person-Years (Person-Months)\ 7557.4 (90688.77) (effort model Person-Months = 4.64607 * (KSLOC**1.12)) Schedule Estimate, Years (Months) 15.95 (191.34) (Basic COCOMO model, Months = 2.5 * (person-months**0.38)) Estimated Average Number of Developers (Effort/Schedule) 473.96 Total Estimated Cost to Develop $1,372,340,206 (average salary = $75,662.08/year, overhead = 2.40).

Table 2: SLOC and estimated production values for Linux 2.6.25 kernel (source: Linux Foundation)

™ White Paper  Another way of arriving at a cost per SLOC figure would be to consider a similar mobile platform development initiative such as that of OS. In very rough terms using publicly available data, approx 100012 staff amortized over some 13 years from the Psion EPOC days built what is now,the modern Symbian OS. The result is of the order of 20 million lines of source according to the Symbian Foundation13. At an average loaded cost COCOMO gives us of $100,000 per resource, this equates to a development cost of $1300 million, which yields indicative figure of $50/ an equivalent figure of $64/SLOC.

SLOC for user side Interpolating between the COCOMO figures derived from the Linux Foundation and system code our further estimates, but with a slight bias towards the lower one as we are focusing on acquisition of primarily middleware/user-level code (albeit low-level/commodity) rather than kernel software, we arrive at an initial cost/SLOC factor of around $50/SLOC for our calculations in this paper. We believe that this figure can be reasonably applied to other mainstream open source projects of relevance to a mobile context in order to conduct a first order estimate of the cost of acquisition of their corresponding components. We will do that once we have addressed the issue of how to generate accurate information about component SLOC which we will do in the next section. 2.3 ohloh.net open analytics web service

The www.ohloh.net service was launched in 2007 with the specific aim of providing accurate and detailed software metrics on existing open source projects derived from data mining the corresponding open source code bases. In particular, ohloh yields extensive information about the evolution of corresponding SLOC for major open SLOC over the duration of a project’s lifetime. It is possible to do this with open source source projects can be project, because this information is available in the corresponding version control system obtained from ohloh logs. The ohloh service has compiled metadata on more than 300,000 major open web services. source projects including (among many others) GTK, GStreamer, WebKit and Android. 14 http://www.ohloh.net It uses a sophisticated source code parsing engine called ohcount for processing the corresponding source code available in a pubic repository; svn, cvs and git version control systems are all supported. The ohloh data is available through a comprehensive and well- documented free to use web service API15 once a visitor signs up for a developer key. Ohloh was recently acquired by SourceForge.16

All ohloh code metrics are accessible through a RESTful17 web service API which returns data as XML. In order to support our research for this paper, we developed and have open sourced a command line driven Python-based

12http://www.gillamorstephens.com/content/en/item_details_core.aspx?guid=AD2DB7B8-FE01-4DF8-A75C-492163FE94FD 13http://blog.symbian.org/2009/07/28/oscon-impressions/ 14http://labs.ohloh.net/ohcount 15https://www.ohloh.net/api/getting_started 16https://www.ohloh.net/announcements/sourceforge_acquires_ohloh 17http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm

™ White Paper  tool18 which is able to reap a variety of information about a particular open source project through the ohloh web service API, parse that information and format and present it in an auto-generated Excel spreadsheet. The results retrieved from this tool form the core of the analytical data in this paper.

The remainder of this section examines the code analytics for four important open source projects which are used within a GNOME based mobile Linux platform such as the LiMo Foundation Platform.

2.4 GTK analysis

Gtk (GNOME ToolKit) is the core application framework used in the LiMo Foundation Platform. It is a mature project which forms the basis of the GNOME Linux Desktop UI and has had over 700 contributors working on it over more than a decade. Using the pyohloh script, a graph illustrating the evolution of GTK over the past nine years to the present day can be generated as shown in Figure 1.

Figure 1: Graph of GTK code history over time (source: www.ohloh.net)

Currently, GTKcomprises some 600,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent engineering cost of $30 million to develop this technology from scratch.

Note the smooth gradient of this graph over the last decade. This is a clear characteristic It would cost $30M of community-grown source code. It evolves gradually in line with a spiral19 or iterative to develop GTK development approach. The dips and bumps noticeable on closer examination of the graph from scratch are not analysed in this paper but they would typically reflect refactoring activity20.

18http://sourceforge.net/projects/pyohloh/ 19http://www.computer.org/portal/cms_docs_computer/computer/homepage/misc/Boehm/r5061.pdf 20The history of the Telepathy open source project is a good example of this.

™ White Paper  2.5 WebKit analysis

WebKit is an open source web rendering engine used within the LiMo Foundation Platform. It is a mature project with 128 contributors working on it over several years. The project was kick-started in mid-2005 by an injection of code from Apple (who in turn had bootstrapped from the Konqueror KDE Desktop Browser) and has, since then, evolved through various versions of the Mac OS X Safari browser and other projects such as Google’s Chrome. Using the pyohloh script, we were able to generate a graph illustrating the evolution of WebKit’s code base over its lifetime. This graph is displayed in Figure 2.

Figure 2: Graph of WebKit code history over time (source: www.ohloh.net)

It would cost $89M Currently, WebKit comprises some 1.78 million SLOC. Using our $50/SLOC factor, this equates to develop WebKit to an equivalent engineering cost of $89 million to develop this technology from scratch. from scratch

™ White Paper  2.6 GStreamer analysis

GStreamer is a media framework for delivering video and audio and is used in the LiMo Foundation Platform. It is a mature project with over 420 contributors working on it since 2002. The code base has shipped in various mobile embedded devices including those based on ’s Maemo platform. Using the pyohloh script, we were able to generate a graph illustrating the evolution of the GStreamer code base over the course of its existence. This graph is displayed in Figure 3.

Figure 3: Graph of GStreamer code history over time (source: www.ohloh.net)

It would cost $45M Currently, GStreamer comprises some 911,000 SLOC. Using our $50/SLOC factor, this equates to develop GStreamer to an equivalent engineering cost of $45.5 million to develop this technology from scratch. from scratch

™ White Paper  2.7 BlueZ analysis

BlueZ is the standard Linux Bluetooth stack which is used as the base Bluetooth stack in the LiMo Foundation Platform. It is a mature project with 49 contributors working on it since 2002. The code base has shipped in various mobile embedded devices including those based on Nokia’s Maemo platform. Using the pyohloh script, we were able to generate a graph illustrating the evolution of the BlueZ code baseover the course of its existence. This graph is displayed in Figure 4.

Figure 4: Graph of BlueZ code history over time (source: www.ohloh.net)

Currently, BlueZ comprises some 105,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent engineering cost of $5.25 million to develop this technology from scratch.

2.8 Acquisition benefits for a mobile platform provider

As previously indicated, the four open source components analysed in this section (GTK, WebKit, GStreamer and BlueZ) are used within the LiMo Platform. Using the figures calculated above, the combined cost of engineering functionalities implemented by these four components alone from scratch comes close to $170 million. Note this figure does not include the cost of implementing dependencies.

™ White Paper 10 3. Adopting open source to enable access to software innovation

The total number of open source projects being undertaken globally at present is huge21. However, relatively few from this vast sea of potential will be both: a) active beyond a single developer and b) of direct interest to mobile device manufacturers today. Nonetheless, it is important to consider this backdrop Innovation flows from as a source of real innovation because what may appear to be an unimportant project today unexpected places may become of great significance in relation to future mobile technology in a relatively short period of time. A good example is WebKit - it has become the de facto standard web rendering engine on mobile devices within a few years of its inception. Rather than rejecting promising projects for being incomplete, significant cost savings may be possible by starting from the corresponding source base rather than beginning from scratch:

“The companies and individuals, who work on Linux-related projects,build this value profit by sharing the development burden with their peers (and sometimes competitors.) Increasingly it’s becoming clear that shouldering this research and development burden individually, as has done, is an expensive approach to building software.”22

There are numerous other examples that have evolved to become very important in a mobile context, from individual components (eg. BlueZ, OpenObex, D-Bus, Telepathy/Farsight) through to entire open source platforms (eg. Android, Maemo).

In this section we will examine the following three projects in greater detail:

• Clutter - open source, advanced UI framework being driven by Intel as a core part of their Moblin platform

• oFono – open source telephony framework being driven by Nokia and Intel

• GeoClue – open source location framework endorsed by GNOME Mobile

These projects have been chosen as purely indicative examples of innovative work that have the potential to be included as standard components in future mobile Linux devices. All these selected projects address areas of technology that are either below the mobile commodity line or are in the process of falling The mobile below it. Our analysis will focus on the development momentum behind these projects and commodity line is the potential saving to be gained from using the corresponding source code as a starting shifting upwards point for further development. It is also worth noting that engaging constructively with a major field of innovation may result in far greater commercial return than the raw offset in engineering cost. On the other hand, engineering cost is only one consideration in a decision of this nature; cost of technology evaluation, selection and engineering learning curve are also factors which we do not take into account here.

21ohloh alone indexes more than 300,000 projects 22http://www.linuxfoundation.org/publications/estimatinglinux.php

™ White Paper 11 3.1 Clutter analysis

Clutter is an open source library for creating fast, visually rich and animated user interfaces. It forms the basis of the advanced UI framework in Intel’s Moblin mobile Linux platform. It is a mature project that was started at leading UK-based open source development house, OpenedHand23 who have since been acquired by Intel24. Various blog posts by ex-OpenedHand staff suggest that significant development is being done around Clutter within Intel. Using the pyohloh script, we were able to generate a graph illustrating the evolution of Clutter’s code base over the course of its existence. This graph is displayed in Figure 5.

Figure 5: Graph of Clutter code history over time (source: www.ohloh.net)

The gradient of this graph suggests a project with significant development velocity $4.3M invested in (~35kSLOC/year), inferring it has not been materially affected by the Intel acquisition. This Clutter to date rate of development constitutes a substantial capital investment on the part of Intel and Clutter is clearly a project to keep an eye on.

Currently, Clutter comprises some 86,600 SLOC. Using our $50/SLOC factor, this equates to an equivalent engineering cost of $4.33 million to develop this technology from scratch.

23http://www.o-hand.com 24http://www.linuxtoday.com/developer/2008082802735NWHWSW

™ White Paper 12 3.2 oFono analysis

The oFono open source project was recently unveiled25 as a joint collaboration between Intel and Nokia and has generated significant interest in the mobile industry. The project aims to build a world class open source telephony stack for mobile Linux devices to be used in Intel’s Moblin platform as well as Nokia’s Maemo platform. Using the pyohloh script, we were able to generate a graph illustrating the evolution of oFono’s code base over the course of its existence. This graph is displayed in Figure 6.

Figure 6: Graph of oFono code history over time (source: www.ohloh.net)

The profile of the contribution curve indicates that this project was kick-started by a flurry $1.1M invested in of coding and possibly a code contribution. Since its inception, activity has returned to a oFono to date more characteristic open source development gradient. One other noteworthy point is that by examining the contributor data output by our script, we were able to confirm that a key contributor is Marcel Holtmann, who is also a lead committer to BlueZ. Information relating to top committers is highlighted in Table 3. Note that we have not refined our tool to examine commit sizes.

Contributor ID Account Name Contributor Name Man months Commits 1457041885407693 Denkenz Denis Kenzior 3 176 1457041885371924 Marcel Holtmann Marcel Holtmann 4 30 1457044032859329 ? Andrzej Zaborowski 2 20 1457041885368986 Rémi Denis-Courmont Rémi Denis-Courmont 2 14 1457041885412800 Akiniemi Aki Niemi 1 10

Table 3: oFono top contributors by commit (source: pyohloh)

Currently, the oFono project comprises 21,912 SLOC. Using our $50/SLOC factor, this corresponds to an equivalent engineering cost of $1.1 million to develop this technology from scratch. Clearly, in spite of the impressive commitment, oFono is at a very early stage at present judging by the evolution of the code base to date and the small number of continuously active committers.

25http://www.unwiredview.com/2009/05/12/oFono-nokia-intel-start-a-new-linux-project-against-android/

™ White Paper 13 3.3 GeoClue analysis

The GeoClueopen source project delivers a geographic information service via D-Bus to client side applications. The backend information can potentially come from a number of geo-information sources (eg. GPS or geoIP address). The project has been used to build utilities such as the Clutter libchamplain26 library and is a technology earmarked for future inclusion in the GNOME Mobilestack. Using the pyohloh script, we were able to generate a graph illustrating the evolution of GeoClue’s code baseover the course of its existence. This graph is displayed in Figure 7.

Figure 7: Graph of GeoClue code history over time (source: www.ohloh.net)

Note that from examination of other active open source projects, a plateau in terms of code activity is typically an indication of a stalled development rather than a sign that the project is finished. It turns $0.6M invested in out from looking at the contributor data that there is only one major developer, who does GeoClue to date not appear to be very active. This was a surprise given that GeoClue is a relatively high profile GNOME project. Nonetheless, it is valuable to learn this information.

Currently, the GeoClueproject comprises 12,338 SLOC. Using our $50/SLOC factor, this equates to an equivalent cost of $0.62 million to develop this technology from scratch.

26http://projects.gnome.org/libchamplain/

™ White Paper 14 4. Adopting open source to reduce cost of software ownership

Peer-reviewed literature exists27 to support the claim that maintenance costs dominate software total cost of ownership (TCO) but our aim is to support this claim by looking at commit data derived from actual open source projects. In this section, we will continue the forensic analysis of the code base of two of the same open source projects we examined in section 2 using the output of our pyohloh script to obtain further information about the number of developers working on these projects, their commits over time, the proportion of changes that constitute maintenance and the corresponding proportion that could be considered as original development.

27http://users.jyu.fi/~koskinen/smcosts.htm

™ White Paper 15 4.1 GTK analysis

In relation to the GTK code history graph highlighted in Figure 1, an important milestone of note was the release of GTK v2.1228 in Sept 2007. Since that release, as Table 4 illustrates, GTK development has continued. For a platform that forked GTK 2.12 and chose not to update it with upstream changes this further GTK development can be considered to constitute unleveraged potential. We can quantify the delta to yield an upper bound of the value of those subsequent upstream contributions. Note that there is no easy way to differentiate between maintenance and new features within unleveraged potential; both are form part of the ‘forking tax’ the platform provider incurs by ignoring upstream.

Month Code Comments Blanks Commits Man Months Delta Man Months 01-09-2007 502697 96897 110101 12140 2752 22 01-10-2007 503262 96956 110247 12195 2771 19 01-11-2007 504825 97593 110449 12290 2799 28 01-12-2007 543111 103835 118349 12435 2844 45 01-01-2008 543764 104006 118521 12520 2868 24 01-02-2008 544540 104081 118681 12623 2889 21 01-03-2008 532912 101912 116092 12788 2924 35 01-04-2008 533430 101959 116160 12833 2941 17 01-05-2008 535693 102461 116699 12991 2976 35 01-06-2008 529833 102411 115387 13359 3021 45 01-07-2008 540055 103289 117049 13505 3059 38 01-08-2008 541936 103932 117410 13701 3098 39 01-09-2008 543399 104243 117692 13824 3131 33 01-10-2008 544106 104206 117846 13916 3156 25 01-11-2008 545548 104931 118125 13978 3173 17 01-12-2008 549378 106453 118903 14165 3194 21 01-01-2009 553943 107572 120014 14433 3219 25 01-02-2009 553931 107685 120011 14572 3245 26 01-03-2009 554353 107715 120101 14611 3262 17 01-04-2009 555396 107830 120303 14672 3286 24 01-05-2009 558435 108139 120894 14741 3313 27 01-06-2009 560721 108810 121376 14860 3341 28 01-07-2009 563135 109647 121930 14957 3361 20

Table 4: GTK month by month commit details since release 2.12 (source: pyohloh)

We can use the data in Table 4 to quantify this unleveraged potential in two ways. First we can look at the delta code size and associate an engineering cost to it. Secondly we can look at the time spent in terms of delta man- months.

28http://mail.gnome.org/archives/gtk-devel-list/2007-September/msg00052.html

™ White Paper 16 In terms of delta code size, GTK’s code size has increased from 502697 in Sept 2007 to 563135 in July 2009. Using our $50/SLOC factor, this equates to an engineering cost of $3.02 million to develop this technology independently. This figure is likely to be on the low side because GTK was already substantially advanced in Sept 2007, so any work to enhance/modify it would be complex by nature.

In terms of delta man-months, it is worth noting that the delta man-month column numbers remains on a constant curve highlighting the maintenance burden. The overall man-months spent on the project between GTK 2.12 to the present went from 2752 to 3361, that is 609 man-months. Using the earlier Unleveraged potential figure of $75000 per developer per year, this equates to $3.8 million unadjusted and$9.12 cost of $6M for GTK million using the COCOMO 2.4 overhead factor. Averaging between the two results gives within a 2 year period us a conservative estimate of $6 million of unleveraged potential between GTK 2.12 and GTKcandidate 2.18. This finally puts a figure to the price of forking GTK from 2.12 and not synchronising/ engaging with upstream development from that point. If a decision to synchronise is made later, there will be an additional re-engineering cost to make this happen.

If we were to move a year along the development curve, we reach GTK 2.14 released in Sept 200829. Using the same approach as above, we go from 3131 to 3361 = 230 man months of unleveraged potential since that point to the current version of GTK at the point of writing. This corresponds to over a third the full unleveraged potential between Sept 2007 and now or $2.3 million.

As a final data point, we can look at the corresponding developer activity graph in Figure 8 which gives us a relatively constant maintenance load of around 25 resources per year committing to the GTK mainline over the last couple of years (equating to approx $1.8 million/year). This graph was generated from the delta man-months column in Table 4.

Figure 8: Graph of GTK developer activity over last two years (source: pyohloh)

29http://mail.gnome.org/archives/gtk-devel-list/2008-September/msg00024.html

™ White Paper 17 4.2 WebKit analysis

The WebKit code history graph highlighted in Figure 2 clearly shows the point in mid-2005 when Apple announced the open sourcing of WebKit. Since then, the graph has illustrated the characteristic upward curve of a healthy open source project where code is being continuously evolved, enhanced and added to. In fact, WebKit is an interesting open source project in that it doesn’t operate fixed releases at all but is available as a continuously moving svn codeline. Nonetheless, if we were to take a fork at Nov 2007 when HTML5 Media support was added30, as Table 5 illustrates, a manufacturer building on the Nov 2007 base Unleveraged potential and not updating with subsequent changes potentially missed out on 1786845 – 1016544 = cost of $44M for 770,301 delta SLOC worth of unleveraged potential. Using our $50/SLOC factor, this equates WebKit within a to an equivalent cost of $38.5 million to develop this technology from scratch on top of 2 year period the WebKit v1.0 source base. As with our GTK analysis above, we can look at the delta man- months too and Table 5 shows 7394 - 4109 = 3285 man months of presumed beneficial evolution of the source base. Using the earlier figure of $75000 per developer per year, this equates to $20.5 million unadjusted and $49.3 million using the COCOMO 2.4 overhead factor. Averaging between the two results gives us a conservative estimate of $44 million of unleveraged potential between WebKit at Sept 2007 and the current WebKit head revision.

30http://webkit.org/blog/140/html5-media-support/

™ White Paper 18 Month Code Comments Blanks Commits Man Months Delta Man Months 01-11-2007 1016544 322867 254297 28684 4109 134 01-12-2007 1044098 328069 259268 29508 4252 143 01-01-2008 1071166 340421 264312 30372 4390 138 01-02-2008 1457449 386764 302370 31175 4530 140 01-03-2008 1483810 399789 306929 32074 4667 137 01-04-2008 1506161 404098 311594 33040 4806 139 01-05-2008 1527013 407883 316609 33924 4949 143 01-06-2008 1534969 408878 317762 34698 5091 142 01-07-2008 1551059 412792 320853 35410 5235 144 01-08-2008 1566371 414854 324016 36124 5367 132 01-09-2008 1595954 420546 329897 37325 5528 161 01-10-2008 1607483 422141 332694 38364 5696 168 01-11-2008 1623679 426169 336570 39241 5857 161 01-12-2008 1642825 428862 339465 40070 6007 150 01-01-2009 1678058 440574 348052 41194 6200 193 01-02-2009 1691983 444456 351355 42064 6380 180 01-03-2009 1709246 448433 354963 43086 6573 193 01-04-2009 1731266 452352 359429 44230 6784 211 01-05-2009 1747419 455081 362749 45309 6988 204 01-06-2009 1816948 460252 371543 46488 7216 228 01-07-2009 1786845 469754 378470 47106 7394 178

Table 5: WebKit month by month commit details Nov 2007-July 2009 (source: pyohloh)

™ White Paper 19 The unleveraged potential figures are so huge for WebKit that it is clearly very important for an OEM to have a maintenance strategy in place up front if they want to include WebKit in their product. This is clearly visible by looking at the corresponding developer activity graph shown in Figure 9 showing an amortized maintenance load of approximately 200 engineers per year (equating to approximately $15 million/year). This graph was generated from the delta man-months column in Table 5. Note that the curve is on an upward gradient demonstrating that WebKit is gaining developer traction.

Figure 9: Graph of WebKit developer activity over last two years (source: pyohloh)

4.3 Maintenance of open source software and community engagement

Using the figures we have uncovered, it is possible to make some quantitatively-backed statements regarding open source software cost of ownership and the related economic benefits of engaging with corresponding open source projects and communities. We can now support the following assertions:

• Healthy open source projects have a characteristic progressive cost profile in relation to maintenance – in a sense, they’re never finished but continue evolving ‘upstream’.

• The cost of forking and losing connection with upstream development is twofold: i) the Emergency corresponding cost of presumed beneficial unleveraged potential, ii) the further cost of “deforking” also having to re-engineer modified forked code in the future to accommodate the inevitable incurs cost! eventual re-sync with upstream. We quantified the former to show that the figures run into $millions for important components such as GTK, WebKit, GStreamer and BlueZ.

™ White Paper 20 • Accommodating upstream development within the context of an open mobile platform is a key way to reduce the cost of unleveraged potential.

• It is important that mobile industry platform providers engage with the open source communities as early as possible so that platform maintenance strategy is fully aligned with the upstream development agenda of these communities, which is far more cost efficient than managing the entire maintenance burden in-house.

In practical terms, a strategy of engagement is bilateral. It involves actively working patches back into community source and trying to influence the direction of the project.

Nevertheless, we have to acknowledge the reluctance on the part of some major mobile industry players to depend on an unpredictable and intangible community for a key deliverable when mission critical commercial release is at stake.

We also need to understand that the benefits of community engagement are not immediately visible or linear – engagement for purposes of strategic alignment with product development is likely to achieve measurable benefits only over the medium to longer term. It is more about investing in the relationship TCO control to gain future value. In any case, it is not possible to divest entirely of the need for requires upstream engineering resource – some engineers will always be required to integrate, test and modify, engagement but engagement does offer a mechanism for maximal gain from the community through maintenance and innovation beyond just initial acquisition. This gain is quantifiable and bounded in upper terms by the unleveraged potential figures. Nokia seem to understand this and have made some endorsements to this effect:

“The one who invests most has the biggest influence. If a company has a large group of developers, it will create more and better proposals and those proposals will take the day.” 31

Nokia’s open source site, http://opensource.nokia.com, is evidence of this in operational practice. There are some 24 showcase open source projects being sponsored and linked from that site including:

WebKit32: Port of WebKit to Nokia S60 platform

• PyS6033: Python interpreter for Nokia’s S60 platform

• Mobile Web Server34: Nokia’s port of Apache to S60 platform

31http://www.mobilemonday.net/news/nokia-finds-value-in-open-source-community 32http://opensource.nokia.com/projects/S60browser/index.html 33http://opensource.nokia.com/projects/pythonfors60/index.html 34http://opensource.nokia.com/projects/mobile-web-server/index.html

™ White Paper 21 4.4 Open source maintenance in a commercial mobile context

It is possible to use the ohloh web service to compare the maintenance of commercially driven open source developments such as oFono, Android and WebKit to more community-sponsored projects such as GTK, GStreamerand BlueZ. One characteristic difference between them is that commercially developed open source is often seeded by the injection of large quantities of code into an open repository. This can be clearly seen by examining Android’s code history shown in Figure 10. Note that the corresponding git repository for this code history is git://android.git.kernel.org/platform/bionic.git which consists of some 3 million SLOC mainly injected around two points during the past year.

Figure 10: Graph of Android code history over time (source: www.ohloh.net)

Maintenance beyond the point of injection typically continues to be undertaken mainly by the commercial entity itself. This was clear from looking at the WebKit contributor details – nearly all the top 25 contributors have Apple email addresses. It is also interesting to look at the list of Android contributors. The top 8 are highlighted in Table 8 below. The top committer by far is “The Android Open Source Project” which is a vehicle for an internal Google engineering team. By doing searches, we were able to determine that the remaining individuals involved also appear to be Google employees and quite probably the key gatekeepers for Google-driven commits to the project.35

35For example, Jean-Baptiste Queru can be confirmed as a Google employee here: http://www.linkedin.com/in/jbqueru Raphael Moll likewise here: http://www.linkedin.com/pub/raphaël-moll/0/2b9/2ab Xavier Ducrohet likewise here: http://www.linkedin.com/pub/xavier-ducrohet/0/265/4b7 and so on…

™ White Paper 22 Contributor ID Account Name Contributor Name Man months Commits 41650445438243 ? The Android Open Source Project 5 323 41650445447227 ? Jean-Baptiste Queru 7 56 41650445476011 ? Raphael Moll 2 36 41650445473806 ? Xavier Ducrohet 2 30 41650445473002 ? Android Code Review 0 28 41650445476007 ? Dianne Hackborn 3 28 41650445476012 ? Eric Fischer 2 22 41650445476006 ? Jorg Pleumann 2 21

Table 8: Android month by month commit details since release 3.22 (source: pyohloh)

The data suggests what many in the open source world already know from experience, namely that it is not easy to ‘dump’ commercially developed software as open source and expect to build a ‘Dumping’ is community around it quickly – the process is likely to take a very long time and requires inefficient. significant efforts to align with the interests of external developers who often have different motivations for getting involved. This should not be taken to mean that all such attempts are doomed, merely that they face formidable challenges as various commentators36 have identified.

What the data we have examined suggests is that if one wishes to engage the community to assist in maintenance, it is likely to be more effective in those cases where the corresponding components were community-created in the first instance as with GTK, GStreamer and BlueZ and even then, only if roadmap alignment can be achieved.

36http://mobileopportunity.blogspot.com/2009/06/symbian-evolving-toward-open.html

™ White Paper 23 5. Conclusions

• Where an open mobile platform is already using key open source projects of critical importance, there is direct economic value in constructive engagement with the corresponding open source communities. The rationale is that through such engagement the platform provider can reduce the cost of acquisition of future innovation as well as reduce the cost of maintenance of that software. The latter requires the platform provider to work collaboratively with the community to align upstream developments. Good candidates for projects that fall into this category are GTK, WebKit, GStreamer and BlueZ. Note that this does not mean a full reliance on the community, which may be untenable in the context of commercial predictability requirements, but a more blended approach. The practical details of how mobile platform providers and device manufacturers can effectively engage with existing open source communities and seek to minimize the cost of ownership of open source software will be the subject of a future LiMo Foundation White Paper.

• Where a technology lies below the commodity line and is already in a mobile platform in the form of a proprietary commercial implementation, open sourcing it is unlikely in the short term to build a significant community around that code outside of the organizations that built the software in the first place. Consequently, though it may be viewed as beneficial in terms of industry leadership and reputational value, it is not necessarily economically beneficial in the short term to open source the technology. The motivation to do so will be driven by non-economic factors such as a desire to see the technology more widely adopted or used. In the event that a proprietary technology is open sourced, it is essential that the platform provider has a practical community-building strategy to follow through on the act.

• Where a technology is falling below the commodity line and is not already present to some degree in a mobile platform, the platform provider should look to adopt relevant open source projects to reduce the cost of software acquisition and offer opportunities for further scale economies through strategic alignment with other open source based industry initiatives. Good candidates for projects of this type currently include the Clutter Advanced UI framework and Telepathy IM Communications framework.

• Where a technology lies above the commodity line, open source equivalents are of less strategic value to a platform provider. This is the area of competitive differentiation for OEMs and operators where their value propositions reside and where open source software tends in general not to offer a compelling technical alternative.

™ White Paper 24