<<

Masaryk University Faculty of Informatics

Distributed Systems: Comparison and Usage Evaluation

Bachelor Thesis

Supervisor: Mgr. Ond°ej Krají£ek Michal M¥rka Brno, spring 2006 Declaration

Hereby I declare that this paper is my original author craft, which I elabo- rated independently. All the sources and literature I used or drew from during elaboration are cited thoroughly with the full reference to the relevant source.

In Brno, on 16th May 2006 ......

1 Acknowledgements

I would like to express my thanks to my bachelor thesis supervisor Mgr. Ond°ej Krají£ek for his advices, comments, and technical support and to Petr Mikulík for proof-reading.

2 Abstract

This thesis describes version control systems that oer possibility to use a decen- tralized repository. It shows advantages, disadvantages and rarities of particular systems, depicts their special features. It also notes implementation and prob- lems which can appear during installation of selected systems. Two version control systems supporting distributed repository were picked for the detailed description, Bazaar-NG and .

Keywords version control system, , conguration manage- ment

3 Contents

1 Introduction 5 1.1 What is a Version Control System (VCS)? ...... 5 1.2 Example of work with a VCS ...... 6

2 General features 8 2.1 CVS ...... 9 2.2 Subversion ...... 9 2.3 GNU Arch ...... 11 2.4 Darcs ...... 14 2.5 Bazaar-NG ...... 15

3 Description of details 17 3.1 Installation notes ...... 17 3.2 Subversion ...... 17 3.3 Darcs ...... 19 3.3.1 Darcs theory ...... 20 3.4 Bazaar-NG ...... 21

4 Conclusion 24

4 Chapter 1

Introduction

Version control systems are tools helping to develop software systematically. When the count of developers increases and the number of lines of source code grows, they become essential. Sometimes it is suitable to split the development process into multiple parts, and develop sections of software project separately, or develop particular portion of software functionality by dierent ways and then after some time pick the nest one. That is where the distributed source code management comes in. This work deals with distributed version control systems with emphasis on the open source solution.

1.1 What is a Version Control System (VCS)?

The most frustrating software problems are often caused by poor conguration management. The problems are frustrating because they take time to x, they often happen at the worst time, and they are totally unnecessary. For example, a dicult bug that was xed at great expense suddenly reappears; a developed and tested feature is mysteriously missing; or a fully tested program suddenly doesn't work. Conguration management helps to reduce these problems by coordinating the work products of the many dierent people who work on a common project. Without such control, their work will often conict, resulting in problems such as:

Simultaneous Update  When two or more programmers work separately on the same program, the last one to make the changes can easily destroy the other's work. Shared Code  Often, when a bug is xed in code shared by several program- mers, some of them are not notied. Common Code  In large systems, when common program functions are mod- ied, all the users need to know. Without eective code management, there is no way to be sure of nding and alerting every user. Versions  Most large programs are developed in evolutionary releases. With one release in customer use, another in test, and a third in development,

5 bug xes must be propagated between them. If found by a customer, for example, a bug should be xed in all later versions. Similarly, if a bug is found in a development release, it should be xed in all those prior ver- sions that contained it. In larger systems with several simultaneous active releases and many programmers working on bug xes and enhancements, conicts and confusion are likely.

These problems stem from confusion and lack of control, and they can waste an enormous amount of time. The key is to have a control system that answers the following questions:

• What is my current software conguration?

• What is its status?

• How do I control changes to my conguration?

• How do I inform everyone else of my changes?

• What changes have been made to my software?

• Do anyone else's changes aect my software?

1.2 Example of work with a VCS

Here is a classic example of work with a VCS, which demonstrates customary command ussage (I'm using Subversion for this example). First a repository have to be created . It is a location, where your (and probably your fellow-programmer's) data will be stored and downloaded from. To create a local repository, run: $ svnadmin create /home/user/repo To see /home/user/svn directory, type : $ svn ls file:///home/user/repo This command returned nothing, because we haven't put anything into the repository yet. So create a checkout directory $ mkdir chdir; cd chdir Download the repository content to your checkout directory. This is the directory where you work with the source code etc. and as soon as you want to save your work, you send your changes to the repository. $ svn checkout file:///home/user/repo Checked out revision 0. $ cd repo

6 Modify the content $ mkdir test; cat > test/foo.txt A file to be added to the repository. ^D Does svn know we've just added a directory and a le? (Now I mean the local checked out copy of ours) $ svn status ? repo/test No, it doesn't. We have to tell svn you want the new le(s) to be added. $ svn add * A test A test/foo.txt $ svn status A test A test/foo.txt Now it does. Did anyone modify the repository content? It is strongly recommended to do this check before . Otherwise you can destroy the work of others. $ svn update At revision 0. Lets send our changes to the repository (don't forget to add the message describing what we've just done) $ svn commit -m "committing the test directory and the foo.txt file" Adding test Adding test/foo.txt Transmitting file data . Committed revision 1. Lets check for modications $ svn update At revision 1. View the the log $ svn log  r1 | user | 2006-05-12 16:51:59 +0200 (Fri, 12 May 2006) | 1 line committing the test directory and the foo.txt file  And nally list the repository content $ svn ls -vR file:///home/user/repo/ 1 user May 12 16:51 test/ 1 user 38 May 12 16:51 test/foo.txt

7 Chapter 2

General features

name status decentralization platforms notes CVS old no , w32 Use Subversion in- [3] stead. Subversion mature no unix, w32 same idea as CVS [4] GNU arch 1 mature? lines of development unix, maintained but not  tla can be shared among limited developed  current [5] independent reposi- support users are supported tories for w32 but new users should probably look else- where; complicated GNU Arch not yet lines of development unix, very new, but it is de- 2.0  revc mature can be shared among w32? signed by Tom Lord  [6] independent reposi- the original designer tories of arch  drawing on lessons from arch, , and bzr ArX mature? lines of development unix, genetically related to [7] can be shared among limited arch; simpler independent reposi- support tories for w32 DARCS not yet uniquely exible unix, cherry-picking is the [8] mature branching, merging, limited fundamental opera- and cherry-picking of support tion on which all selected patches for w32 workow is based; very easy to learn and easy to use bazaar mature lines of development unix a fork of arch for [9] can be shared among better user interface; independent reposi- a sister project to tories bazaar-ng bazaar-ng not yet lines of development unix, w32 newer; commercial [10] mature can be shared among background funding independent reposi- tories

Table 2.1: Key Features Comparison

8 2.1 CVS

CVS is enormously popular, and it does the job. In fact, when CVS was released, CVS was a major new innovation in software conguration management (see gure 2.1). However, CVS is now showing its age through a number of awkward limitations: changes are tracked per-le instead of per-change, commits aren't atomic, renaming les and directories is awkward, and its branching limitations mean that you'd better faithfully tag things or there'll be trouble later. Some of the maintainers of the original CVS have declared that the CVS code has become too crusty to eectively maintain. These problems led the main CVS developers to start over and create Subversion.

Figure 2.1: CVS architecture

2.2 Subversion

Subversion (SVN) is a new system, intending to be a simple replacement of CVS. Subversion is basically a re-implementation of CVS with its warts xed, and it still works the same basic way supporting a centralized repository (see gure 2.2). Like CVS, subversion by itself is intended to support a centralized repository for developers and doesn't handle decentralized development well; the svk project [19] oers a posibility to extend subversion to support decentralized development. From a technology point-of-view you can denitely argue with some of sub- version's decisions. For example, they don't handle as directly as you'd expect given their centrality to the problem. But technical advancement is not the same as utility; for many people who currently use CVS and just want an incremental improvement, subversion is probably more or less what they were expecting and looking for. But there are weaknesses, for example, Subversion doesn't keep track of which patches have already been applied on a given branch, and trying to reapply a patch more than once causes problems. Thus, subversion has trouble with history-sensitive merging of branches where the branches share parts (GNU arch doesn't have this problem, because it does track what merges have been applied). In 2004 there were concerns by some about Subversion's use of db to store data (rather than the safer at les), since in a few cases this can let things get stuck. In practice this doesn't seem to

9 Figure 2.2: Subversion architecture

be so bad (in part because the data can be extracted), but certainly some are concerned. In newer versions, there is a database backend called fsfs which uses at les. The fsfs backend was created because subversion had some problems with the DB backend in debian-installer (a fairly large repository), fsfs works without any problems in that case. Subversion uses a BSD-old-like license that, while OSS/FS1, it is not compat- ible with GPL, and that's unfortunate (GPL incompatibility can be a problem). Subversion can be used to maintain GPL software or any other kind without restrictions. Subversion depends on a large number of libraries and programs (and can be perceived as rather heavyweight), so it can take some eort to install currently; distributions will probably be quick to include it, so that problem should go away relatively soon. Version control with Subversion [15] gives more information about it. If you're using CVS and want a simple upgrade path to something better,

1OSS/FS stands for Open Source Software and

10 Subversion appears to be the simplest approach. It works in a very similar way to CVS (in particular through a centralized repository), allowing any of the authorized developers to immediately modify a shared repository (with a record that it was done so and rollback capability). Subversion is what it intends to be: an improved CVS.

2.3 GNU Arch

GNU Arch is a very interesting competitor, and works in a completely dierent way from CVS and Subversion. GNU Arch is released under the GNU GPL. GNU Arch is fully decentralized, which makes it very work well for decentralized development (like the kernel's development process). It has a very clever and remarkably simple approach to handling data, so it works very easily with many other tools. The smarts are in the client tools, not the server, so a simple secure ftp site or shared directory can serve as the repository, an intriguing capability for such a powerful SCM system. It has simple dependencies, so it's easy to set up too.

Figure 2.3: The GNU Arch project

Decentralized development has its strengths, particularly in allowing dier- ent people to try dierent approaches (e.g., independent branches and forks) independently and then bringing them together later. This ability to scale and support survival of the ttest is what makes decentralized development so important for Linux kernel maintenance. GNU Arch can also be used for cen- tralized development, but see the discussion below about that. There are also a number of people who have built support tools and such that support GNU Arch. For example, tla-graph can create a graph of the patchlogs in archives. A serious weakness of GNU Arch is that it doesn't work well on Windows- based systems, and it's not clear if that will ever change. GNU Arch is a tool that's already very useful but not mature enough. GNU Arch has some awkward weaknesses involving lenames. GNU Arch uses un- usually odd lenaming conventions that cause trouble for scripts, command-line

11 use, and many common tools. Its + prexes cause problems with extremely common tools like vi, , and the pager more (this is especially a problem when trying to enter change log information - why choose a convention that's inconvenient for one of the world's most popular text editors?). Its = prexes expose a bug in lename completion (this bug will eventually be xed in bash, but buggy implementations will be around for a long time to come be- cause this is such a rare need and bash is the default shell for many systems). And although this is less of a problem, it stores data in an {arch} directory, but the {} characters cause problems for many shells (particularly shells) because they have a special meaning (they're lename globbing characters like *). For example, in C-like shells you can't use commands like cd {arch} or vi {arch}/whatever; you must quote the directory name. The problem isn't that lename conventions are a bad idea; most CM systems have them. The problem is that some of the conventions chosen by GNU Arch doesn't seem to be designed to work well with commonly-used tools, and thus require using many work-arounds when using common tools (such as prexing the lename with ./ or using the -- option). That's unfortunate since GNU Arch's underlying concepts work well with other tools; if the developers had chosen better conventions these problems would never have occurred. There are ways to override the defaults in some cases, but not in many, and tools should choose good defaults. It's too bad, because nothing in GNU Arch's fundamental de- sign requires these particular lename conventions. In February 2004 GNU Arch couldn't handle spaces in lenames, but this signicant defect has been xed; version 1.2.1 and later support spaces in lenames. GNU Arch gives you a lot of control using lower-level commands, but it doesn't yet automate a number of tasks that really should be automated. Many common operations require multiple commands, when instead a single command and reasonable options should be enough for most people. If you use a single archive for a long time in GNU Arch, it eventually accumulates a very large amount of data and becomes inconvenient to work with. GNU Arch's developer suggests dividing archives by time and including a date in the archive name. Handling this accumulation is a nuisance; this kind of manual work is exactly what an SCM should handle automatically (e.g., perhaps GNU Arch could hide branches that have been unused in more than a year, by default). GNU Arch has nice caching facilities (both in archives and on individual workstations) which can speed access to specic versions. However, these caches often have to be created by hand (by default the tool should automatically create caches, and remove old automatically-created caches, as well). GNU Arch works slowly if the {arch} directory is on the NFS volume; the tool should be able to detect slow execution. Many GNU Arch developers seem to create a similar set of higher-level specialized scripts to automate common tasks, but that's missing the point: you shouldn't have to write scripts to make a tool automate common tasks. An SCM tool should include commands that, through automation and good defaults, do the right thing for common tasks. The good news is that the GNU Arch developers are realizing that this is a problem and correcting it. The rm (delete) command deletes both the id and the corresponding le automatically (instead of requiring two steps); that capability was only added

12 on February 23, 2004, though, so clearly automating steps has only begun. The documentation notes that automatic cache management is desirable; it just hasn't been done. The mirroring capability is clever, but if you download from a mirror site and make a change, you can't commit the change and the tool isn't smart enough to automatically help (even though the tool does have information on the mirror site's source). The website described a complicated workaround using undo and redo, and Jan Hudec described a simpler approach [20] (using tag, sync-tree, and set-tree-version), but the tool should be able to help commit changes even if you downloaded from a mirror site. GNU Arch will sometimes allow dangerous or problematic operations that just shouldn't be allowed. For example, branches should be either commit- based branches (all revisions after base-0 are created by commit) or tag-based branches (all revisions are created by tag); merging commands will not work otherwise, yet the tool doesn't enforce this limitation. The tla tool doesn't check if there are still pending rejections (.rej reject les), so operations such as commit, update, replay, or star-merge produce a scrambled workarea; users make mistakes, and an SCM system should work to protect data. The user interface also has some problems. For examplethe mv and move commands do dierent things: mv moves moves both the id and the le, while move only moves the id. This user interface seems designed for confusion; why not make move and mv the same, and make mv-id the only command that only manipulates id's? Many commands are aliases, which simply makes documentation unnecessarily complicated. The GNU Arch documentation is scarce; that's most unfortunate, because the documentation issues can hamper early adopters who want to start using it. A careful reading of what's available on-line should be enough for at least basic use of GNU Arch, though. Much of the documentation emphasizes lower- level implementation details (e.g., exactly how a command is implemented in the local lesystem) instead of emphasizing the higher-level constructs. Some parts of the documentation emphasize aliases, which is extremely distracting; if add and add-id mean the same thing, just document add (and later on, in a side note, list the aliases). In some cases the documentation needs to be updated for what the software actually does. The on-line tutorial at the FSF GNU Arch website [21] is a good place to start, and the Arch Wiki [22] is an especially good place to nd some more detailed reference material. In general, GNU Arch isn't currently as mature as Subversion. Then its lename limitations should be xed, and it sometimes requires users to do optimizations by hand when the tool should be doing it automatically. As noted above, its commands are sometimes on the low-level side; it can take several simple commands to set up values that should be defaults or built-in recipes/commands. But most of these problems seem to be short-term. Many of them simply reect the fact that GNU Arch hasn't had as much time to mature as other tools like subversion. It seems that the GNU Arch developers have emphasized simplicity, openness of design, and ability to handle complex situations, and have paid less attention so far to ease of use, especially for simple situations. Thus, although it has problems as noted above, GNU Arch is remarkably powerful

13 and its basic concepts are very exible. More time and tools that build on top of GNU Arch can resolve these issues. GNU Arch is also endorsed by the (FSF) and directly supported by their Savannah2 system; that's certainly no guarantee of success, but endorsements like that often bring users and developers to a project, increasing its likelihood of success. GNU Arch is a frankly more interesting approach to the problem, and it has a lot of promise. Many developers seem to like many of the ideas in GNU Arch, but not the implementation. As a result, several other projects have been started which take some of the ideas of GNU Arch, but are separate projects which aim to be much more user-friendly, portable to Windows as well as Unix-like systems, and so on. SCM projects that are conceptual descendents of GNU Arch include Arx (which has poor Windows support), Bazaar (also named baz) which is essentially a friendly fork of GNU Arch to improve it (primarily its UI), and especially Bazaar-NG (also named bzr). The Bazaar folks are working to ensure a smooth transition to Bazaar-NG once that becomes ready.

2.4 Darcs

Darcs, in particular, is very interesting for its new approach. Darcs is currently more of a prototype of some very innovative ideas for SCM, and maybe a tool for smaller projects, rather than a useful tool for large projects, though it can be used. Darcs is written in Haskell, which is both a strength and a weakness. Haskell is a high-level functional programming language, which probably helped the developer concentrate on abstract concepts. But darcs' developer admits that Darcs has poor performance (which would cause trouble as a project grows). In March 2004 the darcs developer said performance has gotten much better, so hopefully that's no longer a serious problem. However, since few developers truly completely understand functional programming, darcs is less likely to get other developers to help extend it. It has some contributions, but they're nothing compared to the scale of work by others in Subversion or GNU Arch. In March 2004 Darcs' website stated that it does not have an abundance of features and its core may still be buggy. The main developer says that the website is out of date, that the program is no longer buggy, and that it supports more than the basics (though it is still missing some features). Darcs does have some innovative approaches, though, and perhaps darcs will leap past everyone else, or at least perhaps some of its ideas may slip into other SCM systems. For example, darcs can keep track of inter-patch dependencies so that bringing in just one patch can bring in just the others needed, a clever capability not supported by other tools like GNU Arch. It is completely patch- oriented (see 3.3.1 - Darcs patch theory for more details), and requires user input to help characterize exactly what changed. For example, it understands a token replace patch, which makes it possible to create a patch which changes every instance of the variable oddly_named_var with better_var_name, while

2Web site for development, distribution and maintenance of GNU Software. (http:// savannah.gnu.org/)

14 leaving other_oddly_named_var untouched. As the author says [23], When this patch is merged with any other patch involving the oddly_named_var, that instance will also be modied to better_var_name. This is in contrast to a more conventional merging method which would not only fail to change new instances of the variable, but would also involve conicts when merging with any patch that modies lines containing the variable. By more using additional information about the programmer's intent, darcs is thus able to make the process of changing a variable name the trivial task that it really is... The advantage is that merge conicts can suddenly disappear, or at least be far less likely, because the system has more information to work with. The disadvantage is that this requires more interaction with the developer, who already has a complicated problem. Whether or not this approach will catch on is to be seen. The systems which don't have it seem to be acceptable to most developers. But it can denitely bee seen how that additional information could make an SCM system more powerful.

2.5 Bazaar-NG

Thus Bazaar-NG (also named bzr) is a new distributed SCM system that builds on the ideas of Bazaar (which extended GNU Arch), but it's essentially a new project. Here's how the Bazaar-NG developers compare their work with GNU Arch. Bazaar-NG is trying to exploit some of the major innovations in GNU Arch, but by providing an interface that's easier to use (e.g., doing the right thing and easily supporting common operations), trying to make it easier to transition to, and it borrows many ideas from elsewhere. The main developer is developing the user documentation and code simul- taneously (an approach I heartily recommend), and emphasizing common use cases. As a result, it appears that the most common use cases will be especially easy to do  something very important in SCM systems. It is very productive when people write user documentation simultaneously, because if a common op- eration is hard to explain, that's a good signal that the tool isn't user-friendly enough. GNU Arch is an unfortunate example  it needs good documentation because some of its operations are more complicated or awkward than neces- sary (some would say GNU Arch has unnecessary user-hostile complexity). The Bazaar-NG developers plan to cryptographically sign changes to counter the dangers of repository submersion (see the paper on software conguration management (SCM) security [25] for more information). It's developed in Python [16], which means it should easily port to any system. Some may be concerned that the resulting system will be too slow, but that concern isn't well-founded, and portions could be rewritten for speed if that becomes a problem, but that remains to be seen. Other SCM systems, such as [26], are written in Python, so this isn't a rare choice. Bazaar-NG is far less mature than most of other projects. So keep that in mind. But since Bazaar-NG has nancial backing from the company , who commercially support Ubuntu, it may catch up very rapidly. Its emphasis on ease-of-use is quite heartening.

15 System Maturity License Documentation Distributed Written repository in CVS Old GPL excellent no C SVN Mature Apache/ very good no C BSD- style3 GNU Arch Mature? GPL medium yes C Bazzar-NG Not yet GPL good yes Python mature Darcs Not yet GPL good yes Haskell mature

Table 2.2: Basic principles comparsion - resume

3open-source, but GPL incompatible

16 Chapter 3

Description of details

In this chapter I note the progress of the selected version control systems in- stallation as well as the systems details including ease of installation, command set, portability, networking support, web interfaces, and available plugins and related software.

3.1 Installation notes

A part of work in this bachelor thesis was to install selected version control systems from a source at a computer running on Debian Linux operating sys- tem, preferably locally and try them out. With my bachelor thesis supervisor we picked three of them: Bazaar-NG, Darcs and Subversion. I installed the version control systems to the local directory inside my home directory (I added --prefix=$HOME/local option when I ran the congure script).

Bazaar-NG demanded Python [16] 2.4. After its installation I added follow- ing line: export PYTHONPATH="/home/xmerka1/local/lib/python" to the .bashrc le (it is the directory, where bzrlib is situated), for before that I still got Please check bzrlib is on your PYTHONPATH error.

For Darcs the ghc (Glasgow Haskell Compiler[24]) at least 6.2 has to be installed.

3.2 Subversion

Ease of installation A Subversion service requires installing an Apache 2 module (if one wishes to use WebDAV+DeltaV as the underlying protocol) or its own proprietary server. The client requires only the Subversion-specic logic and the Neon WebDAV library (for HTTP). Installation of the components is quite straightforward, but will require some work, assuming Subversion does not come prepackaged for one's system.

17 Command set A CVS-like command set which is easy to get used to for CVS-users.

Portability Clients and Servers work well on UNIX, Windows and Mac OS X.

Networking support The Subversion service can use either WebDAV+DeltaV (which is HTTP or HTTPS based) as its underylying protocol, or its own pro- prietary protocol that can be channeled over an SSH

Web interfaces ViewVC, SVN::Web, WebSVN, ViewSVN, mod_svn_view, Chora, , SVN::RaWeb::Light, SVN Browser, Insurrection and perl_svn. Besides, the Subversion Apache service provides a rudimentary web-interface.

Plugins and related software

• AnkhSVN - A Subversion addin for .NET.

• CW Subversion - A VCS plugin for Metrowerks CodeWarrior.

• Eric3 - Python IDE with Subversion integration; written in PyQt, uses QScintilla editor widget.

• eSvn - cross-platform QT-based GUI frontend to Subversion.

• JSVN - A Java Subversion Client, including a plugin for IDEA.

• KDESvn - A Subversion client for KDE.

• KSvn - A Subversion client for KDE a plugin for Konqueror.

• psvn.el - A Subversion interface for emacs.

• RapidSVN - A cross-platform GUI front-end for Subversion.

• RSVN - Python script which allows multiple repository-side operations in a single, atomic transaction.

• SCPlugin - A Subversion plugin for the Mac OS X Finder.

• sourcecross.org - Subversion SCC Provider (client plugin for many Win- dows IDEs) 4.

• SmartSVN - A cross-platform GUI client for Subversion 4.

• Subclipse - A Subversion Eclipse plugin.

• Subcommander - A cross-platform Subversion GUI client including a vi- sual text merge tool.

• Subway - An SCC Provider for Subversion 4.

18 • Supervision - A Java/Swing based visual client for Subversion, using the CLI, not native libs.

• Svn-Up - A Java client GUI for Subversion and a plugin for the IDEA IDE.

• SvnX - A Mac OS X Panther GUI client.

• SVN SCC Proxy - A SCC add-in for SVN 4.

• TMate - A Subversion tracking, reporting and browsing plugin for IntelliJ IDEA4.

• TortoiseSVN - A Subversion client, implemented as a windows shell ex- tension.

• WLW-SVN - WebLogic Workshop (8.1.3/8.1.4) Extension for Subversion.

• WorkBench - Cross platform software development GUI built on Subver- sion written in Python.

• ZigVersion - A Subversion Interface for Mac OS X. Aims to design an interface based on the typical tasks performed by programmers.

• SVN::Notify - Provides email notication of changes to the SVN repository.

3.3 Darcs

Ease of installation Darcs requires few external libraries.

Command set The command set is fairly compact and the core commands are easy to understand. Follows CVS in a few places, but since the model is dierent most commands are unique.

Portability Very good. Supports many , Mac OS X, and Windows, and is written in a portable language.

Networking support Darcs supports getting patches over HTTP, and get- ting and sending patches over SSH and email.

Web interfaces Darcsweb, darcs.cgi included in the distribution.

4This is not an open source project.

19 Plugins and related software

• darcsreannotate - A small program that will reformat the darcs annotate to make it easier to read.

• EclipseDarcs - Darcs integration with the Eclipse IDE.

• Lispworks - Version Control for Lispworks Lisp IDE.

• PatchWorks - A (currently alpha) gui for using Darcs on Mac OS X 10.4+.

• TortoiseDarcs - A Windows GUI for Darcs.

• Tailor - Conversion between Revision Control Systems (with rsync can be used as a backing up system).

• arch2darcs - Conversion from Arch to Darcs and mirroring of Arch branches.

• darcs_load_dirs - System to automate importing of non-versioned soft- ware into darcs over time.

• darcs2docbook - A XSLT trasfromation for ChangeLog -> docbook revhis- tory.

• darcsbuildpackage - System for maintaining Debian packages in Darcs.

• DarcsDeps - A small perl script to resolve dependency graph of the patches.

• darcs-patchwatch - A lambdabot plugin for posting repository changes to an IRC channel.

• Darcsweb - A web-based repo browser written in Python.

• RubyDarcs - A (currently trivial) bit of ruby interface to Darcs.

• darcs-server - An authenticating push / pull protocol for Darcs.

• darcspatcher - Apply Darcs patches outside a Darcs repository.

3.3.1 Darcs patch theory Say you have the repository Stable with three patches (let's call them A, B and C) in it: Stable: A B C You create a development branch based on that: Devel: A B C You add stu to your development branch: Devel: A B C D E F Now you notice that F is an important bugx that should go into the next stable release. But if you pull it from Devel to Stable, darcs has to nd out

20 which patches it should pull together with F. Maybe F depends on some stu you added with D, or with E? Darcs can use the Patch Theory to nd out if a patch depends on another. In the abovementioned case, it will try to use the rules of the patch theory to commute (i.e. swap) the patch F backwards. First it tries to see if E and F can be commuted. If they can, it knows the F does not depend on E: Devel: A B C D F E It then tries to commute D and F to see whether F depends on D. If that works, all is well, and darcs can just pull F and be happy. If that does not work, darcs knows that F depends on D, and has to pull both patches. Since this reordering is based on a sound theory of patches, it is guaranteed that darcs will nd the minimal set of patches it has to pull to satisfy the dependencies of any patch you requested, without asking you what other patches it needs. There's a bad side, of course. The patch theory works on a purely textual level. It can only nd out that two patches depend on each other if they aect the same portions of text. So, darcs will detect the relationship between two patches that modify the same line of code, though it will not, for example, detect the relation between changes such as one which adds a call to a function, and the other which checks the returned value of this function, unless the two patches happen to fall in the same region of text. However, semantic dependencies are beyond the automatism of Darcs. If you add a new function in one patch and call it in another, darcs won't be able to nd this dependency. Darcs provides some extensions to add such explicit dependencies by hand, but to nd such things automatically, it would need a parser for the programming language.

3.4 Bazaar-NG

Ease of installation Requires Python [16] interpreter, cElementTree [17] is strongly recommended (speeds up xml processing).

Command set Simple CVS-like syntax for common operations like: add, mv, di, status, commit, log, merge, etc.

Portability Very good. Many UNIXes, Mac OS X, and Windows are sup- ported.

Networking support Bazaar-NG supports HTTP (read-only), SFTP, and FTP protocol.

Web interfaces BazaarHGWeb [18] (experimental)

21 Plugins and related software

• baz-import - Convert Baz 1.x to Bazaar-NG.

• bzr- - Represent changes between revisions as a single text le

• bzr-rsync - Push and pull using rsync (useful as a backup system).

• bzr-uncommit - Uncommit an undesired revision. In Bzr 0.7+.

• bzrk - Branch Visualization.

• cia - Submit commits to CIA.

• clean-ignored - Delete all ignored les (e.g. build results).

• distat - Di statistics.

• ditools - Graphical di tools.

• email_sender - Send email after commit, e.g. to bazaar-commits mailing list.

• foreach - Iterates a command for each revno.

• gannotate - Graphical (gtk) annotate.

• graft - Graft branches onto the end of others.

• graph-ancestry - Draw diagrams of branch history

• iam - Sets committer's ID.

• kompare - Graphical di using kompare.

• lastlog - Shows most recent commit messages.

• update-mirrors - Updates all the branches in a directory.

• push - Push a branch release Merges a local branch into a remote one.

• release - Merges a local branch into a remote one.

• revstore2sql - Convert a revision-store to sql.

• revstore2weave - Convert a revision-store to a weave.

• reweave - Update history, inserting ghosts.

• service - Keep bzrlib loaded in memory.

• sftp - sftp transport 5. 5This plugin is now in bzr.

22 • shelve - Undo and restore changes.

• shortlog - Show last 10 commit logs.

• svn - Support for Subversion branches (pre-alpha).

• timecdv-weave - Time conversion of a revision-store to a weave.

• trac - Closes bugs in trac from commit messages.

• version-info - Snapshot information about a tree, suitable for builds.

• vimdi - Display changes between revisions, using vimdi.

• webserve - A web interface to Bazaar.

• extmerge - Calls external merge tools to help resolve conicts.

• BzrEmacs - Provides Emacs integration with Bazaar-NG.

• PatchQueueManager - Is a email based system to manage a multi-committer branch by orders given by email.

• CongManager - Gives developers the ability to split a tree into several branches. Multiple revision control systems are supported.

• TracBzr - Integrates bug tracking, wiki functionality and branch manage- ment in one interface.

• BugTool - Integrates with a local RT instance.

• Bugs Everywhere - A decentralized bug tracker.

• bzrk - Provides a graphical visualization of Bazaar-NG branches.

• gannotate - gtk interface for annotate.

• meld - Local repository browser and di viewer.

23 Chapter 4

Conclusion

The aim of this paper was to describe available version control systems which support development with distributed repository and than to install and depict the selected systems in more detail, and install them on the Linux powered machine, and check them out. For the depiction we chose Subversion because it is widespread among users (even though it doesn't provide distributed repository), Bazaar-NG, and Darcs which both oer distributed repository. I recommend Bazaar-NG (bzr) rather then the other version control systems. It extends ideas of Bazaar (baz) which is based on GNU Arch. GNU Arch is relatively mature, but the documentation is far from perfect and GNU Arch user interface is unnecessarily complicated. The main target of Bazaar-NG is to retain GNU Arch functionality, but to provide usage simplicity apparently missed by GNU Arch. Darcs is written in Haskell and builds on brand new foundations. Based on patches, its patch theory enables users to manage patches like no other version control system. On the other hand it demands more user interaction.

24 Bibliography

[1] Watts Humphrey Managing the Software Process (Addison-Wesley, 1989) http://www. cmcrossroads.com/bradapp/acme/scm-defs.html

[2] Bryce Zooko Wilcox-O'Hearn Quick Reference Guide to Free Software Decentralized Revision Control Systems (http://zooko.com/revision_control_quick_ref.html)

[3] CVS http://www.cvshome.org/

[4] Subversion http://subversion.tigris.org/

[5] GNU arch - tla http://www.gnu.org/software/gnu-arch/

[6] GNU arch - revc http://www.gnuarch.org/revc/

[7] ArX http://www.nongnu.org/arx/

[8] Darcs http://www.darcs.net/

[9] Bazaar http://bazaar.canonical.com/

[10] Bazar-NG http://bazaar-vcs.org/

[11] David A. Wheeler Comments on Open Source Software / Free Software (OSS/FS) Soft- ware Conguration Management (SCM) Systems ( http://dwheeler.com/ essays/scm.html)

[12] Shlomi Fish Version Control System Comparison (http://better-scm.berlios.de/ comparison/comparison.html)

[13] Mikhael Goikhman Revision Control Systems (http://migo.sixbit.org/papers/Revision_ Control_Systems/all-in-one.html)

[14] Eric Kow Patch theory (http://darcs.net/DarcsWiki/PatchTheory)

25 [15] Ben Collins-Sussman, Brian W. Fitzpatrick, C. Michael Pilato Version control with Subversion http://svnbook.red-bean.com/

[16] Python http://www.python.org/

[17] cElementTree http://effbot.org/zone/celementtree.htm

[18] BazaarHGWeb http://goffredo-baroncelli.homelinux.net/bazaar/

[19] svk http://svk.elixus.org/

[20] Jan Hudec mail to David A. Weeler concerning GNU Arch mirroring capability http: //research.warnes.net/Members/rossini/versioncontrol/tla/

[21] GNU Arch on-line tutorial http://www.gnu.org/software/gnu-arch/ tutorial/

[22] Arch Wiki http://wiki.gnuarch.org/

[23] Darcs token replace patch http://www.darcs.net/manual/node2.html

[24] Glasgow Haskell Compiler http://www.haskell.org/ghc/

[25] Software Conguration Management (SCM) Security http://www. dwheeler.com/essays/scm-security.html

[26] Codeville http://www.codeville.org/

26