Distributed Source Code Version Control Systems: Comparison and Usage Evaluation
Total Page:16
File Type:pdf, Size:1020Kb
Masaryk University Faculty of Informatics Distributed Source Code Version Control Systems: Comparison and Usage Evaluation Bachelor Thesis Supervisor: Mgr. Ond°ej Krají£ek Michal M¥rka Brno, spring 2006 Declaration Hereby I declare that this paper is my original author craft, which I elabo- rated independently. All the sources and literature I used or drew from during elaboration are cited thoroughly with the full reference to the relevant source. In Brno, on 16th May 2006 ............................................................ 1 Acknowledgements I would like to express my thanks to my bachelor thesis supervisor Mgr. Ond°ej Krají£ek for his advices, comments, and technical support and to Petr Mikulík for proof-reading. 2 Abstract This thesis describes version control systems that oer possibility to use a decen- tralized repository. It shows advantages, disadvantages and rarities of particular systems, depicts their special features. It also notes implementation and prob- lems which can appear during installation of selected systems. Two version control systems supporting distributed repository were picked for the detailed description, Bazaar-NG and Darcs. Keywords version control system, revision control system, software conguration manage- ment 3 Contents 1 Introduction 5 1.1 What is a Version Control System (VCS)? . 5 1.2 Example of work with a VCS . 6 2 General features 8 2.1 CVS . 9 2.2 Subversion . 9 2.3 GNU Arch . 11 2.4 Darcs . 14 2.5 Bazaar-NG . 15 3 Description of details 17 3.1 Installation notes . 17 3.2 Subversion . 17 3.3 Darcs . 19 3.3.1 Darcs patch theory . 20 3.4 Bazaar-NG . 21 4 Conclusion 24 4 Chapter 1 Introduction Version control systems are tools helping to develop software systematically. When the count of developers increases and the number of lines of source code grows, they become essential. Sometimes it is suitable to split the development process into multiple parts, and develop sections of software project separately, or develop particular portion of software functionality by dierent ways and then after some time pick the nest one. That is where the distributed source code management comes in. This work deals with distributed version control systems with emphasis on the open source solution. 1.1 What is a Version Control System (VCS)? The most frustrating software problems are often caused by poor conguration management. The problems are frustrating because they take time to x, they often happen at the worst time, and they are totally unnecessary. For example, a dicult bug that was xed at great expense suddenly reappears; a developed and tested feature is mysteriously missing; or a fully tested program suddenly doesn't work. Conguration management helps to reduce these problems by coordinating the work products of the many dierent people who work on a common project. Without such control, their work will often conict, resulting in problems such as: Simultaneous Update When two or more programmers work separately on the same program, the last one to make the changes can easily destroy the other's work. Shared Code Often, when a bug is xed in code shared by several program- mers, some of them are not notied. Common Code In large systems, when common program functions are mod- ied, all the users need to know. Without eective code management, there is no way to be sure of nding and alerting every user. Versions Most large programs are developed in evolutionary releases. With one release in customer use, another in test, and a third in development, 5 bug xes must be propagated between them. If found by a customer, for example, a bug should be xed in all later versions. Similarly, if a bug is found in a development release, it should be xed in all those prior ver- sions that contained it. In larger systems with several simultaneous active releases and many programmers working on bug xes and enhancements, conicts and confusion are likely. These problems stem from confusion and lack of control, and they can waste an enormous amount of time. The key is to have a control system that answers the following questions: ² What is my current software conguration? ² What is its status? ² How do I control changes to my conguration? ² How do I inform everyone else of my changes? ² What changes have been made to my software? ² Do anyone else's changes aect my software? 1.2 Example of work with a VCS Here is a classic example of work with a VCS, which demonstrates customary command ussage (I'm using Subversion for this example). First a repository have to be created . It is a location, where your (and probably your fellow-programmer's) data will be stored and downloaded from. To create a local repository, run: $ svnadmin create /home/user/repo To see /home/user/svn directory, type : $ svn ls file:///home/user/repo This command returned nothing, because we haven't put anything into the repository yet. So create a checkout directory $ mkdir chdir; cd chdir Download the repository content to your checkout directory. This is the directory where you work with the source code etc. and as soon as you want to save your work, you send your changes to the repository. $ svn checkout file:///home/user/repo Checked out revision 0. $ cd repo 6 Modify the content $ mkdir test; cat > test/foo.txt A file to be added to the repository. ^D Does svn know we've just added a directory and a le? (Now I mean the local checked out copy of ours) $ svn status ? repo/test No, it doesn't. We have to tell svn you want the new le(s) to be added. $ svn add * A test A test/foo.txt $ svn status A test A test/foo.txt Now it does. Did anyone modify the repository content? It is strongly recommended to do this check before commit. Otherwise you can destroy the work of others. $ svn update At revision 0. Lets send our changes to the repository (don't forget to add the message describing what we've just done) $ svn commit -m "committing the test directory and the foo.txt file" Adding test Adding test/foo.txt Transmitting file data . Committed revision 1. Lets check for modications $ svn update At revision 1. View the the log $ svn log r1 | user | 2006-05-12 16:51:59 +0200 (Fri, 12 May 2006) | 1 line committing the test directory and the foo.txt file And nally list the repository content $ svn ls -vR file:///home/user/repo/ 1 user May 12 16:51 test/ 1 user 38 May 12 16:51 test/foo.txt 7 Chapter 2 General features name status decentralization platforms notes CVS old no unix, w32 Use Subversion in- [3] stead. Subversion mature no unix, w32 same idea as CVS [4] GNU arch 1 mature? lines of development unix, maintained but not tla can be shared among limited developed current [5] independent reposi- support users are supported tories for w32 but new users should probably look else- where; complicated GNU Arch not yet lines of development unix, very new, but it is de- 2.0 revc mature can be shared among w32? signed by Tom Lord [6] independent reposi- the original designer tories of arch drawing on lessons from arch, git, and bzr ArX mature? lines of development unix, genetically related to [7] can be shared among limited arch; simpler independent reposi- support tories for w32 DARCS not yet uniquely exible unix, cherry-picking is the [8] mature branching, merging, limited fundamental opera- and cherry-picking of support tion on which all selected patches for w32 workow is based; very easy to learn and easy to use bazaar mature lines of development unix a fork of arch for [9] can be shared among better user interface; independent reposi- a sister project to tories bazaar-ng bazaar-ng not yet lines of development unix, w32 newer; commercial [10] mature can be shared among background funding independent reposi- tories Table 2.1: Key Features Comparison 8 2.1 CVS CVS is enormously popular, and it does the job. In fact, when CVS was released, CVS was a major new innovation in software conguration management (see gure 2.1). However, CVS is now showing its age through a number of awkward limitations: changes are tracked per-le instead of per-change, commits aren't atomic, renaming les and directories is awkward, and its branching limitations mean that you'd better faithfully tag things or there'll be trouble later. Some of the maintainers of the original CVS have declared that the CVS code has become too crusty to eectively maintain. These problems led the main CVS developers to start over and create Subversion. Figure 2.1: CVS architecture 2.2 Subversion Subversion (SVN) is a new system, intending to be a simple replacement of CVS. Subversion is basically a re-implementation of CVS with its warts xed, and it still works the same basic way supporting a centralized repository (see gure 2.2). Like CVS, subversion by itself is intended to support a centralized repository for developers and doesn't handle decentralized development well; the svk project [19] oers a posibility to extend subversion to support decentralized development. From a technology point-of-view you can denitely argue with some of sub- version's decisions. For example, they don't handle changesets as directly as you'd expect given their centrality to the problem. But technical advancement is not the same as utility; for many people who currently use CVS and just want an incremental improvement, subversion is probably more or less what they were expecting and looking for. But there are weaknesses, for example, Subversion doesn't keep track of which patches have already been applied on a given branch, and trying to reapply a patch more than once causes problems.