Open Source Infrastructure: Version Control Systems

Total Page:16

File Type:pdf, Size:1020Kb

Open Source Infrastructure: Version Control Systems Open Source Infrastructure: Version Control Systems Jim Blandy [email protected] Version Control Systems ● SCCS ● RCS ● CVS ● network-transparent CVS ● Subversion ● BitKeeper ● Git ● Mercurial ● Darcs SCCS ● Mark Rochkind, 1972 ● Included in early Unix systems from AT&T ● Managed single files ● “Interleaved” delta format SCCS: Interleaved deltas A int main() { puts (“a”); } SCCS: Interleaved deltas B A int main() int main() { { puts (“b”); puts (“a”); puts (“a”); } } SCCS: Interleaved deltas B A B' int main() int main() int main() { { { puts (“b”); puts (“a”); puts (“a”); puts (“a”); } puts (“b' ”); } } SCCS: Interleaved deltas +A int main() +A { +B puts (“b”); +A puts (“a”); +B' puts (“b' ”); +A } RCS ● Walter Tichy, 1980's ● designed to address SCCS performance issues RCS Creator:inkscape 0.45.1 RCS Creator:inkscape 0.45.1 CVS ● Dick Grune et. al., ~1988 ● full directory trees, not single files ● copy/modify/merge/commit model CVS Creator:inkscape 0.45.1 CVS Creator:inkscape 0.45.1 CVS Creator:inkscape 0.45.1 Network-transparent CVS ● Jim Kingdon, 1990 (Cygnus Solutions) Subversion ● Blandy, Collins-Sussman, Fitzpatrick, Fogel 2000 ● Retains CVS model ● Better performance ● Better features ● Better architecture Subversion Subversion Creator:inkscape 0.45.1 Subversion Creator:inkscape 0.45.1 Subversion Creator:inkscape 0.45.1 BitKeeper ● BitMover (McVoy), 1998 ● Distributed ● Uses SCCS weave file format ● Used for Linux kernel collaboration until 2005 Hash naming Hash naming 76 ● Is 2 a big number? Hash naming 76 22 ● Is 2 a big number? Roughly = 7.5 * 10 Hash naming 32 ● Roughly 2 people (for now) 32 ● generating 2 strings per year 20 ● for 2 years 84 ● makes 2 strings generated in the likely history of the species Hash naming 160 ● SHA1 has 2 different hash values 84 160 76 ● 2 /2 means we'll hit 1/2 of the hash values Git ● Linus Torvalds, 2005 ● Influenced by BitKeeper and Monotone (Graydon Hoare) Git ● a blob is a string of bytes identified by its hash ● a tree is a series of names, flag bits, and blob hashes of files or subdirectories Git Creator:inkscape 0.45.1 Git ● each working copy includes complete history ● commits are local ● commits can have parents – zero or more! ● history is a DAG; multiple heads ● hash naming makes for fast net synchronization ● trading histories is the essential net operation Mercurial ● Matt Mackall, 2005 ● Similar to Git ● Uses a manifest instead of a tree Leading Question ● What is the correct number of context lines to include in a patch? Darcs ● David Roundy ● metadata is a set of patches (not historical!) Darcs ● Write the effect of applying patch P1, and then P2, as: P1 P2 ● Suppose I have P1 P2 P3 P4 ● It's easy to compute P1 P2 P3 ● But P1 P3 P4 may or may not apply cleanly Darcs ● Some pairs of patches P1 and P2 commute: there is some pair of patches P1' and P2' such that P1 P2 = P2' P1' The Toronto Idea ● Karl Fogel, in dire circumstances .
Recommended publications
  • Efficient Algorithms for Comparing, Storing, and Sharing
    EFFICIENT ALGORITHMS FOR COMPARING, STORING, AND SHARING LARGE COLLECTIONS OF EVOLUTIONARY TREES A Dissertation by SUZANNE JUDE MATTHEWS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY May 2012 Major Subject: Computer Science EFFICIENT ALGORITHMS FOR COMPARING, STORING, AND SHARING LARGE COLLECTIONS OF EVOLUTIONARY TREES A Dissertation by SUZANNE JUDE MATTHEWS Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Approved by: Chair of Committee, Tiffani L. Williams Committee Members, Nancy M. Amato Jennifer L. Welch James B. Woolley Head of Department, Hank W. Walker May 2012 Major Subject: Computer Science iii ABSTRACT Efficient Algorithms for Comparing, Storing, and Sharing Large Collections of Evolutionary Trees. (May 2012) Suzanne Jude Matthews, B.S.; M.S., Rensselaer Polytechnic Institute Chair of Advisory Committee: Dr. Tiffani L. Williams Evolutionary relationships between a group of organisms are commonly summarized in a phylogenetic (or evolutionary) tree. The goal of phylogenetic inference is to infer the best tree structure that represents the relationships between a group of organisms, given a set of observations (e.g. molecular sequences). However, popular heuristics for inferring phylogenies output tens to hundreds of thousands of equally weighted candidate trees. Biologists summarize these trees into a single structure called the consensus tree. The central assumption is that the information discarded has less value than the information retained. But, what if this assumption is not true? In this dissertation, we demonstrate the value of retaining and studying tree collections.
    [Show full text]
  • Analysis of Devops Tools to Predict an Optimized Pipeline by Adding Weightage for Parameters
    International Journal of Computer Applications (0975 – 8887) Volume 181 – No. 33, December 2018 Analysis of DevOps Tools to Predict an Optimized Pipeline by Adding Weightage for Parameters R. Vaasanthi V. Prasanna Kumari, PhD S. Philip Kingston Research Scholar, HOD, MCA Project Manager SCSVMV University Rajalakshmi Engineering Infosys, Mahindra City, Kanchipuram College, Chennai Chennai ABSTRACT cloud. Now-a-days more than ever, DevOps [Development + Operations] has gained a tremendous amount of attention in 2. SCM software industry. Selecting the tools for building the DevOps Source code management (SCM) is a software tool used for pipeline is not a trivial exercise as there are plethora’s of tools development, versioning and enables team working in available in market. It requires thought, planning, and multiple locations to work together more effectively. This preferably enough time to investigate and consult other plays a vital role in increasing team’s productivity. Some of people. Unfortunately, there isn’t enough time in the day to the SCM tools, considered for this study are GIT, SVN, CVS, dig for top-rated DevOps tools and its compatibility with ClearCase, Mercurial, TFS, Monotone, Bitkeeper, Code co- other tools. Each tool has its own pros/cons and compatibility op, Darcs, Endevor, Fossil, Perforce, Rational Synergy, of integrating with other tools. The objective of this paper is Source Safe, and GNU Bazaar. Table1 consists of SCM tools to propose an approach by adding weightage to each with weightage. parameter for the curated list of the DevOps tools. 3. BUILD Keywords Build is a process that enables source code to be automatically DevOps, SCM, dependencies, compatibility and pipeline compiled into binaries including code level unit testing to ensure individual pieces of code behave as expected [4].
    [Show full text]
  • DVCS Or a New Way to Use Version Control Systems for Freebsd
    Brief history of VCS FreeBSD context & gures Is Arch/baz suited for FreeBSD? Mercurial to the rescue New processes & policies needed Conclusions DVCS or a new way to use Version Control Systems for FreeBSD Ollivier ROBERT <[email protected]> BSDCan 2006 Ottawa, Canada May, 12-13th, 2006 Ollivier ROBERT <[email protected]> DVCS or a new way to use Version Control Systems for FreeBSD Brief history of VCS FreeBSD context & gures Is Arch/baz suited for FreeBSD? Mercurial to the rescue New processes & policies needed Conclusions Agenda 1 Brief history of VCS 2 FreeBSD context & gures 3 Is Arch/baz suited for FreeBSD? 4 Mercurial to the rescue 5 New processes & policies needed 6 Conclusions Ollivier ROBERT <[email protected]> DVCS or a new way to use Version Control Systems for FreeBSD Brief history of VCS FreeBSD context & gures Is Arch/baz suited for FreeBSD? Mercurial to the rescue New processes & policies needed Conclusions The ancestors: SCCS, RCS File-oriented Use a subdirectory to store deltas and metadata Use lock-based architecture Support shared developments through NFS (fragile) SCCS is proprietary (System V), RCS is Open Source a SCCS clone exists: CSSC You can have a central repository with symlinks (RCS) Ollivier ROBERT <[email protected]> DVCS or a new way to use Version Control Systems for FreeBSD Brief history of VCS FreeBSD context & gures Is Arch/baz suited for FreeBSD? Mercurial to the rescue New processes & policies needed Conclusions CVS, the de facto VCS for the free world Initially written as shell wrappers over RCS then rewritten in C Centralised server Easy UI Use sandboxes to avoid locking Simple 3-way merges Can be replicated through CVSup or even rsync Extensive documentation (papers, websites, books) Free software and used everywhere (SourceForge for example) Ollivier ROBERT <[email protected]> DVCS or a new way to use Version Control Systems for FreeBSD Brief history of VCS FreeBSD context & gures Is Arch/baz suited for FreeBSD? Mercurial to the rescue New processes & policies needed Conclusions CVS annoyances and aws BUT..
    [Show full text]
  • CSE 391 Lecture 9
    CSE 391 Lecture 9 Version control with Git slides created by Ruth Anderson & Marty Stepp, images from http://git-scm.com/book/en/ http://www.cs.washington.edu/391/ 1 Problems Working Alone • Ever done one of the following? . Had code that worked, made a bunch of changes and saved it, which broke the code, and now you just want the working version back… . Accidentally deleted a critical file, hundreds of lines of code gone… . Somehow messed up the structure/contents of your code base, and want to just “undo” the crazy action you just did . Hard drive crash!!!! Everything’s gone, the day before deadline. • Possible options: . Save as (MyClass-v1.java) • Ugh. Just ugh. And now a single line change results in duplicating the entire file… 2 Problems Working in teams . Whose computer stores the "official" copy of the project? • Can we store the project files in a neutral "official" location? . Will we be able to read/write each other's changes? • Do we have the right file permissions? • Lets just email changed files back and forth! Yay! . What happens if we both try to edit the same file? • Bill just overwrote a file I worked on for 6 hours! . What happens if we make a mistake and corrupt an important file? • Is there a way to keep backups of our project files? . How do I know what code each teammate is working on? 3 Solution: Version Control • version control system: Software that tracks and manages changes to a set of files and resources. • You use version control all the time .
    [Show full text]
  • Confidentiality and Authenticity for Distributed Version Control
    | Author's copy | Confidentiality and Authenticity for Distributed Version Control Systems — A Mercurial Extension Michael Lass Dominik Leibenger Christoph Sorge Paderborn University CISPA, Saarland University CISPA, Saarland University 33098 Paderborn, Germany 66123 Saarbrucken,¨ Germany 66123 Saarbrucken,¨ Germany [email protected] [email protected] [email protected] Abstract—Version Control Systems (VCS) are a valuable tool is a cryptography-based access control solution for SVN that for software development and document management. Both enforces access rights using end-to-end encryption. [13] client/server and distributed (Peer-to-Peer) models exist, with the Since Git [7] was released in 2005, distributed VCS have latter (e.g., Git and Mercurial) becoming increasingly popular. gained more and more popularity. Mercurial [16] and Git are Their distributed nature introduces complications, especially concerning security: it is hard to control the dissemination of the most-popular such systems today. In contrast to modern contents stored in distributed VCS as they rely on replication of centralized VCS, repositories are stored on users’ local work- complete repositories to any involved user. stations again. Collaboration among users is supported by We overcome this issue by designing and implementing a allowing users to synchronize their repositories with others. concept for cryptography-enforced access control which is trans- Revisions can be pulled from / pushed to remote repositories. parent to the user. Use of field-tested schemes (end-to-end encryp- tion, digital signatures) allows for strong security, while adoption There are no limitations concerning the resulting communica- of convergent encryption and content-defined chunking retains tion paths: Distributed VCS support centralized setups, fully storage efficiency.
    [Show full text]
  • Software Development a Practical Approach!
    Software Development A Practical Approach! Hans-Petter Halvorsen https://www.halvorsen.blog https://halvorsen.blog Software Development A Practical Approach! Hans-Petter Halvorsen Software Development A Practical Approach! Hans-Petter Halvorsen Copyright © 2020 ISBN: 978-82-691106-0-9 Publisher Identifier: 978-82-691106 https://halvorsen.blog ii Preface The main goal with this document: • To give you an overview of what software engineering is • To take you beyond programming to engineering software What is Software Development? It is a complex process to develop modern and professional software today. This document tries to give a brief overview of Software Development. This document tries to focus on a practical approach regarding Software Development. So why do we need System Engineering? Here are some key factors: • Understand Customer Requirements o What does the customer needs (because they may not know it!) o Transform Customer requirements into working software • Planning o How do we reach our goals? o Will we finish within deadline? o Resources o What can go wrong? • Implementation o What kind of platforms and architecture should be used? o Split your work into manageable pieces iii • Quality and Performance o Make sure the software fulfills the customers’ needs We will learn how to build good (i.e. high quality) software, which includes: • Requirements Specification • Technical Design • Good User Experience (UX) • Improved Code Quality and Implementation • Testing • System Documentation • User Documentation • etc. You will find additional resources on this web page: http://www.halvorsen.blog/documents/programming/software_engineering/ iv Information about the author: Hans-Petter Halvorsen The author currently works at the University of South-Eastern Norway.
    [Show full text]
  • Distributed Versioning for Everyone
    Distributed versioning for everyone Distributed versioning for everyone Nicolas Pouillard [email protected] March 20, 2008 Nicolas Pouillard Distributed versioning for everyoneMarch 20, 2008 1 / 48 Distributed versioning for everyone Introduction Outline 1 Introduction 2 Principles of Distributed Versioning 3 Darcs is one of them 4 Conclusion Nicolas Pouillard Distributed versioning for everyoneMarch 20, 2008 2 / 48 Distributed versioning for everyone Introduction SCM: “Source Code Manager” Keeps track of changes to source code so you can track down bugs and work collaboratively. Most famous example: CVS Numerous acronyms: RCS, SCM, VCS DSCM: Distributed Source Code Manager Nicolas Pouillard Distributed versioning for everyoneMarch 20, 2008 3 / 48 Distributed versioning for everyone Introduction Purpose What’s the purpose of this presentation Show the importance of the distributed feature Enrich your toolbox with a DSCM Exorcize rumors about darcs Show how DSCM are adapted for personal use What’s not the purpose of it A flame against other DSCMs A precise darcs tutorial A real explanation of the Theory of patches Nicolas Pouillard Distributed versioning for everyoneMarch 20, 2008 4 / 48 Distributed versioning for everyone Principles of Distributed Versioning Outline 1 Introduction 2 Principles of Distributed Versioning 3 Darcs is one of them 4 Conclusion Nicolas Pouillard Distributed versioning for everyoneMarch 20, 2008 5 / 48 Distributed versioning for everyone Principles of Distributed Versioning Distributed rather than centralized
    [Show full text]
  • Git and Gerrit in Action and Lessons Learned Along the Path to Distributed Version Control
    Git and Gerrit in Action And lessons learned along the path to distributed version control Chris Aniszczyk (Red Hat) Principal Software Engineer [email protected] http://aniszczyk.org About Me I've been using and hacking open source for ~12 years - contribute{d} to Gentoo Linux, Fedora Linux, Eclipse Hack on Eclipse, Git and other things at Red Hat Member of the Eclipse Board of Directors Member in the Eclipse Architecture Council I like to run! (2 mins short of Boston qualifying ;/) Co-author of RCP Book (www.eclipsercp.org) An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Agenda History of Version Control (VCS) The Rise of Distributed Version Control (DVCS) Code Review with Git and Gerrit Lessons Learned at Eclipse moving to a DVCS Conclusion Q&A An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Version Control Version Control Systems manage change “The only constant is change” (Heraclitus) An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Why Version Control? VCS became essential to software development because: They allow teams to collaborate They manage change and allow for inspection They track ownership They track evolution of changes They allow for branching They allow for continuous integration An Introduction to Git and Gerrit | © 2011 by Chris Aniszczyk Version Control: The Ancients 1972 – Source Code Control System (SCCS) Born out of Bell Labs, based on interleaved deltas No open source implementations as far as I know 1982 – Revision Control System (RCS) Released as an alternative to SCCS
    [Show full text]
  • Evolution of Version Control in Open Source Lessons Learned Along the Path to Distributed Version Control
    Evolution of Version Control in Open Source Lessons learned along the path to distributed version control Chris Aniszczyk (Red Hat) Principal Software Engineer [email protected] http://aniszczyk.org About Me I've been using and hacking open source for ~12 years - contribute{d} to Gentoo Linux, Fedora Linux, Eclipse Eclipse Board of Directors, Committer Representative Member in the Eclipse {Architecture,Planning} Council I like to run! (just finished Chicago marathon in 3:20) Co-author of RCP Book (www.eclipsercp.org) Evolution of Version Control in Open Source | © 2010 by Chris Aniszczyk Agenda History of Version Control (VCS) The Rise of Distributed Version Control (DVCS) Lessons Learned at Eclipse moving to a DVCS Conclusion Q&A Picture 5 Evolution of Version Control in Open Source | © 2010 by Chris Aniszczyk Version Control Version Control Systems manage change “The only constant is change” (Heraclitus) Evolution of Version Control in Open Source | © 2010 by Chris Aniszczyk Why Version Control? VCS became essential to software development because: They allow teams to collaborate They manage change and allow for inspection They track ownership They track evolution of changes They allow for branching They allow for continuous integration Evolution of Version Control in Open Source | © 2010 by Chris Aniszczyk Version Control: The Ancients 1972 – Source Code Control System (SCCS) Born out of Bell Labs, based on interleaved deltas No open source implementations as far as I know 1982 – Revision Control System (RCS) Released as an alternative
    [Show full text]
  • Version Control
    Génie Logiciel Avancé Cours 7 — Version Control Stefano Zacchiroli [email protected] Laboratoire PPS, Université Paris Diderot - Paris 7 5 mai 2011 URL http://upsilon.cc/zack/teaching/1011/gla/ Copyright © 2011 Stefano Zacchiroli License Creative Commons Attribution-ShareAlike 3.0 Unported License http://creativecommons.org/licenses/by-sa/3.0/ Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 1 / 58 Disclaimer slides in English interactive demos Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 2 / 58 Sommaire 1 Version control Configuration management diff & patch Version control concepts Brief history of version control systems 2 Revision Control System (RCS) 3 Concurrent Versions System (CVS) 4 Subversion 5 Git 6 References Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 3 / 58 Sommaire 1 Version control Configuration management diff & patch Version control concepts Brief history of version control systems 2 Revision Control System (RCS) 3 Concurrent Versions System (CVS) 4 Subversion 5 Git 6 References Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 4 / 58 Sommaire 1 Version control Configuration management diff & patch Version control concepts Brief history of version control systems 2 Revision Control System (RCS) 3 Concurrent Versions System (CVS) 4 Subversion 5 Git 6 References Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 5 / 58 Change During the life time of a software project, everything changes : bugs are discovered and have to be fixed (code) system requirements change and need to be implemented external dependencies (e.g. new version of hardware and software you depend upon) change competitors might catch up most software systems can be thought of as a set of evolving versions potentially, each of them has to be maintained concurrently with the others Stefano Zacchiroli (Paris 7) Version Control 5 mai 2011 6 / 58 Configuration management Definition (Configuration Management) Configuration Management (CM) is concerned with the policies, processes, and tools for managing changing software systems.
    [Show full text]
  • Analysis of SVN Repositories for Remote Access
    Analysis of SVN Repositories for Remote Access Sadaf Solangi and Safeeullah Soomro Suhni Abbasi Department of Computing and Technology Institute of Information Technology Center Faculty of Engineering & Technology, Indus University Faculty of Social Sciences, Sindh Agricultural Karachi, Pakisstan University Tando Jam, Pakistan e-mail:{sadaf.solangi,ssoomro}@indus.edu.pk [email protected] Abstract— Software Evolution is considered to be essential and collections [10].How digital repositories were well used and challenging characteristic in the field of software engineering. for what, by address of recording [7]. Version control system is an incremental versions tracking system, introduced to avoid unnecessary overwriting of files A. SVN Repository such as programming code, web pages and records. It also helps to decrease the confusion affected by duplicate or A repository is information about the database that is shared outdated data. In this proposed research SVN repository is engineered artifacts created and used by enterprise. A common maintained and analyzed for msitone.wikispaces.com to repository allows instruments to share information, not including minimize the efforts as well as resources for the future users. a common Repository, and it will need a particular protocol We have used two semester data for the analysis purpose that Exchange of information between machines [13].Examples of is observed SVN repository. The result shows that, such sample includes, software, documents etc. Repository implementing the SVN repositories are helpful for has three types of sessions such as trunk, branches and maintenance of the Wikispaces as it also reduce the cost, time and efforts for their evolution. Whereas without implementing tags.
    [Show full text]
  • Darcs: Distributed Version Management in Haskell
    Introduction to darcs What worked and what didn’t Second half of talk... Conclusion Darcs: Distributed Version Management in Haskell David Roundy Cornell University September 30, 2005 David Roundy Darcs: Distributed Version Management in Haskell Introduction to darcs What worked and what didn’t Second half of talk... Conclusion Some incomprehensible equations A classical density functional F [ρ] for the free energy of a fluid is given by Z F [ρ] = d~r [f ideal (ρ(~r)) + f exc (¯ρ(~r)) + ρ(~r)ξ(~r)] (1) where the weighted densityρ ¯ is defined by Z ρ¯(~r) = d~r 0ρ(~r 0)W (~r −~r 0) (2) and the correction term ρ(~r)ξ(~r) is determined by Z 0 0 0 0 ξ(~r) = − d~r ρ(~r )(kB TC(~r −~r ) + W (~r −~r )) (3) where C(∆~r) is the direct correlation function. δ2F = −k T ρ(~r)[δ(~r −~r 0) + C(~r −~r 0)] (4) δρ(~r)δρ(~r 0) B David Roundy Darcs: Distributed Version Management in Haskell Introduction to darcs What worked and what didn’t Second half of talk... Conclusion Outline Introduction to darcs Ideas behind darcs What worked and what didn’t A pure functional language for an SCM? Laziness and unsafeInterleaveIO Object-oriented-like data structures QuickCheck Foreign Function Interface Efficient string handling Handles and zlib and threads Error handling and cleanup Optimization experiences David Roundy Darcs: Distributed Version Management in Haskell Introduction to darcs Ideas behind darcs What worked and what didn’t Distributed rather than centralized Second half of talk..
    [Show full text]