Distributed Version Control with Git and Mercurial

Total Page:16

File Type:pdf, Size:1020Kb

Distributed Version Control with Git and Mercurial A.CHRISTOPHERTOREK DISTRIBUTEDVERSION CONTROLWITHGITAND MERCURIAL 2 copyright stuff Dedication NB: this front matter is still quite a hodgepodge; these are just various thoughts. [Find out if I can use a copy of xkcd #1597] Git is notoriously difficult for beginners. In xkcd comic #1597, Randall Munroe, referring to Git, draws a character (called “Cueball” on the explainxkcd site) who says: Just memorize these shell commands and type them to sync up. If you get errors, save your work elsewhere, delete the project, and download a fresh copy. [Maybe turn the above into an epigraph?] Everyone makes mistakes. The difference between being a novice and being an expert cannot be boiled down to just one sentence, but we can say that one—maybe the most important—difference is that an expert can recover from mistakes mid-process. This book should help you do so. Git makes it easy to make mistakes, and also easy to correct them. Mercurial makes it harder to make mistakes, but also harder to cor- rect them. There are already several good Git books, Chacon and Straub [2014] and Loeliger[ 2009]. The primary author of Mercurial has published a Mercurial book, O’Sullivan[ 2009]. So why write another book? Loeliger’s book is good but has become out of date—a constant hazard with actively-developed software. Chacon’s book is on line and gets updated, but focuses strictly on Git. O’Sullivan’s book fo- cuses strictly on Mercurial. I’m not currently aware of any books that approach version control in this particular manner, and show both Git and Mercurial usage. If you are reading this book, you probably have thought about using Git or Mercurial (or even both), or you may have used them in the past or be using them now and want to learn more. You may be considering which one to use. The book will try to address all of these. I also wanted to have a book that could also appear as a series of web pages that were structured very differently. That is, the book would proceed in a logical building-up fashion, but using a web hyperlink, you could start with any particular topic and zoom up or down the scale of generalization or specialization to find specific answers. Part of the project was to write a program to produce hypertext (HTML) web pages. It would read the LaTeX input, and actually use LaTeX to generate figures, but keep the small-section-at-a-time setup. 3 The text for the book and the text for the hypertext setup would live together in (at least relative) harmony. At the time I write this, the outcome of this experiment has yet to be determined. Both Git and Mercurial have many Graphical User Interfaces (GUIs) and Integrated Development Environments (IDEs) that al- low you to browse commits, and in some GUIs and all IDEs, change or create branches, make new commits, and so on. Every one of these is different and we cannot possibly address them, so we will stick with the command line interfaces. [Here’s an intro bit that goes with the xkcd comic] [scene: you’ve been given some shell commands to type] $ git clone ssh://host.name/path/to/repo $ cd repo make changes... $ git commit repeat change-and-commit as needed $ git pull --rebase You may get a merge conflict, and not know what to do. Or maybe you do know what to do, and have done it. But in any case, you want to continue and you try (as instructed): $ git rebase --continue but now you get an error: No changes - did you forget to use ’git add’? If there is nothing left to stage, chances are that something else already introduced the same changes; you might want to skip this patch. When you have resolved this problem run "git rebase --continue". If you would prefer to skip this patch, instead run "git rebase --skip". To check out the original branch and stop rebasing run "git rebase --abort". If you run git status , which is a good thing to use, you simply see: # Not currently on any branch. nothing to commit (working directory clean) You may also run into problems when you have used git merge or git rebase successfully—or so you thought; and then you dis- cover that you want or need to back out of the merge or rebase. This book should set you up so that you know what to do. Organization of this book The book begins with an overview of version control in general. We introduce terminology that you will need, and review some historical version control systems and their distinguishing characteristics. 4 Next, we cover graph theory and how it applies to both Git and Mercurial. It’s worth noting that while this theory has nothing to do with the controlled source itself, it’s a basic building block for performing source control. It not only interacts with merging and rebasing, it is also fundamental to the distributed nature of Git and Mercurial repositories. The third chapter describes more precisely what is in a commit; how we compare one commit to another; and how, at a high level, merging works, using the commit graph described in Chapter 2. It also mentions the issues with file path names that will affect you once you distribute a repository across dissimilar operating systems. The fourth chapter covers the mechanics of distributing repos- itories, and one of the key consequences: that some commits are public and some commits are private. Private commits can be deleted without affecting others, but once a commit is published, it may be impossible to retract it. It also includes some of the theory needed to understand how commits can be signed and authenticated. With these basics out of the way, Chapters 5 and 6 some of the basic setup and usage of both Git and Mercurial. We discover just how similar, and in some cases just how different, the two VCSes are. XXX this is now wrong Chapter 6 discusses diffs: comparisons between pairs of commits, or one commit and the corresponding working-tree files. Chapter 7 covers merging, which—while it has many variations— basically amounts to combining two diffs. Chapter 8 (. is not yet written). Those who want to jump right to using Git or Mercurial can start at Chapter 5, referring back to earlier theory chapters only as needed. However, careful reading of the history and theory chapters should give you a much better idea of what you are doing with the practical aspects of version control. Each page has room for graphics, side notes, and exercises. Side notes that are numbered are specific details regarding items in the main text. Unnumbered side notes are general ideas I find interesting or relevant, yet not directly related to the main text. The exercises are optional, but are meant to verify and cement your understanding of the concepts involved. ASCII Chapter 3 refers to ASCII, the American Standard Code for Infor- mation Interchange. This is a very old standard for saving and ex- changing data on computer systems, in which one single-byte code represents one letter, digit, or other printable symbol (and certain 5 “control” operations including tab, carriage return, and the like). ASCII is an old standard, and by the 1980s, all computers could work with it. It is not adequate to modern needs, but much is built upon it. Numbers This book mostly works with ordinary decimal numbers. However, hashes are typically encoded in hexadecimal, with “digits” that range from 0 through 9 but then continue on with abcdef, which may be written in either uppercase or lowercase. In some places, we will use a leading zero and letter-x to denote hexadecimal numbers: 0x10 rep- resents the same number as 16, 0x80 represents the same number as 128, 0x100 represents the same number as 256, and so on. We will write hashes as a27fc31 and the like, without any leading prefix. While these do represent numbers inside the computer, their deci- malized equivalent representations are not useful for anything. Bugs The term bug dates back to at least the 1870s and Thomas Edison. The first application to computing may have been in 1947 when Grace Hopper’s group at Harvard discovered a moth in the cir- cuitry of the Harvard Mark II computer. (The log book containing the remains of the moth is now in the possession of the Smithsonian Institution; see Smithsonian Institution[ 1994]). Bugs are, however, usually very small, difficult to observe from a distance, and can in- duce a great deal of revulsion in some people. In this book, we will instead use larger, friendlier mammals, specifically marsupials. In- stead of moths, ants, spiders, centipedes, and cockroaches, we will deliberately introduce kangaroos and wallabies into our programs and processes, so as to illustrate their removal. Target Audience, preface, introduction? Animals in this book The Marsupial Maker is not a real project, but marsupials are real This stuff is currently at front of book, creatures, and I find them quite interesting. Here are photos of some but probably should be at back of book. that I took on a trip to parts of Australia in February of 2010. Plate 1: Red kangaroo, Healesville Sanctuary. The kangaroo is probably the most widely known marsupial. There are actually four species of large kangaroo: the red, the eastern and western grey, and the antilopine. There are also smaller tree- kangaroos and rat-kangaroos. 8 Plate 2: Bennet’s wallaby, Cradle Mountain. The wallaby is smaller than any kangaroo, but in fact the term “wallaby” is defined a bit loosely.
Recommended publications
  • Pragmatic Version Control Using Subversion
    What readers are saying about Pragmatic Version Control using Subversion I expected a lot, but you surprised me with even more. Hav- ing used CVS for years I hesitated to try Subversion until now, although I knew it would solve many of the shortcom- ings of CVS. After reading your book, my excuses to stay with CVS disappeared. Oh, and coming from the Pragmatic Bookshelf this book is fun to read too. Thanks Mike. Steffen Gemkow Managing Director, ObjectFab GmbH I’m a long-time user of CVS and I’ve been skeptical of Sub- version, wondering if it would ever be “ready for prime time.” Until now. Thanks to Mike Mason for writing a clear, con- cise, gentle introduction to this new tool. After reading this book, I’m actually excited about the possibilities for version control that Subversion brings to the table. David Rupp Senior Software Engineer, Great-West Life & Annuity This was exactly the Subversion book I was waiting for. As a long-time Perforce and CVS user and administrator, and in my role as an agile tools coach, I wanted a compact book that told me just what I needed to know. This is it. Within a couple of hours I was up and running against remote Subversion servers, and setting up my own local servers too. Mike uses a lot of command-line examples to guide the reader, and as a Windows user I was worried at first. My fears were unfounded though—Mike’s examples were so clear that I think I’ll stick to using the command line from now on! I thoroughly recommend this book to anyone getting started using or administering Subversion.
    [Show full text]
  • Version Control 101 Exported from Please Visit the Link for the Latest Version and the Best Typesetting
    Version Control 101 Exported from http://cepsltb4.curent.utk.edu/wiki/efficiency/vcs, please visit the link for the latest version and the best typesetting. Version Control 101 is created in the hope to minimize the regret from lost files or untracked changes. There are two things I regret. I should have learned Python instead of MATLAB, and I should have learned version control earlier. Version control is like a time machine. It allows you to go back in time and find out history files. You might have heard of GitHub and Git and probably how steep the learning curve is. Version control is not just Git. Dropbox can do version control as well, for a limited time. This tutorial will get you started with some version control concepts from Dropbox to Git for your needs. More importantly, some general rules are suggested to minimize the chance of file losses. Contents Version Control 101 .............................................................................................................................. 1 General Rules ................................................................................................................................... 2 Version Control for Files ................................................................................................................... 2 DropBox or Google Drive ............................................................................................................. 2 Version Control on Confluence ...................................................................................................
    [Show full text]
  • Generating Commit Messages from Git Diffs
    Generating Commit Messages from Git Diffs Sven van Hal Mathieu Post Kasper Wendel Delft University of Technology Delft University of Technology Delft University of Technology [email protected] [email protected] [email protected] ABSTRACT be exploited by machine learning. The hypothesis is that methods Commit messages aid developers in their understanding of a con- based on machine learning, given enough training data, are able tinuously evolving codebase. However, developers not always doc- to extract more contextual information and latent factors about ument code changes properly. Automatically generating commit the why of a change. Furthermore, Allamanis et al. [1] state that messages would relieve this burden on developers. source code is “a form of human communication [and] has similar Recently, a number of different works have demonstrated the statistical properties to natural language corpora”. Following the feasibility of using methods from neural machine translation to success of (deep) machine learning in the field of natural language generate commit messages. This work aims to reproduce a promi- processing, neural networks seem promising for automated commit nent research paper in this field, as well as attempt to improve upon message generation as well. their results by proposing a novel preprocessing technique. Jiang et al. [12] have demonstrated that generating commit mes- A reproduction of the reference neural machine translation sages with neural networks is feasible. This work aims to reproduce model was able to achieve slightly better results on the same dataset. the results from [12] on the same and a different dataset. Addition- When applying more rigorous preprocessing, however, the per- ally, efforts are made to improve upon these results by applying a formance dropped significantly.
    [Show full text]
  • Introduction to Version Control with Git
    Warwick Research Software Engineering Introduction to Version Control with Git H. Ratcliffe and C.S. Brady Senior Research Software Engineers \The Angry Penguin", used under creative commons licence from Swantje Hess and Jannis Pohlmann. March 12, 2018 Contents 1 About these Notes1 2 Introduction to Version Control2 3 Basic Version Control with Git4 4 Releases and Versioning 11 Glossary 14 1 About these Notes These notes were written by H Ratcliffe and C S Brady, both Senior Research Software Engineers in the Scientific Computing Research Technology Platform at the University of Warwick for a series of Workshops first run in December 2017 at the University of Warwick. This document contains notes for a half-day session on version control, an essential part of the life of a software developer. This work, except where otherwise noted, is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Li- cense. To view a copy of this license, visit http://creativecommons.org/ licenses/by-nc-nd/4.0/. The notes may redistributed freely with attribution, but may not be used for commercial purposes nor altered or modified. The Angry Penguin and other reproduced material, is clearly marked in the text and is not included in this declaration. The notes were typeset in LATEXby H Ratcliffe. Errors can be reported to [email protected] 1.1 Other Useful Information Throughout these notes, we present snippets of code and pseudocode, in particular snippets of commands for shell, make, or git. These often contain parts which you should substitute with the relevant text you want to use.
    [Show full text]
  • Distributed Configuration Management: Mercurial CSCI 5828 Spring 2012 Mark Grebe Configuration Management
    Distributed Configuration Management: Mercurial CSCI 5828 Spring 2012 Mark Grebe Configuration Management Configuration Management (CM) systems are used to store code and other artifacts in Software Engineering projects. Since the early 70’s, there has been a progression of CM systems used for Software CM, starting with SCCS, and continuing through RCS, CVS, and Subversion. All of these systems used a single, centralized repository structure. Distributed Configuration Management As opposed to traditional CM systems, Distributed Configuration Management Systems are ones where there does not have to be a central repository. Each developer has a copy of the entire repository and history. A central repository may be optionally used, but it is equal to all of the other developer repositories. Advantages of Distributed Configuration Management Distributed tools are faster than centralized ones since metadata is stored locally. Can use tool to manage changes locally while not connected to the network where server resides. Scales more easily, since all of the load is not on a central server. Allows private work that is controlled, but not released to the larger community. Distributed systems are normally designed to make merges easy, since they are done more often. Mercurial Introduction Mercurial is a cross-platform, distributed configuration management application. In runs on most modern OS platforms, including Windows, Linux, Solaris, FreeBSD, and Mac OSX. Mercurial is written 95% in Python, with the remainder written in C for speed. Mercurial is available as a command line tool on all of the platforms, and with GUI support programs on many of the platforms. Mercurial is customizable with extensions, hooks, and output templates.
    [Show full text]
  • Microservices and Monorepos
    Microservices and Monorepos Match made in heaven? Sven Erik Knop, Perforce Software Overview . Microservices refresher . Microservices and versioning . What are Monorepos and why use them? . These two concepts seem to contradict – why mix them together? . The magic of narrow cloning . A match made in heaven! 2 Why Microservices? . Monolithic approach: App 3 Database Microservices approach . Individual Services 4 DB DB Database Versioning Microservices . Code . Executables and Containers . Configuration . Natural choice: individual repositories for each service Git . But: • Security • Visibility • Refactoring • Single change id to rule them all? 5 Monorepo . Why would you use a monorepo? . Who is using monorepos? . How would you use a monorepo? 6 Monorepos: Why would you do this? . Single Source of Truth for all projects . Simplified security . Configuration and Refactoring across entire application . Single change id across all projects . Examples: • Google, Facebook, Twitter, Salesforce, ... 7 Single change across projects change 314156 8 Monorepos: Antipatterns User workspace User workspace 9 Monorepos – view mapping User workspace . Map one or more services . Users only access files they need . Simplified pushing of changes 10 What does this have to do with Git? . Git does not support Monorepos • Limitations on number and size of files, history, contributing users • Companies have tried and failed . Android source spread over a thousand Git repositories • Requires repo and gerrit to work with 11 How can we square this circle? https://en.wikipedia.org/wiki/Squaring_the_circle 12 Narrow cloning! . Clone individual projects/services . Clone a group of projects into a single repo 13 Working with narrowly cloned repos . Users work normally in Git . Fetch and push changes from and to monorepo .
    [Show full text]
  • Higher Inductive Types (Hits) Are a New Type Former!
    Git as a HIT Dan Licata Wesleyan University 1 1 Darcs Git as a HIT Dan Licata Wesleyan University 1 1 HITs 2 Generator for 2 equality of equality HITs Homotopy Type Theory is an extension of Agda/Coq based on connections with homotopy theory [Hofmann&Streicher,Awodey&Warren,Voevodsky,Lumsdaine,Garner&van den Berg] 2 Generator for 2 equality of equality HITs Homotopy Type Theory is an extension of Agda/Coq based on connections with homotopy theory [Hofmann&Streicher,Awodey&Warren,Voevodsky,Lumsdaine,Garner&van den Berg] Higher inductive types (HITs) are a new type former! 2 Generator for 2 equality of equality HITs Homotopy Type Theory is an extension of Agda/Coq based on connections with homotopy theory [Hofmann&Streicher,Awodey&Warren,Voevodsky,Lumsdaine,Garner&van den Berg] Higher inductive types (HITs) are a new type former! They were originally invented[Lumsdaine,Shulman,…] to model basic spaces (circle, spheres, the torus, …) and constructions in homotopy theory 2 Generator for 2 equality of equality HITs Homotopy Type Theory is an extension of Agda/Coq based on connections with homotopy theory [Hofmann&Streicher,Awodey&Warren,Voevodsky,Lumsdaine,Garner&van den Berg] Higher inductive types (HITs) are a new type former! They were originally invented[Lumsdaine,Shulman,…] to model basic spaces (circle, spheres, the torus, …) and constructions in homotopy theory But they have many other applications, including some programming ones! 2 Generator for 2 equality of equality Patches Patch a a 2c2 diff b d = < b c c --- > d 3 3 id a a b b
    [Show full text]
  • Homework 0: Account Setup for Course and Cloud FPGA Intro Questions
    Cloud FPGA Homework 0 Fall 2019 Homework 0 Jakub Szefer 2019/10/20 Please follow the three setup sections to create BitBucket git repository, install LATEX tools or setup Overleaf account, and get access to the course's git repository. Once you have these done, answer the questions that follow. Submit your solutions as a single PDF file generated from a template; more information is at end in the Submission Instructions section. Setup BitBucket git Repository This course will use git repositories for code development. Each student should setup a free BitBucket (https://bitbucket.org) account and create a git repository for the course. Please make the repository private and give WRITE access to your instructor ([email protected]). Please send the URL address of the repository to the instructor by e-mail. Make sure there is a README:md file in the repository (access to the repository will be tested by a script that tries to download the README:md from the repository address you share). Also, if you are using a Apple computer, please add :gitignore file which contains one line: :DS Store (to prevent the hidden :DS Store files from accidentally being added to the repository). If you have problems accessing BitBucket git from the command line, please see the Appendix. Setup LATEX and Overleaf Any written work (including this homework's solutions) will be submitted as PDF files generated using LATEX [1] from provided templates. Students can setup a free Overleaf (https://www. overleaf.com) account to edit LATEX files and generate PDFs online; or students can install LATEX tools on their computer.
    [Show full text]
  • Version Control – Agile Workflow with Git/Github
    Version Control – Agile Workflow with Git/GitHub 19/20 November 2019 | Guido Trensch (JSC, SimLab Neuroscience) Content Motivation Version Control Systems (VCS) Understanding Git GitHub (Agile Workflow) References Forschungszentrum Jülich, JSC:SimLab Neuroscience 2 Content Motivation Version Control Systems (VCS) Understanding Git GitHub (Agile Workflow) References Forschungszentrum Jülich, JSC:SimLab Neuroscience 3 Motivation • Version control is one aspect of configuration management (CM). The main CM processes are concerned with: • System building • Preparing software for releases and keeping track of system versions. • Change management • Keeping track of requests for changes, working out the costs and impact. • Release management • Preparing software for releases and keeping track of system versions. • Version control • Keep track of different versions of software components and allow independent development. [Ian Sommerville,“Software Engineering”] Forschungszentrum Jülich, JSC:SimLab Neuroscience 4 Motivation • Keep track of different versions of software components • Identify, store, organize and control revisions and access to it • Essential for the organization of multi-developer projects is independent development • Ensure that changes made by different developers do not interfere with each other • Provide strategies to solve conflicts CONFLICT Alice Bob Forschungszentrum Jülich, JSC:SimLab Neuroscience 5 Content Motivation Version Control Systems (VCS) Understanding Git GitHub (Agile Workflow) References Forschungszentrum Jülich,
    [Show full text]
  • Revision Control
    Revision Control Tomáš Kalibera, (Peter Libič) Department of Distributed and Dependable Systems http://d3s.mff.cuni.cz CHARLES UNIVERSITY PRAGUE Faculty of Mathematics and Physics Problems solved by revision control What is it good for? Keeping history of system evolution • What a “system” can be . Source code (single file, source tree) . Textual document . In general anything what can evolve – can have versions • Why ? . Safer experimentation – easy reverting to an older version • Additional benefits . Tracking progress (how many lines have I added yesterday) . Incremental processing (distributing patches, …) Allowing concurrent work on a system • Why concurrent work ? . Size and complexity of current systems (source code) require team work • How can concurrent work be organized ? 1. Independent modifications of (distinct) system parts 2. Resolving conflicting modifications 3. Checking that the whole system works . Additional benefits . Evaluating productivity of team members Additional benefits of code revision control • How revision control helps . Code is isolated at one place (no generated files) . Notifications when a new code version is available • Potential applications that benefit . Automated testing • Compile errors, functional errors, performance regressions . Automated building . Backup • Being at one place, the source is isolated from unneeded generated files . Code browsing • Web interface with hyperlinked code Typical architecture Working copy Source code repository (versioned sources) synchronization Basic operations • Check-out . Create a working copy of repository content • Update . Update working copy using repository (both to latest and historical version) • Check-in (Commit) . Propagate working copy back to repository • Diff . Show differences between two versions of source code Simplified usage scenario Source code Check-out or update repository Working 1 copy 2 Modify & Test Check-in 3 Exporting/importing source trees • Import .
    [Show full text]
  • FAKULTÄT FÜR INFORMATIK Leveraging Traceability Between Code and Tasks for Code Reviews and Release Management
    FAKULTÄT FÜR INFORMATIK DER TECHNISCHEN UNIVERSITÄT MÜNCHEN Master’s Thesis in Informatics Leveraging Traceability between Code and Tasks for Code Reviews and Release Management Jan Finis FAKULTÄT FÜR INFORMATIK DER TECHNISCHEN UNIVERSITÄT MÜNCHEN Master’s Thesis in Informatics Leveraging Traceability between Code and Tasks for Code Reviews and Release Management Einsatz von Nachvollziehbarkeit zwischen Quellcode und Aufgaben für Code Reviews und Freigabemanagement Author: Jan Finis Supervisor: Prof. Bernd Brügge, Ph.D. Advisors: Maximilian Kögel, Nitesh Narayan Submission Date: May 18, 2011 I assure the single-handed composition of this master’s thesis only supported by declared resources. Sydney, May 10th, 2011 Jan Finis Acknowledgments First, I would like to thank my adviser Maximilian Kögel for actively supporting me with my thesis and being reachable for my frequent issues even at unusual times and even after he left the chair. Furthermore, I would like to thank him for his patience, as the surrounding conditions of my thesis, like me having an industrial internship and finishing my thesis abroad, were sometimes quite impedimental. Second, I want to thank my other adviser Nitesh Narayan for helping out after Max- imilian has left the chair. Since he did not advise me from the start, he had more effort working himself into my topic than any usual adviser being in charge of a thesis from the beginning on. Third, I want to thank the National ICT Australia for providing a workspace, Internet, and library access for me while I was finishing my thesis in Sydney. Finally, my thanks go to my supervisor Professor Bernd Brügge, Ph.D.
    [Show full text]
  • Distributed Revision Control with Mercurial
    Distributed revision control with Mercurial Bryan O’Sullivan Copyright c 2006, 2007 Bryan O’Sullivan. This material may be distributed only subject to the terms and conditions set forth in version 1.0 of the Open Publication License. Please refer to Appendix D for the license text. This book was prepared from rev 028543f67bea, dated 2008-08-20 15:27 -0700, using rev a58a611c320f of Mercurial. Contents Contents i Preface 2 0.1 This book is a work in progress ...................................... 2 0.2 About the examples in this book ..................................... 2 0.3 Colophon—this book is Free ....................................... 2 1 Introduction 3 1.1 About revision control .......................................... 3 1.1.1 Why use revision control? .................................... 3 1.1.2 The many names of revision control ............................... 4 1.2 A short history of revision control .................................... 4 1.3 Trends in revision control ......................................... 5 1.4 A few of the advantages of distributed revision control ......................... 5 1.4.1 Advantages for open source projects ............................... 6 1.4.2 Advantages for commercial projects ............................... 6 1.5 Why choose Mercurial? .......................................... 7 1.6 Mercurial compared with other tools ................................... 7 1.6.1 Subversion ............................................ 7 1.6.2 Git ................................................ 8 1.6.3
    [Show full text]