<<

Version control with

Kees den Heijer

Data2day

January 29, 2020

DOI 10.5281/zenodo.3629176

Kees den Heijer (Data2day) Version control with Git January 29, 2020 1 / 22 Introduction Outline

1 Introduction Reproducibility Why version control Version control systems Git

2 Hands-on exercises Carpentries material Collaboration Branching

Kees den Heijer (Data2day) Version control with Git January 29, 2020 2 / 22 Introduction About the speaker

Educated in Coastal Engineering

Experienced in data and information matter

Data Steward @ TU Delft Owner / consultant @ Data2day

MSc (2005) PhD (2013)

Kees den Heijer (Data2day) Version control with Git January 29, 2020 3 / 22 Introduction Why version control Version control

Recognize this??

: Piled Higher and Deeper by Jorge Cham www.phdcomics.com

Kees den Heijer (Data2day) Version control with Git January 29, 2020 4 / 22 Introduction Why version control Traceability by design is common practice in food industry

Peanut butter

Egg code

Kees den Heijer (Data2day) Version control with Git January 29, 2020 5 / 22 Introduction Why version control Why version control?

We need Reproducibility by design

Scientists we are... Key advantages of version control professional easily trace back all steps taken human have your work reproducible proud always have a back-up aiming at quality

Kees den Heijer (Data2day) Version control with Git January 29, 2020 6 / 22 Introduction Version control systems Flavours of version control

Distributed Centralized Local

source http://git-scm.com/book

Kees den Heijer (Data2day) Version control with Git January 29, 2020 7 / 22 Introduction Version control systems Doesn’t cloud storage also keep track of versions?

Yes, but this has... implicit version control (auto save) limited control on what you call a version often no room to provide notes that describe version

Kees den Heijer (Data2day) Version control with Git January 29, 2020 8 / 22 Introduction Version control systems How to choose?

A suitable version control systems depends on... What is the purpose? Software code, data, documents... How familiar are you and your (potential) co-workers with version control software? What type of files? Text, binary... What data volumes are expected? Number/frequency of changes Chain of reasoning

Kees den Heijer (Data2day) Version control with Git January 29, 2020 9 / 22 Introduction Git Git vs GitHub/GitLab/ etc.

Git is the version control system, the core tool GitHub is a service providing git repositories (default public, optional private/shared) Bitbucket is a service providing git repositories. VU has academic plan subscription. GitLab is a service providing git repositories, an on premise instance is hosted at TU Delft (private, group, public possible)

Kees den Heijer (Data2day) Version control with Git January 29, 2020 10 / 22 Introduction Git Git workflow

Server-client structure

source: https://www.edureka.co/blog/git-tutorial/

Kees den Heijer (Data2day) Version control with Git January 29, 2020 11 / 22 Introduction Git Important

Help your future self messages Never put (plain text) passwords in your repository Think twice before adding large/binary files

Kees den Heijer (Data2day) Version control with Git January 29, 2020 12 / 22 Hands-on exercises Hands-on exercises

The Carpentries material Additional room for http://swcarpentry.github.io/ Collaboration git-novice/ Conflict handling Customization for today: Branching Use BitBucket Pull requests Merging Start with remote repository rather than local Continuous Integration

Kees den Heijer (Data2day) Version control with Git January 29, 2020 13 / 22 Hands-on exercises Prerequisites

Open slideshow

Before you start: DOI 10.5281/zenodo.3629176 Create Bitbucket account Install command-line Git client Windows: https://gitforwindows.org/

Kees den Heijer (Data2day) Version control with Git January 29, 2020 14 / 22 Hands-on exercises Carpentries material Create repository

Create

Kees den Heijer (Data2day) Version control with Git January 29, 2020 15 / 22 Hands-on exercises Carpentries material Create repository

Create repository

Kees den Heijer (Data2day) Version control with Git January 29, 2020 16 / 22 Hands-on exercises Carpentries material Create repository

Create a new repository

Guidance Repository name: workshop Include a README: Yes, with a template Version control: Git

Kees den Heijer (Data2day) Version control with Git January 29, 2020 17 / 22 Hands-on exercises Carpentries material Create repository

Congratulations, your repos is live

Kees den Heijer (Data2day) Version control with Git January 29, 2020 18 / 22 Hands-on exercises Collaboration Collaboration

Collaboration easily leads to conflicts... More than one person edits the same file/lines It can be ambiguous how to these edits In code, edits to solve an issue can create other issues

Ways to cope with conflicts Prevent and minimize effects: Branching Pull requests Continuous Integration

Kees den Heijer (Data2day) Version control with Git January 29, 2020 19 / 22 Hands-on exercises Branching Branching (1)

source: Microsoft Git branching guidance

source:

https://nvie.com/posts/a-successful-git-branching-model/

Kees den Heijer (Data2day) Version control with Git January 29, 2020 20 / 22 Hands-on exercises Branching Branching (2)

source: https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows

Kees den Heijer (Data2day) Version control with Git January 29, 2020 21 / 22 Hands-on exercises Continuous Integration Continuous Integration

How does it work? Predefined workflows/pipelines run upon each commit Email notification upon failure

Some examples LaTeX: https://bitbucket.org/denheijer/latex-ci-example/ Python: https://bitbucket.org/denheijer/python-ci-example/ R: https://bitbucket.org/denheijer/r-ci-example/

Kees den Heijer (Data2day) Version control with Git January 29, 2020 22 / 22