Version control with Git
Kees den Heijer
Data2day
January 29, 2020
DOI 10.5281/zenodo.3629176
Kees den Heijer (Data2day) Version control with Git January 29, 2020 1 / 22 Introduction Outline
1 Introduction Reproducibility Why version control Version control systems Git
2 Hands-on exercises Carpentries material Collaboration Branching Continuous Integration
Kees den Heijer (Data2day) Version control with Git January 29, 2020 2 / 22 Introduction About the speaker
Educated in Coastal Engineering
Experienced in data and information matter
Data Steward @ TU Delft Owner / consultant @ Data2day
MSc (2005) PhD (2013)
Kees den Heijer (Data2day) Version control with Git January 29, 2020 3 / 22 Introduction Why version control Version control
Recognize this??
source: Piled Higher and Deeper by Jorge Cham www.phdcomics.com
Kees den Heijer (Data2day) Version control with Git January 29, 2020 4 / 22 Introduction Why version control Traceability by design is common practice in food industry
Peanut butter
Egg code
Kees den Heijer (Data2day) Version control with Git January 29, 2020 5 / 22 Introduction Why version control Why version control?
We need Reproducibility by design
Scientists we are... Key advantages of version control professional easily trace back all steps taken human have your work reproducible proud always have a back-up aiming at quality
Kees den Heijer (Data2day) Version control with Git January 29, 2020 6 / 22 Introduction Version control systems Flavours of version control
Distributed Centralized Local
source http://git-scm.com/book
Kees den Heijer (Data2day) Version control with Git January 29, 2020 7 / 22 Introduction Version control systems Doesn’t cloud storage also keep track of versions?
Yes, but this has... implicit version control (auto save) limited control on what you call a version often no room to provide notes that describe version
Kees den Heijer (Data2day) Version control with Git January 29, 2020 8 / 22 Introduction Version control systems How to choose?
A suitable version control systems depends on... What is the purpose? Software code, data, documents... How familiar are you and your (potential) co-workers with version control software? What type of files? Text, binary... What data volumes are expected? Number/frequency of changes Chain of reasoning
Kees den Heijer (Data2day) Version control with Git January 29, 2020 9 / 22 Introduction Git Git vs GitHub/GitLab/Bitbucket etc.
Git is the version control system, the core tool GitHub is a service providing git repositories (default public, optional private/shared) Bitbucket is a service providing git repositories. VU has academic plan subscription. GitLab is a service providing git repositories, an on premise instance is hosted at TU Delft (private, group, public possible)
Kees den Heijer (Data2day) Version control with Git January 29, 2020 10 / 22 Introduction Git Git workflow
Server-client structure
source: https://www.edureka.co/blog/git-tutorial/
Kees den Heijer (Data2day) Version control with Git January 29, 2020 11 / 22 Introduction Git Important
Help your future self Commit messages Never put (plain text) passwords in your repository Think twice before adding large/binary files
Kees den Heijer (Data2day) Version control with Git January 29, 2020 12 / 22 Hands-on exercises Hands-on exercises
The Carpentries material Additional room for http://swcarpentry.github.io/ Collaboration git-novice/ Conflict handling Customization for today: Branching Use BitBucket Pull requests Merging Start with remote repository rather than local Continuous Integration
Kees den Heijer (Data2day) Version control with Git January 29, 2020 13 / 22 Hands-on exercises Prerequisites
Open slideshow
Before you start: DOI 10.5281/zenodo.3629176 Create Bitbucket account Install command-line Git client Windows: https://gitforwindows.org/
Kees den Heijer (Data2day) Version control with Git January 29, 2020 14 / 22 Hands-on exercises Carpentries material Create repository
Create
Kees den Heijer (Data2day) Version control with Git January 29, 2020 15 / 22 Hands-on exercises Carpentries material Create repository
Create repository
Kees den Heijer (Data2day) Version control with Git January 29, 2020 16 / 22 Hands-on exercises Carpentries material Create repository
Create a new repository
Guidance Repository name: workshop Include a README: Yes, with a template Version control: Git
Kees den Heijer (Data2day) Version control with Git January 29, 2020 17 / 22 Hands-on exercises Carpentries material Create repository
Congratulations, your repos is live
Kees den Heijer (Data2day) Version control with Git January 29, 2020 18 / 22 Hands-on exercises Collaboration Collaboration
Collaboration easily leads to conflicts... More than one person edits the same file/lines It can be ambiguous how to merge these edits In code, edits to solve an issue can create other issues
Ways to cope with conflicts Prevent and minimize effects: Branching Pull requests Continuous Integration
Kees den Heijer (Data2day) Version control with Git January 29, 2020 19 / 22 Hands-on exercises Branching Branching (1)
source: Microsoft Git branching guidance
source:
https://nvie.com/posts/a-successful-git-branching-model/
Kees den Heijer (Data2day) Version control with Git January 29, 2020 20 / 22 Hands-on exercises Branching Branching (2)
source: https://git-scm.com/book/en/v2/Git-Branching-Branching-Workflows
Kees den Heijer (Data2day) Version control with Git January 29, 2020 21 / 22 Hands-on exercises Continuous Integration Continuous Integration
How does it work? Predefined workflows/pipelines run upon each commit Email notification upon failure
Some examples LaTeX: https://bitbucket.org/denheijer/latex-ci-example/ Python: https://bitbucket.org/denheijer/python-ci-example/ R: https://bitbucket.org/denheijer/r-ci-example/
Kees den Heijer (Data2day) Version control with Git January 29, 2020 22 / 22