Open Source Infrastructure: Systems

Jim Blandy [email protected]

Version Control Systems

● SCCS

● RCS

● CVS

● network-transparent CVS

● Subversion

● BitKeeper

SCCS

● Mark Rochkind, 1972

● Included in early systems from AT&T

● Managed single files

● “Interleaved” delta format

SCCS:

A

int main() { puts (“a”); }

SCCS: Interleaved deltas

B A int main() int main() { { puts (“b”); puts (“a”); puts (“a”); } }

SCCS: Interleaved deltas

B A B' int main() int main() int main() { { { puts (“b”); puts (“a”); puts (“a”); puts (“a”); } puts (“b' ”); } }

SCCS: Interleaved deltas

+A int main() +A { +B puts (“b”); +A puts (“a”); +B' puts (“b' ”); +A }

RCS

● Walter Tichy, 1980's

● designed to address SCCS performance issues

RCS

Creator:inkscape 0.45.1

RCS

Creator:inkscape 0.45.1

CVS

● Dick Grune et. al., ~1988

● full directory trees, not single files

● copy/modify// model

CVS

Creator:inkscape 0.45.1

CVS

Creator:inkscape 0.45.1

CVS

Creator:inkscape 0.45.1

Network-transparent CVS

● Jim Kingdon, 1990 (Cygnus Solutions)

Subversion

● Blandy, Collins-Sussman, Fitzpatrick, Fogel 2000

● Retains CVS model

● Better performance

● Better features

● Better architecture

Subversion

Subversion

Creator:inkscape 0.45.1

Subversion

Creator:inkscape 0.45.1

Subversion

Creator:inkscape 0.45.1

BitKeeper

● BitMover (McVoy), 1998

● Distributed

● Uses SCCS weave file format

● Used for kernel collaboration until 2005

Hash naming

Hash naming

76 ● Is 2 a big number?

Hash naming

76 22 ● Is 2 a big number? Roughly = 7.5 * 10

Hash naming

32 ● Roughly 2 people (for now) 32 ● generating 2 strings per year 20 ● for 2 years 84 ● makes 2 strings generated in the likely history of the species

Hash naming

160 ● SHA1 has 2 different hash values 84 160 76 ● 2 /2 means we'll hit 1/2 of the hash values

Git

, 2005

● Influenced by BitKeeper and (Graydon Hoare)

Git

● a blob is a string of bytes identified by its hash

● a tree is a series of names, flag bits, and blob hashes of files or subdirectories

Git

Creator:inkscape 0.45.1

Git

● each working copy includes complete history

● commits are local

● commits can have parents – zero or more!

● history is a DAG; multiple heads

● hash naming makes for fast net synchronization

● trading histories is the essential net operation

Mercurial

● Matt Mackall, 2005

● Similar to Git

● Uses a manifest instead of a tree

Leading Question

● What is the correct number of context lines to include in a patch?

Darcs

● David Roundy

● metadata is a set of patches (not historical!)

Darcs

● Write the effect of applying patch P1, and then P2, as: P1 P2

● Suppose I have P1 P2 P3 P4

● It's easy to compute P1 P2 P3

● But P1 P3 P4 may or may not apply cleanly

Darcs

● Some pairs of patches P1 and P2 commute: there is some pair of patches P1' and P2' such that P1 P2 = P2' P1'

The Toronto Idea

● Karl Fogel, in dire circumstances