Open Source Infrastructure: Version Control Systems
Jim Blandy [email protected]
Version Control Systems
● SCCS
● RCS
● CVS
● network-transparent CVS
● Subversion
● BitKeeper
● Git
● Darcs
SCCS
● Mark Rochkind, 1972
● Included in early Unix systems from AT&T
● Managed single files
● “Interleaved” delta format
SCCS: Interleaved deltas
A
int main() { puts (“a”); }
SCCS: Interleaved deltas
B A int main() int main() { { puts (“b”); puts (“a”); puts (“a”); } }
SCCS: Interleaved deltas
B A B' int main() int main() int main() { { { puts (“b”); puts (“a”); puts (“a”); puts (“a”); } puts (“b' ”); } }
SCCS: Interleaved deltas
+A int main() +A { +B puts (“b”); +A puts (“a”); +B' puts (“b' ”); +A }
RCS
● Walter Tichy, 1980's
● designed to address SCCS performance issues
RCS
Creator:inkscape 0.45.1
RCS
Creator:inkscape 0.45.1
CVS
● Dick Grune et. al., ~1988
● full directory trees, not single files
● copy/modify/merge/commit model
CVS
Creator:inkscape 0.45.1
CVS
Creator:inkscape 0.45.1
CVS
Creator:inkscape 0.45.1
Network-transparent CVS
● Jim Kingdon, 1990 (Cygnus Solutions)
Subversion
● Blandy, Collins-Sussman, Fitzpatrick, Fogel 2000
● Retains CVS model
● Better performance
● Better features
● Better architecture
Subversion
Subversion
Creator:inkscape 0.45.1
Subversion
Creator:inkscape 0.45.1
Subversion
Creator:inkscape 0.45.1
BitKeeper
● BitMover (McVoy), 1998
● Distributed
● Uses SCCS weave file format
● Used for Linux kernel collaboration until 2005
Hash naming
Hash naming
76 ● Is 2 a big number?
Hash naming
76 22 ● Is 2 a big number? Roughly = 7.5 * 10
Hash naming
32 ● Roughly 2 people (for now) 32 ● generating 2 strings per year 20 ● for 2 years 84 ● makes 2 strings generated in the likely history of the species
Hash naming
160 ● SHA1 has 2 different hash values 84 160 76 ● 2 /2 means we'll hit 1/2 of the hash values
Git
● Linus Torvalds, 2005
● Influenced by BitKeeper and Monotone (Graydon Hoare)
Git
● a blob is a string of bytes identified by its hash
● a tree is a series of names, flag bits, and blob hashes of files or subdirectories
Git
Creator:inkscape 0.45.1
Git
● each working copy includes complete history
● commits are local
● commits can have parents – zero or more!
● history is a DAG; multiple heads
● hash naming makes for fast net synchronization
● trading histories is the essential net operation
Mercurial
● Matt Mackall, 2005
● Similar to Git
● Uses a manifest instead of a tree
Leading Question
● What is the correct number of context lines to include in a patch?
Darcs
● David Roundy
● metadata is a set of patches (not historical!)
Darcs
● Write the effect of applying patch P1, and then P2, as: P1 P2
● Suppose I have P1 P2 P3 P4
● It's easy to compute P1 P2 P3
● But P1 P3 P4 may or may not apply cleanly
Darcs
● Some pairs of patches P1 and P2 commute: there is some pair of patches P1' and P2' such that P1 P2 = P2' P1'
The Toronto Idea
● Karl Fogel, in dire circumstances