Agent Cloud
Book Keeper
Gateway Software Distribution with CernVM-FS WebAPI Jakob Blomer CERN, EP-SFT [email protected] https://github.com/cvmfs/cvmfs 2019-05-17 Micro Agent Cloud
1. Accelerate. . . Book Keeper
Gateway
WebAPI
Micro
1 Agent Cloud
2. Collide. . . Book Keeper
Gateway
WebAPI
Micro
2 Agent Cloud
3. Measure. . . Book Keeper
Gateway
WebAPI
Micro
∙ Billions of independent “events” ∙ Each event subject to complex software processing → High-Throughput Computing
3 Agent Cloud
4. Analyze! Book Keeper
Gateway
WebAPI
Micro
4 Agent Cloud
Computing, Federated Book Keeper
Gateway
WebAPI
Micro
∙ Physics and computing: international collaborations ∙ “The Grid”: ∼160 data centers ∙ Approx.: global batch system
5 Agent Cloud
Software Delivery by a File System Book Keeper
Provide uniform, consistent, and versioned POSIX file system access to /cvmfs Gateway $ ls /cvmfs/cms.cern.ch slc7_amd64_gcc700 slc7_ppc64le_gcc530 slc7_aarch64_gcc700 slc6_mic_gcc481 WebAPI ... Micro on grids, clouds, supercomputers and end user laptops
read publish
Populate and propagate new and updated content
∙ A few “software librarians” can publish into /cvmfs ∙ Transactional writes as in git commit/push
6 Agent Cloud
Scale of Deployment Book Keeper
Gateway
WebAPI
Micro
∙ > 1 billion files under management ∙ HTTP data transport ∙ LHC infrastructure: 5 mirror servers, 400 web caches 7 Agent Cloud
Key Design Choices Book Keeper
Gateway
Transformation HTTP Transport a1240 WebAPI Caching & Replication 41ae3 ... 7e95b Micro Read-Only Content-Addressed Objects, Read/Write File System Merkle Tree File System
Worker Nodes Software Publisher / Master Source
∙ Reading and writing treated asymmetrically ∙ HTTP transport + caching ∙ Immutable objects, stateless services ∙ Consistency over availability
8 Agent Cloud
Key Design Choices Book Keeper
Gateway
Transformation HTTP Transport a1240 WebAPI Caching & Replication 41ae3 ... 7e95b Micro Read-Only Several CDN options: Content-Addressed Objects, Read/Write File System Merkle Tree File System ∙ Apache + Squids ∙ Ceph/S3 Worker Nodes ∙ Commercial CDN Software Publisher / Master Source
∙ Reading and writing treated asymmetrically ∙ HTTP transport + caching ∙ Immutable objects, stateless services ∙ Consistency over availability
8 Agent Cloud
Reading Book Keeper
Gateway
CernVM-FS rAA Basic System Utilities Global WebAPI HTTP Cache Hierarchy Micro Fuse OS Kernel
File System CernVM-FS Repository (HTTP or S3) Memory Buffer Persistent Cache All Versions Available ∼100 MB ∼20 GB 1 TB – 10 TB
∙ Fuse based, independent mount points, e. g. /cvmfs/atlas.cern.ch ∙ High cache effiency because entire cluster likely to use same software 9 Agent Cloud
Writing Book Keeper
Gateway
Staging Area WebAPI CernVM-FS Read-Only AUFS ∙ Kernel-level union file system: (Union File System) Micro AUFS, OverlayFS Read/Write Interface File System, S3
Publishing new content [ ~ ]# cvmfs_server transaction containers.cern.ch [ ~ ]# cd /cvmfs/containers.cern.ch && tar xvf ubuntu1610.tar.gz [ ~ ]# cvmfs_server publish containers.cern.ch
10 Agent Cloud
Use of Content-Addressable Storage Book Keeper
Object Store /cvmfs/alice.cern.ch Gateway ∙ Compressed files and chunks amd64-gcc6.0 ∙ De-duplicated WebAPI 4.2.0
ChangeLog . File Catalog Micro . ∙ Directory structure, symlinks Compression, Hashing 806fbb67373e9... ∙ Content hashes of regular files ∙ Digitally signed Repository ⇒ integrity, authenticity ∙ Time to live
Object Store File catalogs ∙ Partitioned / Merkle hashes (possibility of sub catalogs)
11 ⇒ Immutable files, trivial to check for corruption, versioning Why a file system?
12 Agent Cloud
HEP Software Challenge Book Keeper
Gateway
WebAPI
Individual 0.1 MLOC Key Figures Analysis Code Micro ∙ Hundreds of (novice) developers changing ∙ Hundred million binaries Experiment 4 MLOC Software Framework ∙ 1 TB / day of nightly builds ∙ ∼100 000 machines world-wide High Energy Physics 5 MLOC Libraries ∙ Daily production releases, remain available “eternally” Compiler 20 MLOC System Libraries stable OS Kernel → too much for a packaging approach
13 Agent Cloud
Actual new content Book Keeper
Software Directory Tree Gateway atlas.cern.ch
15 WebAPI ] Directories repo 6
10 Symlinks × software Micro 10 x86_64-gcc43
Duplicates 17.1.0 5
File System Entries [ 17.2.0 . 1 File Kernel . Statistics over 2 Years
Between consecutive software versions: only ∼15 % new files At runtime only tiny fraction of files actually accessed 14 Agent Cloud
Container Amplification Book Keeper
Gateway
Linux Libs ... WebAPI
Micro
∙ Containers are easier to create than to role-out at scale ∙ Ideally: containers for isolation and a software file system for distribution
15 Agent Cloud
Custom Graph Driver / Snapshotter Approach Book Keeper
Gateway
WebAPI
Micro
Ideally: native support for (some) unpacked layers on a read-only file sytem
16