Software Distribution with Cernvm-FS Webapi Jakob Blomer CERN, EP-SFT [email protected] 2019-05-17 Micro Agent Cloud
Total Page:16
File Type:pdf, Size:1020Kb
Agent Cloud Book Keeper Gateway Software Distribution with CernVM-FS WebAPI Jakob Blomer CERN, EP-SFT [email protected] https://github.com/cvmfs/cvmfs 2019-05-17 Micro Agent Cloud 1. Accelerate. Book Keeper Gateway WebAPI Micro 1 Agent Cloud 2. Collide. Book Keeper Gateway WebAPI Micro 2 Agent Cloud 3. Measure. Book Keeper Gateway ∙ Billions of independent “events” WebAPI ∙ Each event subject to complex software processing → High-Throughput Computing Micro 3 Agent Cloud 4. Analyze! Book Keeper Gateway WebAPI Micro 4 Agent Cloud Computing, Federated Book Keeper Gateway ∙ Physics and computing: international collaborations ∙ “The Grid”: ∼160 data centers ∙ Approx.: global batch system WebAPI Micro 5 Agent Cloud Software Delivery by a File System ¶ Provide uniform, consistent, and versioned POSIX file system access to /cvmfs Book Keeper $ ls /cvmfs/cms.cern.ch slc7_amd64_gcc700 slc7_ppc64le_gcc530 slc7_aarch64_gcc700 slc6_mic_gcc481 ... on grids, clouds, supercomputers and end user laptops Gateway read publish WebAPI · Populate and propagate new and updated content ∙ A few “software librarians” can publish into /cvmfs ∙ Transactional writes as in git commit/push Micro 6 Agent Cloud Scale of Deployment Book Keeper Gateway WebAPI ∙ > 1 billion files under management ∙ HTTP data transport ∙ LHC infrastructure: Micro 5 mirror servers, 400 web caches 7 Agent Cloud Key Design Choices Book Keeper Transformation HTTP Transport a1240 Caching & Replication 41ae3 ... 7e95b Gateway Read-Only Content-Addressed Objects, Read/Write File System Merkle Tree File System Worker Nodes Software Publisher / Master Source WebAPI ∙ Reading and writing treated asymmetrically ∙ HTTP transport + caching ∙ Immutable objects, stateless services ∙ Consistency over availability Micro 8 Agent Cloud Key Design Choices Book Keeper Transformation HTTP Transport a1240 Caching & Replication 41ae3 ... 7e95b Read-Only Several CDN options: Content-Addressed Objects, Read/Write Gateway File System Merkle Tree File System ∙ Apache + Squids ∙ Ceph/S3 Worker Nodes ∙ Commercial CDN Software Publisher / Master Source WebAPI ∙ Reading and writing treated asymmetrically ∙ HTTP transport + caching ∙ Immutable objects, stateless services ∙ Consistency over availability Micro 8 Agent Cloud Reading Book Keeper CernVM-FS rAA Basic System Utilities Global HTTP Cache Hierarchy Gateway Fuse OS Kernel File System CernVM-FS Repository (HTTP or S3) Memory Buffer Persistent Cache All Versions Available WebAPI ∼100 MB ∼20 GB 1 TB – 10 TB ∙ Fuse based, independent mount points, e. g. /cvmfs/atlas.cern.ch Micro ∙ High cache effiency because entire cluster likely to use same software 9 Agent Cloud Writing Book Keeper Staging Area CernVM-FS Read-Only AUFS ∙ Kernel-level union file system: (Union File System) AUFS, OverlayFS Read/Write Gateway Interface File System, S3 WebAPI Publishing new content [ ~ ]# cvmfs_server transaction containers.cern.ch [ ~ ]# cd /cvmfs/containers.cern.ch && tar xvf ubuntu1610.tar.gz [ ~ ]# cvmfs_server publish containers.cern.ch Micro 10 Agent Cloud Use of Content-Addressable Storage /cvmfs/alice.cern.ch Object Store ∙ Compressed files and chunks Book Keeper amd64-gcc6.0 ∙ De-duplicated 4.2.0 ChangeLog . File Catalog . Gateway ∙ Directory structure, symlinks Compression, Hashing 806fbb67373e9... ∙ Content hashes of regular files ∙ Digitally signed WebAPI Repository ⇒ integrity, authenticity ∙ Time to live Object Store File catalogs ∙ Partitioned / Merkle hashes (possibility of sub catalogs) Micro 11 ⇒ Immutable files, trivial to check for corruption, versioning Why a file system? 12 Agent Cloud HEP Software Challenge Book Keeper Individual 0.1 MLOC Key Figures Analysis Code ∙ Hundreds of (novice) developers Gateway changing ∙ Hundred million binaries Experiment 4 MLOC Software Framework ∙ 1 TB / day of nightly builds ∙ ∼100 000 machines world-wide High Energy Physics 5 MLOC WebAPI Libraries ∙ Daily production releases, remain available “eternally” Compiler 20 MLOC System Libraries stable OS Kernel → too much for a packaging approach Micro 13 Agent Cloud Actual new content Software Directory Tree Book Keeper atlas.cern.ch 15 ] Directories repo 6 10 Symlinks × software Gateway 10 x86_64-gcc43 Duplicates 17.1.0 5 WebAPI File System Entries [ 17.2.0 . 1 File Kernel . Statistics over 2 Years Micro Between consecutive software versions: only ∼15 % new files At runtime only tiny fraction of files actually accessed 14 Agent Cloud Container Amplification Book Keeper Linux Libs ... Gateway ∙ Containers are easier to create than to role-out at scale WebAPI ∙ Ideally: containers for isolation and a software file system for distribution Micro 15 Agent Cloud Custom Graph Driver / Snapshotter Approach Book Keeper Gateway WebAPI Ideally: native support for (some) unpacked layers on a read-only file sytem Micro 16.