The State of ssLinux Containers Gaikai
PS Now announcement at CES 2014
2 Gaikai
- caching
+ controller feedback
3 Agenda
1. “Linux Container” / “Docker Ecosystem” in a Nutshell 2. Confusion about Ecosystem / Vision to tackle it 3. Docker -> SWARM -> SLURM -> BigData 4. Discussion of Opportunities and Problems
4 The Bits and Pieces… Linux Containers
Containers do not spin up a distinct kernel
all containers & the host share the same
user-lands are independent
they are separated by Kernel Namespaces
SERVICE SERVICE SERVICE
Userland (OS) Userland (OS) Userland (OS) KERNEL KERNEL KERNEL SERVICE SERVICE SERVICE HYPERVISOR Userland (OS)Ubuntu:14.04 Ubuntu:15.10Userland (OS) Userland (OS)RHEL7.2 Userland (OS) Userland (OS)Tiny Core Linux HOST KERNEL HOST KERNEL SERVER SERVER
Traditional Virtualisation Containerisation 6 Kernel Namespaces
Containers are ‘grouped processes’
isolated by Kernel Namespaces
resource restrictions applicable through CGroups (disk/netIO)
Namespaces: PID Network Mount IPC UTS HOST
container1 container2 container3 container4 bash consul consul consul ls -l apache mysqld slurmd ssh
7 Docker Engine
Container Runtime Daemon
creates/…/removes containers, exposes REST API
handles Namespaces, CGroups, bind-mounts, etc.
IP connectivity by default via ‘host-only’ network bridge
container2
container1 docker0
Docker-Engine SERVER eth0 8 Docker Compose
Describes stack of container configurations
instead of writing a small bash script…
… it holds the runtime configuration as YAML file.
9 Docker Networking
Docker Networking spans networks across engines
KV-store to synchronise (Zookeeper, etcd, Consul)
VXLAN to pass messages along
global
container0 container1 containerN
Docker-Engine Docker-Engine Docker-Engine
Consul Consul Consul DC Consul SERVER0 SERVER1 SERVER
10 Docker Swarm
Docker Swarm proxies docker-engines
serves an API endpoint in front of multiple docker-engines
does placement decisions.
container1
:2376 :2376 :2376 Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master :2375 SERVER0 SERVER1 SERVER
-e constraint:node==SERVER0 11 Docker Swarm [cont]
query docker-enginequery docker-swarm
12 Introduce new Technologies Introducing new Tech
Self-perception when introducing new tech…
credit: TF2 - Meet the Pyro 14 Introducing new Tech
… not always the same as the perception of others.
credit: TF2 - Meet the Pyro 15 Docker Buzzword Chaos!
Distributions On-Premise & OverSpill Auto-Scaling Orchestration
Solutions self-healing production-ready enterprise-grade
16 Vision
1. No special distributions
useful for certain use-cases, such as elasticity and green-field deployment
not so much for an on-premise datacenter w/ legacy in it. 2. Leverage existing processes/resources
install workflow, syslog, monitoring
security (ssh infrastructure), user auth. 3. keep up with docker ecosystem
incorporate new features of engine, swarm, compose
networking, volumes, user-namespaces 17 Reduce to the max! Testbed
Hardware (courtesy of )
8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2) Software
Base installation
CentOS 7.2 base installation (updated from 7-alpha)
Ansible
consul, sensu
docker v1.10, docker-compose
docker SWARM
19 Docker Networking
Synchronised by Consul
node8 Docker-Engine Consul
node2 Docker-Engine Consul
node1 Docker-Engine Consul
Consul DC
20 Docker SWARM
Docker SWARM
Synchronised by Consul KV- store swarm master node8 Docker-Engine Consul
swarm node2 Docker-Engine Consul
swarm node1 Docker-Engine Consul
Consul SWARM DC
21 SLURM Cluster
SLURM within SWARM
swarm master node8 slurmctld slurmd Docker-Engine Consul
swarm node2 slurmd Docker-Engine Consul
swarm node1 slurmd Docker-Engine Consul
Consul SLURM SWARM DC
22 SLURM Cluster [cont]
23 SLURM Cluster [cont]
SLURM within SWARM
slurmd within app-container swarm master node8 pre-stage containers slurmctld slurmdhpcg Docker-Engine Consul
swarm node2 slurmdhpcg Docker-Engine Consul
swarm node1 slurmdhpcg Docker-Engine Consul
Consul SLURM SWARM DC
24 MPI Benchmark
http://qnib.org/mpi
http://qnib.org/mpi-paper 25 SLURM Cluster [cont]
SLURM within SWARM
slurmd within app-container
OpenFOAM swarm master node8 pre-stage containers slurmctld slurmdhpcg Docker-Engine Consul
OpenFOAM swarm node2 slurmdhpcg Docker-Engine Consul
OpenFOAM swarm node1 slurmdhpcg Docker-Engine Consul
Consul SLURM SWARM DC
26 OpenFOAM Benchmark
http://qnib.org/immutable
http://qnib.org/immutable-paper 27 Samza Cluster
Distributed Samza
Zookeeper and Kafka cluster
zookeeper kafka swarm master Samza instances to run jobs node8 samza Docker-Engine Consul
zookeeper kafka swarm node2 samza Docker-Engine Consul
zookeeper kafka swarm node1 samza Docker-Engine Consul
Consul SWARM DC
$ cat test.log |awk ‘{print $1}’ |sed -e ’s/HPC/BigData/g’ |tee out.log 28 To Be Explored Small vs. Big
1. Where to base images on?
Ubuntu/Fedora: ~200MB
Debian: ~100MB
Alpine Linux: 5MB (musl-libc) 2. Trimm the Images down at all cost?
How about debugging tools? Possibility to run tools on the host and ‘inspect’ namespaced processes inside of a container.
If PID-sharing arrives, carving out (e.g.) monitoring could be a thing.
30 One vs. Many Processes
1. In an ideal world…
a container only runs one process, e.g. the HPC solver. 2. In reality…
MPI want’s to connect to a sshd within the job-peers
monitoring, syslog, service discovery should be present as well. 3. How fast / aggressive to break traditional approaches?
31 Docker Network
Plugin System
VXLAN
MACVLAN
How about IPoIB?
32 Reproducibility / Downscaling
Running OpenFOAM on small scale is cumbersome
manually install OpenFOAM on a workstation
be confident that the installation works correctly A containerised OpenFOAM installation tackles both
http://qnib.org/immutable-paper http://qnib.org/immutable
33 Service Discovery
1. Since the environments are rather dynamic…
how does the containers discover services?
external registry as part of the framework?
discovery service as part of the container stacks?
34 Orchestration Frameworks
With Docker Swarm it is rather easy
to spin up a Kubernetes or Mesos cluster within Swarm.
scheduler apiserver
kubelet kubelet kubelet
etcd etcd etcd
Docker-Engine Docker-Engine Docker-Engine
swarm-client swarm-client swarm-client
swarm-master
SERVER0 SERVER1 SERVER
35 Immutable vs. Config Mgmt
1. Containers should be controlled via ENV or flags
External access/change of a running container is discouraged 2. Configuration management
Downgraded to bootstrap a host?
36 Continuous Dev./Integration
If containers are immutable within pipeline
testing/deployment should be automated
developers should have a production replica
37 Docker Momentum
IT Tinkering (Hello World)
Continuous Dev/Int/Dep
Microservices, hyper scale
Big Data
High Performance Computing DatacenterOps HPC
Software Dev
Disclaimer: subjective exaggeration 38 Docker in Software Development
Spinning up production-like environment is great
MongoDB, PostreSQL, memcached as separate containers
python2.7, python3.4
Like python’s virtualenv on steroids, iteration speedup through reproducibility
39 Docker in HPC development
Spinning up production-like environment is…
…not that easy
focus more on engineer/scientist, not the software-developer 1. For development it might work
close to non-HPC software dev 2. But is that the iteration-focus?
rather job settings / input data?
40 Separation of Concerns?
Split input iteration / development from operation
non-distributed stays vanilla
transition to HPC cluster using tech to foster operation
Input/Dev
http://gmkurtzer.github.io/singularity 41 containerd Integration
Docker-Engine 1.11 will not be the parent of containers
runC usage under the hood
42 Recap aka. IMHO
1. Separat Dev and Ops
don’t block the momentum fostering iteration speed in Development 2. Using vanilla docker-tech
keep up with the ecosystem and prevent vendor/ecosystem lock-in 3. 80/20 rule
have caveats on the radar but don’t bother too much
everything is so fast moving - it’s hard to predict
43 23.06.2016
https://github.com/qnib/hpcac-cluster2016 Q&A
eGalea Workshop (Pisa)