The State of ssLinux Containers Gaikai

PS Now announcement at CES 2014

2 Gaikai

- caching

+ controller feedback

3 Agenda

1. “Linux Container” / “Docker Ecosystem” in a Nutshell 2. Confusion about Ecosystem / Vision to tackle it 3. Docker -> SWARM -> SLURM -> BigData 4. Discussion of Opportunities and Problems

4 The Bits and Pieces… Linux Containers

Containers do not spin up a distinct kernel

all containers & the host share the same

user-lands are independent

they are separated by Kernel Namespaces

SERVICE SERVICE SERVICE

Userland (OS) Userland (OS) Userland (OS) KERNEL KERNEL KERNEL SERVICE SERVICE SERVICE HYPERVISOR Userland (OS):14.04 Ubuntu:15.10Userland (OS) Userland (OS)RHEL7.2 Userland (OS) Userland (OS)Tiny Core Linux HOST KERNEL HOST KERNEL SERVER SERVER

Traditional Virtualisation Containerisation 6 Kernel Namespaces

Containers are ‘grouped processes’

isolated by Kernel Namespaces

resource restrictions applicable through CGroups (disk/netIO)

Namespaces: PID Network Mount IPC UTS HOST

container1 container2 container3 container4 bash consul consul consul ls -l apache mysqld slurmd ssh

7 Docker Engine

Container Runtime Daemon

creates/…/removes containers, exposes REST API

handles Namespaces, CGroups, bind-mounts, etc.

IP connectivity by default via ‘host-only’ network bridge

container2

container1 docker0

Docker-Engine SERVER eth0 8 Docker Compose

Describes stack of container configurations

instead of writing a small bash script…

… it holds the runtime configuration as YAML file.

9 Docker Networking

Docker Networking spans networks across engines

KV-store to synchronise (Zookeeper, etcd, Consul)

VXLAN to pass messages along

global

container0 container1 containerN

Docker-Engine Docker-Engine Docker-Engine

Consul Consul Consul DC Consul SERVER0 SERVER1 SERVER

10 Docker Swarm

Docker Swarm proxies docker-engines

serves an API endpoint in front of multiple docker-engines

does placement decisions.

container1

:2376 :2376 :2376 Docker-Engine Docker-Engine Docker-Engine

swarm-client swarm-client swarm-client

swarm-master :2375 SERVER0 SERVER1 SERVER

-e constraint:node==SERVER0 11 Docker Swarm [cont]

query docker-enginequery docker-swarm

12 Introduce new Technologies Introducing new Tech

Self-perception when introducing new tech…

credit: TF2 - Meet the Pyro 14 Introducing new Tech

… not always the same as the perception of others.

credit: TF2 - Meet the Pyro 15 Docker Buzzword Chaos!

Distributions On-Premise & OverSpill Auto-Scaling Orchestration

Solutions self-healing production-ready enterprise-grade

16 Vision

1. No special distributions

useful for certain use-cases, such as elasticity and green-field deployment

not so much for an on-premise datacenter w/ legacy in it. 2. Leverage existing processes/resources

install workflow, syslog, monitoring

security (ssh infrastructure), user auth. 3. keep up with docker ecosystem

incorporate new features of engine, swarm, compose

networking, volumes, user-namespaces 17 Reduce to the max! Testbed

Hardware (courtesy of )

8x Sun Fire x2250, 2x 4core XEON, 32GB, Mellanox ConnectX-2) Software

Base installation

CentOS 7.2 base installation (updated from 7-alpha)

Ansible

consul, sensu

docker v1.10, docker-compose

docker SWARM

19 Docker Networking

Synchronised by Consul

node8 Docker-Engine Consul

node2 Docker-Engine Consul

node1 Docker-Engine Consul

Consul DC

20 Docker SWARM

Docker SWARM

Synchronised by Consul KV- store swarm master node8 Docker-Engine Consul

swarm node2 Docker-Engine Consul

swarm node1 Docker-Engine Consul

Consul SWARM DC

21 SLURM Cluster

SLURM within SWARM

swarm master node8 slurmctld slurmd Docker-Engine Consul

swarm node2 slurmd Docker-Engine Consul

swarm node1 slurmd Docker-Engine Consul

Consul SLURM SWARM DC

22 SLURM Cluster [cont]

23 SLURM Cluster [cont]

SLURM within SWARM

slurmd within app-container swarm master node8 pre-stage containers slurmctld slurmdhpcg Docker-Engine Consul

swarm node2 slurmdhpcg Docker-Engine Consul

swarm node1 slurmdhpcg Docker-Engine Consul

Consul SLURM SWARM DC

24 MPI Benchmark

http://qnib.org/mpi

http://qnib.org/mpi-paper 25 SLURM Cluster [cont]

SLURM within SWARM

slurmd within app-container

OpenFOAM swarm master node8 pre-stage containers slurmctld slurmdhpcg Docker-Engine Consul

OpenFOAM swarm node2 slurmdhpcg Docker-Engine Consul

OpenFOAM swarm node1 slurmdhpcg Docker-Engine Consul

Consul SLURM SWARM DC

26 OpenFOAM Benchmark

http://qnib.org/immutable

http://qnib.org/immutable-paper 27 Samza Cluster

Distributed Samza

Zookeeper and Kafka cluster

zookeeper kafka swarm master Samza instances to run jobs node8 samza Docker-Engine Consul

zookeeper kafka swarm node2 samza Docker-Engine Consul

zookeeper kafka swarm node1 samza Docker-Engine Consul

Consul SWARM DC

$ cat test.log |awk ‘{print $1}’ |sed -e ’s/HPC/BigData/g’ |tee out.log 28 To Be Explored Small vs. Big

1. Where to base images on?

Ubuntu/Fedora: ~200MB

Debian: ~100MB

Alpine Linux: 5MB (musl-libc) 2. Trimm the Images down at all cost?

How about debugging tools? Possibility to run tools on the host and ‘inspect’ namespaced processes inside of a container.

If PID-sharing arrives, carving out (e.g.) monitoring could be a thing.

30 One vs. Many Processes

1. In an ideal world…

a container only runs one process, e.g. the HPC solver. 2. In reality…

MPI want’s to connect to a sshd within the job-peers

monitoring, syslog, service discovery should be present as well. 3. How fast / aggressive to break traditional approaches?

31 Docker Network

Plugin System

VXLAN

MACVLAN

How about IPoIB?

32 Reproducibility / Downscaling

Running OpenFOAM on small scale is cumbersome

manually install OpenFOAM on a workstation

be confident that the installation works correctly A containerised OpenFOAM installation tackles both

http://qnib.org/immutable-paper http://qnib.org/immutable

33 Service Discovery

1. Since the environments are rather dynamic…

how does the containers discover services?

external registry as part of the framework?

discovery service as part of the container stacks?

34 Orchestration Frameworks

With Docker Swarm it is rather easy

to spin up a Kubernetes or Mesos cluster within Swarm.

scheduler apiserver

kubelet kubelet kubelet

etcd etcd etcd

Docker-Engine Docker-Engine Docker-Engine

swarm-client swarm-client swarm-client

swarm-master

SERVER0 SERVER1 SERVER

35 Immutable vs. Config Mgmt

1. Containers should be controlled via ENV or flags

External access/change of a running container is discouraged 2. Configuration management

Downgraded to bootstrap a host?

36 Continuous Dev./Integration

If containers are immutable within pipeline

testing/deployment should be automated

developers should have a production replica

37 Docker Momentum

IT Tinkering (Hello World)

Continuous Dev/Int/Dep

Microservices, hyper scale

Big Data

High Performance Computing DatacenterOps HPC

Software Dev

Disclaimer: subjective exaggeration 38 Docker in Software Development

Spinning up production-like environment is great

MongoDB, PostreSQL, memcached as separate containers

python2.7, python3.4

Like python’s virtualenv on steroids, iteration speedup through reproducibility

39 Docker in HPC development

Spinning up production-like environment is…

…not that easy

focus more on engineer/scientist, not the software-developer 1. For development it might work

close to non-HPC software dev 2. But is that the iteration-focus?

rather job settings / input data?

40 Separation of Concerns?

Split input iteration / development from operation

non-distributed stays vanilla

transition to HPC cluster using tech to foster operation

Input/Dev

http://gmkurtzer.github.io/singularity 41 containerd Integration

Docker-Engine 1.11 will not be the parent of containers

runC usage under the hood

42 Recap aka. IMHO

1. Separat Dev and Ops

don’t block the momentum fostering iteration speed in Development 2. Using vanilla docker-tech

keep up with the ecosystem and prevent vendor/ecosystem lock-in 3. 80/20 rule

have caveats on the radar but don’t bother too much

everything is so fast moving - it’s hard to predict

43 23.06.2016

https://github.com/qnib/hpcac-cluster2016 Q&A

eGalea Workshop (Pisa) http://qnib.org