CRIU: Time and Space Travel Service for Linux Applications
Pavel Emelyanov LinuxCon NA, New Orleans, 2013 Agenda
What is CRIU?
Project history and state
Usage scenarios
Live migration
Reboot-less kernel upgrade
Slow services startup
Advanced debugging and testing
and more...
2 What is CRIU?
Checkpoint Restore In Userspace
Checkpoint or Dump Full info about state Restore or Restart
3 Why in userspace?
Process
Restore: User-space C/R API - syscalls
Kernel Dump: - Ptrace - /proc - netlink - syscalls
kmod
4 CRIU background
Project started ~2 years ago – an RFC on kernel memory API extension – small command line tool – minimal dump of process' internals First release – 23 Jul 2012 – x86 and basic stuff Since then – 150+ kernel patches merged – new APIs for reading and setting process' state
5 Current project state
The latest release – v0.7 – supports x86 & ARM – stuff typical applications use Explicitly checked – Apache, nginx, Oracle*, mysql, mongodb – Ssh/sshd, openvpn*, cron, sendmail – Java, gcc, make – VNC + { gimp, mplayer, blender, supertux } – Screen + { bash, top, tcpdump, tar/bz2 }
* some kernel tweaks required
6 Usage scenarios
Live migration – Useful in cluster Kernel upgrade w/o reboot Slow services startup Periodic snapshots – HPC case Advanced debugging and testing
7 Live migration
Host A Host B
8 Live migration ++
Host A Host B Pre-migrate memory
with memory tracker
Shared FS
9 Load balancing on cluster
Host A Host B
Host C
10 Node maintenance
Host A Host B
11 Kernel upgrade w/o reboot
Host
Kexec Kernel BA
12 Slow services startup
Service readiness Ready 100%
Initialize resource pools
Topup caches
Load config
Spawn process
T
# service foo start time
13 Slow services startup
Service readiness Ready 100%
Spawn process
t < T T
# service foo start time
14 Periodic snapshots
Memory tracker helps to keep images smaller
time
15 HPC
Power failure
time 0% 20% 40% 60% 60%
16 Advanced debugging
Application in trouble Production Host
Developer Host
Debugger
17 Advanced testing
Start App
T ~ 30 sec t ~ 0.1 sec t ~ 0.1 sec
......
18 Advanced testing
...
New test or new hardware ?
19 More (funny) usecases
Forgot to launch your program in screen – Live-migrate it there Playing a game without the save button – Snapshot it Suspend-to-RAM a VDI session
20 CRIU project resources
http://criu.org – project news and documentation http://git.criu.org – git repo with tool sources +CRIU page criu@openvz.org mailing list [email protected] is me
Thank you!
21 Pavel Emelyanov
22 Parallels – Optimized ComputingTM Confidential