Evaluating Linux Kernel Crash Dumping Mechanisms Fernando Luis Vázquez Cao NTT Data Intellilink
[email protected] Abstract 1 Introduction There have been several kernel crash dump cap- turing solutions available for Linux for some Mainstream Linux lacked a kernel crash dump- time now and one of them, kdump, has even ing mechanism for a long time despite the made it into the mainline kernel. fact that there were several solutions (such as Diskdump [1], Netdump [2], and LKCD [3]) But the mere fact of having such a feature does available out of tree . Concerns about their in- not necessary imply that we can obtain a dump trusiveness and reliability prevented them from reliably under any conditions. The LKDTT making it into the vanilla kernel. (Linux Kernel Dump Test Tool) project was created to evaluate crash dumping mechanisms Eventually, a handful of crash dumping so- in terms of success rate, accuracy and com- lutions based on kexec [4, 5] appeared: pleteness. Kdump [6, 7], Mini Kernel Dump [8], and Tough Dump [9]. On paper, the kexec-based A major goal of LKDTT is maximizing the approach seemed very reliable and the impact coverage of the tests. For this purpose, LKDTT in the kernel code was certainly small. Thus, forces the system to crash by artificially recre- kdump was eventually proposed as Linux ker- ating crash scenarios (panic, hang, exception, nel’s crash dumping mechanism and subse- stack overflow, hang, etc.), taking into ac- quently accepted. count the hardware conditions (such as ongoing DMA or interrupt state) and the load of the sys- tem.