Debian Bug report logs - #903767 Stretch kernel 4.9.110-1 boot-loops with 4.8

Package: src:; Maintainer for src:linux is Debian Kernel Team ; Affects: release.debian.org Reported by: "Michael J. Redd" Date: Sat, 14 Jul 2018 13:45:01 UTC Severity: serious Tags: pending Merged with 903800, 903821, 903885 Found in version linux/4.9.110-1

Reply or subscribe to this bug.

Toggle useless messages

View this report as an mbox folder, status mbox, maintainer mbox

Message #5 received at [email protected] (full text, mbox, reply): From: "Michael J. Redd" To: [email protected] Subject: Stretch kernel 4.9.110-1 boot-loops with Xen Hypervisor 4.8 Date: Sat, 14 Jul 2018 09:43:56 -0400

Package: linux-image-4.9.0-7-amd64 Version: 4.9.110-1

Description: ======

After installing the latest Stretch kernel, 4.9.110-1, on a server running Xen Hypervisor 4.8, bootstrapping the kernel fails. GRUB loads the hypervisor as normal, which then attempts to load the Dom0 kernel. Once that starts, the system simply reboots. Nothing is output to the console after Xen does its thing (screen goes black), so I cannot offer any insights into what the kernel may be doing before it fails.

If I roll back to the previous Stretch kernel, linux-image-4.9.0-6- amd64 (4.9.88-1+deb9u1), the hypervisor starts the Dom0 kernel as normal and the system boots successfully.

Setup: ======

Xen Hypervisor version: 4.8.3+xsa267+shim4.10.1+xsa267-1+deb9u9 Kernel version: linux-image-4.9.0-7-amd64 (4.9.110-1)

Bug reassigned from package 'linux-image-4.9.0-7-amd64' to 'src:linux'. Request was from Salvatore Bonaccorso to [email protected]. (Sat, 14 Jul 2018 14:21:04 GMT) (full text, mbox, link).

No longer marked as found in versions linux/4.9.110-1. Request was from Salvatore Bonaccorso to [email protected]. (Sat, 14 Jul 2018 14:21:05 GMT) (full text, mbox, link).

Marked as found in versions linux/4.9.110-1. Request was from Salvatore Bonaccorso to [email protected]. (Sat, 14 Jul 2018 14:21:06 GMT) (full text, mbox, link).

Message #16 received at [email protected] (full text, mbox, reply): From: "Michael J. Redd" To: [email protected] Subject: Re: Stretch kernel 4.9.110-1 boot-loops with Xen Hypervisor 4.8 Date: Sat, 14 Jul 2018 10:35:16 -0400

This also apparently affects at least PV guests. Upgrading a PV domU to kernel 4.9.110-1 and rebooting yields the following output via xl's console:

Loading Linux 4.9.0-6-amd64 ... Loading Linux 4.9.0-7-amd64 ... Loading ... [ vmlinuz-4.9.0-7- amd6 2.69MiB 66% 1.67MiB/s ] [ 0.128044] dmi: Firmware registration failed.a 17.29MiB 100% 9.75MiB/s ] [ 1.408778] dmi-: dmi entry is absent. [ 1.427758] : 0000 [#1] SMP [ 1.427767] Modules linked in: [ 1.427778] CPU: 0 PID: 1 Comm: init Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-1 [ 1.427789] task: ffff88000ee36040 task.stack: ffffc90040068000 [ 1.427798] RIP: e030:[] [] ret_from_fork+0x2d/0x70 [ 1.427815] RSP: e02b:ffffc9004006bf50 EFLAGS: 00010006 [ 1.427823] RAX: 00000002175f5000 RBX: ffffffff816076d0 RCX: ffffea0000310e1f [ 1.427833] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffffc9004006bf58 [ 1.427840] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff88000a9e5000 [ 1.427847] R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 [ 1.427854] R13: 179f3966a73fde7b R14: 06f99905e8f3edfb R15: cf60f5f9fd8e4751 [ 1.427866] FS: 0000000000000000(0000) GS:ffff88000fc00000(0000) knlGS:0000000000000000 [ 1.427873] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.427879] CR2: 00007ffd37e72eb9 CR3: 000000000a9f4000 CR4: 0000000000042660 [ 1.427889] Stack: [ 1.427893] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.427906] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.427921] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 1.427943] Call Trace: [ 1.427951] Code: c7 e8 b8 fe a8 ff 48 85 db 75 2f 48 89 e7 e8 5b ed 9e ff 50 90 0f 20 d8 65 48 0b 04 25 e0 02 01 00 78 08 65 88 04 25 e7 02 01 00 <0f> 22 d8 58 66 66 90 66 66 90 e9 c1 07 00 00 4c 89 e7 eb 11 e8 [ 1.428148] RIP [] ret_from_fork+0x2d/0x70 [ 1.428160] RSP [ 1.428168] ---[ end trace cb1a96e88a7c4794 ]--- [ 1.428298] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 1.428298] [ 1.428316] Kernel Offset: disabled

Note that the guest bootloader being used here is pvGRUB; not sure if that is relevant but thought I would include it. Similarly, rolling the domU back to the previous kernel/selecting the previous kernel via pvGRUB allows the VM to boot normally.

Message #21 received at [email protected] (full text, mbox, reply): From: Andy Smith To: "Michael J. Redd" Cc: [email protected] Subject: Re: Bug#903767: Stretch kernel 4.9.110-1 boot-loops with Xen Hypervisor 4.8 Date: Sat, 14 Jul 2018 14:58:44 +0000

Also same symptoms in a PV guest under Xen 4.10. Works if booted on previous kernel. Also prevents installation of stable using the netboot image at e.g.

Cheers, Andy

Merged 903767 903821 Request was from Salvatore Bonaccorso to [email protected]. (Sun, 15 Jul 2018 12:36:10 GMT) (full text, mbox, link).

Severity set to 'serious' from 'normal' Request was from Salvatore Bonaccorso to [email protected]. (Sun, 15 Jul 2018 12:36:11 GMT) (full text, mbox, link).

Message #30 received at [email protected] (full text, mbox, reply): From: Jaap Eldering To: [email protected] Subject: verbose kernel log Date: Sun, 15 Jul 2018 15:06:38 +0200

I'm seeing the same problem with linux-image-4.9.0-7-amd64:4.9.110-1 from the 9.5 point release running on Xen/pvgrub.

In case it's useful, here's the verbose kernel log:

[ 2.028120] registered taskstats version 1 [ 2.028149] : loaded using pool lzo/zbud [ 2.030226] ima: No TPM chip found, activating TPM-bypass! [ 2.030238] ima: Allocated hash algorithm: sha256 [ 2.030281] xenbus_probe_frontend: Device with no driver: device/vbd/51712 [ 2.030286] xenbus_probe_frontend: Device with no driver: device/vbd/51728 [ 2.030291] xenbus_probe_frontend: Device with no driver: device/vif/0 [ 2.030297] xenbus_probe_frontend: Device with no driver: device/vif/1 [ 2.030317] hctosys: unable to rtc device (rtc0) [ 2.031657] Freeing unused kernel memory: 1420K [ 2.031668] Write protecting the kernel -only data: 12288k [ 2.038529] Freeing unused kernel memory: 1928K [ 2.039863] Freeing unused kernel memory: 1228K [ 2.050421] x86/mm: Checked W+X mappings: FAILED, 6123 W+X pages found. [ 2.050922] general protection fault: 0000 [#1] SMP [ 2.050932] Modules linked in: [ 2.050941] CPU: 1 PID: 1 Comm: init Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-1 [ 2.050949] task: ffff88003e367040 task.stack: ffffc9004018c000 [ 2.050954] RIP: e030:[] [] ret_from_fork+0x2d/0x70 [ 2.050967] RSP: e02b:ffffc9004018ff50 EFLAGS: 00010006 [ 2.050972] RAX: 0000000a7338d000 RBX: ffffffff816076d0 RCX: ffffea0000f4ac9f [ 2.050978] RDX: 0000000000000003 RSI: 0000000000000002 RDI: ffffc9004018ff58 [ 2.050984] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff880039b18000 [ 2.050991] R10: 8080808080808080 R11: fefefefefefefeff R12: 0000000000000000 [ 2.050997] R13: 0000000000000000 R14: 0000000000116359 R15: 00000000000053dc [ 2.051006] FS: 0000000000000000(0000) GS:ffff88003f880000(0000) knlGS:0000000000000000 [ 2.051013] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.051019] CR2: 00007ffea17fded0 CR3: 0000000039b8c000 CR4: 0000000000042660 [ 2.051024] Stack: [ 2.051029] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 2.051041] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 2.051050] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 2.051060] Call Trace: [ 2.051066] Code: c7 e8 b8 fe a8 ff 48 85 db 75 2f 48 89 e7 e8 5b ed 9e ff 50 90 0f 20 d8 65 48 0b 04 25 e0 02 01 00 78 08 65 88 04 25 e7 02 01 00 <0f> 22 d8 58 66 0f 1f 44 00 00 e9 c1 07 00 00 4c 89 e7 eb 11 e8 [ 2.051125] RIP [] ret_from_fork+0x2d/0x70 [ 2.051132] RSP [ 2.051140] ---[ end trace d3783fae70c9b202 ]--- [ 2.051262] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 2.051262] [ 2.051280] Kernel Offset: disabled

Merged 903767 903800 903821 Request was from Ben Hutchings to [email protected]. (Sun, 15 Jul 2018 17:39:04 GMT) (full text, mbox, link).

Merged 903767 903800 903821 Request was from Ben Hutchings to [email protected]. (Sun, 15 Jul 2018 17:39:08 GMT) (full text, mbox, link).

Added tag(s) pending. Request was from Ben Hutchings to [email protected]. (Mon, 16 Jul 2018 02:57:02 GMT) (full text, mbox, link).

Merged 903767 903800 903821 903885 Request was from Salvatore Bonaccorso to [email protected]. (Mon, 16 Jul 2018 07:33:18 GMT) (full text, mbox, link).

Message #43 received at [email protected] (full text, mbox, reply): From: Hans van Kranenburg To: [email protected], [email protected], [email protected], [email protected] Cc: Benoît Tonnerre , Sebastien KOECHLIN , "Michael J. Redd" , Andy Smith , Jaap Eldering , Vincent Lefevre , Robby , Luis Miguel Parra , Gianluigi Tiesi , Jered Floyd , Valentin Vidic Subject: 4.9.110-1 Xen PV boot workaround Date: Mon, 16 Jul 2018 19:28:37 +0200

Reportedly, adding pti=off to the kernel boot parameters will work around the issue for now.

Turning off pti in the guest kernel is done in any case for PV. The issue between 4.9.107 and 4.9.111 affects the detection and turning off of pti, that's why forcing it off helps.

In 4.9.112 it's fixed in commit 1adc34adc3447c34926994b87db5d929f5ab45b5 "x86/cpu: Re-apply forced caps every time CPU caps are re-read"

Hans

Message #48 received at [email protected] (full text, mbox, reply): From: Benoît Tonnerre To: [email protected] Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Subject: Re: 4.9.110-1 Xen PV boot workaround Date: Tue, 17 Jul 2018 00:39:36 +0200

[Message part 1 (text/plain, inline)]

Hi,

I tested this workaround : I confirm that it works on Xen host, but not on Xen guest. If you try to start a vm with latest kernel i.e. theses parameters in cfg file :

# # Kernel + memory size # kernel = '/boot/vmlinuz-4.9.0-7-amd64' extra = 'elevator=noop' ramdisk = '/boot/initrd.img-4.9.0-7-amd64'

The VM crash in loop with kernel error :

[ 0.000000] Linux version 4.9.0-7-amd64 ([email protected]) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 SMP Debian 4.9.110-1 (2018-07-05) [ 0.000000] Command line: root=/dev/xvda2 ro elevator=noop [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] ACPI in unprivileged domain disabled [ 0.000000] Released 0 page(s) [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] Xen: [mem 0x0000000000000000-0x000000000009ffff] usable [ 0.000000] Xen: [mem 0x00000000000a0000-0x00000000000fffff] reserved [ 0.000000] Xen: [mem 0x0000000000100000-0x000000007fffffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] DMI not present or invalid. [ 0.000000] Hypervisor detected: Xen [ 0.000000] e820: last_pfn = 0x80000 max_arch_pfn = 0x400000000 [ 0.000000] MTRR: Disabled [ 0.000000] x86/PAT: MTRRs disabled, skipping PAT initialization too. [ 0.000000] x86/PAT: Configuration [0-7]: WB WT UC- UC WC WP UC UC [ 0.000000] RAMDISK: [mem 0x02000000-0x05996fff] [ 0.000000] NUMA turned off [ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000007fffffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x7fc16000-0x7fc1afff] [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.000000] DMA32 [mem 0x0000000001000000-0x000000007fffffff] [ 0.000000] Normal empty [ 0.000000] Device empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009ffff] [ 0.000000] node 0: [mem 0x0000000000100000-0x000000007fffffff] [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000007fffffff] [ 0.000000] p2m virtual area at ffffc90000000000, size is 40000000 [ 0.000000] Remapped 0 page(s) [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org [ 0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000fffff] [ 0.000000] e820: [mem 0x80000000-0xffffffff] available for PCI devices [ 0.000000] paravirtualized kernel on Xen [ 0.000000] Xen version: 4.8.4-pre (preserve-AD) [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:1 nr_node_ids:1 [ 0.000000] percpu: Embedded 35 pages/cpu @ffff88007f600000 s105304 r8192 d29864 u2097152 [ 0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes) [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 515978 [ 0.000000] Policy zone: DMA32 [ 0.000000] Kernel command line: root=/dev/xvda2 ro elevator=noop [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] Memory: 1980804K/2096764K available (6250K kernel code, 1159K rwdata, 2868K rodata, 1420K init, 688K bss, 115960K reserved, 0K cma-reserved) [ 0.000000] Kernel/User page tables isolation: enabled [ 0.000000] Hierarchical RCU implementation. [ 0.000000] Build-time adjustment of leaf fanout to 64. [ 0.000000] RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=1. [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=1 [ 0.000000] Using NULL legacy PIC [ 0.000000] NR_IRQS:33024 nr_irqs:32 0 [ 0.000000] xen:events: Using FIFO-based ABI [ 0.000000] Console: colour dummy device 80x25 [ 0.000000] console [tty0] enabled [ 0.000000] console [hvc0] enabled [ 0.000000] clocksource: xen: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] installing Xen timer for CPU 0 [ 0.000000] tsc: Unable to calibrate against PIT [ 0.000000] tsc: No reference (HPET/PMTIMER) available [ 0.000000] tsc: Detected 2597.018 MHz processor [ 0.004000] Calibrating delay loop (skipped), value calculated using timer frequency.. 5194.03 BogoMIPS (lpj=10388072) [ 0.004000] pid_max: default: 32768 minimum: 301 [ 0.004000] Security Framework initialized [ 0.004000] Yama: disabled by default; enable with sysctl kernel.yama.* [ 0.004000] AppArmor: AppArmor disabled by boot time parameter [ 0.004000] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes) [ 0.004000] -cache hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.004000] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.004000] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes) [ 0.004000] ENERGY_PERF_BIAS: Set to 'normal', was 'performance' [ 0.004000] ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8) [ 0.004000] CPU: Physical Processor ID: 0 [ 0.004000] CPU: Processor Core ID: 0 [ 0.004000] mce: CPU supports 2 MCE banks [ 0.004000] Last level iTLB entries: 4KB 1024, 2MB 1024, 4MB 1024 [ 0.004000] Last level dTLB entries: 4KB 1024, 2MB 1024, 4MB 1024, 1GB 4 [ 0.004000] Spectre V2 : Mitigation: Full generic retpoline [ 0.004000] Spectre V2 : Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier [ 0.004000] Spectre V2 : Enabling Restricted Speculation for firmware calls [ 0.004000] Speculative Store Bypass: Vulnerable [ 0.051616] Freeing SMP alternatives memory: 24K [ 0.057710] ftrace: allocating 25269 entries in 99 pages [ 0.072061] cpu 0 spinlock event irq 1 [ 0.072071] smpboot: Max logical packages: 1 [ 0.072078] VPMU disabled by hypervisor. [ 0.072093] Performance Events: unsupported p6 CPU model 63 no PMU driver, software events only. [ 0.072602] NMI watchdog: disabled (cpu0): hardware events not enabled [ 0.072610] NMI watchdog: Shutting down hard lockup detector on all cpus [ 0.072624] x86: Booted up 1 node, 1 CPUs [ 0.072761] devtmpfs: initialized [ 0.072813] x86/mm: Memory block size: 128MB [ 0.074028] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns [ 0.074045] hash table entries: 256 (order: 2, 16384 bytes) [ 0.074075] pinctrl core: initialized pinctrl subsystem [ 0.074165] NET: Registered protocol family 16 [ 0.074176] xen:grant_table: Grant tables using version 1 layout [ 0.074195] Grant table initialized [ 0.074377] PCI: setting up Xen PCI frontend stub [ 0.074377] ACPI: Interpreter disabled. [ 0.074377] xen:balloon: Initialising balloon driver [ 0.076045] xen_balloon: Initialising balloon driver [ 0.076053] vgaarb: loaded [ 0.076068] dmi: Firmware registration failed. [ 0.076106] PCI: System does not support PCI [ 0.076111] PCI: System does not support PCI [ 0.076237] clocksource: Switched to clocksource xen [ 0.081278] VFS: Disk quotas dquot_6.6.0 [ 0.081294] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.081315] hugetlbfs: disabling because there are no supported hugepage sizes [ 0.081343] pnp: PnP ACPI: disabled [ 0.082398] NET: Registered protocol family 2 [ 0.082534] TCP established hash table entries: 16384 (order: 5, 131072 bytes) [ 0.082606] TCP bind hash table entries: 16384 (order: 6, 262144 bytes) [ 0.082654] TCP: Hash tables configured (established 16384 bind 16384) [ 0.082689] UDP hash table entries: 1024 (order: 3, 32768 bytes) [ 0.082708] UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes) [ 0.082750] NET: Registered protocol family 1 [ 0.082788] Unpacking initramfs... [ 0.123386] Freeing initrd memory: 58972K [ 0.123786] general protection fault: 0000 [#1] SMP [ 0.123792] Modules linked in: [ 0.123799] CPU: 0 PID: 30 Comm: modprobe Not tainted 4.9.0-7-amd64 #1 Debian 4.9.110-1 [ 0.123807] task: ffff880078ad7000 task.stack: ffffc90040498000 [ 0.123812] RIP: e030:[] [] ret_from_fork+0x2d/0x70 [ 0.123824] RSP: e02b:ffffc9004049bf50 EFLAGS: 00010006 [ 0.123829] RAX: 0000000493ef5000 RBX: ffffffff8108e9d0 RCX: ffffea0001ec61df [ 0.123835] RDX: 0000000000000002 RSI: 0000000000000002 RDI: ffffc9004049bf58 [ 0.123841] RBP: 0000000000000000 R08: 0000000000000000 R09: ffff880078adc000 [ 0.124009] R10: 8080808080808080 R11: fefefefefefefeff R12: ffff88007ceb7a00 [ 0.124009] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.124009] FS: 0000000000000000(0000) GS:ffff88007f600000(0000) knlGS:0000000000000000 [ 0.124009] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.124009] CR2: 00007ffd13e9e9b9 CR3: 0000000078af4000 CR4: 0000000000042660 [ 0.124009] Stack: [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.124009] Call Trace: [ 0.124009] Code: c7 e8 b8 fe a8 ff 48 85 db 75 2f 48 89 e7 e8 5b ed 9e ff 50 90 0f 20 d8 65 48 0b 04 25 e0 02 01 00 78 08 65 88 04 25 e7 02 01 00 <0f> 22 d8 58 66 0f 1f 44 00 00 e9 c1 07 00 00 4c 89 e7 eb 11 e8 [ 0.124009] RIP [] ret_from_fork+0x2d/0x70 [ 0.124009] RSP [ 0.124009] ---[ end trace e2ff95a7e079b5b5 ]---

Did I miss something ?

Thanks for your help.

Best regards.

Benoît

Le lun. 16 juil. 2018 à 19:28, Hans van Kranenburg a écrit :

> Reportedly, adding pti=off to the kernel boot parameters will work > around the issue for now. > > Turning off pti in the guest kernel is done in any case for PV. The > issue between 4.9.107 and 4.9.111 affects the detection and turning off > of pti, that's why forcing it off helps. > > In 4.9.112 it's fixed in commit 1adc34adc3447c34926994b87db5d929f5ab45b5 > "x86/cpu: Re-apply forced caps every time CPU caps are re-read" > > Hans >

[Message part 2 (text/html, inline)]

Message #53 received at [email protected] (full text, mbox, reply): From: "Michael J. Redd" To: Benoît Tonnerre Cc: [email protected], [email protected], [email protected], [email protected] Subject: Re: 4.9.110-1 Xen PV boot workaround Date: Mon, 16 Jul 2018 19:20:29 -0400

I've tested the workaround successfully. Added `pti=off` to my kernel's boot arguments, updated GRUB, and it started as intended.

Benoît,

Just to be sure, since you're loading your guests' kernels directly like that, you're passing pti=off via the `extra` config line in your domU config files, right? I.e. extra = 'elevator=noop pti=off'

On Tue, 2018-07-17 at 00:39 +0200, Benoît Tonnerre wrote: > Hi, > > I tested this workaround : I confirm that it works on Xen host, but > not on Xen guest. > If you try to start a vm with latest kernel i.e. theses parameters in > cfg file : > > # > # Kernel + memory size > # > kernel = '/boot/vmlinuz-4.9.0-7-amd64' > extra = 'elevator=noop' > ramdisk = '/boot/initrd.img-4.9.0-7-amd64'

Message #58 received at [email protected] (full text, mbox, reply): From: Benoît Tonnerre To: [email protected] Cc: [email protected], [email protected], [email protected], [email protected] Subject: Re: 4.9.110-1 Xen PV boot workaround Date: Tue, 17 Jul 2018 06:42:49 +0200

[Message part 1 (text/plain, inline)]

Hi Michael,

Sorry for my mistake.

Indeed, whith extra = 'elevator=noop pti=off'

All xen guests boot without a problem.

Thanks a lot for your help.

Benoît

Le mar. 17 juil. 2018 à 01:20, Michael J. Redd a écrit :

> I've tested the workaround successfully. Added `pti=off` to my kernel's > boot arguments, updated GRUB, and it started as intended. > > Benoît, > > Just to be sure, since you're loading your guests' kernels directly > like that, you're passing pti=off via the `extra` config line in your > domU config files, right? I.e. > > extra = 'elevator=noop pti=off' > > > > On Tue, 2018-07-17 at 00:39 +0200, Benoît Tonnerre wrote: > > Hi, > > > > I tested this workaround : I confirm that it works on Xen host, but > > not on Xen guest. > > If you try to start a vm with latest kernel i.e. theses parameters in > > cfg file : > > > > # > > # Kernel + memory size > > # > > kernel = '/boot/vmlinuz-4.9.0-7-amd64' > > extra = 'elevator=noop' > > ramdisk = '/boot/initrd.img-4.9.0-7-amd64' > > > >

[Message part 2 (text/html, inline)]

Message #63 received at [email protected] (full text, mbox, reply): From: Michael Laß To: [email protected] Subject: 1adc34adc3447c34926994b87db5d929f5ab45b5 confirmed to fix the issue Date: Tue, 17 Jul 2018 11:11:37 +0200

I can confirm that applying 1adc34adc3447c34926994b87db5d929f5ab45b5 on top of Debian’s 4.9.110-1 kernel solves the issue here on a Xen PV guest.

Message #68 received at [email protected] (full text, mbox, reply): From: Hans van Kranenburg To: Benoît Tonnerre Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Subject: Re: 4.9.110-1 Xen PV boot workaround Date: Tue, 17 Jul 2018 12:23:22 +0200

On 07/17/2018 12:39 AM, Benoît Tonnerre wrote: > Hi, > > I tested this workaround : I confirm that it works on Xen host, but not > on Xen guest. > If you try to start a vm with latest kernel i.e. theses parameters in > cfg file : > > # > # Kernel + memory size > # > kernel = '/boot/vmlinuz-4.9.0-7-amd64' > extra = 'elevator=noop' > ramdisk = '/boot/initrd.img-4.9.0-7-amd64' > > The VM crash in loop with kernel error : > > [...] > > Did I miss something ?

Yes, the pti=off needs to go in your extra line:

extra = 'elevator=noop pti=off'

Hans

Message #73 received at [email protected] (full text, mbox, reply): From: tmc To: [email protected] Subject: Re: 1adc34adc3447c34926994b87db5d929f5ab45b5 confirmed to fix the issue Date: Wed, 18 Jul 2018 00:07:50 +1000

On Tue, 17 Jul 2018 11:11:37 +0200 =?utf-8?Q?Michael_La=C3=9F?= wrote: > I can confirm that applying 1adc34adc3447c34926994b87db5d929f5ab45b5 on top of > Debian’s 4.9.110-1 kernel solves the issue here on a Xen PV guest. >

Message #78 received at [email protected] (full text, mbox, reply): From: TMC To: [email protected] Subject: Re: 1adc34adc3447c34926994b87db5d929f5ab45b5 confirmed to fix the issue Date: Wed, 18 Jul 2018 00:09:54 +1000

[Message part 1 (text/plain, inline)] does commit 1adc34adc3447c34926994b87db5d929f5ab45b5 on top of Debian’s 4.9.110-1 kernel solvexen dom0 boot issue?

-- -- GPG key fingerprint: 07DF B95B DB58 57B6 9656 682E 830A D092 288E F017 GPG public key available on pgp(dot)net key server

[Message part 2 (text/html, inline)]

Message #83 received at [email protected] (full text, mbox, reply): From: Michael Laß To: [email protected] Subject: Re: 1adc34adc3447c34926994b87db5d929f5ab45b5 confirmed to fix the issue Date: Tue, 17 Jul 2018 16:49:00 +0200

On Wed, 18 Jul 2018 00:09:54 +1000 TMC wrote: > does commit 1adc34adc3447c34926994b87db5d929f5ab45b5 on top of > Debian’s 4.9.110-1 kernel solvexen dom0 boot issue?

Unfortunately I have no control over the host, so I cannot test dom0. I could only test inside a single domU.

Added indication that 903767 affects release.debian.org Request was from Ben Hutchings to [email protected]. (Tue, 17 Jul 2018 21:24:03 GMT) (full text, mbox, link).

Send a report that this bug log contains spam.

Debian bug tracking system administrator . Last modified: Wed Jul 18 09:48:10 2018; Machine Name: beach

Debian Bug tracking system

Debbugs is free software and licensed under the terms of the GNU Public License version 2. The current version can be obtained from https://bugs.debian.org/debbugs-source/.

Copyright © 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson, 2005-2017 Don Armstrong, and many other contributors.