CPUID handling for guests
Andrew Cooper
Citrix XenServer
Friday 26th August 2016
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 1 / 11 Return information about the processor
I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged
I Useable by userspace I Doesn’t trap to supervisor mode
The CPUID instruction
Introduced in 1993
I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Unpriveleged
I Useable by userspace I Doesn’t trap to supervisor mode
The CPUID instruction
Introduced in 1993
I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor
I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc)
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 The CPUID instruction
Introduced in 1993
I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor
I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged
I Useable by userspace I Doesn’t trap to supervisor mode
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Some kernels binary patch themselves (e.g. Alternatives framework)
I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start
I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume
I Not a reboot Guest must not observe a loss of dependent features
OS expectations
CPUID information don’t change after boot
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Userspace function dispatching at process start
I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume
I Not a reboot Guest must not observe a loss of dependent features
OS expectations
CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)
I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Migration modelled as suspend/resume
I Not a reboot Guest must not observe a loss of dependent features
OS expectations
CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)
I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start
I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Guest must not observe a loss of dependent features
OS expectations
CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)
I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start
I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume
I Not a reboot
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 OS expectations
CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)
I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start
I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume
I Not a reboot Guest must not observe a loss of dependent features
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Available features controlled by:
I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices Typically, “identical” servers are not
I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)
How compatible does hardware need to be?
All features previously used need to continue functioning
I All instructions, MSRs, etc
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 Typically, “identical” servers are not
I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)
How compatible does hardware need to be?
All features previously used need to continue functioning
I All instructions, MSRs, etc Available features controlled by:
I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 Must lie to guests (for their own good)
How compatible does hardware need to be?
All features previously used need to continue functioning
I All instructions, MSRs, etc Available features controlled by:
I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices Typically, “identical” servers are not
I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 How compatible does hardware need to be?
All features previously used need to continue functioning
I All instructions, MSRs, etc Available features controlled by:
I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices Typically, “identical” servers are not
I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 PV guests
I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID
I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting
I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen
Controlling a guests view of information
HVM guests
I CPUID instruction traps to hypervisor I Xen can control all information seen
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 PV Emulated CPUID
I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting
I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen
Controlling a guests view of information
HVM guests
I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests
I CPUID instruction doesn’t trap I Xen has no direct control
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 CPUID Faulting
I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen
Controlling a guests view of information
HVM guests
I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests
I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID
I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 Controlling a guests view of information
HVM guests
I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests
I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID
I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting
I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 AMD (All hardware Xen can run on)
I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)
I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits
I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right
Controlling a guests view of information (2)
CPUID Masking
I Non-architectural, “documented” only in whitepapers
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Intel (only some Nehalem/Westmere processors, SandyBridge)
I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits
I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right
Controlling a guests view of information (2)
CPUID Masking
I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)
I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Magic CPUID bits
I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right
Controlling a guests view of information (2)
CPUID Masking
I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)
I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)
I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Controlling a guests view of information (2)
CPUID Masking
I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)
I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)
I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits
I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Some features don’t have feature bits
I AMD’s Long Mode Segment Limit Enable FPU pipeline behaviour exposed directly to guests
I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves
I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking
Problematic areas to control
Guests can probe for some features
I 1GB Superpage support I AES instructions
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 FPU pipeline behaviour exposed directly to guests
I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves
I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking
Problematic areas to control
Guests can probe for some features
I 1GB Superpage support I AES instructions Some features don’t have feature bits
I AMD’s Long Mode Segment Limit Enable
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 Some feature bits affect the interpretation of other leaves
I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking
Problematic areas to control
Guests can probe for some features
I 1GB Superpage support I AES instructions Some features don’t have feature bits
I AMD’s Long Mode Segment Limit Enable FPU pipeline behaviour exposed directly to guests
I MXCSR_MASK I Intel’s FPDP and FPCSDS
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 Problematic areas to control
Guests can probe for some features
I 1GB Superpage support I AES instructions Some features don’t have feature bits
I AMD’s Long Mode Segment Limit Enable FPU pipeline behaviour exposed directly to guests
I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves
I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency
I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency
I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 No knowledge of feature dependency
I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency
I Buggy assumptions in guests result in crashes
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency
I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 CPUID handling in Xen
No real develoment since the uni-vcpu guest days
I No concept of topology information No auditing of the toolstack-provided information
I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach
I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency
I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability
I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information
I Hardware domain for C/P states, MTRRs I Control domain for building guests
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests
Shared Xen/libxc algorithm for feature dependencies
I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic
CPUID-related improvements in Xen 4.7
Introduce “a featureset” in the API/ABI
I Keys as specified in architecture manuals I Can be treated as opaque by higher level software
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 Shared Xen/libxc algorithm for feature dependencies
I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic
CPUID-related improvements in Xen 4.7
Introduce “a featureset” in the API/ABI
I Keys as specified in architecture manuals I Can be treated as opaque by higher level software At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 CPUID-related improvements in Xen 4.7
Introduce “a featureset” in the API/ABI
I Keys as specified in architecture manuals I Can be treated as opaque by higher level software At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests
Shared Xen/libxc algorithm for feature dependencies
I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 Improved hypercalls for policies
I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work
I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs
Future work
Extend the featureset concept to a full CPUID policy
I Needs guest topology information I Will allow control of MAXPHYSADDR
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 More migration stream work
I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs
Future work
Extend the featureset concept to a full CPUID policy
I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies
I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 MSRs
Future work
Extend the featureset concept to a full CPUID policy
I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies
I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work
I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 Future work
Extend the featureset concept to a full CPUID policy
I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies
I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work
I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 CPUID handling for guests
Any Questions?
Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 11 / 11