CPUID handling for guests

Andrew Cooper

Citrix XenServer

Friday 26th August 2016

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 1 / 11 Return information about the processor

I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged

I Useable by userspace I Doesn’t trap to supervisor mode

The CPUID instruction

Introduced in 1993

I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Unpriveleged

I Useable by userspace I Doesn’t trap to supervisor mode

The CPUID instruction

Introduced in 1993

I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor

I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc)

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 The CPUID instruction

Introduced in 1993

I Takes input parameters in %eax and %ecx I Returns values in %eax, %ebx, %ecx and %edx Return information about the processor

I Identifying information (GenuineIntel, AuthenticAMD, etc) I Feature information (available instructions, MSRs, etc) I Topology information (sockets, cores, threads, caches, TLBs, etc) Unpriveleged

I Useable by userspace I Doesn’t trap to supervisor mode

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 2 / 11 Some kernels binary patch themselves (e.g. Alternatives framework)

I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start

I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume

I Not a reboot Guest must not observe a loss of dependent features

OS expectations

CPUID information don’t change after boot

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Userspace function dispatching at process start

I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume

I Not a reboot Guest must not observe a loss of dependent features

OS expectations

CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)

I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Migration modelled as suspend/resume

I Not a reboot Guest must not observe a loss of dependent features

OS expectations

CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)

I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start

I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Guest must not observe a loss of dependent features

OS expectations

CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)

I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start

I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume

I Not a reboot

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 OS expectations

CPUID information don’t change after boot Some kernels binary patch themselves (e.g. Alternatives framework)

I e.g. New features in fastpaths (SMAP) I e.g. Errata workarounds Userspace function dispatching at process start

I e.g. Most efficient memset() I e.g. Hardware crypto if available I e.g. Audio or video processing Migration modelled as suspend/resume

I Not a reboot Guest must not observe a loss of dependent features

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 3 / 11 Available features controlled by:

I CPU family, model and stepping I Firmware version and settings I version I ’s hardware support and administrator choices Typically, “identical” servers are not

I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)

How compatible does hardware need to be?

All features previously used need to continue functioning

I All instructions, MSRs, etc

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 Typically, “identical” servers are not

I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)

How compatible does hardware need to be?

All features previously used need to continue functioning

I All instructions, MSRs, etc Available features controlled by:

I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 Must lie to guests (for their own good)

How compatible does hardware need to be?

All features previously used need to continue functioning

I All instructions, MSRs, etc Available features controlled by:

I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices Typically, “identical” servers are not

I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 How compatible does hardware need to be?

All features previously used need to continue functioning

I All instructions, MSRs, etc Available features controlled by:

I CPU family, model and stepping I Firmware version and settings I Microcode version I Xen’s hardware support and administrator choices Typically, “identical” servers are not

I Different CPUs, especially on RMA’d hardware I Configuring firmware is tedious and error prone Must lie to guests (for their own good)

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 4 / 11 PV guests

I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID

I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting

I Non-architectural, but available in IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen

Controlling a guests view of information

HVM guests

I CPUID instruction traps to I Xen can control all information seen

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 PV Emulated CPUID

I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting

I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen

Controlling a guests view of information

HVM guests

I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests

I CPUID instruction doesn’t trap I Xen has no direct control

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 CPUID Faulting

I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen

Controlling a guests view of information

HVM guests

I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests

I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID

I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 Controlling a guests view of information

HVM guests

I CPUID instruction traps to hypervisor I Xen can control all information seen PV guests

I CPUID instruction doesn’t trap I Xen has no direct control PV Emulated CPUID

I Software can opt in to Xen’s control I Adhoc use in PV guest kernels, but not by userspace CPUID Faulting

I Non-architectural, but available in Intel IvyBridge and later I Causes CPUID to fault with #GP(0) I Xen can control all information seen

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 5 / 11 AMD (All hardware Xen can run on)

I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)

I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits

I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right

Controlling a guests view of information (2)

CPUID Masking

I Non-architectural, “documented” only in whitepapers

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Intel (only some Nehalem/Westmere processors, SandyBridge)

I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits

I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right

Controlling a guests view of information (2)

CPUID Masking

I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)

I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Magic CPUID bits

I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right

Controlling a guests view of information (2)

CPUID Masking

I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)

I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)

I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Controlling a guests view of information (2)

CPUID Masking

I Non-architectural, “documented” only in whitepapers AMD (All hardware Xen can run on)

I Override MSRs I Must be careful not to advertise features unsupported in silicon I Covers basic and extended feature leaves Intel (only some Nehalem/Westmere processors, SandyBridge)

I AND mask against the real hardware values I Different MSRs on different hardware I Covers basic, extended and xsave feature leaves Magic CPUID bits

I APIC and OSXSAVE bits fast forwarded from other state I Interaction with masking completely undocumented I Behaviour reverse engineered, hopefully right

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 6 / 11 Some features don’t have feature bits

I AMD’s Segment Limit Enable FPU pipeline behaviour exposed directly to guests

I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves

I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking

Problematic areas to control

Guests can probe for some features

I 1GB Superpage support I AES instructions

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 FPU pipeline behaviour exposed directly to guests

I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves

I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking

Problematic areas to control

Guests can probe for some features

I 1GB Superpage support I AES instructions Some features don’t have feature bits

I AMD’s Long Mode Segment Limit Enable

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 Some feature bits affect the interpretation of other leaves

I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking

Problematic areas to control

Guests can probe for some features

I 1GB Superpage support I AES instructions Some features don’t have feature bits

I AMD’s Long Mode Segment Limit Enable FPU pipeline behaviour exposed directly to guests

I MXCSR_MASK I Intel’s FPDP and FPCSDS

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 Problematic areas to control

Guests can probe for some features

I 1GB Superpage support I AES instructions Some features don’t have feature bits

I AMD’s Long Mode Segment Limit Enable FPU pipeline behaviour exposed directly to guests

I MXCSR_MASK I Intel’s FPDP and FPCSDS Some feature bits affect the interpretation of other leaves

I CMP_LEGACY, HTT and X2APIC affect the topology interpretation I Can’t control topology information with masking

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 7 / 11 No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency

I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency

I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 No knowledge of feature dependency

I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency

I Buggy assumptions in guests result in crashes

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency

I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 CPUID handling in Xen

No real develoment since the uni-vcpu guest days

I No concept of topology information No auditing of the toolstack-provided information

I Lots of runtime adjustment to regain plausibility Inconsistent whitelist/blacklist approach

I Different approaches between different leaves I Different approaches between different guest types I Some information passed straight through from hardware No knowledge of feature dependency

I Buggy assumptions in guests result in crashes No way for the toolstack to evaluate the hardware capability

I Feature internals exposed in the ABI, with no API PV dependence on leaked CPUID information

I Hardware domain for C/P states, MTRRs I Control domain for building guests

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 8 / 11 At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests

Shared Xen/libxc algorithm for feature dependencies

I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic

CPUID-related improvements in Xen 4.7

Introduce “a featureset” in the API/ABI

I Keys as specified in architecture manuals I Can be treated as opaque by higher level software

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 Shared Xen/libxc algorithm for feature dependencies

I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic

CPUID-related improvements in Xen 4.7

Introduce “a featureset” in the API/ABI

I Keys as specified in architecture manuals I Can be treated as opaque by higher level software At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 CPUID-related improvements in Xen 4.7

Introduce “a featureset” in the API/ABI

I Keys as specified in architecture manuals I Can be treated as opaque by higher level software At boot, Xen calculates featuresets for itself and the toolstack: raw Features supported by hardware host Features used by Xen (after errata, command line, etc) pv Maximum featureset available for PV guests hvm Maximum featureset available for HVM guests

Shared Xen/libxc algorithm for feature dependencies

I Provides consistent logic between Xen and libxc I Build-time calculations to avoid complicated runtime logic

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 9 / 11 Improved hypercalls for policies

I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work

I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs

Future work

Extend the featureset concept to a full CPUID policy

I Needs guest topology information I Will allow control of MAXPHYSADDR

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 More migration stream work

I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs

Future work

Extend the featureset concept to a full CPUID policy

I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies

I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 MSRs

Future work

Extend the featureset concept to a full CPUID policy

I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies

I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work

I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 Future work

Extend the featureset concept to a full CPUID policy

I Needs guest topology information I Will allow control of MAXPHYSADDR Improved hypercalls for policies

I XEN_DOMCTL_get_cpuid_policy I Xen to audit policy plausibility at hypercall time, not runtime More migration stream work

I CPUID policy should be in the migration stream I Avoid mistakes when regenerating I Can finaly audit RAM/CPU state against reality I Allows for a pre-migrate check at the destination MSRs

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 10 / 11 CPUID handling for guests

Any Questions?

Andrew Cooper (Citrix XenServer) CPUID handling for guests Friday 26th August 2016 11 / 11