Logical Partitions on Many-core Processors
Ramya Masti, Claudio Marforio, Kari Kostiainen, Claudio Soriente, Srdjan Capkun ETH Zurich
ACSAC 2015 1 Infrastructure as a Service (IaaS)
VM VM App App App App OS OS OS OS Hardware Hardware Hardware
Personal platforms Shared cloud platform (IaaS) - Economies of scale - Shared resources - Security problems
2 Resource Partitioning
VM VM App App OS OS
Hardware
Cloud platform with dedicated resources - Guaranteed performance - Secure against shared memory/CPU attacks - Example: IBM cloud
3 Resource Partitioning
Logical Partition Memory VM VM Subset of a system’s resources that can run an OS/VM
Processor C0 C1 C2 C3
Every logical partition needs dedicated resources
Many logical partitions need platforms that have lots of resources!
4 Many-core Processors for Logical Partitioning
Cores, caches VM
Many-core ? processor VM
Many simple cores Many logical partitions
Designed for workloads that share data (no isolation!) Can many-core architectures support scalable logical partitioning?
5 Hypervisor Design Alternatives
Traditional Distributed Centralized
VM VM …… VM VM ……
Hypervisor (H) H H …… H H VM VM ……
C0 C1 ….. Cn C0 C1 ….. Cn C0 C1 ….. Cn
VM Run-time VM-hypervisor interaction H - Increases hypervisor’s attack surface - Performance overhead C
6 Centralized Hypervisors Today
Every core supports virtualization H VM VM …… - Hypervisor and VM modes
C0 C1 C2 ….. Cn - Privileged instructions and memory addressing - Enables memory confinement
Higher privilege Lower privilege But every core runs only in one mode! mode mode
VM Unused functionality in every core
H VM Solutions without processor virtualization
7 174 174IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 1, JANUARY 2011 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 1, JANUARY 2011
Fig. 1. Block diagram and tile architecture. Intel Single-chip Cloud Computer (SCC) Fig. 1. Block diagram and tile architecture. Memory Core controller
Cache Network on chip (NoC) Messaging Buffer
Network Interface Tiles
Sources: Howard et al. IJSSC’10; The Future of Many-core Computing, Tim Mattson, Intel Labs
8 Fig. 2. Full-chip and tile micrograph and characteristics. quired for core-to-router communication. Each tile’s NI logic A. Core Architecture also features a Message Passing Buffer (MPB), or 16 KB of Fig. 2. Full-chip and tile micrograph and characteristics. on-die shared memory. The MPB is used to increase perfor- The core is an enhanced version of the second generation Pen- tium processor [4]. L1 instruction and data caches have been mance of a message passing programming model whereby cores upsized to 16 KB, over the previous 8 KB design, and sup- communicate through local shared memory. port 4-way set associativity and both write-through and write- Total die power is kept at a minimum by dynamically scaling quired for core-to-router communication. Each tile’s NI logic A. Core Architecture back modes for increased performance. Additionally, data cache both voltage and performance. Fine grained voltage change also features a Message Passing Buffer (MPB), or 16 KB of lineson-die have shared been modified memory. with The a MPB new status is used bit used to increase to mark perfor- The core is an enhanced version of the second generation Pen- commands are transmitted over the on-die network to a Voltage the content of the cache line as Message Passing Memory Type tium processor [4]. L1 instruction and data caches have been Regulator Controller (VRC). The VRC interfaces with two mance of a message passing programming model whereby cores (MPMT). The MPMT is introduced to differentiate between upsized to 16 KB, over the previous 8 KB design, and sup- on package voltage regulators. Each voltage regulator has a communicate through local shared memory. normal memory data and message passing data. The cache line’s port 4-way set associativity and both write-through and write- 4-rail output to supply 8 on-die voltage islands. Further power Total die power is kept at a minimum by dynamically scaling MPMT bit is determined by page table information found in the back modes for increased performance. Additionally, data cache savings are achieved through active frequency scaling at a tile core’sboth TLB voltage and must and be performance. setup properly Fine by the grained operating voltage system. change lines have been modified with a new status bit used to mark granularity. Frequency change commands are issued to a tile’s commandsThe Pentium are instruction transmitted set architecture over the on-die has been network extended to a to Voltage un-core logic, whereby frequency adjustments are processed. the content of the cache line as Message Passing Memory Type includeRegulator a new Controller instruction, (VRC). INVDMB, The used VRC to support interfaces software with two Tile performance is scalable from 300 MHz at 700 mV to 1.3 (MPMT). The MPMT is introduced to differentiate between managedon package coherency. voltage When regulators. executed, Each an INVDMB voltage instruction regulator has a GHz at 1.3 V. The on-chip network scales from 60 MHz at 550 invalidates all MPMT cache lines in a single clock cycle. Subse- normal memory data and message passing data. The cache line’s mV to 2.6 GHz at 1.3 V. The design target for nominal usage is quently,4-rail output reads or to writes supply to the 8 on-die MPMT voltage cache lines islands. are guaranteed Further power MPMT bit is determined by page table information found in the 1 GHz for tiles and 2 GHz for the 2-D network, when supplied tosavings miss and are data achieved will be fetched through or active written. frequency The instruction scaling ex- at a tile core’s TLB and must be setup properly by the operating system. by 1.1 V. Full-Chip and tile micrograph and characteristics are posesgranularity. the programmer Frequency to direct change control commands of cache are management issued to a tile’s The Pentium instruction set architecture has been extended to shown in Fig. 2. whileun-core in a logic, message whereby passing environment. frequency adjustments are processed. include a new instruction, INVDMB, used to support software Tile performance is scalable from 300 MHz at 700 mV to 1.3 managed coherency. When executed, an INVDMB instruction GHz at 1.3 V. The on-chip network scales from 60 MHz at 550 invalidates all MPMT cache lines in a single clock cycle. Subse- mV to 2.6 GHz at 1.3 V. The design target for nominal usage is quently, reads or writes to the MPMT cache lines are guaranteed 1 GHz for tiles and 2 GHz for the 2-D network, when supplied to miss and data will be fetched or written. The instruction ex- by 1.1 V. Full-Chip and tile micrograph and characteristics are poses the programmer to direct control of cache management shown in Fig. 2. while in a message passing environment. Intel SCC Architecture
TILE Core Core Core - Asymmetric processor - Each core can run an independent OS NETWORK INTERFACE - Simple cores: no virtualization support Look Up Tables Look Up Tables (LUTs) - Determines the resources available to a core - Resources: other tiles, RAM and I/O
Network on Chip All system resources are memory mapped
9 Intel SCC Address Translation
Virtual address 0x87256199
CORE Memory Management Unit
Physical address 0x72008127
NETWORK INTERFACE Look Up Tables (LUTs) System-wide address
On-tile Off-tile
LUT Registers On-tile I/O Memory at DRAM Memory Another Tile
10 Problem: Lack of Isolation TILE
Core Core
NETWORK INTERFACE Look Up Tables
Every core can Network on Chip change its memory access configuration!
Off-tile memory (DRAM)
11 Solution Intuition
TILE TILE
Master CoreCore Core Core Core X X X NETWORK INTERFACE NETWORK INTERFACE Look Up Tables Look Up Tables
Network on Chip Master core configures memory confinement for all coresOff-tile memory (DRAM)
12 Contributions
1. Hardware change to Intel SCC that enables logical partitioning - isolation on Network on Chip - emulate in software
2. Custom hypervisor - small TCB and attack surface
3. Cloud architecture (IaaS) - implementation and evaluation
13 Cloud Architecture (IaaS)
Computing Node Crypto engine (HSM) H VM VM
HW-virtualized Intel SCC peripherals
Management Service
User Management Storage VM Management Service
14 Operation example: VM startup
Helper Hypervisor VM Computing VM node Storage Core Master Core Core Service Crypto Encrypt and Intel SCC Engine install VM 1. Boot 2. Start 3. Fetch VM 4. Decrypt VM
5. Assign core and configure LUTs 6. Start
15 Recap of Main Properties
No processor virtualization ◦ Beneficial for high number of cores Small TCB ◦ Hypervisor implementation size 3.4K LOC ◦ Tolerance to compromise of other cloud components Reduced VM interaction ◦ Small attack surface ◦ No virtualization overhead
16 Comparison
No Processor Small TCB Reduced VM Virtualization Interaction
Xen (>100K LOC) ✓ ✗ ✗ HypeBIOS (4K LOC) ✗ ✓ ✗ Merge? NoHype (>100K LOC) ✗ ✗ ✓ } Our solution (3.4K LOC) ✓ ✓ ✓
17 Summary
VM
Many-core ? H processor VM Yes! (with minor modifications)
Intel SCC case study - Isolation on Network on Chip (NoC)
18 Thank you!
Ramya Masti, Claudio Marforio, Kari Kostiainen, Claudio Soriente, Srdjan Capkun
19