Intel Haswell

Mike & Kennedy Computer Architecture 122 Outline

development cycle ● Haswell feature overview ● Transistor technology ● System on Chip (SoC) ● AVX2 ● Active Idle ● ● µOperations ● Available platforms ● DDR3 & DDR4 History Ivy Bridge Features

From Ivy Bridge New Features ● 22 nm immersion ● "Stacked" System on a lithography Chip ● 3D Tri-Gate Transistor ● Advanced Vector ● 14 stage Pipeline Extension 2 (AVX2) ● Quad-core ● Active Idle ● DDR3 Mainstream ● Thunderbolt ● 64kB L1 cache ● LGA1150 Socket ○ 32 kB Instruction ● Transactional ○ 32 kB Data Synchronization ● 256 kB L2 cache per Extensions (TSX) core ○ Shared memory Multi Threaded Transactional Memory Support 3D Tri-Gate Transistor

● First introduced in Ivy Bridge ● Significant reduction in current leakage ● More current can flow when the transistor is on ● Lower transition delay ● Incredibly small Transistor Operating Voltage "Stacked" (SoC)

● Nehalem brought part of the onto the CPU

● Multi-Chip Module (MCM) ○ Memory control module next to the CPU ○ Integrated graphics (Clarkdale)

(PCH) ○ Rest of Northbridge fused with

● Sandy-bridge integrated memory control and graphics into same CPU die. Advanced Vector Extension 2 (AVX2)

● 256 bits integer AVX instructions ○ Larger vector, register files ● 3-operand general-purpose bit manipulation and multiply ● Vector elements can be loaded from non-contiguous memory locations ● 3-operand fused multiply- accumulate support ○ d=a+b*c ○ Destructive, d will replace one of operants (a,b, or c). but only requires 3 registers Active Idle

● CPU appears active to OS ● 20x lower power usage ● Consumes power levels similar to sleep ● Handled by hardware and firmware Thunderbolt

● Multiplexing 6 Channels (Devices) ● 10 Gbits/sec per Channel (20 Gbits/sec total) ● Can Deliver Up To 10 Watts Across Cable ● Direct Memory Access (DMA) More µOperations!

µOperations??? ● Execution Unit ● Breaking down ○ 6 µops/cycle -> 8 µops/cycle operations into smaller pieces ● allows more complex cpu systems ● Types: ○ Transfer ○ Arithmetic ○ Logic ○ Shift

Haswell Multi-Chip Module

● GT1 - GT3 ● GT1 - GT2 ● GT3 ● Up to C7 ● Up to C6 ● Up to C10 ● 37/47/57W ● 35/45/65/95W ● 15W ● Platform Controller Hub (PCH) onto CPU (Sharkbay) DDR3 (Mainstream) DDR4 (Servers)

● Two Memory ● Four Memory Channels Channels ● 800MHz to 2.1GHz ● Starting frequency 2.1 ● Operates at 1.5Volts GHz, over 4.2GHz possible ● Operates at 1.2Volts Summary

● Uses 22nm die (like Ivy Bridge) ● Reduced power consumption ● Continues trend towards SoC ● Different versions to cover a wide range of platforms

Broadwell (2014) ● 14nm Die Shrink ● <10W TDP for Ultrabook Version ● Integrate "Stacked" SoC