Intel Haswell
Mike & Kennedy Computer Architecture 122 Outline
● Intel development cycle ● Haswell feature overview ● Transistor technology ● System on Chip (SoC) ● AVX2 ● Active Idle ● Thunderbolt ● µOperations ● Available platforms ● DDR3 & DDR4 Microarchitecture History Ivy Bridge Features
From Ivy Bridge New Features ● 22 nm immersion ● "Stacked" System on a lithography Chip ● 3D Tri-Gate Transistor ● Advanced Vector ● 14 stage Pipeline Extension 2 (AVX2) ● Quad-core ● Active Idle ● DDR3 Mainstream ● Thunderbolt ● 64kB L1 cache ● LGA1150 Socket ○ 32 kB Instruction ● Transactional ○ 32 kB Data Synchronization ● 256 kB L2 cache per Extensions (TSX) core ○ Shared memory Multi Threaded Transactional Memory Support 3D Tri-Gate Transistor
● First introduced in Ivy Bridge ● Significant reduction in current leakage ● More current can flow when the transistor is on ● Lower transition delay ● Incredibly small Transistor Operating Voltage "Stacked" System on a Chip (SoC)
● Nehalem brought part of the Northbridge onto the CPU
● Multi-Chip Module (MCM) ○ Memory control module next to the CPU ○ Integrated graphics (Clarkdale)
● Platform Controller Hub (PCH) ○ Rest of Northbridge fused with Southbridge
● Sandy-bridge integrated memory control and graphics into same CPU die. Advanced Vector Extension 2 (AVX2)
● 256 bits integer AVX instructions ○ Larger vector, register files ● 3-operand general-purpose bit manipulation and multiply ● Vector elements can be loaded from non-contiguous memory locations ● 3-operand fused multiply- accumulate support ○ d=a+b*c ○ Destructive, d will replace one of operants (a,b, or c). but only requires 3 registers Active Idle
● CPU appears active to OS ● 20x lower power usage ● Consumes power levels similar to sleep ● Handled by hardware and firmware Thunderbolt
● Multiplexing 6 Channels (Devices) ● 10 Gbits/sec per Channel (20 Gbits/sec total) ● Can Deliver Up To 10 Watts Across Cable ● Direct Memory Access (DMA) More µOperations!
µOperations??? ● Execution Unit ● Breaking down ○ 6 µops/cycle -> 8 µops/cycle operations into smaller pieces ● allows more complex cpu systems ● Types: ○ Transfer ○ Arithmetic ○ Logic ○ Shift
Haswell Multi-Chip Module
● GT1 - GT3 ● GT1 - GT2 ● GT3 ● Up to C7 ● Up to C6 ● Up to C10 ● 37/47/57W ● 35/45/65/95W ● 15W ● Platform Controller Hub (PCH) onto CPU (Sharkbay) DDR3 (Mainstream) DDR4 (Servers)
● Two Memory ● Four Memory Channels Channels ● 800MHz to 2.1GHz ● Starting frequency 2.1 ● Operates at 1.5Volts GHz, over 4.2GHz possible ● Operates at 1.2Volts Summary
● Uses 22nm die (like Ivy Bridge) ● Reduced power consumption ● Continues trend towards SoC ● Different versions to cover a wide range of platforms
Broadwell (2014) ● 14nm Die Shrink ● <10W TDP for Ultrabook Version ● Integrate "Stacked" SoC