Intel® Itanium® Architecture Update
Total Page:16
File Type:pdf, Size:1020Kb
Intel® Itanium® Architecture Update 5th October, 2006 Dr. Feixiong Liu Technical Manager for HP TSG Intel EMEA Copyright © 2006 Intel Corporation Agenda Itanium® Processor Family Roadmap Itanium® 2 Processor update & technology highlights Montecito Processor Performance update Itanium® 2 Processor Vs Power 5+ competitive position Itanium® 2 Processor Vs X86 Positioning 2 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Quiz How many transistors in the next generation of Intel® Itanium® 2 processors? 1.72 billion transistors 3 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential How big is the transistor in the next generation Itanium® Processor? 4 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Intel® Itanium® Processor Family 2001 2002 2003 2004 2006 future Itanium® 2 Madison9M Montecito** Tukwila** Madison** Common Platform 800MHz 1GHz 1.5GHz 1.5GHz 1.5GHz Multi-Core 4MB L3-Cache 3MB iL3-Cache 6MB iL3-Cache 9MB iL3-Cache 24MB iL3- Developed by 460GX Chip-set E8870 Chip-set E8870 Chip-set E8870 Chip-set Cache former Alpha OEM Chip-sets OEM Chip-sets OEM Chip-sets OEM Chip-sets Dual-Core team 180nm 180nm 130nm 130nm E8870 OEM Chip-sets 90nm **codename All features and dates specified are targets provided for planning purposes only and are subject to change. 5 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Intel® Itanium® Processor Family Roadmap Update 2005 2006 2007 2008 Future Richford Platform ItaniumItanium 22 MontecitoMontecito MontvaleMontvale TukwilaTukwila PoulsonPoulson (Madison(Madison 9M)9M) Intel® E8870 Chipset/OEM Rose Hill/OEM New Technologies • Dual-core • Multi-threading • Intel® Virtualization Technology • New instructions • Cache reliability (Intel® Cache safe Technology) • Faster FSB • Multi-core • Common system architecture w/ Intel® Xeon® SingleSingle DualDual QuadQuad MultiMulti CoreCore CoreCore CoreCore CoreCore • Enhanced RAS • Enhanced virtualization • Enhanced I/O & memory All features and dates specified are targets provided for planning purposes only and are subject to change without notice 6 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Agenda Itanium® Processor Family Roadmap Itanium® 2 Processor update & technology highlights Montecito Processor Performance update 64-bit Windows and SQL Server 2005 on Itanium® processor Itanium® 2 Processor Vs Power 5+ competitive position Itanium® 2 Processor Vs X86 Positioning 7 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Introducing Dual core Intel® Itanium® 2 Processor 9000 Series processor New features for performance Performance Dual-Core 2X 24MB on-die level 3 cache + new 1MB L2 D-cache Higher Intel® Hyper-Threading Technology Intel® Virtualization Technology Intel® Cache Safe Technology 104W, 2.5x performance/watt improvement PCI-Express & DDR2 Plus… Based on EPIC architecture Power Scalability: Systems scaling to 32P, 64P, and 20% beyond Lower Mainframe class reliability features First 1.72 billion transistors processor 8 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Itanium® Momentum Continues “Global 100” Deployments $10B ISA investment ’06-’10 • More than 70 of the world’s 100 largest companies committed to IPF Application Growth 8000+ Price / Performance Leadership $4.99 7000 $3.93 /tpmc Lower is better /tpmc $2.75 Power5 570 /tpmc $2.71 2400 /tpmc Platforms based on 700 Dual-Core Itanium® 2P 2P 4P 4P Dual-Core Itanium 2 Processor 9000 4C 4C 8C 8C 2 Processor 9000 2003 2004 2005 1H’06 Series 1 2 Source: Itanium Solutions Alliance TPC-C 2P TPC-C 4P 1 Source www.tpc.org : Source www.tpc.org : IBM eServer p5 570 4P, POWER5 1.9GHz, 4P, (2 processors, 4 cores, 8 threads), 128 GB memory, Oracle Database 10g Enterprise Edition, IBM AIX 5L V5.3, result of 203,439 tpmC $3.93/tpmC, published on 10/17/05. Itanium® 2 processor results of 200,829 tpmC and $2.75/tpmC on HP Integrity rx4640 using 2 Itanium® 2 processors 1.6GHz with 24MB L3 cache, (2 processors, 4 cores, 8 threads), 128GB memory, Oracle Database 10g Enterprise Edition, HP UX 11.iv2 64-Bit Base OS, was published on 03/21/06. 2 Source www.tpc.org : HP Integrity rx4640-8 4p c/s, on Intel DC Itanium2 Processor 9050 1.6 Ghz, (4 processors, 8 cores, 16 threads) , with 24M L3 Cache, 128GB memory, Microsoft SQL Server 2005 Enterprise Itanium Ed., Microsoft Windows Server 2003 Enterprise Edition SP1, published a result of 290,644 tpmc, $/tpmc of 2.71 USD, on 3/27/2006. IBM eServer p5 570 8P, 9 IBM POWER520th July 1.9GHz, 2006 (4 processors, Intel® 8 cores, Itanium® 16 threads), Architecture256GB memory, IBM Update DB2 UDB 8.1, submitted a result of 429900 tpmc, 4.99 USD/tpmc on 8/31/2004.Rev i1.1_4718844 Intel Confidential Rapid Growth in System Revenue Relative Itanium System Revenue vs. SPARC1 and Power1 60% 1 60% vs. SPARC SpecificSpecific CountriesCountries 50%50% vs. Power2 45% versus versus SPARC1 Power2 40%40% 42% Japan 110% 109% 30%30% Korea 68% 51% 20%20% PRC 55% 43% 10% 10% Russia 352% 97% 0%0% Q103Q103 Q303Q303 Q104Q104 Q304Q304 Q105Q105 Q305Q305 Q106Q106 1: SPARC Includes: SPARC I, SPARC II, SPARC III, SPARC IV, SPARC 64 and SPARC 64 V; 2: POWER Includes: Power RS64 II, Power RS64 III, Power RS64 III, Power 3, Power 4, Power 5, and PowerPC Source: IDC Q1’06 WW Quarterly Server Tracker Other names and brands may be claimed as the property of others 10 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Intel® Itanium® Architecture Processors Madison9M Montecito Technology 130nm 90nm Number of cores 1 2 Clockrate 1600MHz 1600MHz - INT Units 6 6 - MM Units 6 6 - FP Units 2 (*,+) 2 (*,+) - ADDR Units 2L+2S or 4L 2L+2S or 4L L1-Caches (I/D) 16/16KB 16/16KB L2-Cache (I/D) 256KB Unified 1MB/256KB L3-Cache 9MB 24MB on die System Bus 6.4GB/s 8.5GB/s - Clockrate 400MHz 533MHz - Width 128 bit 128 bit 11 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Why Multi-Core? Performance Power 1.00x Max Frequency Relative single-core frequency and Vcc 12 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Over-clocking 1.73x Performance Power 1.13x 1.00x Over-clocked Max Frequency (+20%) Relative single-core frequency and Vcc 13 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Under-clocking 1.73x Performance Power 1.13x 1.00x 0.87x 0.51x Over-clocked Max Frequency Under-clocked (+20%) (-20%) Relative single-core frequency and Vcc 14 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Multi-Core = Energy Efficient Performance Dual-Core 1.73x Performance 1.73x Power 1.13x 1.00x 1.02x Over-clocked Max Frequency Dual-core (+20%) (-20%) Relative single-core frequency and Vcc 15 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Out-of-order versus Explicit Parallelism Legacy Architecture Itanium® Architecture Original Original Parallel Source Source Machine Code Code Hardware Code Implicitly Itanium 2- compiler Implicitly compiler parallel based parallel compiler Sequential Machine Code . Multiple execution . Execution Units difficult to . units used more . use efficiently . efficiently but Massive . compiler more complex Resources Compiler for Itanium® key for successful exploitation of resources 16 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Multi-Threading Approaches Sharing High utilization resources High complexity Medium latency hiding Single Multiple Medium utilization thread/cycle TMT thread/cycle SMT Low complexity Medium latency hiding A A A B B A A B B B Temporal Event A A B B B A A A B B B A A B B B B A A A A A A A B B B B B A A A A A A A A A Medium utilization B B B B B B Low complexity A A A B B B B High latency hiding B B B B B B B B B B “Event” for the core and “Multiple” for the caches 17 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Montecito Thread Switching • Switch events – L3 miss/return • Instruction, Data or HPW access – Time slice expiration – Low power state entered/exited – Switch hint execution (hint@pause instruction retired) – ALAT invalidation 18 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Montecito Multi-threading Serial Execution Ai Idle Ai+1 Bi Idle Bi+1 Montecito Multi-threaded Execution Ai Idle Ai+1 Bi Bi+1 Multi-threading decreases stalls and increase performance 19 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Temporal Multi-Threading ThreadThread 11 RunRun CC StallStall RunRun CC StallStall RunRun Switch ThreadThread 22 RunRun CC StallStall RunRun CC StallStall Overlap memory latency stall of 15 cycle penalty Thread 1 with execution of Thread 2 Stalled Cycles 15-35% Database 100% Performance workloads get a 90% Gain 80% significant 70% speedup 60% 50% 40% 2% core area 30% 20% impact and 0% 10% cycle time 0% impact SpecInt CPI SpecFP CPI TPC-C Single TPC-C Dual- Thread CPI thread CPI 20 20th July 2006 Intel® Itanium® Architecture Update Rev i1.1_4718844 Intel Confidential Unmatched Flexibility Itanium® 2 Architecture Virtualization Benefits Reduce Data Center Complexity Reduce Costs Consolidate applications Increase server utilization Rapid deployment • Dynamic load balancing between apps Smaller data center footprint Improve