Microprocessor Packaging (Intel)

Total Page:16

File Type:pdf, Size:1020Kb

Microprocessor Packaging (Intel) MicroprocessorMicroprocessor PackagingPackaging The Key Link in the Chain Koushik Banerjee Technical Advisor Assembly Technology Development Intel Corporation 1 MicroprocessorsMicroprocessors 1971 2001 2 GlobalGlobal PackagingPackaging R&DR&D China Shanghai Philippines Cavite Arizona Chandler Malaysia Penang VirtualVirtual ATDATD 3 MainMain R&DR&D facilityfacility inin Chandler,Chandler, AZAZ 4 Computing needs driving complexity First to introduce organic in mainstream CPUs First to introduce flip chip in Intel 486TM Pentium®mainstreamPentium® Pro Pentium® CPUs II Pentium® III Pentium® 4 Itanium® Processor Processor Processor Processor Processor Processor Processor 25 MHz 1.0+ GHz Ceramic Wire-bond To Organic To Flip Chip 5 LookingLooking aheadahead …… ComplexityComplexity andand ChallengesChallenges toto supportsupport Moore’sMoore’s LawLaw 1. Silicon to package interconnect 2. Within package interconnect 3. Power management 4. Adding more functionality GoalGoal :: BringBring technologytechnology innovationinnovation intointo HighHigh volumevolume manufacturingmanufacturing atat aa LOWLOW COSTCOST 6 SiliconSilicon ?? PackagePackage RelationshipRelationship AnatomyAnatomy 101101 Silicon Processor: The “brain” of the computer (generates instructions) Packaging: The rest of the body (Communicates instructions to the outside world, adds protection) NoNo PackagePackage == NoNo ProductProduct !! Great Packaging = Great Products !! Great Packaging = Great Products !! 7 TheThe KeyKey LinkLink inin thethe ChainChain OpportunityOpportunity Transistor-to-Transistor Innovative,Innovative, efficient,efficient, Ckt Blk to Ckt Blk highhigh performance,performance, lowlow-- costcost packagespackages areare aa significantsignificant competitivecompetitive advantageadvantage Chip-to-Package Package- to-Board Board-to-System 8 ExampleExample –– EnablingEnabling CustomCustom solutionssolutions SiliconSilicon EnablerEnabler ValueValue AddAdd :: customcustom solutionssolutions basedbased onon marketmarket segmentsegment 9 BreakingBreaking BarriersBarriers toto Moore’sMoore’s LawLaw 1 Billion K Transistors 1,000,000 100,000 Pentium®4 Processor Pentium® II Processor ® 10,000 Pentium III Processor Pentium® Processor Pentium® Pro Processor 1,000 i486™ Processor 100 i386™ Processor 80286 TheThe NumberNumber ofof TransistorsTransistors PerPer 10 8086 ChipChip will will Double Double Every Every 18 18 MonthsMonths 1 ’75 ’80 ’85 ’90 ’95 ’00 ’05 ’10 ’15 Source: Intel IntegratedIntegrated PackagingPackaging ++ SiliconSilicon TechnologyTechnology developmentdevelopment isis essentialessential 10 ChallengeChallenge ## 11 SiliconSilicon toto packagepackage interconnectinterconnect 11 ApproachingApproaching 10K10K flipflip chipchip bumpsbumps onon aa diedie Flip Chip (C4) interconnect Underfill 10K Silicon Package Number of flip chip bumps 0 Pentium III® ItaniumTM Family Family Pentium 4® Future Family generations DriverDriver ––increasedincreased siliconsilicon functionality functionality 12 SolutionSolution :: AggressiveAggressive BumpBump PitchPitch ScalingScaling toto keepkeep downdown diedie sizesize Solder Bumps Key Challenges : • Plating bumps • Chip Attach Process • Underfill Human • Joint integrity Hair • HVM scalable process Strand 13 Which leads us to … ChallengeChallenge ## 22 WithinWithin packagepackage InterconnectInterconnect 14 SolutionSolution :: HighHigh DensityDensity InterconnectInterconnect Very high escape routing density from the die Package Traces Lines narrower DriverDriver :: NeedNeed highhigh wiringwiring than hair densitydensity Human Hair 15 DimensionalDimensional StackStack--UpUp Line in Silicon 130 nm (100X magnification) Line in Package 25 um (100X magnification) Line in Motherboard 5 mils (0.005”) (100X magnification) 16 ApproachingApproaching 40K40K micromicro viasvias insideinside aa packagepackage Chip Attach 40K Micro-vias Pads vias in package - 0 # of micro Pentium III® ItaniumTM Package X-section Family Family Pentium 4® Future Family generations DriverDriver ––HighHigh I/OI/O countcount && powerpower supply supply 17 SolutionSolution :: AdvancedAdvanced lithographylithography (new term in packaging !) C/A pads Wire Key Challenges : • Developing HDI (high density interconnect) at LOW COST • High Volume Manufacturing Capable Via Build-Up dielectric Core 18 CoreCore frequencyfrequency trendtrend …… doublingdoubling everyevery 22 yearsyears 100,000 10,000 1,000 Frequency 100 P6 (MHz) Pentium® proc 486 10 8085 386 8086 286 1 8080 8008 0.1 4004 ’70 ’80 ’90 ’00 ’10 Source : Intel Architecture Labs InIn additionaddition …… 19 FSBFSB frequencyfrequency rampramp continuescontinues 400 133 66 Max Mega transfers / second Pentium® II Pentium® III Pentium® 4 Processor Processor Processor Future Generation Processors Microprocessor Generation 20 SolutionSolution :: HighHigh PerformancePerformance InterconnectInterconnect TechnologyTechnology BenefitsBenefits ofof organicorganic 1. Copper – Low resistance 2. Low dielectric constant 3. Cheaper High Performance Silicon Copper Interconnects KeyKey ChallengeChallenge :: ? Optimize the entire substrate architecture (material properties, layer stack-up, via placement, power bussing etc.) Organic Packaging 21 SolutionSolution :: BetterBetter designsdesigns A poor design can ruin processor performance Key Challenges : ? Signal Timing ? Innovative routing – layout ? Optimizing power / ground distribution ? Co-design of the complete silicon ? package interconnect 22 Switching gears from interconnect to … ChallengeChallenge ## 33 PowerPower ManagementManagement 23 PowerPower Increasing,Increasing, siliconsilicon gettinggetting smallersmaller Pentium® Power 100 processors (Watts) 286 486 8086 10 386 8085 8080 8008 1 4004 0.1 ’71 ’74 ’78 ’85 ’92 ’00 ’04 ’08 Source : Intel Architecture Labs Two Challenges … Getting power in & getting heat out 24 ImportanceImportance ofof aa quietquiet PowerPower SupplySupply High Low Ideal state High Low Reality – noise OR High Low OR This is what Voltage Scaling can do 25 NeedNeed lotslots ofof charge,charge, veryvery quicklyquickly …… Increasing distance from supply Hot Water Inefficient Heater design Still Waiting !! Close Proximity to supply 26 SolutionSolution :: OptimizeOptimize designdesign forfor powerpower deliverydelivery KeyKey Challenge:Challenge: ?2X improvement in Cpkg capacitance and Lpkg inductance needed / generation lpkg (pH) Cpkg (uF) ?Need to optimize the complete silicon ? package integrated 0.18 um Future power delivery Generation Generation processors 130 nm processors solution Generation processors 27 SolutionSolution :: ReduceReduce systemsystem designdesign burdenburden –– heatheat removalremoval Temp – Package Temp – Silicon Case (Tc) Temp – Ambient (Tj) (Ta) Temperature Gradient OEM Packaging Provide Solutions Provide Solutions for this interface for this interface of the budget of the budget IntegratedIntegrated ThermalThermal SolutionsSolutions inin thethe packagepackage reducereduce heatheat fluxflux ––easiereasier toto coolcool inin thethe systemsystem 28 ExampleExample :: Pentium4Pentium4® High conductivity Thermal Interface Integrated High Material Conductivity Heat Spreader 29 ExampleExample :: ItaniumItanium® SchematicSchematic ofof howhow aa typicaltypical heatheat pipepipe worksworks Water Vapor Wick Structure Heater Block Vapor Condenses Cooler Section Evaporation Integrated heat pipe Of Heat Pipe Cooler Section technology interfacing Of Heat Pipe directly to the Heat Source Condensed water flows back silicon through the wick structure by capillary action 30 And finally … ChallengeChallenge ## 44 AddingAdding moremore functionalityfunctionality 31 SolutionSolution :: HighHigh DensityDensity interconnectinterconnect == moremore integrationintegration High Density Interconnect enables a large cache memory integration in a small space Leveraging packaging instead of adding onto silicon 32 SolutionSolution :: MassiveMassive integrationintegration == moremore featuresfeatures On CPU Voltage regulation RASM 33 InIn SummarySummary …… 34 WeWe talkedtalked aboutabout futurefuture ComplexityComplexity andand ChallengesChallenges toto supportsupport Moore’sMoore’s LawLaw 1. Silicon to package interconnect 2. Within package interconnect 3. Power management 4. Adding more functionality Intel’sIntel’s PackagingPackaging StrategyStrategy Innovative Technology Making Technology Affordable Smart designs Integrated silicon + packaging solutions 35 For more information, please visit …. http://www.intel.com/research/silicon/packaging.htm http://developer.intel.com/technology/itj/ Search for packaging articles 36.
Recommended publications
  • Wayne Community College 2009-2010 Strategic Plan End-Of-Year Report Table of Contents
    Wayne Community College INSTITUTIONAL EFFECTIVENESS THROUGH PLAN & BUDGET INTEGRATION 2009-2010 Strategic Plan End-of-Year Report Wayne Community College 2009-2010 Strategic Plan End-of-Year Report Table of Contents Planning Group 1 – President Foundation Institutional Advancement Planning Group 2 – VP Academic Services Academic Skills Center Ag & Natural Resources Allied Health Arts & Sciences Business Administration Cooperative Programs Dental Engineering & Mechanical Studies Global Education Information Systems & Computer Technology Language & Communication Library Mathematics Medical Lab Sciences Nursing Pre-Curriculum Public Safety Public Services Science SJAFB Social Science Transportation Planning Group 3 – VP Student Services VP Student Services Admissions & Records Financial Aid Student Activities Student Development Planning Group 4 – VP Educational Support Services VP Educational Support Services Campus Information Services Educational Support Technologies Facilities Operations Information Technology Security Planning Group 5 – VP Continuing Education VP Continuing Education Basic Skills Business & Industry Center Occupational Extension WCC PLANNING DOCUMENT 2009-2010 Department: Foundation Long Range Goal #8: Integrate state-of-practice technology in all aspects of the college’s programs, services, and operations. Short Range Goal #8.2: Expand and improve program accessibility through technology. Objective/Intended Outcome: The Foundation has experienced phenomenal growth in the last three years. With the purchase of the Raisers Edge Software, we have been able to see a direct increase in our revenues. In order to sustain this level of growth, The Foundation either needs to hire extra manpower or purchase additional Raiser’s Edge software to support our growth. 1. Raiser’s Edge NetSolutions: Enhance the Foundation office’s fundraising abilities. The Foundation would be able to accept online donations, reservations for golf tournament, gala, arts and humanities programs and reach out to alumni.
    [Show full text]
  • Soft Machines Targets Ipcbottleneck
    SOFT MACHINES TARGETS IPC BOTTLENECK New CPU Approach Boosts Performance Using Virtual Cores By Linley Gwennap (October 27, 2014) ................................................................................................................... Coming out of stealth mode at last week’s Linley Pro- president/CTO Mohammad Abdallah. Investors include cessor Conference, Soft Machines disclosed a new CPU AMD, GlobalFoundries, and Samsung as well as govern- technology that greatly improves performance on single- ment investment funds from Abu Dhabi (Mubdala), Russia threaded applications. The new VISC technology can con- (Rusnano and RVC), and Saudi Arabia (KACST and vert a single software thread into multiple virtual threads, Taqnia). Its board of directors is chaired by Global Foun- which it can then divide across multiple physical cores. dries CEO Sanjay Jha and includes legendary entrepreneur This conversion happens inside the processor hardware Gordon Campbell. and is thus invisible to the application and the software Soft Machines hopes to license the VISC technology developer. Although this capability may seem impossible, to other CPU-design companies, which could add it to Soft Machines has demonstrated its performance advan- their existing CPU cores. Because its fundamental benefit tage using a test chip that implements a VISC design. is better IPC, VISC could aid a range of applications from Without VISC, the only practical way to improve single-thread performance is to increase the parallelism Application (sequential code) (instructions per cycle, or IPC) of the CPU microarchi- Single Thread tecture. Taken to the extreme, this approach results in massive designs such as Intel’s Haswell and IBM’s Power8 OS and Hypervisor that deliver industry-leading performance but waste power Standard ISA and die area.
    [Show full text]
  • Intel Corporation 2000 Annual Report
    silicon is in 2000 Annual Report i n t e l .c o m i n t c . c o m Intel facts and figures Net revenues Diluted earnings per share Dollars in billions Dollars, adjusted for stock splits 35 1.6 33.7 1.51 30 29.4 1.2 26.3 25 25.1 Intel revenues 1.05 20.8 20 grew 15% in 2000, 0.97 0.86 0.8 giving us our 14th 16.2 15 0.73 consecutive year of 11.5 10 0.50 0.4 8.8 revenue growth. 0.33 0.33 5.8 5 4.8 0.12 0.16 0 0 91 92 93 94 95 9697 98 99 00 91 92 93 94 95 9697 98 99 00 Geographic breakdown of 2000 revenues Return on average stockholders’ equity Percent Percent 100 40 38.4 35.5 35.6 33.3 North America 41% Intel has 30 75 30.2 experienced strong 27.3 28.4 26.2 international growth, 21.6 20 50 with 59% of revenues 20.4 Asia-Pacific 26% outside North America in 2000. 10 25 Europe 24% 0 Japan 9% 91 92 93 94 95 9697 98 99 00 0 Capital additions to property, Stock price trading ranges by fiscal year plant and equipment † Dollars, adjusted for stock splits Dollars in millions 75 8,000 Capital invest- 6,674 ments reflect Intel’s 6,000 50 commitment to building leading-edge manu- 4,501 4,000 4,032 facturing capacity for 3,550 3,403 25 3,024 state-of-the-art 2,441 2,000 silicon products.
    [Show full text]
  • Computer Architecture Out-Of-Order Execution
    Computer Architecture Out-of-order Execution By Yoav Etsion With acknowledgement to Dan Tsafrir, Avi Mendelson, Lihu Rappoport, and Adi Yoaz 1 Computer Architecture 2013– Out-of-Order Execution The need for speed: Superscalar • Remember our goal: minimize CPU Time CPU Time = duration of clock cycle × CPI × IC • So far we have learned that in order to Minimize clock cycle ⇒ add more pipe stages Minimize CPI ⇒ utilize pipeline Minimize IC ⇒ change/improve the architecture • Why not make the pipeline deeper and deeper? Beyond some point, adding more pipe stages doesn’t help, because Control/data hazards increase, and become costlier • (Recall that in a pipelined CPU, CPI=1 only w/o hazards) • So what can we do next? Reduce the CPI by utilizing ILP (instruction level parallelism) We will need to duplicate HW for this purpose… 2 Computer Architecture 2013– Out-of-Order Execution A simple superscalar CPU • Duplicates the pipeline to accommodate ILP (IPC > 1) ILP=instruction-level parallelism • Note that duplicating HW in just one pipe stage doesn’t help e.g., when having 2 ALUs, the bottleneck moves to other stages IF ID EXE MEM WB • Conclusion: Getting IPC > 1 requires to fetch/decode/exe/retire >1 instruction per clock: IF ID EXE MEM WB 3 Computer Architecture 2013– Out-of-Order Execution Example: Pentium Processor • Pentium fetches & decodes 2 instructions per cycle • Before register file read, decide on pairing Can the two instructions be executed in parallel? (yes/no) u-pipe IF ID v-pipe • Pairing decision is based… On data
    [Show full text]
  • Performance of a Computer (Chapter 4) Vishwani D
    ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Performance of a Computer (Chapter 4) Vishwani D. Agrawal & Victor P. Nelson epartment of Electrical and Computer Engineering Auburn University, Auburn, AL 36849 ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 1 What is Performance? Response time: the time between the start and completion of a task. Throughput: the total amount of work done in a given time. Some performance measures: MIPS (million instructions per second). MFLOPS (million floating point operations per second), also GFLOPS, TFLOPS (1012), etc. SPEC (System Performance Evaluation Corporation) benchmarks. LINPACK benchmarks, floating point computing, used for supercomputers. Synthetic benchmarks. ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 2 Small and Large Numbers Small Large 10-3 milli m 103 kilo k 10-6 micro μ 106 mega M 10-9 nano n 109 giga G 10-12 pico p 1012 tera T 10-15 femto f 1015 peta P 10-18 atto 1018 exa 10-21 zepto 1021 zetta 10-24 yocto 1024 yotta ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 3 Computer Memory Size Number bits bytes 210 1,024 K Kb KB 220 1,048,576 M Mb MB 230 1,073,741,824 G Gb GB 240 1,099,511,627,776 T Tb TB ELEC 5200-001/6200-001 Performance Fall 2013 . Lecture 4 Units for Measuring Performance Time in seconds (s), microseconds (μs), nanoseconds (ns), or picoseconds (ps). Clock cycle Period of the hardware clock Example: one clock cycle means 1 nanosecond for a 1GHz clock frequency (or 1GHz clock rate) CPU time = (CPU clock cycles)/(clock rate) Cycles per instruction (CPI): average number of clock cycles used to execute a computer instruction.
    [Show full text]
  • Chap01: Computer Abstractions and Technology
    CHAPTER 1 Computer Abstractions and Technology 1.1 Introduction 3 1.2 Eight Great Ideas in Computer Architecture 11 1.3 Below Your Program 13 1.4 Under the Covers 16 1.5 Technologies for Building Processors and Memory 24 1.6 Performance 28 1.7 The Power Wall 40 1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors 43 1.9 Real Stuff: Benchmarking the Intel Core i7 46 1.10 Fallacies and Pitfalls 49 1.11 Concluding Remarks 52 1.12 Historical Perspective and Further Reading 54 1.13 Exercises 54 CMPS290 Class Notes (Chap01) Page 1 / 24 by Kuo-pao Yang 1.1 Introduction 3 Modern computer technology requires professionals of every computing specialty to understand both hardware and software. Classes of Computing Applications and Their Characteristics Personal computers o A computer designed for use by an individual, usually incorporating a graphics display, a keyboard, and a mouse. o Personal computers emphasize delivery of good performance to single users at low cost and usually execute third-party software. o This class of computing drove the evolution of many computing technologies, which is only about 35 years old! Server computers o A computer used for running larger programs for multiple users, often simultaneously, and typically accessed only via a network. o Servers are built from the same basic technology as desktop computers, but provide for greater computing, storage, and input/output capacity. Supercomputers o A class of computers with the highest performance and cost o Supercomputers consist of tens of thousands of processors and many terabytes of memory, and cost tens to hundreds of millions of dollars.
    [Show full text]
  • Intel Corporation Annual Report 1999
    clients networking and communications intel.com 1999 annual report the building blocks of the internet economy intc.com server platforms solutions and services 29.4 30 2.25 90 2.11 26.3 1.93 25.1 1.73 20.8 20 1.45 1.50 60 16.2 1.01 11.5 10 0.75 30 8.8 0.65 0.65 High 5.8 4.8 3.9 0.31 Close 0.24 0.20 Low INTEL CORPORATION 1999 0 0 0 90 91 92 93 94 95 96 97 98 99 90 91 92 93 94 95 96 97 98 99 90 91 92 93 94 95 96 97 98 99 Net revenues Diluted earnings per share Stock price trading ranges (Dollars in billions) (Dollars, adjusted for stock splits) by fiscal year (Dollars, adjusted for stock splits) 3,111 1999 facts and figures 3,000 45 2,509 Intel’s stock 38.4 2,347 35.5 35.6 price has risen 33.3 2,000 28.4 30 1,808 27.3 at a 48% 26.2 21.2 21.6 20.4 1,296 1,111 970 compound 1,000 15 780 618 517 annual growth 0 rate in the 0 90 91 92 93 94 95 96 97 98 99 90 91 92 93 94 95 96 97 98 99 Research and development Return on average (Dollars in millions, excluding purchased last 10 years. stockholders’ equity in-process research and development) (Percent) 9.76 4,501 9 Japan 4,500 7% 4,032 7.05 3,550 3,403 3,024 5.93 6 Asia- 3,000 Pacific North 5.14 23% America 2,441 43% 1,9 33 3.69 2.80 3 1,500 1,228 2.24 Machinery 948 & equipment 1.63 1.35 Europe 680 1.12 27% Land, buildings & improvements 0 0 90 91 92 93 94 95 96 97 98 99 90 91 92 93 94 95 96 97 98 99 Book value per share Geographic breakdown of 1999 revenues Capital additions to property, at year-end (Percent) plant and equipment† (Dollars, adjusted for stock splits) (Dollars in millions) Past performance does not guarantee future results.
    [Show full text]
  • CPU) the CPU Is the Brains of the Computer, and Is Also Known As the Processor (A Single Chip Also Known As Microprocessor)
    Central processing unit (CPU) The CPU is the brains of the computer, and is also known as the processor (a single chip also known as microprocessor). This electronic component interprets and carries out the basic instructions that operate the computer. Cache as a rule holds data waiting to be processed and instructions waiting to be executed. The main parts of the CPU are: control unit arithmetic logic unit (ALU), and registers – also referred as Cache registers The CPU is connected to a circuit board called the motherboard also known as the system board. Click here to see more information on the CPU Let’s look inside the CPU and see what the different components actually do and how they interact Control unit The control unit directs and co-ordinates most of the operations in the computer. It is a bit similar to a traffic officer controlling traffic! It translates instructions received from a program/application and then begins the appropriate action to carry out the instruction. Specifically the control unit: controls how and when input devices send data stores and retrieves data to and from specific locations in memory decodes and executes instructions sends data to other parts of the CPU during operations sends data to output devices on request Arithmetic Logic Unit (ALU): The ALU is the computer’s calculator. It handles all math operations such as: add subtract multiply divide logical decisions - true or false, and/or, greater then, equal to, or less than Registers Registers are special temporary storage areas on the CPU. They are: used to store items during arithmetic, logic or transfer operations.
    [Show full text]
  • (12) United States Patent (10) Patent No.: US 7,020,893 B2 Connelly (45) Date of Patent: Mar
    US007020893B2 (12) United States Patent (10) Patent No.: US 7,020,893 B2 Connelly (45) Date of Patent: Mar. 28, 2006 (54) METHOD AND APPARATUS FOR 5,945,988 A 8, 1999 Williams et al. CONTINUOUSLY AND 5,977.964. A 11/1999 Williams et al. OPPORTUNISTICALLY DRIVING AN 5.991,841 A 11/1999 Gafken et al. OPTIMIAL BROADCAST SCHEDULE BASED 6,002,393 A 12/1999 Hite et al. ON MOST RECENT CLIENT DEMAND FEEDBACK FROMA DISTRIBUTED SET OF (Continued) BROADCAST CLIENTS FOREIGN PATENT DOCUMENTS (75) Inventor: Jay H. Connelly, Portland, OR (US) WO WO 99,65237 12/1999 (73) Assignee: Intel Corporation, Santa Clara, CA (Continued) (US) OTHER PUBLICATIONS (*) Notice: Subject to any disclaimer, the term of this Intel: Intel Architecture Labs. Internet and Broadcast: The patent is extended or adjusted under 35 Key To Digital Convergence. Utilizing Digital Technology U.S.C. 154(b) by 882 days. to Meet Audience Demand, 2000, pp. 1-4. (21) Appl. No.: 09/882,487 (Continued) Filed: Jun. 15, 2001 Primary Examiner Ngoc Vu (22) (74) Attorney, Agent, or Firm—Blakely, Sokoloff, Taylor & (65) Prior Publication Data Zafman LLP US 20O2/O194598 A1 Dec. 19, 2002 (57) ABSTRACT (51) Int. C. HO)4N 7/173 (2006.01) Abroadcast method and system for continuously and oppor (52) U.S. Cl. ............................ 725/97; 725/91. 725/95; tunistically driving an optimal broadcast schedule based on 725/114; 725/121 most recent client demand feedback from a distributed set of (58) Field of Classification Search .................. 725/95, broadcast clients. The broadcast system includes an opera 725/97, 98, 105,109, 118, 121, 46,91, 92, tion center that broadcasts meta-data to a plurality of client 725/114, 120 systems.
    [Show full text]
  • Reverse Engineering X86 Processor Microcode
    Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz, Ruhr-University Bochum https://www.usenix.org/conference/usenixsecurity17/technical-sessions/presentation/koppe This paper is included in the Proceedings of the 26th USENIX Security Symposium August 16–18, 2017 • Vancouver, BC, Canada ISBN 978-1-931971-40-9 Open access to the Proceedings of the 26th USENIX Security Symposium is sponsored by USENIX Reverse Engineering x86 Processor Microcode Philipp Koppe, Benjamin Kollenda, Marc Fyrbiak, Christian Kison, Robert Gawlik, Christof Paar, and Thorsten Holz Ruhr-Universitat¨ Bochum Abstract hardware modifications [48]. Dedicated hardware units to counter bugs are imperfect [36, 49] and involve non- Microcode is an abstraction layer on top of the phys- negligible hardware costs [8]. The infamous Pentium fdiv ical components of a CPU and present in most general- bug [62] illustrated a clear economic need for field up- purpose CPUs today. In addition to facilitate complex and dates after deployment in order to turn off defective parts vast instruction sets, it also provides an update mechanism and patch erroneous behavior. Note that the implementa- that allows CPUs to be patched in-place without requiring tion of a modern processor involves millions of lines of any special hardware. While it is well-known that CPUs HDL code [55] and verification of functional correctness are regularly updated with this mechanism, very little is for such processors is still an unsolved problem [4, 29]. known about its inner workings given that microcode and the update mechanism are proprietary and have not been Since the 1970s, x86 processor manufacturers have throughly analyzed yet.
    [Show full text]
  • Design of the MIPS Processor
    Design of the MIPS Processor We will study the design of a simple version of MIPS that can support the following instructions: • I-type instructions LW, SW • R-type instructions, like ADD, SUB • Conditional branch instruction BEQ • J-type branch instruction J The instruction formats 6-bit 5-bit 5-bit 5-bit 5-bit 5-bit LW op rs rt immediate SW op rs rt immediate ADD op rs rt rd 0 func SUB op rs rt rd 0 func BEQ op rs rt immediate J op address ALU control ALU control (3-bit) 32 ALU result 32 ALU control input ALU function 000 AND 001 OR 010 add 110 sub 111 Set less than How to generate the ALU control input? The control unit first generates this from the opcode of the instruction. A single-cycle MIPS We consider a simple version of MIPS that uses Harvard architecture. Harvard architecture uses separate memory for instruction and data. Instruction memory is read-only – a programmer cannot write into the instruction memory. To read from the data memory, set Memory read =1 To write into the data memory, set Memory write =1 Instruction fetching Clock Each clock cycle fetches the instruction from the address specified by the PC, and increments PC by 4 at the same time. Executing R-type instructions This is the instruction format for the R-type instructions. Here are the steps in the execution of an R-type instruction: ♦ Read instruction ♦ Read source registers rs and rt ♦ ALU performs the desired operation ♦ Store result in the destination register rd.
    [Show full text]
  • 1. Central Processing Unit (CPU): 2. Memory Unit
    Department of Electrical Engineering. First Year / 2016-2017 By: Salwa Adel Al-agha Lecture 2 1. Central Processing Unit (CPU): The Central Processing Unit (CPU) is an internal component of the computer, portion of a computer system that carries out the instructions of a computer program, to perform the basic arithmetical, logical, and input/output operations of the system. The CPU is the brain of the computer. On personal computers and small workstations, the CPU is housed in a single silicon chip called a microprocessor. Modern CPUs are large scale integrated circuits in small, rectangular packages, with multiple connecting pins. Two typical components of a CPU are: - Arithmetic Logic Unit (ALU). - Control Unit (CU). Modern CPUs are small and square and contain multiple metallic connectors or pins on the underside. Pentium chip or processor, made by Intel, is the most common CPU though there are many other companies that produce processors for personal computers. 2. Memory Unit: Memory is a collection of storage registers used to transfer information in and out of the unit. Memory is one of the easiest pieces of hardware to add to computer. The actual work is done in (memory) and the finished result is stored in (disk). The information stored in the memory as binary code in groups of bits called word. The binary is two logic levels: - Logic (1). - Logic (0). Bit: is binary digit (1) or (0), and Byte: is a group of eight bits. 9 Department of Electrical Engineering. First Year / 2016-2017 By: Salwa Adel Al-agha Lecture 2 Memory in general is divided into two general categories: 2.1 Read Only Memory: Read-Only Memory (ROM) is a class of storage medium used in computers and other electronic devices.
    [Show full text]