Madison Processor
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Outline ECE473 Computer Architecture and Organization • Technology Trends • Introduction to Computer Technology Trends Architecture
Outline ECE473 Computer Architecture and Organization • Technology Trends • Introduction to Computer Technology Trends Architecture Lecturer: Prof. Yifeng Zhu Fall, 2009 Portions of these slides are derived from: ECE473 Lec 1.1 ECE473 Lec 1.2 Dave Patterson © UCB Birth of the Revolution -- What If Your Salary? The Intel 4004 • Parameters – $16 base First Microprocessor in 1971 – 59% growth/year – 40 years • Intel 4004 • 2300 transistors • Initially $16 Æ buy book • Barely a processor • 3rd year’s $64 Æ buy computer game • Could access 300 bytes • 16th year’s $27 ,000 Æ buy cacar of memory • 22nd year’s $430,000 Æ buy house th @intel • 40 year’s > billion dollars Æ buy a lot Introduced November 15, 1971 You have to find fundamental new ways to spend money! 108 KHz, 50 KIPs, 2300 10μ transistors ECE473 Lec 1.3 ECE473 Lec 1.4 2002 - Intel Itanium 2 Processor for Servers 2002 – Pentium® 4 Processor • 64-bit processors Branch Unit Floating Point Unit • .18μm bulk, 6 layer Al process IA32 Pipeline Control November 14, 2002 L1I • 8 stage, fully stalled in- cache ALAT Integer Multi- Int order pipeline L1D Medi Datapath RF @3.06 GHz, 533 MT/s bus cache a • Symmetric six integer- CLK unit issue design HPW DTLB 1099 SPECint_base2000* • IA32 execution engine 1077 SPECfp_base2000* integrated 21.6 mm L2D Array and Control L3 Tag • 3 levels of cache on-die totaling 3.3MB 55 Million 130 nm process • 221 Million transistors Bus Logic • 130W @1GHz, 1.5V • 421 mm2 die @intel • 142 mm2 CPU core L3 Cache ECE473 Lec 1.5 ECE473 19.5mm Lec 1.6 Source: http://www.specbench.org/cpu2000/results/ @intel 2006 - Intel Core Duo Processors for Desktop 2008 - Intel Core i7 64-bit x86-64 PERFORMANCE • Successor to the Intel Core 2 family 40% • Max CPU clock: 2.66 GHz to 3.33 GHz • Cores :4(: 4 (physical)8(), 8 (logical) • 45 nm CMOS process • Adding GPU into the processor POWER 40% …relative to Intel® Pentium® D 960 When compared to the Intel® Pentium® D processor 960. -
Lecture Note 1
EE586 VLSI Design Partha Pande School of EECS Washington State University [email protected] Lecture 1 (Introduction) Why is designing digital ICs different today than it was before? Will it change in future? The First Computer The Babbage Difference Engine (1832) 25,000 parts cost: £17,470 ENIAC - The first electronic computer (1946) The Transistor Revolution First transistor Bell Labs, 1948 The First Integrated Circuits Bipolar logic 1960’s ECL 3-input Gate Motorola 1966 Intel 4004 Micro-Processor 1971 1000 transistors 1 MHz operation Intel Pentium (IV) microprocessor Moore’s Law In 1965, Gordon Moore noted that the number of transistors on a chip doubled every 18 to 24 months. He made a prediction that semiconductor technology will double its effectiveness every 18 months Moore’s Law 16 15 14 13 12 11 10 9 8 7 6 OF THE NUMBER OF 2 5 4 LOG 3 2 1 COMPONENTS PER INTEGRATED FUNCTION 0 1959 1960 1961 1962 1963 1964 1965 1966 1967 1968 1969 1970 1971 1972 1973 1974 1975 Electronics, April 19, 1965. Evolution in Complexity Transistor Counts 1 Billion K Transistors 1,000,000 100,000 Pentium® III 10,000 Pentium® II Pentium® Pro 1,000 Pentium® i486 100 i386 80286 10 8086 Source: Intel 1 1975 1980 1985 1990 1995 2000 2005 2010 Projected Courtesy, Intel Moore’s law in Microprocessors 1000 2X growth in 1.96 years! 100 10 P6 Pentium® proc 1 486 386 0.1 286 Transistors (MT) Transistors 8086 Transistors8085 on Lead Microprocessors double every 2 years 0.01 8080 8008 4004 0.001 1970 1980 1990 2000 2010 Year Courtesy, Intel Die Size Growth 100 P6 -
On the Hardware Reduction of Z-Datapath of Vectoring CORDIC
On the Hardware Reduction of z-Datapath of Vectoring CORDIC R. Stapenhurst*, K. Maharatna**, J. Mathew*, J.L.Nunez-Yanez* and D. K. Pradhan* *University of Bristol, Bristol, UK **University of Southampton, Southampton, UK [email protected] Abstract— In this article we present a novel design of a hardware wordlength larger than 18-bits the hardware requirement of it optimal vectoring CORDIC processor. We present a mathematical becomes more than the classical CORDIC. theory to show that using bipolar binary notation it is possible to eliminate all the arithmetic computations required along the z- In this particular work we propose a formulation to eliminate datapath. Using this technique it is possible to achieve three and 1.5 all the arithmetic operations along the z-datapath for conventional times reduction in the number of registers and adder respectively two-sided vector rotation and thereby reducing the hardware compared to classical CORDIC. Following this, a 16-bit vectoring while increasing the accuracy. Also the resulting architecture CORDIC is designed for the application in Synchronizer for IEEE shows significant hardware saving as the wordlength increases. 802.11a standard. The total area and dynamic power consumption Although we stick to the 2’s complement number system, without of the processor is 0.14 mm2 and 700μW respectively when loss of generality, this formulation can be adopted easily for synthesized in 0.18μm CMOS library which shows its effectiveness redundant arithmetic and higher radix formulation. A 16-bit as a low-area low-power processor. processor developed following this formulation requires 0.14 mm2 area and consumes 700 μW dynamic power when synthesized in 0.18μm CMOS library. -
18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures n Single-cycle Microarchitectures n Multi-cycle and Microprogrammed Microarchitectures n Pipelining n Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery, … n Out-of-Order Execution n Issues in OoO Execution: Load-Store Handling, … 2 Reminder on Assignments n Lab 2 due next Friday (Feb 6) q Start early! n HW 1 due today n HW 2 out n Remember that all is for your benefit q Homeworks, especially so q All assignments can take time, but the goal is for you to learn very well 3 Lab 1 Grades 25 20 15 10 5 Number of Students 0 30 40 50 60 70 80 90 100 n Mean: 88.0 n Median: 96.0 n Standard Deviation: 16.9 4 Extra Credit for Lab Assignment 2 n Complete your normal (single-cycle) implementation first, and get it checked off in lab. n Then, implement the MIPS core using a microcoded approach similar to what we will discuss in class. n We are not specifying any particular details of the microcode format or the microarchitecture; you can be creative. n For the extra credit, the microcoded implementation should execute the same programs that your ordinary implementation does, and you should demo it by the normal lab deadline. n You will get maximum 4% of course grade n Document what you have done and demonstrate well 5 Readings for Today n P&P, Revised Appendix C q Microarchitecture of the LC-3b q Appendix A (LC-3b ISA) will be useful in following this n P&H, Appendix D q Mapping Control to Hardware n Optional q Maurice Wilkes, “The Best Way to Design an Automatic Calculating Machine,” Manchester Univ. -
The Advantages of FRAM-Based Smart Ics for Next Generation Government Electronic Ids
The Advantages of FRAM-Based Smart ICs for Next Generation Government Electronic IDs By Joseph Pearson and Dr. Ted Moise Texas Instruments, Inc. September 27, 2007 Executive Summary Electronic versions of passports and other government-issued identification documents use an Integrated Circuit (IC) or chip to establish a digital link between the holder and personal biometric information, such as a digitized photo, fingerprint or iris image. Designed to enhance border, physical and IT security, electronic chips ensure that the person holding a passport or government document is the one to whom it was legitimately issued. The next generation ICs will employ an advanced embedded memory technology, called FRAM (Ferroelectric Random Access Memory), which considerably improves the speed and reliability of future smart, secure e-passports and government ID documents. More than 50 countries have electronic passport (e-passport) programs, and many countries are also putting in place more secure forms of electronic citizen, visitor and government employee identification. As the volume of document issuance increases and new security threats occur, there is an increased need for industry-standard, next-generation contactless smart IC solutions that securely store, process and communicate data. These new smart ICs will have increased writing speeds to produce and process documents faster and more efficiently, as well as enhanced memory for future security requirements. When FRAM is manufactured at the 130 nanometer semiconductor process node, and embedded in a smart IC, it surpasses the limitations of current Electrically Erasable Programmable Read-Only Memory (EEPROM) and other memory technologies used in government ID applications. The imminent introduction of this innovative memory technology for smart ICs signals a shift in performance in smart card applications deployed in government electronic ID documents. -
45-Year CPU Evolution: One Law and Two Equations
45-year CPU evolution: one law and two equations Daniel Etiemble LRI-CNRS University Paris Sud Orsay, France [email protected] Abstract— Moore’s law and two equations allow to explain the a) IC is the instruction count. main trends of CPU evolution since MOS technologies have been b) CPI is the clock cycles per instruction and IPC = 1/CPI is the used to implement microprocessors. Instruction count per clock cycle. c) Tc is the clock cycle time and F=1/Tc is the clock frequency. Keywords—Moore’s law, execution time, CM0S power dissipation. The Power dissipation of CMOS circuits is the second I. INTRODUCTION equation (2). CMOS power dissipation is decomposed into static and dynamic powers. For dynamic power, Vdd is the power A new era started when MOS technologies were used to supply, F is the clock frequency, ΣCi is the sum of gate and build microprocessors. After pMOS (Intel 4004 in 1971) and interconnection capacitances and α is the average percentage of nMOS (Intel 8080 in 1974), CMOS became quickly the leading switching capacitances: α is the activity factor of the overall technology, used by Intel since 1985 with 80386 CPU. circuit MOS technologies obey an empirical law, stated in 1965 and 2 Pd = Pdstatic + α x ΣCi x Vdd x F (2) known as Moore’s law: the number of transistors integrated on a chip doubles every N months. Fig. 1 presents the evolution for II. CONSEQUENCES OF MOORE LAW DRAM memories, processors (MPU) and three types of read- only memories [1]. The growth rate decreases with years, from A. -
The Birth, Evolution and Future of Microprocessor
The Birth, Evolution and Future of Microprocessor Swetha Kogatam Computer Science Department San Jose State University San Jose, CA 95192 408-924-1000 [email protected] ABSTRACT timed sequence through the bus system to output devices such as The world's first microprocessor, the 4004, was co-developed by CRT Screens, networks, or printers. In some cases, the terms Busicom, a Japanese manufacturer of calculators, and Intel, a U.S. 'CPU' and 'microprocessor' are used interchangeably to denote the manufacturer of semiconductors. The basic architecture of 4004 same device. was developed in August 1969; a concrete plan for the 4004 The different ways in which microprocessors are categorized are: system was finalized in December 1969; and the first microprocessor was successfully developed in March 1971. a) CISC (Complex Instruction Set Computers) Microprocessors, which became the "technology to open up a new b) RISC (Reduced Instruction Set Computers) era," brought two outstanding impacts, "power of intelligence" and "power of computing". First, microprocessors opened up a new a) VLIW(Very Long Instruction Word Computers) "era of programming" through replacing with software, the b) Super scalar processors hardwired logic based on IC's of the former "era of logic". At the same time, microprocessors allowed young engineers access to "power of computing" for the creative development of personal 2. BIRTH OF THE MICROPROCESSOR computers and computer games, which in turn led to growth in the In 1970, Intel introduced the first dynamic RAM, which increased software industry, and paved the way to the development of high- IC memory by a factor of four. -
Nano-Cmos Scaling Problems and Implications
CHAPTER 1 NANO-CMOS SCALING PROBLEMS AND IMPLICATIONS 1.1 DESIGN METHODOLOGY IN THE NANO-CMOS ERA As process technology scales beyond 100-nm feature sizes, for functional and high-yielding silicon the traditional design approach needs to be modified to cope with the increased process variation, interconnect processing difficulties, and other newly exacerbated physical effects. The scaling of gate oxide (Figure 1.1) in the nano-CMOS regime results in a significant increase in gate direct tunnel- ing current. Subthreshold leakage and gate direct tunneling current (Figure 1.2) are no longer second-order effects [1,15]. The effect of gate-induced drain leak- age (GIDL) will be felt in designs, such as DRAM (Chapter 7) and low-power SRAM (Chapter 9), where the gate voltage is driven negative with respect to the source [15]. If these effects are not taken care of, the result will be a nonfunctional SRAM, DRAM, or any other circuit that uses this technique to reduce subthresh- old leakage. In some cases even wide muxes and flip-flops may be affected. Subthreshold leakage and gate current are not the only issues that we have to deal with at a functional level, but also the power management of chips for high-performance circuits such as microprocessors, digital signal processors, and graphics processing units. Power management is also a challenge in mobile applications. Furthermore, optical lithography will be stretched to the limit even when en- hanced resolution extension technologies (RETs) are employed. These techniques Nano-CMOS Circuit and Physical Design, by Ban P. Wong, Anurag Mittal, Yu Cao, and Greg Starr ISBN 0-471-46610-7 Copyright © 2005 John Wiley & Sons, Inc. -
Performance and Energy Efficient Network-On-Chip Architectures
Linköping Studies in Science and Technology Dissertation No. 1130 Performance and Energy Efficient Network-on-Chip Architectures Sriram R. Vangal Electronic Devices Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden Linköping 2007 ISBN 978-91-85895-91-5 ISSN 0345-7524 ii Performance and Energy Efficient Network-on-Chip Architectures Sriram R. Vangal ISBN 978-91-85895-91-5 Copyright Sriram. R. Vangal, 2007 Linköping Studies in Science and Technology Dissertation No. 1130 ISSN 0345-7524 Electronic Devices Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden Linköping 2007 Author email: [email protected] Cover Image A chip microphotograph of the industry’s first programmable 80-tile teraFLOPS processor, which is implemented in a 65-nm eight-metal CMOS technology. Printed by LiU-Tryck, Linköping University Linköping, Sweden, 2007 Abstract The scaling of MOS transistors into the nanometer regime opens the possibility for creating large Network-on-Chip (NoC) architectures containing hundreds of integrated processing elements with on-chip communication. NoC architectures, with structured on-chip networks are emerging as a scalable and modular solution to global communications within large systems-on-chip. NoCs mitigate the emerging wire-delay problem and addresses the need for substantial interconnect bandwidth by replacing today’s shared buses with packet-switched router networks. With on-chip communication consuming a significant portion of the chip power and area budgets, there is a compelling need for compact, low power routers. While applications dictate the choice of the compute core, the advent of multimedia applications, such as three-dimensional (3D) graphics and signal processing, places stronger demands for self-contained, low-latency floating-point processors with increased throughput. -
Benchmarking the Intel FPGA SDK for Opencl Memory Interface
The Memory Controller Wall: Benchmarking the Intel FPGA SDK for OpenCL Memory Interface Hamid Reza Zohouri*†1, Satoshi Matsuoka*‡ *Tokyo Institute of Technology, †Edgecortix Inc. Japan, ‡RIKEN Center for Computational Science (R-CCS) {zohour.h.aa@m, matsu@is}.titech.ac.jp Abstract—Supported by their high power efficiency and efficiency on Intel FPGAs with different configurations recent advancements in High Level Synthesis (HLS), FPGAs are for input/output arrays, vector size, interleaving, kernel quickly finding their way into HPC and cloud systems. Large programming model, on-chip channels, operating amounts of work have been done so far on loop and area frequency, padding, and multiple types of blocking. optimizations for different applications on FPGAs using HLS. However, a comprehensive analysis of the behavior and • We outline one performance bug in Intel’s compiler, and efficiency of the memory controller of FPGAs is missing in multiple deficiencies in the memory controller, leading literature, which becomes even more crucial when the limited to significant loss of memory performance for typical memory bandwidth of modern FPGAs compared to their GPU applications. In some of these cases, we provide work- counterparts is taken into account. In this work, we will analyze arounds to improve the memory performance. the memory interface generated by Intel FPGA SDK for OpenCL with different configurations for input/output arrays, II. METHODOLOGY vector size, interleaving, kernel programming model, on-chip channels, operating frequency, padding, and multiple types of A. Memory Benchmark Suite overlapped blocking. Our results point to multiple shortcomings For our evaluation, we develop an open-source benchmark in the memory controller of Intel FPGAs, especially with respect suite called FPGAMemBench, available at https://github.com/ to memory access alignment, that can hinder the programmer’s zohourih/FPGAMemBench. -
Class-Action Lawsuit
Case 3:20-cv-00863-SI Document 1 Filed 05/29/20 Page 1 of 279 Steve D. Larson, OSB No. 863540 Email: [email protected] Jennifer S. Wagner, OSB No. 024470 Email: [email protected] STOLL STOLL BERNE LOKTING & SHLACHTER P.C. 209 SW Oak Street, Suite 500 Portland, Oregon 97204 Telephone: (503) 227-1600 Attorneys for Plaintiffs [Additional Counsel Listed on Signature Page.] UNITED STATES DISTRICT COURT DISTRICT OF OREGON PORTLAND DIVISION BLUE PEAK HOSTING, LLC, PAMELA Case No. GREEN, TITI RICAFORT, MARGARITE SIMPSON, and MICHAEL NELSON, on behalf of CLASS ACTION ALLEGATION themselves and all others similarly situated, COMPLAINT Plaintiffs, DEMAND FOR JURY TRIAL v. INTEL CORPORATION, a Delaware corporation, Defendant. CLASS ACTION ALLEGATION COMPLAINT Case 3:20-cv-00863-SI Document 1 Filed 05/29/20 Page 2 of 279 Plaintiffs Blue Peak Hosting, LLC, Pamela Green, Titi Ricafort, Margarite Sampson, and Michael Nelson, individually and on behalf of the members of the Class defined below, allege the following against Defendant Intel Corporation (“Intel” or “the Company”), based upon personal knowledge with respect to themselves and on information and belief derived from, among other things, the investigation of counsel and review of public documents as to all other matters. INTRODUCTION 1. Despite Intel’s intentional concealment of specific design choices that it long knew rendered its central processing units (“CPUs” or “processors”) unsecure, it was only in January 2018 that it was first revealed to the public that Intel’s CPUs have significant security vulnerabilities that gave unauthorized program instructions access to protected data. 2. A CPU is the “brain” in every computer and mobile device and processes all of the essential applications, including the handling of confidential information such as passwords and encryption keys. -
Lecture 5: Scaled MOSFETS for Ics
Lecture 5: Scaled MOSFETS for ICs MSE 6001, Semiconductor Materials Lectures Fall 2006 5 MOSFETS and scaling Silicon is a mediocre semiconductor, and several other semiconductors have better electrical and optical properties. However, the very high quality of the electrical properties of the silicon-silicon dioxide interface allows very good metal-oxide-semiconductor field-effect transistors (MOSFETs) to be fabricated. These devices have several properties, such as operation frequency and power con- sumption, that improve as their size is scaled down to smaller dimensions, which also allows more transistors to be packed onto a chip. The smaller sizes are achieved using higher-resolution pho- tolithography, which is improved by steadily improving the fabrication technologies. The trends of steadily improving performance and greater integration density with time are generically refered to A 70 Mbit SRAM test vehicle with >as0.5 “Moore’s billion trans Law”.istors Of course, scaling has to end eventually, somewhere before the scale of atoms and incorporating all of the features described in this paper has been fabricated on this technologyis. reached.The aggressive de- sign rules allow for a small 0.57Pm2 6-T SRAM cell that is also compatible with high performance logic processing. A top view of the cell after poly patterni5.1ng is show Basicn in Figur MOSFETe 12. In addition to small size, this cell has a robust static noise margin down to 0.7V VDD to alloMOSFETsw low voltage turnopera- on and off via the non-linear gate capacitor between the gate electrode and the tion (Fig. 13). Figure 14 is a Shmosubstrate.o plot for the The 70 M MOSFETb of Fig.