Computer Architecture and Performance Tuning

Total Page:16

File Type:pdf, Size:1020Kb

Computer Architecture and Performance Tuning Computer Architecture and Performance Tuning “Recent progress in Processor Architectures and the 7 Dimensions of Performance” Vincenzo Innocente Original Slides by Sverre Jarp CERN Honorary Staff ESC 2016 – Bertinoro, Italy – October 2016 Computer Architecture and Performance Tuning Goal of these lectures 1. Give an understanding of modern computer architectures from a performance point-of-view § Processor, Memory subsystem, Caches § Use x86-64 as a de-facto standard § But keep an eye on ARM64, POWER(-PC), as well as GPUs/accelerators 2. Explain hardware factors that improve or degrade program execution speed § Prepare for writing well-performing software 2 Sverre Jarp Computer Architecture and Performance Tuning Contents § Introduction: § Setting the Scene; Scaling “laws” § Complexity in Computing § Basic Architecture § Memory subsystem § Performance Dimensions: § Vectorisation § Instruction level parallelism § Multi-core parallelisation § Conclusion 3 Sverre Jarp Computer Architecture and Performance Tuning The Big Issues (from an architectural viewpoint) 4 Sverre Jarp Computer Architecture and Performance Tuning Where are we coming from? § Von Neumann architecture (since forever) § Memory: § Single, homogeneous memory John von Neumann (1903 – 57) § Low latency Source : Wikipedia § Primitive machine code (assembly) § CPU scaling: § Moore’s law (1965) § Dennard scaling (1974) § Little or no parallelism Robert Dennard (IBM) Source : Wikipedia 5 Sverre Jarp Computer Architecture and Performance Tuning Von Neumann architecture § From Wikipedia: Algorithms and Data Structures § The von Neumann architecture is a computer Instructions Data design model that uses a processing unit and a single separate storage structure to hold both instructions and data. Results § It can be viewed as an Input entity into which one streams instructions and data in order to produce Processing results 6 Sverre Jarp Computer Architecture and Performance Tuning Von Neumann architecture (cont’d) Algorithms and Data Structures § The goal is to produce results as fast as possible Instructions Data § But, lots of problems can occur: § Instructions or data don’t arrive in time § Bandwidth issues? § Latency issues? Results Input § Clashes between input data Caches and output data Processing § Other “complexity-based” problems inside an extreme Many people think the architecture processing parallelism is out-dated. But nobody has managed to replace it (yet). 7 Sverre Jarp Computer Architecture and Performance Tuning Moore’s “law” § A marching order established ~50 years ago § “Let’s continue to double the number of transistors every other year!” § First published as: § Moore, G.E.: Cramming more components onto integrated circuits. Electronics, 38(8), April 1965. § Accepted by all partners: § Semiconductor manufacturers § Hardware integrators § Software companies § Us, the consumers From Wikipedia 8 Sverre Jarp Computer Architecture and Performance Tuning Semiconductor evolution (as 2014..) § Today’s silicon processes: § 28, 22 nm We are here § Being introduced: § 14 nm (2013/14) § In research: § 10 nm (2015/16) LHC data § 7 nm (2017/18) § 5 nm (2019/20) S. Borkar et al. (Intel), "Platform 2015: Intel Platform Evolution for the Next Decade", 2005. – Source: Intel 2 nm (2028?) TSMC § By the end of this decade we will have chips with ~100’000’000’000 (1011) transistors! § And, this will continue to drive innovation 9 Sverre Jarp Computer Architecture and Performance Tuning frequency scaling… § The 7 “fat” years of frequency scaling: § The Pentium Pro in 1996: 150 MHz (12W) § The Pentium 4 in 2003: 3.8 GHz (~25x) (115W) § Since then § Core 2 systems: § ~3 GHz § Multi-core § Recent CERN purchase: § Intel Xeon E5-2630 v3 © 2009 Herb § “only” 2.40 GHz (85W) Sutter § 8 core 10 Sverre Jarp Computer Architecture and Performance Tuning …vs Memory Latency 11 Sverre Jarp Computer Architecture and Performance Tuning Where are we today ? § Von Neumann architecture (unchanged) § Memory: § Multi-layered, complex layout § Non-uniform; even disjoint § High latency Things have § Primitive machine code (unchanged) become worse! § CPU scaling: § Moore’s law: Slowing down § Dennard scaling: Practically gone § Extreme parallelism at all levels § Instruction, Chip, System 12 Sverre Jarp Computer Architecture and Performance Tuning Real consequence of Moore’s law § We are being “snowed under” by “innovation”: § More (and more complex) execution units § Hundreds of new instructions § Longer SIMD/SSE hardware vectors § More and more cores § Specialised accelerators § Complex cache hierarchies § In order to profit we need to “think parallel” § Data parallelism § Task parallelism “Data Oriented Design” 13 Sverre Jarp Computer Architecture and Performance Tuning Let’s start with the basics! For a review of one of the latest INTEL architecture see http://www.anandtech.com/show/6355/intels-haswell-architecture 14 Sverre Jarp Computer Architecture and Performance Tuning Simple processor layout Keeps the state of execution § A simple processor with IC four key components: § Control Logic FU Flags § Instruction Counter § Program Status Word PSW Control § Register file R0 Data § functional Unit Data R1 transfer § Data Transfer Unit unit § Data bus Address RNN § Address bus Registers 15 Sverre Jarp Computer Architecture and Performance Tuning Simple server diagram § Multiple components which Socket 0 Socket 1 interact during the execution C0T0 of a program: C0 C1 C0C0T1 C1 § Processors/cores C2 C3 C2 C3 § w/private caches C4 C5 C4 C5 – I-cache, D-cache Shared Shared cache cache § Shared caches Mem-ctl Mem-ctl § Instructions and Data Memory Memory § Memory controllers Interconnect § Memory (non-uniform) § I/O subsystem § Network attachment I/O bus § Disk subsystem Intel Nehalem 16 Sverre Jarp Computer Architecture and Performance Tuning Memory Subsystem 17 Sverre Jarp Computer Architecture and Performance Tuning Optimal Memory Programming § What needs to be understood: § The memory hierarchy § Main memory – Physical layout – Latency – Bandwidth § Caches – Physical layout, Line sizes – Levels/Sharing – Latency § Programmer/Compiler – Data Layout – Data Locality § Execution environment: – Affinity 18 Sverre Jarp Computer Architecture and Performance Tuning Cache/Memory Hierarchy Processor Core (Registers) § From CPU to main memory on (R:64B + W:32B)/1c a recent Haswell L1I L1D 4c latency processor (32 KB) (32 KB) § With multicore, L2 R: 64B/1c memory (256 KB) 11c latency bandwidth is shared 32B/1c for all cores between > 21c latency cores in the same Shared L3 processor (~20 MB) (socket) ~24 B/c for all cores > 200c latency Local/remote memory (large, but typically non-uniform) c = cycle 19 Sverre Jarp Computer Architecture and Performance Tuning Cache lines (1) § When a data element or an instruction is requested by the processor, a cache line is ALWAYS moved (as the minimum quantity), usually to Level-1 requested § A cache line is a contiguous section of memory, typically 64B in size (8 * double) and 64B aligned § A 32KB Level-1 cache can hold 512 lines § When cache lines have to be moved come from memory § Latency is long (>200 cycles) § It is even longer if the memory is remote § Memory controller stays busy (~8 cycles) 20 Sverre Jarp Computer Architecture and Performance Tuning Cache lines (2) § Good utilisation is vital § When only one element (4B or 8B) element is used inside the cache line: § A lot of bandwidth is wasted! requested § Multidimensional C arrays should be accessed with the last index changing fastest: for (i = 0; i < rows; ++i) for (j = 0; j < columns; ++j) mymatrix [i] [j] += increment; § Pointer chasing (in linked lists) can easily lead to “cache thrashing” (too much memory traffic) 21 Sverre Jarp Computer Architecture and Performance Tuning Cache lines (3) § Prefetching: § fetch a cache line before it is requested § Hiding latency § Normally done by the hardware § Especially if processor executes Out-of-order § Also done by software instructions § Especially when In-order (IA-64, Xeon Phi, etc.) § Locality is vital: § Spatial locality – Use all elements in the line § Temporal locality – Complete the execution whilst the elements are certain to be in the cache Programming the memory hierarchy is an art in itself. 22 Sverre Jarp Computer Architecture and Performance Tuning Cache/Memory Trends Processor Core § The trend is to deepen and diversify the L1I L1D cache/memory hierarchy: § Additional L2 levels of cache Shared L3 § Multiple kinds of large memories Shared L4 § Non-volatile memories (great for databases, Local/remote Local/remote Non-volatile etc.) memory (1) memory (2) memory (3) Larger, slower Faster, smaller 23 Sverre Jarp Computer Architecture and Performance Tuning Latency Measurements (example) § Memory Latency on Sandy Bridge-EP 2690 (dual socket) § 90 ns (local) versus 150 ns (remote) Socket 0 Socket 1 C0T0 C0 C1 C0C0T1 C1 C2 C3 C2 C3 C4 C5 C4 C5 Shared Shared cache cache Mem-ctl Mem-ctl Memory Interconnect I/O bus 24 Sverre Jarp Computer Architecture and Performance Tuning Recent architectures Source AnandTech 25 Sverre Jarp Computer Architecture and Performance Tuning Current GPU Memory Layout § CPU and GPU memories are separate CPU GPU § What everybody wants System GPU is a single unified view Unified Memory of memory Memory Memory § One vision is “Heterogeneous Systems Architecture”(HSA) pushed by AMD, ARM, and others § AMD Kaveri APU § NVidia+IBM: Unified Memory over NVLink § Annnounced in 2016: OpenCAPI (everybody but INTEL) 26 Sverre Jarp Computer
Recommended publications
  • Computer Architectures
    Computer Architectures Central Processing Unit (CPU) Pavel Píša, Michal Štepanovský, Miroslav Šnorek The lecture is based on A0B36APO lecture. Some parts are inspired by the book Paterson, D., Henessy, V.: Computer Organization and Design, The HW/SW Interface. Elsevier, ISBN: 978-0-12-370606-5 and it is used with authors' permission. Czech Technical University in Prague, Faculty of Electrical Engineering English version partially supported by: European Social Fund Prague & EU: We invests in your future. AE0B36APO Computer Architectures Ver.1.10 1 Computer based on von Neumann's concept ● Control unit Processor/microprocessor ● ALU von Neumann architecture uses common ● Memory memory, whereas Harvard architecture uses separate program and data memories ● Input ● Output Input/output subsystem The control unit is responsible for control of the operation processing and sequencing. It consists of: ● registers – they hold intermediate and programmer visible state ● control logic circuits which represents core of the control unit (CU) AE0B36APO Computer Architectures 2 The most important registers of the control unit ● PC (Program Counter) holds address of a recent or next instruction to be processed ● IR (Instruction Register) holds the machine instruction read from memory ● Another usually present registers ● General purpose registers (GPRs) may be divided to address and data or (partially) specialized registers ● SP (Stack Pointer) – points to the top of the stack; (The stack is usually used to store local variables and subroutine return addresses) ● PSW (Program Status Word) ● IM (Interrupt Mask) ● Optional Floating point (FPRs) and vector/multimedia regs. AE0B36APO Computer Architectures 3 The main instruction cycle of the CPU 1. Initial setup/reset – set initial PC value, PSW, etc.
    [Show full text]
  • Microprocessor Architecture
    EECE416 Microcomputer Fundamentals Microprocessor Architecture Dr. Charles Kim Howard University 1 Computer Architecture Computer System CPU (with PC, Register, SR) + Memory 2 Computer Architecture •ALU (Arithmetic Logic Unit) •Binary Full Adder 3 Microprocessor Bus 4 Architecture by CPU+MEM organization Princeton (or von Neumann) Architecture MEM contains both Instruction and Data Harvard Architecture Data MEM and Instruction MEM Higher Performance Better for DSP Higher MEM Bandwidth 5 Princeton Architecture 1.Step (A): The address for the instruction to be next executed is applied (Step (B): The controller "decodes" the instruction 3.Step (C): Following completion of the instruction, the controller provides the address, to the memory unit, at which the data result generated by the operation will be stored. 6 Harvard Architecture 7 Internal Memory (“register”) External memory access is Very slow For quicker retrieval and storage Internal registers 8 Architecture by Instructions and their Executions CISC (Complex Instruction Set Computer) Variety of instructions for complex tasks Instructions of varying length RISC (Reduced Instruction Set Computer) Fewer and simpler instructions High performance microprocessors Pipelined instruction execution (several instructions are executed in parallel) 9 CISC Architecture of prior to mid-1980’s IBM390, Motorola 680x0, Intel80x86 Basic Fetch-Execute sequence to support a large number of complex instructions Complex decoding procedures Complex control unit One instruction achieves a complex task 10
    [Show full text]
  • What Do We Mean by Architecture?
    Embedded programming: Comparing the performance and development workflows for architectures Embedded programming week FABLAB BRIGHTON 2018 What do we mean by architecture? The architecture of microprocessors and microcontrollers are classified based on the way memory is allocated (memory architecture). There are two main ways of doing this: Von Neumann architecture (also known as Princeton) Von Neumann uses a single unified cache (i.e. the same memory) for both the code (instructions) and the data itself, Under pure von Neumann architecture the CPU can be either reading an instruction or reading/writing data from/to the memory. Both cannot occur at the same time since the instructions and data use the same bus system. Harvard architecture Harvard architecture uses different memory allocations for the code (instructions) and the data, allowing it to be able to read instructions and perform data memory access simultaneously. The best performance is achieved when both instructions and data are supplied by their own caches, with no need to access external memory at all. How does this relate to microcontrollers/microprocessors? We found this page to be a good introduction to the topic of microcontrollers and ​ ​ microprocessors, the architectures they use and the difference between some of the common types. First though, it’s worth looking at the difference between a microprocessor and a microcontroller. Microprocessors (e.g. ARM) generally consist of just the Central ​ ​ Processing Unit (CPU), which performs all the instructions in a computer program, including arithmetic, logic, control and input/output operations. Microcontrollers (e.g. AVR, PIC or ​ 8051) contain one or more CPUs with RAM, ROM and programmable input/output ​ peripherals.
    [Show full text]
  • The Von Neumann Computer Model 5/30/17, 10:03 PM
    The von Neumann Computer Model 5/30/17, 10:03 PM CIS-77 Home http://www.c-jump.com/CIS77/CIS77syllabus.htm The von Neumann Computer Model 1. The von Neumann Computer Model 2. Components of the Von Neumann Model 3. Communication Between Memory and Processing Unit 4. CPU data-path 5. Memory Operations 6. Understanding the MAR and the MDR 7. Understanding the MAR and the MDR, Cont. 8. ALU, the Processing Unit 9. ALU and the Word Length 10. Control Unit 11. Control Unit, Cont. 12. Input/Output 13. Input/Output Ports 14. Input/Output Address Space 15. Console Input/Output in Protected Memory Mode 16. Instruction Processing 17. Instruction Components 18. Why Learn Intel x86 ISA ? 19. Design of the x86 CPU Instruction Set 20. CPU Instruction Set 21. History of IBM PC 22. Early x86 Processor Family 23. 8086 and 8088 CPU 24. 80186 CPU 25. 80286 CPU 26. 80386 CPU 27. 80386 CPU, Cont. 28. 80486 CPU 29. Pentium (Intel 80586) 30. Pentium Pro 31. Pentium II 32. Itanium processor 1. The von Neumann Computer Model Von Neumann computer systems contain three main building blocks: The following block diagram shows major relationship between CPU components: the central processing unit (CPU), memory, and input/output devices (I/O). These three components are connected together using the system bus. The most prominent items within the CPU are the registers: they can be manipulated directly by a computer program. http://www.c-jump.com/CIS77/CPU/VonNeumann/lecture.html Page 1 of 15 IPR2017-01532 FanDuel, et al.
    [Show full text]
  • Hardware Architecture
    Hardware Architecture Components Computing Infrastructure Components Servers Clients LAN & WLAN Internet Connectivity Computation Software Storage Backup Integration is the Key ! Security Data Network Management Computer Today’s Computer Computer Model: Von Neumann Architecture Computer Model Input: keyboard, mouse, scanner, punch cards Processing: CPU executes the computer program Output: monitor, printer, fax machine Storage: hard drive, optical media, diskettes, magnetic tape Von Neumann architecture - Wiki Article (15 min YouTube Video) Components Computer Components Components Computer Components CPU Memory Hard Disk Mother Board CD/DVD Drives Adaptors Power Supply Display Keyboard Mouse Network Interface I/O ports CPU CPU CPU – Central Processing Unit (Microprocessor) consists of three parts: Control Unit • Execute programs/instructions: the machine language • Move data from one memory location to another • Communicate between other parts of a PC Arithmetic Logic Unit • Arithmetic operations: add, subtract, multiply, divide • Logic operations: and, or, xor • Floating point operations: real number manipulation Registers CPU Processor Architecture See How the CPU Works In One Lesson (20 min YouTube Video) CPU CPU CPU speed is influenced by several factors: Chip Manufacturing Technology: nm (2002: 130 nm, 2004: 90nm, 2006: 65 nm, 2008: 45nm, 2010:32nm, Latest is 22nm) Clock speed: Gigahertz (Typical : 2 – 3 GHz, Maximum 5.5 GHz) Front Side Bus: MHz (Typical: 1333MHz , 1666MHz) Word size : 32-bit or 64-bit word sizes Cache: Level 1 (64 KB per core), Level 2 (256 KB per core) caches on die. Now Level 3 (2 MB to 8 MB shared) cache also on die Instruction set size: X86 (CISC), RISC Microarchitecture: CPU Internal Architecture (Ivy Bridge, Haswell) Single Core/Multi Core Multi Threading Hyper Threading vs.
    [Show full text]
  • The Von Neumann Architecture of Computer Systems
    The von Neumann Architecture of Computer Systems http://www.csupomona.edu/~hnriley/www/VonN.html The von Neumann Architecture of Computer Systems H. Norton Riley Computer Science Department California State Polytechnic University Pomona, California September, 1987 Any discussion of computer architectures, of how computers and computer systems are organized, designed, and implemented, inevitably makes reference to the "von Neumann architecture" as a basis for comparison. And of course this is so, since virtually every electronic computer ever built has been rooted in this architecture. The name applied to it comes from John von Neumann, who as author of two papers in 1945 [Goldstine and von Neumann 1963, von Neumann 1981] and coauthor of a third paper in 1946 [Burks, et al. 1963] was the first to spell out the requirements for a general purpose electronic computer. The 1946 paper, written with Arthur W. Burks and Hermann H. Goldstine, was titled "Preliminary Discussion of the Logical Design of an Electronic Computing Instrument," and the ideas in it were to have a profound impact on the subsequent development of such machines. Von Neumann's design led eventually to the construction of the EDVAC computer in 1952. However, the first computer of this type to be actually constructed and operated was the Manchester Mark I, designed and built at Manchester University in England [Siewiorek, et al. 1982]. It ran its first program in 1948, executing it out of its 96 word memory. It executed an instruction in 1.2 milliseconds, which must have seemed phenomenal at the time. Using today's popular "MIPS" terminology (millions of instructions per second), it would be rated at .00083 MIPS.
    [Show full text]
  • An Overview of Parallel Computing
    An Overview of Parallel Computing Marc Moreno Maza University of Western Ontario, London, Ontario (Canada) Chengdu HPC Summer School 20-24 July 2015 Plan 1 Hardware 2 Types of Parallelism 3 Concurrency Platforms: Three Examples Cilk CUDA MPI Hardware Plan 1 Hardware 2 Types of Parallelism 3 Concurrency Platforms: Three Examples Cilk CUDA MPI Hardware von Neumann Architecture In 1945, the Hungarian mathematician John von Neumann proposed the above organization for hardware computers. The Control Unit fetches instructions/data from memory, decodes the instructions and then sequentially coordinates operations to accomplish the programmed task. The Arithmetic Unit performs basic arithmetic operation, while Input/Output is the interface to the human operator. Hardware von Neumann Architecture The Pentium Family. Hardware Parallel computer hardware Most computers today (including tablets, smartphones, etc.) are equipped with several processing units (control+arithmetic units). Various characteristics determine the types of computations: shared memory vs distributed memory, single-core processors vs multicore processors, data-centric parallelism vs task-centric parallelism. Historically, shared memory machines have been classified as UMA and NUMA, based upon memory access times. Hardware Uniform memory access (UMA) Identical processors, equal access and access times to memory. In the presence of cache memories, cache coherency is accomplished at the hardware level: if one processor updates a location in shared memory, then all the other processors know about the update. UMA architectures were first represented by Symmetric Multiprocessor (SMP) machines. Multicore processors follow the same architecture and, in addition, integrate the cores onto a single circuit die. Hardware Non-uniform memory access (NUMA) Often made by physically linking two or more SMPs (or multicore processors).
    [Show full text]
  • Introduction to Graphics Hardware and GPU's
    Blockseminar: Verteiltes Rechnen und Parallelprogrammierung Tim Conrad GPU Computer: CHINA.imp.fu-berlin.de GPU Intro Today • Introduction to GPGPUs • “Hands on” to get you started • Assignment 3 • Projects GPU Intro Traditional Computing Von Neumann architecture: instructions are sent from memory to the CPU Serial execution: Instructions are executed one after another on a single Central Processing Unit (CPU) Problems: • More expensive to produce • More expensive to run • Bus speed limitation GPU Intro Parallel Computing Official-sounding definition: The simultaneous use of multiple compute resources to solve a computational problem. Benefits: • Economical – requires less power and cheaper to produce • Better performance – bus/bottleneck issue Limitations: • New architecture – Von Neumann is all we know! • New debugging difficulties – cache consistency issue GPU Intro Flynn’s Taxonomy Classification of computer architectures, proposed by Michael J. Flynn • SISD – traditional serial architecture in computers. • SIMD – parallel computer. One instruction is executed many times with different data (think of a for loop indexing through an array) • MISD - Each processing unit operates on the data independently via independent instruction streams. Not really used in parallel • MIMD – Fully parallel and the most common form of parallel computing. GPU Intro Enter CUDA CUDA is NVIDIA’s general purpose parallel computing architecture • Designed for calculation-intensive computation on GPU hardware • CUDA is not a language, it is an API • We will
    [Show full text]
  • Computer Architecture
    Computer architecture Compendium for INF2270 Philipp Häfliger, Dag Langmyhr and Omid Mirmotahari Spring 2013 Contents Contents iii List of Figures vii List of Tables ix 1 Introduction1 I Basics of computer architecture3 2 Introduction to Digital Electronics5 3 Binary Numbers7 3.1 Unsigned Binary Numbers.........................7 3.2 Signed Binary Numbers..........................7 3.2.1 Sign and Magnitude........................7 3.2.2 Two’s Complement........................8 3.3 Addition and Subtraction.........................8 3.4 Multiplication and Division......................... 10 3.5 Extending an n-bit binary to n+k bits................... 11 4 Boolean Algebra 13 4.1 Karnaugh maps............................... 16 4.1.1 Karnaugh maps with 5 and 6 bit variables........... 18 4.1.2 Karnaugh map simplification with ‘X’s.............. 19 4.1.3 Karnaugh map simplification based on zeros.......... 19 5 Combinational Logic Circuits 21 5.1 Standard Combinational Circuit Blocks................. 22 5.1.1 Encoder............................... 23 5.1.2 Decoder.............................. 24 5.1.3 Multiplexer............................. 25 5.1.4 Demultiplexer........................... 26 5.1.5 Adders............................... 26 6 Sequential Logic Circuits 31 6.1 Flip-Flops.................................. 31 6.1.1 Asynchronous Latches...................... 31 6.1.2 Synchronous Flip-Flops...................... 34 6.2 Finite State Machines............................ 37 6.2.1 State Transition Graphs...................... 37 6.3 Registers................................... 39 6.4 Standard Sequential Logic Circuits.................... 40 6.4.1 Counters.............................. 40 6.4.2 Shift Registers........................... 42 Page iii CONTENTS 7 Von Neumann Architecture 45 7.1 Data Path and Memory Bus........................ 47 7.2 Arithmetic and Logic Unit (ALU)..................... 47 7.3 Memory................................... 48 7.3.1 Static Random Access Memory (SRAM)...........
    [Show full text]
  • CPE 323 Introduction to Embedded Computer Systems: Introduction
    CPE 323 Introduction to Embedded Computer Systems: Introduction Instructor: Dr Aleksandar Milenkovic CPE 323 Administration Syllabus textbook & other references grading policy important dates course outline Prerequisites Number representation Digital design: combinational and sequential logic Computer systems: organization Embedded Systems Laboratory Located in EB 106 EB 106 Policies Introduction sessions Lab instructor CPE 323: Introduction to Embedded Computer Systems 2 CPE 323 Administration LAB Session on-line LAB manuals and tutorials Access cards Accounts Lab Assistant: Zahra Atashi Lab sessions (select 4 from the following list) Monday 8:00 - 9:30 AM Wednesday 8:00 - 9:30 AM Wednesday 5:30 - 7:00 PM Friday 8:00 - 9:30 AM Friday 9:30 – 11:00 AM Sign-up sheet will be available in the laboratory CPE 323: Introduction to Embedded Computer Systems 3 Outline Computer Engineering: Past, Present, Future Embedded systems What are they? Where do we find them? Structure and Organization Software Architectures CPE 323: Introduction to Embedded Computer Systems 4 What Is Computer Engineering? The creative application of engineering principles and methods to the design and development of hardware and software systems Discipline that combines elements of both electrical engineering and computer science Computer engineers are electrical engineers that have additional training in the areas of software design and hardware-software integration CPE 323: Introduction to Embedded Computer Systems 5 What Do Computer Engineers Do? Computer engineers are involved in all aspects of computing Design of computing devices (both Hardware and Software) Where are computing devices? Embedded computer systems (low-end – high-end) In: cars, aircrafts, home appliances, missiles, medical devices,..
    [Show full text]
  • Implementing Von Neumann's Architecture for Machine
    IMPLEMENTING VON NEUMANN'S ARCHITECTURE FOR MACHINE SELF REPRODUCTION WITHIN THE TIERRA ARTIFICIAL LIFE PLATFORM TO INVESTIGATE EVOLVABLE GENOTYPE-PHENOTYPE MAPPINGS Declan Baugh, M.Eng. B.Sc. School of Electronic Engineering, Dublin City University Supervisor: Prof. Barry McMullin A thesis submitted for the degree of Ph.D. July 2015 Declaration I hereby certify that this material, which I now submit for assessment on the programme of study leading to the award of Ph.D., is entirely my own work, and that I have exercised reasonable care to ensure that the work is original, and does not to the best of my knowledge breach any law of copyright, and has not been taken from the work of others save and to the extent that such work has been cited and acknowledged within the text of my work. Signed: (Candidate), ID No.: 53605574, Date: i Dedication All of my work is dedicated to Trevor - who inspires, supports, and protects. ii Acknowledgements I would like to express my special appreciation and thanks to my supervisor Prof. Barry McMullin; you have been a tremendous mentor for me. I would like to thank you for directing and encouraging my research over the years. For all the long meetings delivering advice and guidance as well as the countless hours correcting and honing my academic writing skills. Your support over the past years has been invaluable and I sincerely appreciate the effort that you invested in me. I would also like to thank my colleague, Dr. Tomonori Hasegawa for his assistance and companionship over the years as we battled this Ph.D.
    [Show full text]
  • Computer Organization and Architecture
    Computer Organization and Architecture Introduction Chapters 1-2 Architecture & Organization 1 zArchitecture is those attributes visible to the programmer yInstruction set, number of bits used for data representation, I/O mechanisms, addressing techniques. ye.g. Is there a multiply instruction? zOrganization is how features are implemented, typically hidden from the programmer yControl signals, interfaces, memory technology. ye.g. Is there a hardware multiply unit or is it done by repeated addition? 1 Architecture & Organization 2 z All Intel x86 family share the same basic architecture z The IBM System/370 family share the same basic architecture z This gives code compatibility yAt least backwards yBut… increases complexity of each new generation. May be more efficient to start over with a new technology, e.g. RISC vs. CISC z Organization differs between different versions Structure & Function zComputers are complex; easier to understand if broken up into hierarchical components. At each level the designer should consider yStructure : the way in which components relate to each other yFunction : the operation of individual components as part of the structure zLet’s look at the computer top-down starting with function. 2 Function zAll computer functions are: yData processing yData storage yData movement yControl Functional view zFunctional view of a computer Data Storage Facility Data Control Movement Mechanism Apparatus Data Processing Facility 3 Operations (1) zData movement ye.g. keyboard to screen Data Storage Facility Data Control Movement Mechanism Apparatus Data Processing Facility Operations (2) zStorage ye.g. Internet download to disk Data Storage Facility Data Control Movement Mechanism Apparatus Data Processing Facility 4 Operation (3) zProcessing from/to storage ye.g.
    [Show full text]