Lecture 03 Basics Moore's Law Since 2003

Total Page:16

File Type:pdf, Size:1020Kb

Lecture 03 Basics Moore's Law Since 2003 Lecture 03 Basics Processors CS 50011: Introduction to Systems II Architecture ISA Lecture 3: Computer Architecture DMA Modes Prof. Jeff Turkstra © 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2 Basics Some lecture material based on: Moore’s Law for integrated circuits Slides by Dr. George B. Adams III Transistor count for a typical processor Slides from Hennessy & Patterson or memory chip increases 40% to 55% per year. Slides from Silberschatz Doubles every 18-24 months Transistor count ~= computational power 1986-2003 computer performance increased ~ 50%/year © 2017 Dr. Jeffrey A. Turkstra 3 © 2017 Dr. Jeffrey A. Turkstra 4 Moore’s Law since 2003 25,000-fold hardware performance Microprocessor performance only improvement since 1985 20%/year Programs today trade execution Maximum power dissipation limits for performance for programmer air-cooled chips productivity Lack of additional instruction-level More programming is done in managed parallelism for hardware to exploit languages like Java, Python, and C# New applications have arisen: speech, sound, images, video © 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6 Computer components Hardware Data types Transistors Representations for character, integer, Gates floating point, etc more, od, xxd Combinational and sequential circuits Sign-magnitude Adders, decoders, mux/demux, latches, flip-flops, registers 1’s complement Processors 2’s complement Memory IEEE 754 etc BCD ... © 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8 Software Instructions for what to compute © 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10 How a modern computer Harvard architecture works Idea by Howard Aiken, Harvard physicist, to IBM Nov. 1937 Built by IBM in Endicott, NY and delivered to Harvard in Feb. 1944 as the Mark 1 computer Has separate memories for program (instructions) and data Input/output (I/O) to connect to the world Processor to carry out the computations © 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12 (John) Von Neumann Harvard architecture Developed during his June 1945 train ride from Philadelphia to Los Alamos, NM He had programmed the Mark 1 in August 1944 One memory for both data and program Same I/O Same processor © 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14 von Neumann vs Harvard Von Neumann architectures von Neumann Same memory holds instructions and data Single bus between CPU and memory Flexible, more cost effective Harvard Separate memories for data and instructions Two busses Allows two simultaneous memory fetches Less flexible, memory is physically partitioned Both are stored program computer designs © 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16 Processors Stored programs Device that performs automatic computation Some memories can be written to only Fixed logic – single operation once and then read many times Traffic signal sequencer Read-Only Memory (ROM) Selectable logic – user can select from multiple E.g., automobile engine control hardwired functions Car with Econo and Sports modes for transmission Some ROM can be re-written Parameterized logic – computes fixed function PROM, programmable ROM on variable user input EPROM, erasable programmable ROM Programmable video recorder EEPROM, electrically erasable… Programmable logic processor Embedded systems often PROM CPU, GPU, etc Firmware upgrades © 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18 Fetch-Execute Intel Core i7 Processor At the highest level, a processor does this: repeat forever { FETCH, access the next program instruction from location where it is stored EXECUTE, perform the actions described by the instruction } © 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20 Motherboard © 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22 Architecture basics Architecture basics Instruction set Logic design Software instructions that the hardware Which logic circuits are used and how executes are they organized? Functional organization Implementation How is the hardware partitioned into Technologies and packaging used specialized units? © 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24 Instruction set Hierarchical abstraction architecture Hardware and software consist of Instruction set architecture (ISA) is a layers in a hierarchy key level of abstraction To a good approximation Primary interface between hardware and Each layer hides (some of) its detail software from the layer above Set of operations that a processor Principal of Abstraction performs Highest layer interacts with outside Instruction format defines an world/end user interpretation of bit strings Similar to ASCII, 2’s complement, IEEE 754, BCD, etc © 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26 Opcodes, operands, and results A bit string, interpreted as an instruction, specifies Operations to be performed Actual operand(s) and/or source(s) for the operand(s) and their type(s) Destination for the result(s) © 2017 Dr. Jeffrey A. Turkstra 27 © 2017 Dr. Jeffrey A. Turkstra 28 ISA Design Many tradeoffs Instruction length Number of registers Number of instructions etc © 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30 CISC vs RISC Endianness Complex Instruction Set Computer Imagine memory is read from lowest address to highest address Reduced Instruction Set Computer Big Endian RISC won Most significant, “big,” byte comes first. Ie, placed in lowest numbered memory location. Even Intel uses RISC micro-instructions “Big” end appears first when reading memory They just have a really amazing instruction Network traffic decoder PowerPC, ARM, SPARC, MIPS Little Endian Reverse of Big Endian: least significant, “Little,” byte placed in lowest address “Little” end first © 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32 Example and comparison Consider 0x00C0F380 = 0x 00 C0 F3 80 = 0b0000 0000 1100 0000 1111 0011 1000 0000 Most significant byte Least significant byte Byte at given location Addresses arbitrarily start Memory Little endian Big endian address at 0x00000000; 0x00000000 1000 0000 0000 0000 Locations accessed in 0x00000001 1111 0000 1100 0000 arrow-indicated 0x00000002 1100 0000 1111 0011 sequence 0x00000003 0000 0000 1000 0000 © 2017 by George B. Adams III Portions © 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34 Memory hierarchy Registers Type of memory located inside CPU Can hold a single piece of data Data processing Control Many registers More later © 2017 Dr. Jeffrey A. Turkstra 35 © 2017 Dr. Jeffrey A. Turkstra 36 Designing an ISA © 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 38 Example Writing a program using these instructions is programming in assembly language; example Assembly instr. ; Comments load r2, 20(r1) ; r2 ← Data_Memory[20+r1] load r3, 24(r1) ; r3 ← Data_Memory[24+r1] add r4, r2, r3 ; r4 ← r2 + r3 store r4, 28(r1) ; Data_Memory[28+r1] ← r4 jump 60(r7) ; Fetch at Instr._Memory[60+r7] r1, r2, r3, r4 are registers in the data path 20, 24, 28 are decimal constants “x ←” means “x” is the location for the result Memory[x] means the contents of memory at address x + means addition, with operand typedefined by the instruction (r1 + r2 is add with different data type than 28+r3) © 2017 Dr. Jeffrey A. Turkstra 39 © 2017 Dr. Jeffrey A. Turkstra 40 Assembly and its machine ADD code ADD result data path Opcode PointerPointerPointer Unused offset, arbitrarily set all 0 41 ADD: ALU output is result, deliver to dst reg; result has the meaning “integer” because this is an integer adder © 2017 Dr. Jeffrey A. Turkstra 41 © 2017 Dr. Jeffrey A. Turkstra 4242 LOAD STORE t l u h s t e a r p D a A t a O L d STORE result data path LOAD: ALU output is pointer that must be sent to data memory, which then STORE: ALU output is pointer that must be sent to data memory along with produces copy of the location contents which, finally, must be written into dst_reg; the value from reg B to be written into the data memory location; Result is a Result is a bit string from memory: no inherent meaning at all bit string from reg_B written in memory, no inherent meaning at all © 2017 Dr. Jeffrey A. Turkstra 4343 © 2017 Dr. Jeffrey A. Turkstra 4444 Intel Core microarchitecture JUMP pipeline JUMP result data path JUMP: ALU output is computed Next_instruction_pointer, must deliver to Instruction_pointer_register; Result meaning is location of next instruction on the execution path © 2017 Dr. Jeffrey A. Turkstra 4545 © 2017 Dr. Jeffrey A. Turkstra 46 Direct memory access MMU DMA allows other hardware Responsible for “refreshing” DRAM subsystems to access main memory Translates virtual memory addresses without going through the CPU to physical addresses Modern systems usually have DMA Sometimes part of CPU controller (MMU) Sometimes not Memory address register, byte count, Northbridge for Intel until recently control, etc I7/i5 have an Integrated Memory Responsible for ensuring accesses are Controller (IMC) properly restrained Attack vector © 2017 Dr. Jeffrey A. Turkstra 47 © 2017 Dr. Jeffrey A. Turkstra 48 Page 70 Execution modes CPU hardware has several possible modes At any one time, in one mode Modes specify Privilege level Valid instructions Valid memory addresses Size of data items Backwards compatibility © 2017 Dr. Jeffrey A. Turkstra 49 © 2017 Dr. Jeffrey A. Turkstra 50 Rings Ring -1 Intel Active Management Technology Exists for other architectures as well Runs on the Intel Management Engine (ME) Isolated and protected coprocessor Embedded in all current Intel chipsets ARC core Out-of-band access Direct access to Ethernet controller Requires vPro-enabled CPU/Motherboard/Chipset © 2017 Dr. Jeffrey A. Turkstra 51 © 2017 Dr. Jeffrey A. Turkstra 52 Ring -1 Trusting trust ...if you can exploit it, you win. Reflections on Trusting Trust CVE-2017-5689 by Ken Thompson Go read about it Read this too © 2017 Dr. Jeffrey A. Turkstra 53 © 2017 Dr.
Recommended publications
  • Microprocessor Architecture
    EECE416 Microcomputer Fundamentals Microprocessor Architecture Dr. Charles Kim Howard University 1 Computer Architecture Computer System CPU (with PC, Register, SR) + Memory 2 Computer Architecture •ALU (Arithmetic Logic Unit) •Binary Full Adder 3 Microprocessor Bus 4 Architecture by CPU+MEM organization Princeton (or von Neumann) Architecture MEM contains both Instruction and Data Harvard Architecture Data MEM and Instruction MEM Higher Performance Better for DSP Higher MEM Bandwidth 5 Princeton Architecture 1.Step (A): The address for the instruction to be next executed is applied (Step (B): The controller "decodes" the instruction 3.Step (C): Following completion of the instruction, the controller provides the address, to the memory unit, at which the data result generated by the operation will be stored. 6 Harvard Architecture 7 Internal Memory (“register”) External memory access is Very slow For quicker retrieval and storage Internal registers 8 Architecture by Instructions and their Executions CISC (Complex Instruction Set Computer) Variety of instructions for complex tasks Instructions of varying length RISC (Reduced Instruction Set Computer) Fewer and simpler instructions High performance microprocessors Pipelined instruction execution (several instructions are executed in parallel) 9 CISC Architecture of prior to mid-1980’s IBM390, Motorola 680x0, Intel80x86 Basic Fetch-Execute sequence to support a large number of complex instructions Complex decoding procedures Complex control unit One instruction achieves a complex task 10
    [Show full text]
  • What Do We Mean by Architecture?
    Embedded programming: Comparing the performance and development workflows for architectures Embedded programming week FABLAB BRIGHTON 2018 What do we mean by architecture? The architecture of microprocessors and microcontrollers are classified based on the way memory is allocated (memory architecture). There are two main ways of doing this: Von Neumann architecture (also known as Princeton) Von Neumann uses a single unified cache (i.e. the same memory) for both the code (instructions) and the data itself, Under pure von Neumann architecture the CPU can be either reading an instruction or reading/writing data from/to the memory. Both cannot occur at the same time since the instructions and data use the same bus system. Harvard architecture Harvard architecture uses different memory allocations for the code (instructions) and the data, allowing it to be able to read instructions and perform data memory access simultaneously. The best performance is achieved when both instructions and data are supplied by their own caches, with no need to access external memory at all. How does this relate to microcontrollers/microprocessors? We found this page to be a good introduction to the topic of microcontrollers and ​ ​ microprocessors, the architectures they use and the difference between some of the common types. First though, it’s worth looking at the difference between a microprocessor and a microcontroller. Microprocessors (e.g. ARM) generally consist of just the Central ​ ​ Processing Unit (CPU), which performs all the instructions in a computer program, including arithmetic, logic, control and input/output operations. Microcontrollers (e.g. AVR, PIC or ​ 8051) contain one or more CPUs with RAM, ROM and programmable input/output ​ peripherals.
    [Show full text]
  • Endian: from the Ground up a Coordinated Approach
    WHITEPAPER Endian: From the Ground Up A Coordinated Approach By Kevin Johnston Senior Staff Engineer, Verilab July 2008 © 2008 Verilab, Inc. 7320 N MOPAC Expressway | Suite 203 | Austin, TX 78731-2309 | 512.372.8367 | www.verilab.com WHITEPAPER INTRODUCTION WHat DOES ENDIAN MEAN? Data in Imagine XYZ Corp finally receives first silicon for the main Endian relates the significance order of symbols to the computers chip for its new camera phone. All initial testing proceeds position order of symbols in any representation of any flawlessly until they try an image capture. The display is kind of data, if significance is position-dependent in that regularly completely garbled. representation. undergoes Of course there are many possible causes, and the debug Let’s take a specific type of data, and a specific form of dozens if not team analyzes code traces, packet traces, memory dumps. representation that possesses position-dependent signifi- There is no problem with the code. There is no problem cance: A digit sequence representing a numeric value, like hundreds of with data transport. The problem is eventually tracked “5896”. Each digit position has significance relative to all down to the data format. other digit positions. transformations The development team ran many, many pre-silicon simula- I’m using the word “digit” in the generalized sense of an between tions of the system to check datapath integrity, bandwidth, arbitrary radix, not necessarily decimal. Decimal and a few producer and error correction. The verification effort checked that all other specific radixes happen to be particularly useful for the data submitted at the camera port eventually emerged illustration simply due to their familiarity, but all of the consumer.
    [Show full text]
  • Assignment Solutions
    Week 1: Assignment Solutions 1. Which of the following statements are true? a. The ENIAC computer was built using mechanical relays. b. Harvard Mark1 computer was built using mechanical relays. c. PASCALINE computer could multiply and divide numbers by repeated addition and subtraction. d. Charles Babbage built his automatic computing engine in 19th century. Solution: ((b) and (c)) ENIAC was built using vacuum tubes. Charles Babbage designed his automatic computing engine but could not built it. 2. Which of the following statements are true for Moore’s law? a. Moore’s law predicts that power dissipation will double every 18 months. b. Moore’s law predicts that the number of transistors per chip will double every 18 months. c. Moore’s law predicts that the speed of VLSI circuits will double every 18 months. d. None of the above. Solution: (b) Moore’s law only predicts that number of transistors per chip will double every 18 months. 3. Which of the following generates the necessary signals required to execute an instruction in a computer? a. Arithmetic and Logic Unit b. Memory Unit c. Control Unit d. Input/Output Unit Solution: (c) Control unit acts as the nerve center of a computer and generates the necessary control signals required to execute an instruction. 4. An instruction ADD R1, A is stored at memory location 4004H. R1 is a processor register and A is a memory location with address 400CH. Each instruction is 32-bit long. What will be the values of PC, IR and MAR during execution of the instruction? a.
    [Show full text]
  • Introduction to Cpu
    microprocessors and microcontrollers - sadri 1 INTRODUCTION TO CPU Mohammad Sadegh Sadri Session 2 Microprocessor Course Isfahan University of Technology Sep., Oct., 2010 microprocessors and microcontrollers - sadri 2 Agenda • Review of the first session • A tour of silicon world! • Basic definition of CPU • Von Neumann Architecture • Example: Basic ARM7 Architecture • A brief detailed explanation of ARM7 Architecture • Hardvard Architecture • Example: TMS320C25 DSP microprocessors and microcontrollers - sadri 3 Agenda (2) • History of CPUs • 4004 • TMS1000 • 8080 • Z80 • Am2901 • 8051 • PIC16 microprocessors and microcontrollers - sadri 4 Von Neumann Architecture • Same Memory • Program • Data • Single Bus microprocessors and microcontrollers - sadri 5 Sample : ARM7T CPU microprocessors and microcontrollers - sadri 6 Harvard Architecture • Separate memories for program and data microprocessors and microcontrollers - sadri 7 TMS320C25 DSP microprocessors and microcontrollers - sadri 8 Silicon Market Revenue Rank Rank Country of 2009/2008 Company (million Market share 2009 2008 origin changes $ USD) Intel 11 USA 32 410 -4.0% 14.1% Corporation Samsung 22 South Korea 17 496 +3.5% 7.6% Electronics Toshiba 33Semiconduc Japan 10 319 -6.9% 4.5% tors Texas 44 USA 9 617 -12.6% 4.2% Instruments STMicroelec 55 FranceItaly 8 510 -17.6% 3.7% tronics 68Qualcomm USA 6 409 -1.1% 2.8% 79Hynix South Korea 6 246 +3.7% 2.7% 812AMD USA 5 207 -4.6% 2.3% Renesas 96 Japan 5 153 -26.6% 2.2% Technology 10 7 Sony Japan 4 468 -35.7% 1.9% microprocessors and microcontrollers
    [Show full text]
  • Real-Time Embedded Systems
    Real-Time Embedded Systems Ch3: Microprocessor Objectives Microprocessor Characteristics Von Neumann and Harvard architectures I/O Addressing Endianness Memory organization of some classic microprocessors ◦ PIC18F8720 ◦ Intel 8086 ◦ Intel Pentium ◦ Arm X. Fan: Real-Time Embedded Systems 2 Some microprocessors used in embedded systems 3 Microprocessor Architectures Von Neumann Architecture ◦ Data and instructions (executable code) are stored in the same address space. ◦ The processor interfaces to memory through a single set of address/data buses. Harvard Architecture ◦ Data and instructions (executable code) are stored in separate address spaces. ◦ There are two sets of address/data buses between processor and memory. Complex Instruction Set Computer (CISC) ◦ runs “complex instructions“ where a single instruction may execute several low-level operations Reduced Instruction Set Computer (RISC) ◦ runs compact, uniform instructions where the amount of work any single instruction accomplishes is reduced. 4 Other Stuff Processing Width ◦ Width ◦ Word I/O Addressing ◦ Special instruction vs memory mapped Reset Vector ◦ Intell – high end of address space ◦ PIC/ and Motorolla (Frescale, NXT….) low ◦ Arm definable ◦ Why put it on end or the other? X. Fan: Real-Time Embedded Systems 5 Endianness VS 6 Endianness Whether integers are presented from left to right or right to left ◦ Little-endian: the least significant byte stored at the lowest memory address ◦ Big-endian: the most significant byte stored at the lowest memory address 7 System Convention Be internally consistent ◦ Endianness ◦ Data formats ◦ Protocols ◦ Units Do conversion ASAP and ALAP ◦ On and off the wire X. Fan: Real-Time Embedded Systems 8 PIC18F8720 9 PIC18F8720: Memory organization 10 PIC18F8720: EMI Operating mode: Word Write Mode This mode allows instruction fetches and table reads from, and table writes to, all forms of 16-bit (word-wide) external memories.
    [Show full text]
  • Renesas Technology and Hitachi Announce Development of SH-2A 32-Bit RISC CPU Core for High-Performance Embedded Sysytems
    Renesas Technology and Hitachi Announce Development of SH-2A 32-Bit RISC CPU Core for High-Performance Embedded Sysytems Approximately 3.5-fold improvement in processing performance plus improved program code efficiency and real-time capability, ideal for high-performance real-time control systems for automotive, consumer, and industrial products Tokyo, April 19, 2004 Renesas Technology Corp. and Hitachi, Ltd. (TSE:6501, NYSE:HIT) today announced the development of a 32-bit RISC (Reduced Instruction Set Computer) CPU core, dubbed the SH-2A, for use in control devices in the automotive, industrial, and consumer fields. The SH-2A is the successor of the SuperH™*1 RISC microprocessor SH-2 CPU core, and offers a major increase in performance together with improved program code efficiency. The SH-2A is designed for products requiring real-time capability, and is ideal for use in single-chip microcontrollers and SoCs (Systems on Chip) in automotive engine control systems and consumer and industrial products such as printers and AC servos. < Background > Automotive engine control has become more complex with the recent emphasis on environmental conservation and improved fuel consumption, while at the same time there is a trend toward the development of large-scale systems in short time-frames through the adoption of autocode technology*2 driven by improvements in development tools. The field of industrial products such as AC servos is also witnessing advances in system precision, while more and more consumer products such as printers are appearing in the form of multi-function systems. For example, there is a demand for faster processing providing coupled printing control while executing high-speed computational operations on image data from a DSC or scanner, and control system in these fields require microcontroller that offer high speed, high performance, and large-capacity on-chip ROM.
    [Show full text]
  • Lecture #2 "Computer Systems Big Picture"
    18-600 Foundations of Computer Systems Lecture 2: “Computer Systems: The Big Picture” John P. Shen & Gregory Kesden 18-600 August 30, 2017 CS: AAP CS: APP ➢ Recommended Reference: ❖ Chapters 1 and 2 of Shen and Lipasti (SnL). ➢ Other Relevant References: ❖ “A Detailed Analysis of Contemporary ARM and x86 Architectures” by Emily Blem, Jaikrishnan Menon, and Karthikeyan Sankaralingam . (2013) ❖ “Amdahl’s and Gustafson’s Laws Revisited” by Andrzej Karbowski. (2008) 8/30/2017 (©J.P. Shen) 18-600 Lecture #2 1 18-600 Foundations of Computer Systems Lecture 2: “Computer Systems: The Big Picture” 1. Instruction Set Architecture (ISA) a. Hardware / Software Interface (HSI) b. Dynamic / Static Interface (DSI) c. Instruction Set Architecture Design & Examples 2. Historical Perspective on Computing a. Major Epochs of Modern Computers b. Computer Performance Iron Law (#1) 3. “Economics” of Computer Systems a. Amdahl’s Law and Gustafson’s Law b. Moore’s Law and Bell’s Law 8/30/2017 (©J.P. Shen) 18-600 Lecture #2 2 Anatomy of Engineering Design SPECIFICATION: Behavioral description of “What does it do?” Synthesis: Search for possible solutions; pick best one. Creative process IMPLEMENTATION: Structural description of “How is it constructed?” Analysis: Validate if the design meets the specification. “Does it do the right thing?” + “How well does it perform?” 8/30/2017 (©J.P. Shen) 18-600 Lecture #2 3 [Gerrit Blaauw & Fred Brooks, 1981] Instruction Set Processor Design ARCHITECTURE: (ISA) programmer/compiler view = SPECIFICATION • Functional programming model to application/system programmers • Opcodes, addressing modes, architected registers, IEEE floating point IMPLEMENTATION: (μarchitecture) processor designer view • Logical structure or organization that performs the ISA specification • Pipelining, functional units, caches, physical registers, buses, branch predictors REALIZATION: (Chip) chip/system designer view • Physical structure that embodies the implementation • Gates, cells, transistors, wires, dies, packaging 8/30/2017 (©J.P.
    [Show full text]
  • KVM on POWER7
    Developments in KVM on Power Paul Mackerras, IBM LTC OzLabs [email protected] Outline . Introduction . Little-endian support . OpenStack . Nested virtualization . Guest hotplug . Hardware error detection and recovery 2 21 October 2013 © 2013 IBM Introduction • We will be releasing POWER® machines with KVM – Announcement by Arvind Krishna, IBM executive • POWER8® processor disclosed at Hot Chips conference – 12 cores per chip, 8 threads per core – 96kB L1 cache, 512kB L2 cache, 8MB L3 cache per core on chip u u Guest Guest m host OS process m e e OS OS q host OS process q host OS process KVM Host Linux Kernel SAPPHIRE FSP POWER hardware 3 21 October 2013 © 2013 IBM Introduction • “Sapphire” firmware being developed for these machines – Team led by Ben Herrenschmidt – Successor to OPAL • Provides initialization and boot services for host OS – Load first-stage Linux kernel from flash – Probe the machine and set up device tree – Petitboot bootloader to load and run the host kernel (via kexec) • Provides low-level run-time services to host kernel – Communication with the service processor (FSP) • Console • Power and reboot control • Non-volatile memory • Time of day clock • Error logging facilities – Some low-level error detection and recovery services 4 21 October 2013 © 2013 IBM Little-endian Support • Modern POWER CPUs have a little-endian mode – Instructions and multi-byte data operands interpreted in little-endian byte order • Lowest-numbered byte is least significant, rather than most significant – “True” little endian, not address swizzling
    [Show full text]
  • EECS 373 Design of Microprocessor-Based Systems
    EECS 373 Design of Microprocessor-Based Systems Website: https://www.eecs.umich.edu/courses/eecs373/ Robert Dick University of Michigan Lecture 1: Introduction, ARM ISA September 6 2017 Many slides from Mark Brehob Teacher Robert Dick http://robertdick.org/ [email protected] EECS Professor Co-founder, CEO of profitable direct- to-consumer athletic wearable electronics company (Stryd) Visiting Professor at Tsinghua Univeristy Graduate studies at Princeton Visiting Researcher at NEC Labs, America, where technology went into their smartphones ~100 research papers on embedded system design Lab instructor Matthew Smith [email protected] Head lab instructor 15 years of 373 experience He has seen it before … but he’ll make you figure it out yourself TAs Took EECS 373 Jon Toto <[email protected]> Brennan Garrett <[email protected]> Thomas Deeds <[email protected]> Melinda Kothbauer <[email protected]> Course goals Embedded system design Debugging complex systems Communication and marketing A head start on a new product or research idea What is an embedded system? An (application-specific) computer within something else that is not generally regarded as a computer. Embedded, everywhere Embedded systems market Dominates general-purpose computing market in volume. Similar in monetary size to general-purpose computing market. Growing at 15% per year, 10% for general- purpose computing. Car example: half of value in embedded electronics, from zero a few decades ago. Common requirements Timely (hard real-time) Wireless Reliable
    [Show full text]
  • Anatomy of Memory Corruption Attacks and Mitigations In
    Anatomy of Memory Corruption Attacks and Mitigations in Embedded Systems Nektarios Georgios Tsoutsos, Student Member, IEEE, and Michail Maniatakos, Senior Member, IEEE Abstract—For more than two decades, memory safety viola- deploy additional protection mechanisms. In this article, we tions and control-flow integrity attacks have been a prominent review memory corruption threats to embedded processors and threat to the security of computer systems. Contrary to regular summarize contemporary mitigation techniques. systems that are updated regularly, application-constrained de- vices typically run monolithic firmware that may not be updated Memory Safety. In embedded system programming, memory in the lifetime of the device after being deployed in the field. corruption remains a major class of memory safety violations Hence, the need for protections against memory corruption that could enable attackers to subvert the legitimate program becomes even more prominent. In this article, we survey memory safety in the context of embedded processors, and describe behavior. Using programming languages that provide unlim- different attacks that can subvert the legitimate control flow, ited freedom on how to data are handled, such violations can with a special focus on Return Oriented Programming. Based be attributed to programmer errors or improper input validation on common attack trends, we formulate the anatomy of typical [1]. Consequently, memory safety violations comprise a broad memory corruption attacks and discuss powerful mitigation range of software problems that could enable unauthorized techniques that have been reported in the literature. access to sensitive information resources, escalate privileges, Index Terms—Memory safety violations, control-flow integrity modify the program control flow or allow execution of arbi- protections, buffer overflows, return oriented programming.
    [Show full text]
  • EECS 373 Design of Microprocessor-Based Systems
    EECS 373 Design of Microprocessor-Based Systems Prabal Dutta University of Michigan Lecture 2: Architecture, Assembly, and ABI January 13, 2015 Slides developed in part by Mark Brehob 1 Announcements • Website up – http://www.eecs.umich.edu/~prabal/teaching/eecs373/ • Homework 2 posted (mostly a 370 review) • Lab and office hours posted on-line. – My office hours: Thursday 3:00-4:00 pm in 4773 BBB • Projects – Start thinking about them now! 2 Today… Finish ARM assembly example from last time Walk though of the ARM ISA Software Development Tool Flow Application Binary Interface (ABI) 3 Major elements of an Instruction Set Architecture (registers, memory, word size, endianess, conditions, instructions, addressing modes) 32-bits 32-bits ! !mov!r0,!#4! ! !ldr!r1,![r0,#8]! ! !!!!!!r1=mem((r0)+8)! ! !bne!loop! ! !subs!r2,!#1! Endianess Endianess 4 The endianess religious war: 288 years and counting! • Modern version • Little-Endian – Danny Cohen – LSB is at lower address !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Memory!!!!!Value! – IEEE Computer, v14, #10 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Offset!!(LSB)!(MSB)! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!======!!===========! uint8_t!a!!=!1;!!!!!!!!!!!!!!!0x0000!!01!02!FF!00! – Published in 1981 uint8_t!b!!=!2;! uint16_t!c!=!255;!//!0x00FF! – Satire on CS religious war uint32_t!d!=!0x12345678;!!!!!!0x0004!!78!56!34!12! • Historical Inspiration • Big-Endian – Jonathan Swift – MSB is at lower address !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Memory!!!!!Value! – Gulliver's Travels !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Offset!!(LSB)!(MSB)! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!======!!===========!
    [Show full text]