Lecture 03
Basics Processors
CS 50011: Introduction to Systems II Architecture ISA Lecture 3: Computer Architecture DMA Modes Prof. Jeff Turkstra
© 2017 Dr. Jeffrey A. Turkstra 1 © 2017 Dr. Jeffrey A. Turkstra 2
Basics
Some lecture material based on: Moore’s Law for integrated circuits Slides by Dr. George B. Adams III Transistor count for a typical processor Slides from Hennessy & Patterson or memory chip increases 40% to 55% per year. Slides from Silberschatz Doubles every 18-24 months Transistor count ~= computational power 1986-2003 computer performance increased ~ 50%/year
© 2017 Dr. Jeffrey A. Turkstra 3 © 2017 Dr. Jeffrey A. Turkstra 4
Moore’s Law since 2003
25,000-fold hardware performance Microprocessor performance only improvement since 1985 20%/year Programs today trade execution Maximum power dissipation limits for performance for programmer air-cooled chips productivity Lack of additional instruction-level More programming is done in managed parallelism for hardware to exploit languages like Java, Python, and C# New applications have arisen: speech, sound, images, video
© 2017 Dr. Jeffrey A. Turkstra 5 © 2017 Dr. Jeffrey A. Turkstra 6
Computer components
Hardware Data types Transistors Representations for character, integer, Gates floating point, etc more, od, xxd Combinational and sequential circuits Sign-magnitude Adders, decoders, mux/demux, latches, flip-flops, registers 1’s complement Processors 2’s complement Memory IEEE 754 etc BCD ... © 2017 Dr. Jeffrey A. Turkstra 7 © 2017 Dr. Jeffrey A. Turkstra 8
Software Instructions for what to compute
© 2017 Dr. Jeffrey A. Turkstra 9 © 2017 Dr. Jeffrey A. Turkstra 10
How a modern computer Harvard architecture works Idea by Howard Aiken, Harvard physicist, to IBM Nov. 1937 Built by IBM in Endicott, NY and delivered to Harvard in Feb. 1944 as the Mark 1 computer Has separate memories for program (instructions) and data Input/output (I/O) to connect to the world Processor to carry out the computations
© 2017 Dr. Jeffrey A. Turkstra 11 © 2017 Dr. Jeffrey A. Turkstra 12
(John) Von Neumann Harvard architecture Developed during his June 1945 train ride from Philadelphia to Los Alamos, NM He had programmed the Mark 1 in August 1944 One memory for both data and program Same I/O Same processor
© 2017 Dr. Jeffrey A. Turkstra 13 © 2017 Dr. Jeffrey A. Turkstra 14
von Neumann vs Harvard Von Neumann architectures von Neumann Same memory holds instructions and data Single bus between CPU and memory Flexible, more cost effective Harvard Separate memories for data and instructions Two busses Allows two simultaneous memory fetches Less flexible, memory is physically partitioned Both are stored program computer designs
© 2017 Dr. Jeffrey A. Turkstra 15 © 2017 Dr. Jeffrey A. Turkstra 16
Processors Stored programs
Device that performs automatic computation Some memories can be written to only Fixed logic – single operation once and then read many times Traffic signal sequencer Read-Only Memory (ROM) Selectable logic – user can select from multiple E.g., automobile engine control hardwired functions Car with Econo and Sports modes for transmission Some ROM can be re-written Parameterized logic – computes fixed function PROM, programmable ROM on variable user input EPROM, erasable programmable ROM Programmable video recorder EEPROM, electrically erasable… Programmable logic processor Embedded systems often PROM CPU, GPU, etc Firmware upgrades
© 2017 Dr. Jeffrey A. Turkstra 17 © 2017 Dr. Jeffrey A. Turkstra 18
Fetch-Execute Intel Core i7 Processor
At the highest level, a processor does this: repeat forever { FETCH, access the next program instruction from location where it is stored EXECUTE, perform the actions described by the instruction }
© 2017 Dr. Jeffrey A. Turkstra 19 © 2017 Dr. Jeffrey A. Turkstra 20
Motherboard
© 2017 Dr. Jeffrey A. Turkstra 21 © 2017 Dr. Jeffrey A. Turkstra 22
Architecture basics Architecture basics
Instruction set Logic design Software instructions that the hardware Which logic circuits are used and how executes are they organized? Functional organization Implementation How is the hardware partitioned into Technologies and packaging used specialized units?
© 2017 Dr. Jeffrey A. Turkstra 23 © 2017 Dr. Jeffrey A. Turkstra 24
Instruction set Hierarchical abstraction architecture Hardware and software consist of Instruction set architecture (ISA) is a layers in a hierarchy key level of abstraction To a good approximation Primary interface between hardware and Each layer hides (some of) its detail software from the layer above Set of operations that a processor Principal of Abstraction performs Highest layer interacts with outside Instruction format defines an world/end user interpretation of bit strings Similar to ASCII, 2’s complement, IEEE 754, BCD, etc © 2017 Dr. Jeffrey A. Turkstra 25 © 2017 Dr. Jeffrey A. Turkstra 26
Opcodes, operands, and results A bit string, interpreted as an instruction, specifies Operations to be performed Actual operand(s) and/or source(s) for the operand(s) and their type(s) Destination for the result(s)
© 2017 Dr. Jeffrey A. Turkstra 27 © 2017 Dr. Jeffrey A. Turkstra 28
ISA Design
Many tradeoffs Instruction length Number of registers Number of instructions etc
© 2017 Dr. Jeffrey A. Turkstra 29 © 2017 Dr. Jeffrey A. Turkstra 30
CISC vs RISC Endianness
Complex Instruction Set Computer Imagine memory is read from lowest address to highest address Reduced Instruction Set Computer Big Endian RISC won Most significant, “big,” byte comes first. Ie, placed in lowest numbered memory location. Even Intel uses RISC micro-instructions “Big” end appears first when reading memory They just have a really amazing instruction Network traffic decoder PowerPC, ARM, SPARC, MIPS Little Endian Reverse of Big Endian: least significant, “Little,” byte placed in lowest address “Little” end first © 2017 Dr. Jeffrey A. Turkstra 31 © 2017 Dr. Jeffrey A. Turkstra 32
Example and comparison
Consider 0x00C0F380 = 0x 00 C0 F3 80 = 0b0000 0000 1100 0000 1111 0011 1000 0000 Most significant byte Least significant byte Byte at given location Addresses arbitrarily start Memory Little endian Big endian address at 0x00000000; 0x00000000 1000 0000 0000 0000 Locations accessed in 0x00000001 1111 0000 1100 0000 arrow-indicated 0x00000002 1100 0000 1111 0011 sequence 0x00000003 0000 0000 1000 0000 © 2017 by George B. Adams III Portions © 2017 Dr. Jeffrey A. Turkstra 33 © 2017 Dr. Jeffrey A. Turkstra 34
Memory hierarchy Registers
Type of memory located inside CPU Can hold a single piece of data Data processing Control Many registers More later
© 2017 Dr. Jeffrey A. Turkstra 35 © 2017 Dr. Jeffrey A. Turkstra 36
Designing an ISA
© 2017 Dr. Jeffrey A. Turkstra 37 © 2017 Dr. Jeffrey A. Turkstra 38
Example
Writing a program using these instructions is programming in assembly language; example Assembly instr. ; Comments load r2, 20(r1) ; r2 ← Data_Memory[20+r1] load r3, 24(r1) ; r3 ← Data_Memory[24+r1] add r4, r2, r3 ; r4 ← r2 + r3 store r4, 28(r1) ; Data_Memory[28+r1] ← r4 jump 60(r7) ; Fetch at Instr._Memory[60+r7] r1, r2, r3, r4 are registers in the data path 20, 24, 28 are decimal constants “x ←” means “x” is the location for the result Memory[x] means the contents of memory at address x + means addition, with operand typedefined by the instruction (r1 + r2 is add with different data type than 28+r3)
© 2017 Dr. Jeffrey A. Turkstra 39 © 2017 Dr. Jeffrey A. Turkstra 40
Assembly and its machine ADD code
ADD result data path
Opcode PointerPointerPointer Unused offset, arbitrarily set all 0
41 ADD: ALU output is result, deliver to dst reg; result has the meaning “integer” because this is an integer adder
© 2017 Dr. Jeffrey A. Turkstra 41 © 2017 Dr. Jeffrey A. Turkstra 4242
LOAD STORE
t l u h s t e a r
p
D a A t a O L d
STORE result data path
LOAD: ALU output is pointer that must be sent to data memory, which then STORE: ALU output is pointer that must be sent to data memory along with produces copy of the location contents which, finally, must be written into dst_reg; the value from reg B to be written into the data memory location; Result is a Result is a bit string from memory: no inherent meaning at all bit string from reg_B written in memory, no inherent meaning at all
© 2017 Dr. Jeffrey A. Turkstra 4343 © 2017 Dr. Jeffrey A. Turkstra 4444
Intel Core microarchitecture JUMP pipeline
JUMP result data path
JUMP: ALU output is computed Next_instruction_pointer, must deliver to Instruction_pointer_register; Result meaning is location of next instruction on the execution path
© 2017 Dr. Jeffrey A. Turkstra 4545 © 2017 Dr. Jeffrey A. Turkstra 46
Direct memory access MMU
DMA allows other hardware Responsible for “refreshing” DRAM subsystems to access main memory Translates virtual memory addresses without going through the CPU to physical addresses Modern systems usually have DMA Sometimes part of CPU controller (MMU) Sometimes not Memory address register, byte count, Northbridge for Intel until recently control, etc I7/i5 have an Integrated Memory Responsible for ensuring accesses are Controller (IMC) properly restrained Attack vector
© 2017 Dr. Jeffrey A. Turkstra 47 © 2017 Dr. Jeffrey A. Turkstra 48
Page 70 Execution modes
CPU hardware has several possible modes At any one time, in one mode Modes specify Privilege level Valid instructions Valid memory addresses Size of data items Backwards compatibility
© 2017 Dr. Jeffrey A. Turkstra 49 © 2017 Dr. Jeffrey A. Turkstra 50
Rings Ring -1
Intel Active Management Technology Exists for other architectures as well Runs on the Intel Management Engine (ME) Isolated and protected coprocessor Embedded in all current Intel chipsets ARC core Out-of-band access Direct access to Ethernet controller Requires vPro-enabled CPU/Motherboard/Chipset
© 2017 Dr. Jeffrey A. Turkstra 51 © 2017 Dr. Jeffrey A. Turkstra 52
Ring -1 Trusting trust
...if you can exploit it, you win. Reflections on Trusting Trust CVE-2017-5689 by Ken Thompson Go read about it Read this too
© 2017 Dr. Jeffrey A. Turkstra 53 © 2017 Dr. Jeffrey A. Turkstra 54
How to change between Paging and Virtual modes Memory Automatic ...later Hardware interrupts OS-specified handlers “Manual” Initiated by software, typically OS System calls, signals, and page faults Sometimes mode can be set by application
© 2017 Dr. Jeffrey A. Turkstra 55 © 2017 Dr. Jeffrey A. Turkstra 56
Questions?
© 2017 Dr. Jeffrey A. Turkstra 57