Instruction Processing Cycle

Lecture 2 Expectations Survey: Results

Lecture 2 Expectations Survey: Results

Greatest hopes for the class?

 Greatest fear about the class?

 Would you take it even if it were not required? Our Virtual Machine

A simple, fictitious architecture  Memory  CPU ()  Registers (PC, CCRs, general purpose)

Computer Science and Engineering  The Ohio State University  Organized as a sequence of “cells” kbits  The smallest

addressable unit 0  N cells, addressed 1 0..N-1 2 3  Each cell has a 4 “width” (k bits) 5 6  Typically k = 8 (a byte) …

 Other widths are N-2 possible N-1  Adjacent cells can be combined to form a single quantity (a word) Questions

Computer Science and Engineering  The Ohio State University  How many different values can a cell have?  A function of ___

 How many bits are needed to represent an address  A function of ___ Instruction Cycle

Repeat: 1. Fetch - get the instruction 2. Decode - figure out what it means 3. Evaluate Addresses - calculate addresses of operands 4. Fetch Operands - get the operands 5. Execute - do it! 6. Store Result - update memory, CCRs  Instruction processing cycle, IPC, fetch-decode-execute cycle Fetch

First copy (contents of) memory location indicated by PC to the CPU  Then increment PC

R0 R1 R2 R7 Incrementing the PC

Computer Science and Engineering  The Ohio State University  Note that the increment is part of the Fetch phase  Therefore, if a subsequent phase uses the PC, its value is the address of the next instruction  Eg. “branch to subroutine” instruction  Execute phase: store PC value, then change PC to be the address of subroutine  Next fetch phase will get first instruction of subroutine  The stored value is the address of the instruction after the branch The IPC is Fast

Clock ticks control this cycle  Intel Core i7: 3.4 GHz (3.2 billion ticks/s)  Light bulb: 60 Hz (120 flickers/s)  Every flicker about 25 million clock ticks!  Not every phase is needed every time  Basic phases: fetch, decode, execute  Some phases can be combined  Eg. Operands can be fetched from registers in the same tick as instruction decoding Reduced Instruction Set (RISC)

Computer Science and Engineering  The Ohio State University  Repeat: 1. Instruction Fetch (IF) 2. Instruction Decode (ID)

3. Execute (EX) Evaluate addresses, 4. Memory Access (MEM) fetch operands, execute 5. Writeback (WB) Speeding Things Up: Pipelining

Metaphor: Washer, dryer, folder

 Assume  Each phase takes 30 min  There are 4 loads to do  Without pipeline: 30min x 3 x 4 = 6hrs  With a pipeline?  Abstraction: Client doesn’t care how laundry service completes the 4 loads Pipelining: Diagram

How much faster is the pipeline?  For 1 load: no improvement (1.5 hrs)  For 2 loads: 2 hrs vs 3 hrs  For 3 loads: 2.5 hrs vs 4.5 hrs  As the number of loads goes to infinity, what is the improvement?

 What determines the asymptotic speedup of a pipeline? Pipelining: RISC architecture

Computer Science and Engineering  The Ohio State University

Computer Science and Engineering  The Ohio State University  Unlike loads of laundry, instructions are not independent  One instruction may change the effect of the very next instruction  Examples:  Write to a register read by next instruction  Change the PC  Solution:  Stall the next instruction until it is safe  Creates a “bubble” (an idle cycle) that moves through the pipeline Speeding Things Up: Concurrency

When things are independent, they can be done concurrently  Quad-core laundry service: 4 washers, dryers, and folders  Each core is still pipelined  Challenges  Identifying things that are independent  Synchronizing things that are not  See CSE 2431 (Systems II) (μarch)

Low-level structure: , registers, data path, pipeline, Instruction Set Architecture: ISA

Computer Science and Engineering  The Ohio State University  Higher-level description: machine instruction set, programmer accessible registers, memory addressing modes  Recall CSE 2421 (Systems I)

Goal: Functional correctness at the instruction set architecture (ISA) level  You do NOT need to account for other aspects of the microarchitecture  No pipelining  No concurrency  No cache Overview of Labs

Computer Science and Engineering  The Ohio State University

Assembly language e.g. LOAD r1,Total Assembler

Object file e.g. ...T003F2200... Linking Loader

Linked machine code e.g. ...0010001000110010... Simulator

Executing program Lab 1 Requirements

Given an object file (a text file)  Contents of file denote initialization for a chunk of memory  Format: header record, sequence of text records, end record  Develop a simulator that allows us to:  Initialize the machine  Load the object file into memory  Simulate execution of the loaded program  Simulation has 3 modes  Quiet: no output (except for that of the machine itself)  Trace: output machine state before and after execution, as well as affect of each instruction (memory and registers)  Step: same trace mode, but pause after each instruction  Optional: extra functionality for usability  View state of machine, modify registers and memory, etc  Robustness: Test it thoroughly! Lab 1 Milestones

Preliminary Documentation: Sept 7  Programmer's Guide is particularly important  We will look at and comment on everything you turn in  Mandatory Design Review: Sept 10, 11  Sign up for a 30 minute slot  Everyone in group must attend  Completed Documentation: Sept 24  Interactive Grading: Sept 24, 25  Sign up for a 60 minute slot  Everyone in group must attend Group Formation

Form groups of 4  Exchange contact information  Set regular in-person meeting times  Note: Lab 0 is due soon! Summary

Instruction processing cycle  PC is incremented during fetch phase  Pipelining  Overlap parts of execution to increase throughput  Microarchitecture vs ISA  Abstraction  Lab overview  Group formation