Logistics • Meeting time: 2:20pm-5:20pm, Monday • Classroom: CSIE Room 104 • Instructor: Yung-Yu Chuang Course overview • ThiTeaching assittistants: 李根逸/黃子桓 • Webpage: http://www.csie.ntu.edu.tw/~cyy/asm Comppgzuter Organization and Assembly ygg Languages id / password Yung-Yu Chuang • Forum: http://www.cmlab.csie.ntu.edu.tw/~cyy/forum/viewforum.php?f=13 2008/09/15 • Mailing list: [email protected] Please subscribe via with slides by Kip Irvine https://cmlmail.csie.ntu.edu.tw/mailman/listinfo/assembly/

Prerequisites Textbook • Better to have programming experience with • Readings and slides some high-level language such C, C ++, Java … References (TOY) References (ARM)

Princeton’s Introduction to CS, ARM Assembly Language http: //www.cs.pr ince ton.e du /in tro PiProgramming, PPteter Knaggs an d cs/50machine/ Stephen Welsh http://www.cs.princeton.edu/intro cs/60circuits/ ARM System Developer’s Guide, Andrew Sloss, Dominic Symes and Chris Wright

References (ARM) References (IA32)

Whirlwind Tour of ARM Assembly, Assembly Language for Intel-Based TONC, Jasper Vijn. CtComputers, 5th Edition, Kip Irv ine

ARM System-on-chip Architecture, Steve Furber. The Art of Assem bly Language, RdRandy Hyde References (IA32) Grading (subject to change)

Michael Abrash' s Graphics Programming • Assignments (4~5 projects, 50%) Black Boo k • Class partic ipat ion (5%) • Midterm exam (20%) • Final project (25%) – Examples from last years

CtComputer StSystems: A Programmer 's Perspective, Randal E. Bryant and David R. O'HO'Hllallaron

Computer Organization and Assembly language Early computers • It is not only about assembly and not only about “computer”. Early programming tools First popular PCs

Early PCs GUI/IDE • Intel 8086 processor • 768KB memory • 20MB disk •Dot-Matrix printer (9-pin) More advanced architectures More advanced software • Pipeline • SIMD • Multi-core •Cache

More “computers” around us My computers

Desktop (Intel Pentium D 3GHz, Nvidia 7900) VAIO TX17TP (Inte l Pent ium M 1. 1GHz )

GBA SP iPod classic Nokia 6070 (ARM7 16.78MHz) (ARM7 80MHz) (ARM7 51MHz) Computer Organization and Assembly language TOY machine • It is not only about assembly and not only about “computer”. • It will cover – Basic concept of computer systems and architecture – ARM assembly language – x86 assembly language

TOY machine TOY machine • Starting from a simple construct • Build several components and connect them together TOY machine TOY machine

• Almost as good as any computers int A[32]; ADUP32 10: C020

lda R1, 1 20: 7101 lda RA, A 21: 7A00 i=0; lda RC, 0 22: 7C00 Do { RD=stdi n; read ld RD, 0xFF 23: 8DFF if (RD==0) break; bz RD, exit 24: CD29 add R2, RA, RC 25: 12AC A[i]=RD; sti RD, R2 26: BD02 i=i+1; add RC, RC, R1 27: 1CC1 } while (1); bz R0, read 28: C023

prit()intr(); exit jl RF, prin tr 29: FF2B hlt 2A: 0000

ARM IA32 • ARM architecture • IA-32 Processor Architecture • Data Transfers, Addressing, and Arithmetic • ARM assembly programm ing • Procedures • Conditional Processing • Integer Arithmetic • Advanced Procedures • Strings and Arrays • High-Level Language Interface • Real Arithmetic (FPU) •SIMD • Code Optimization • Writing toy OS What you will learn Why taking this course? • Basic principle of computer architecture • Does anyone really program in assembly • How your computer works nowadays? • How your C programs work • Yes, at times, you do need to write assembly • Assembly basics code. • ARM assembly programming • It is foundation for computer architecture and • IA-32 assembly programming compilers. It is related to electronics, logic • Spec ific components, FPU/MMX design and operating system. • Code optimization • Interface between assembly to high-level langgguage • Toy OS writing

CSIE courses wikipedia • Hardware: electronics, digital system, • Today, assembly language is used primarily for architecture direct hardware manipulation, access to • Software: operating system, compiler specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems. Reasons for not using assembly Reasons for using assembly • Development time: it takes much longer to • Educational reasons: to understand how CPUs develop in assembly. Harder to debug, no type and compilers work. Better understanding to checking, side effects… efficiency issues of various constructs. • Ma in ta ina bility: unstruc ture d, ditdirty titric ks • Deve lop ing compilers, dbdebuggers and other • Portability: platform-dependent development tools. • Hardware drivers and system code • Embedded systems • Developing libraries. • Accessing instructions that are not available through high-level languages. • OtiiiOptimizing for spee d or space

To sum up Overview • It is all about lack of smart compilers • Virtual Machine Concept • Data Representation • Faster code, compiler is not good enough • Boolean Operations • Smaller code , compiler is not good enough, e.g. mobile devices,,, embedded devices, also Smaller code → better cache performance → faster code • Unusual architecture , there isn’t even a compiler or compiler quality is bad, eg GPU, DSP chips, even MMX. Translating Languages Virtual machines Abstractions for computers English: Display the sum of A times B plus C. High-Level Language Level 5

C++: Assembly Language Level 4

cout << (A * B+C);B + C); Operating System Level 3

Instruction Set Intel Machine Language: Architecture Level 2 Assembly Language: A1 00000000 mov eax,A Microarchitecture Level 1 F7 25 00000004 mul B add eax,C 03 05 00000008 Digital Logic Level 0 callWitItll WriteInt E8 00500000

High-Level Language Assembly Language

• Level 5 • Level 4 • Application-oriented languages • Instruction mnemonics that have a one-to-one • Programs compile into assembly language correspondence to machine language (Level 4) • Calls functions written at the operating syy()stem level (Level 3) • Programs are translated into machine language (Level 2) cout << (A * B + C); mov eax, A mul B add eax, C call WriteInt Operating System Instruction Set Architecture

• Level 3 • Level 2 • Provides services • Also known as conventional machine language •Programs translated and run at the instruction •Executed byypg Level 1 program set architecture level (Level 2) (microarchitecture, Level 1)

A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000

Microarchitecture Digital Logic

• Level 1 • Level 0 • Interprets conventional machine instructions • CPU, constructed from didiigita l lilogic gates (Level 2) • System bus • Executed by digital hardware (Level 0) •Memory Data representation Binary Representations • Computer is a construction of digital circuits • Electronic Implementation with two states: on and off – Easy to store w ith bitblbistable elemen ts • You need to have the ability to translate – Reliably transmitted on noisy and inaccurate wires be tween different represen ta tions to exam ine the content of the machine 0 1 0 • Common number systems: binary, octal, 3.3V and hexadecimal 2.8V

0.5V 0.0V

Binary numbers Unsigned binary integers

• Digits are 1 and 0 • Each digit () is either 1 or 0 (a binary dig it is ca lle d a bit) • Each bit represents a power of 2: 1 1 1 1 1 1 1 1 1 = true 27 26 25 24 23 22 21 20 0 = false • MSB –most significant bit • LSB –least significant bit Every is a MSB LSB sum of powers • : 1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0 of 2 15 0

• A bit string could have different interpretations Translating Binary to Decimal Translating Unsigned Decimal to Binary

• Repeatedly divide the decimal integer by 2. Each Weighted shows how to remainder is a binary digit in the translated value: calculate the decimal value of each binary bit: n-1 n-2 1 dec = (Dn-1 × 2 ) + (Dn-2 × 2 ) + ... + (D1 × 2 ) + (D0 × 20) D = binary digit 37 = 100101

binary 00001001 = decimal 9: (1 × 23) + (1 × 20) = 9

Binary addition Integer storage sizes

• Starting with the LSB, add each pair of digits, 8 include the carry if present. Standard sizes: word 16 doubleword 32

carry: 1 quadword 64 0 0 0 0 0 1 0 0 (4)

+ 0 0 0 0 0 1 1 1 (7)

0 0 0 0 1 0 1 1 (11)

bit position: 7 6 5 4 3 2 1 0 Practice: What is the largest unsigned integer that may be stored in 20 ? bits? Large measurements Hexadecimal integers

•Kilobyte (KB),210 All values in memory are stored in binary. Because long • MbMegabyte (MB), 220 bytes binary numbers are hard to read , we use hexadecimal representation. • Gigabyte (GB), 230 bytes • Terabyte (TB), 240 bytes • Petabyte •Exabyte • ZttbtZettabyte • Yottabyte

Translating binary to hexadecimal Converting hexadecimal to decimal

• Each hexadecimal digit corresponds to 4 binary • Multiply each digit by its corresponding bits. power of 16: dec = (D × 163) + (D × 162) + (D × 161) + (D × 160) • Example: Translate the binary integer 3 2 1 0 000101101010011110010100 to hexadecimal: • Hex 1234 equa ls (1 × 163)+(2) + (2 × 162)+(3) + (3 × 161)+(4) + (4 × 160), or decimal 4,660.

• Hex 3BA4 equals (3 × 163)+(11) + (11 * 162)+(10) + (10 × 161) + (4 × 160), or decimal 15,268. Powers of 16 Converting decimal to hexadecimal

Used when calculating hexadecimal values up to 8 digits long:

decimal 422 = 1A6 hexadecimal

Hexadecimal addition Hexadecimal subtraction Divide the sum of two digits by the number base When a borrow is required from the digit to the (16). The quoti ent becomes the carry va lue, an d left , add 10h to the curren t diit'digit's va lue: the remainder is the sum digit. −1 1 1 C6 75 36 28 28 6A A2 47 42 45 58 4B 24 2E 78 6D 80 B5

Important skill: Programmers frequently add and subtract the Practice: The address of var1 is 00400020. The address of the next addresses of variables and instructions. variable after var1 is 0040006A . How many bytes are used by var1? Signed integers Two's complement notation

The highest bit indicates the . 1 = negative, Steps: 0 = pos it ive – Complement (reverse) each bit – Add 1 sign bit

1 1 1 1 0 1 1 0 Negative

0 0 0 0 1 0 1 0 Positive

If the highest digit of a hexadecmal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D Note that 00000001 + 11111111 = 00000000

Binary subtraction Ranges of signed integers • When subtracting A – B, convert B to its two's The highest bit is reserved for the sign. This limits complement the range: • Add A to (–B) 0 1 1 0 0 0 1 1 0 0 – 0 0 0 1 1 1 1 1 0 1 0 1 0 0 1 Advantages for 2’s complement: • No two 0’s • Sign bit • Remove the need for separate circuits for add and sub Character Representing Instructions •Character sets int sum(int x, int y) Alpha sum Sun sum PC sum – Stand ar d ASCII(0 – 127) { return x+y; 00 81 55 – Extended ASCII (0 – 255) } 00 C3 89 – ANSI (0 – 255) 30 E0 E5 – For this example, Alpha & –Unicode (0 – 65,535) 42 08 8B Sun use two 4-byte 01 90 45 • Null-terminated String instructions 80 02 0C FA 00 03 – Array of characters followed by a null byte • Use differing numbers of instructions in other cases 6B 09 45 • Using the ASCII table 08 – PC uses 7 instructions 89 – back inside cover of book with lengg,,ths 1, 2, and 3 EC bytes 5D • Same for NT and for Linux C3 • NT / Linux not fully binary Differen t mac hines use to ta lly different compatible instructions and encodings

Boolean algebra NOT

• Boolean expressions created from: • Inverts (reverses) a boolean value – NOT, AND, OR • Truth table for Boolean NOT operator:

Digital gate diagram for NOT:

NNOTOT AND OR

• Truth if both are true • True if either is true • Truth table for Boolean AND operator: • Truth table for Boolean OR operator:

Digital gate diagram for AND: Digital gate diagram for OR:

AND OR

Operator precedence Implementation of gates

• NOT > AND > OR • Fluid switch (http://www.cs.princeton.edu/introcs/lectures/fluid-computer.swf) • Examples showing the order of operations:

• Use parentheses to avoid ambiguity Implementation of gates Implementation of gates

Truth Tables (1 of 2) Truth Tables (2 of 2)

• A Boolean function has one or more Boolean • Example: X ∧¬Y inputs, and returns a silingle BlBoolean output. • A truth table shows all the inputs and outputs of a Boolean function

Example: ¬X ∨ Y