Logistics • Meeting time: 2:20pm-5:20pm, Monday • Classroom: CSIE Room 104 • Instructor: Yung-Yu Chuang Course overview • ThiTeaching assittistants: 李根逸/黃子桓 • Webpage: http://www.csie.ntu.edu.tw/~cyy/asm Comppgzuter Organization and Assembly ygg Languages id / password Yung-Yu Chuang • Forum: http://www.cmlab.csie.ntu.edu.tw/~cyy/forum/viewforum.php?f=13 2008/09/15 • Mailing list: [email protected] Please subscribe via with slides by Kip Irvine https://cmlmail.csie.ntu.edu.tw/mailman/listinfo/assembly/
Prerequisites Textbook • Better to have programming experience with • Readings and slides some high-level language such C, C ++, Java … References (TOY) References (ARM)
Princeton’s Introduction to CS, ARM Assembly Language http: //www.cs.pr ince ton.e du /in tro PiProgramming, PPteter Knaggs an d cs/50machine/ Stephen Welsh http://www.cs.princeton.edu/intro cs/60circuits/ ARM System Developer’s Guide, Andrew Sloss, Dominic Symes and Chris Wright
References (ARM) References (IA32)
Whirlwind Tour of ARM Assembly, Assembly Language for Intel-Based TONC, Jasper Vijn. CtComputers, 5th Edition, Kip Irv ine
ARM System-on-chip Architecture, Steve Furber. The Art of Assem bly Language, RdRandy Hyde References (IA32) Grading (subject to change)
Michael Abrash' s Graphics Programming • Assignments (4~5 projects, 50%) Black Boo k • Class partic ipat ion (5%) • Midterm exam (20%) • Final project (25%) – Examples from last years
CtComputer StSystems: A Programmer 's Perspective, Randal E. Bryant and David R. O'HO'Hllallaron
Computer Organization and Assembly language Early computers • It is not only about assembly and not only about “computer”. Early programming tools First popular PCs
Early PCs GUI/IDE • Intel 8086 processor • 768KB memory • 20MB disk •Dot-Matrix printer (9-pin) More advanced architectures More advanced software • Pipeline • SIMD • Multi-core •Cache
More “computers” around us My computers
Desktop (Intel Pentium D 3GHz, Nvidia 7900) VAIO TX17TP (Inte l Pent ium M 1. 1GHz )
GBA SP iPod classic Nokia 6070 (ARM7 16.78MHz) (ARM7 80MHz) (ARM7 51MHz) Computer Organization and Assembly language TOY machine • It is not only about assembly and not only about “computer”. • It will cover – Basic concept of computer systems and architecture – ARM assembly language – x86 assembly language
TOY machine TOY machine • Starting from a simple construct • Build several components and connect them together TOY machine TOY machine
• Almost as good as any computers int A[32]; ADUP32 10: C020
lda R1, 1 20: 7101 lda RA, A 21: 7A00 i=0; lda RC, 0 22: 7C00 Do { RD=stdi n; read ld RD, 0xFF 23: 8DFF if (RD==0) break; bz RD, exit 24: CD29 add R2, RA, RC 25: 12AC A[i]=RD; sti RD, R2 26: BD02 i=i+1; add RC, RC, R1 27: 1CC1 } while (1); bz R0, read 28: C023
prit()intr(); exit jl RF, prin tr 29: FF2B hlt 2A: 0000
ARM IA32 • ARM architecture • IA-32 Processor Architecture • Data Transfers, Addressing, and Arithmetic • ARM assembly programm ing • Procedures • Conditional Processing • Integer Arithmetic • Advanced Procedures • Strings and Arrays • High-Level Language Interface • Real Arithmetic (FPU) •SIMD • Code Optimization • Writing toy OS What you will learn Why taking this course? • Basic principle of computer architecture • Does anyone really program in assembly • How your computer works nowadays? • How your C programs work • Yes, at times, you do need to write assembly • Assembly basics code. • ARM assembly programming • It is foundation for computer architecture and • IA-32 assembly programming compilers. It is related to electronics, logic • Spec ific components, FPU/MMX design and operating system. • Code optimization • Interface between assembly to high-level langgguage • Toy OS writing
CSIE courses wikipedia • Hardware: electronics, digital system, • Today, assembly language is used primarily for architecture direct hardware manipulation, access to • Software: operating system, compiler specialized processor instructions, or to address critical performance issues. Typical uses are device drivers, low-level embedded systems, and real-time systems. Reasons for not using assembly Reasons for using assembly • Development time: it takes much longer to • Educational reasons: to understand how CPUs develop in assembly. Harder to debug, no type and compilers work. Better understanding to checking, side effects… efficiency issues of various constructs. • Ma in ta ina bility: unstruc ture d, ditdirty titric ks • Deve lop ing compilers, dbdebuggers and other • Portability: platform-dependent development tools. • Hardware drivers and system code • Embedded systems • Developing libraries. • Accessing instructions that are not available through high-level languages. • OtiiiOptimizing for spee d or space
To sum up Overview • It is all about lack of smart compilers • Virtual Machine Concept • Data Representation • Faster code, compiler is not good enough • Boolean Operations • Smaller code , compiler is not good enough, e.g. mobile devices,,, embedded devices, also Smaller code → better cache performance → faster code • Unusual architecture , there isn’t even a compiler or compiler quality is bad, eg GPU, DSP chips, even MMX. Translating Languages Virtual machines Abstractions for computers English: Display the sum of A times B plus C. High-Level Language Level 5
C++: Assembly Language Level 4
cout << (A * B+C);B + C); Operating System Level 3
Instruction Set Intel Machine Language: Architecture Level 2 Assembly Language: A1 00000000 mov eax,A Microarchitecture Level 1 F7 25 00000004 mul B add eax,C 03 05 00000008 Digital Logic Level 0 callWitItll WriteInt E8 00500000
High-Level Language Assembly Language
• Level 5 • Level 4 • Application-oriented languages • Instruction mnemonics that have a one-to-one • Programs compile into assembly language correspondence to machine language (Level 4) • Calls functions written at the operating syy()stem level (Level 3) • Programs are translated into machine language (Level 2) cout << (A * B + C); mov eax, A mul B add eax, C call WriteInt Operating System Instruction Set Architecture
• Level 3 • Level 2 • Provides services • Also known as conventional machine language •Programs translated and run at the instruction •Executed byypg Level 1 program set architecture level (Level 2) (microarchitecture, Level 1)
A1 00000000 F7 25 00000004 03 05 00000008 E8 00500000
Microarchitecture Digital Logic
• Level 1 • Level 0 • Interprets conventional machine instructions • CPU, constructed from didiigita l lilogic gates (Level 2) • System bus • Executed by digital hardware (Level 0) •Memory Data representation Binary Representations • Computer is a construction of digital circuits • Electronic Implementation with two states: on and off – Easy to store w ith bitblbistable elemen ts • You need to have the ability to translate – Reliably transmitted on noisy and inaccurate wires be tween different represen ta tions to exam ine the content of the machine 0 1 0 • Common number systems: binary, octal, 3.3V decimal and hexadecimal 2.8V
0.5V 0.0V
Binary numbers Unsigned binary integers
• Digits are 1 and 0 • Each digit (bit) is either 1 or 0 (a binary dig it is ca lle d a bit) • Each bit represents a power of 2: 1 1 1 1 1 1 1 1 1 = true 27 26 25 24 23 22 21 20 0 = false • MSB –most significant bit • LSB –least significant bit Every binary number is a MSB LSB sum of powers • Bit numbering: 1 0 1 1 0 0 1 0 1 0 0 1 1 1 0 0 of 2 15 0
• A bit string could have different interpretations Translating Binary to Decimal Translating Unsigned Decimal to Binary
• Repeatedly divide the decimal integer by 2. Each Weighted positional notation shows how to remainder is a binary digit in the translated value: calculate the decimal value of each binary bit: n-1 n-2 1 dec = (Dn-1 × 2 ) + (Dn-2 × 2 ) + ... + (D1 × 2 ) + (D0 × 20) D = binary digit 37 = 100101
binary 00001001 = decimal 9: (1 × 23) + (1 × 20) = 9
Binary addition Integer storage sizes
• Starting with the LSB, add each pair of digits, byte 8 include the carry if present. Standard sizes: word 16 doubleword 32
carry: 1 quadword 64 0 0 0 0 0 1 0 0 (4)
+ 0 0 0 0 0 1 1 1 (7)
0 0 0 0 1 0 1 1 (11)
bit position: 7 6 5 4 3 2 1 0 Practice: What is the largest unsigned integer that may be stored in 20 bits? bits? Large measurements Hexadecimal integers
•Kilobyte (KB),210 bytes All values in memory are stored in binary. Because long • MbMegabyte (MB), 220 bytes binary numbers are hard to read , we use hexadecimal representation. • Gigabyte (GB), 230 bytes • Terabyte (TB), 240 bytes • Petabyte •Exabyte • ZttbtZettabyte • Yottabyte
Translating binary to hexadecimal Converting hexadecimal to decimal
• Each hexadecimal digit corresponds to 4 binary • Multiply each digit by its corresponding bits. power of 16: dec = (D × 163) + (D × 162) + (D × 161) + (D × 160) • Example: Translate the binary integer 3 2 1 0 000101101010011110010100 to hexadecimal: • Hex 1234 equa ls (1 × 163)+(2) + (2 × 162)+(3) + (3 × 161)+(4) + (4 × 160), or decimal 4,660.
• Hex 3BA4 equals (3 × 163)+(11) + (11 * 162)+(10) + (10 × 161) + (4 × 160), or decimal 15,268. Powers of 16 Converting decimal to hexadecimal
Used when calculating hexadecimal values up to 8 digits long:
decimal 422 = 1A6 hexadecimal
Hexadecimal addition Hexadecimal subtraction Divide the sum of two digits by the number base When a borrow is required from the digit to the (16). The quoti ent becomes the carry va lue, an d left , add 10h to the curren t diit'digit's va lue: the remainder is the sum digit. −1 1 1 C6 75 36 28 28 6A A2 47 42 45 58 4B 24 2E 78 6D 80 B5
Important skill: Programmers frequently add and subtract the Practice: The address of var1 is 00400020. The address of the next addresses of variables and instructions. variable after var1 is 0040006A . How many bytes are used by var1? Signed integers Two's complement notation
The highest bit indicates the sign. 1 = negative, Steps: 0 = pos it ive – Complement (reverse) each bit – Add 1 sign bit
1 1 1 1 0 1 1 0 Negative
0 0 0 0 1 0 1 0 Positive
If the highest digit of a hexadecmal integer is > 7, the value is negative. Examples: 8A, C5, A2, 9D Note that 00000001 + 11111111 = 00000000
Binary subtraction Ranges of signed integers • When subtracting A – B, convert B to its two's The highest bit is reserved for the sign. This limits complement the range: • Add A to (–B) 0 1 1 0 0 0 1 1 0 0 – 0 0 0 1 1 1 1 1 0 1 0 1 0 0 1 Advantages for 2’s complement: • No two 0’s • Sign bit • Remove the need for separate circuits for add and sub Character Representing Instructions •Character sets int sum(int x, int y) Alpha sum Sun sum PC sum – Stand ar d ASCII(0 – 127) { return x+y; 00 81 55 – Extended ASCII (0 – 255) } 00 C3 89 – ANSI (0 – 255) 30 E0 E5 – For this example, Alpha & –Unicode (0 – 65,535) 42 08 8B Sun use two 4-byte 01 90 45 • Null-terminated String instructions 80 02 0C FA 00 03 – Array of characters followed by a null byte • Use differing numbers of instructions in other cases 6B 09 45 • Using the ASCII table 08 – PC uses 7 instructions 89 – back inside cover of book with lengg,,ths 1, 2, and 3 EC bytes 5D • Same for NT and for Linux C3 • NT / Linux not fully binary Differen t mac hines use to ta lly different compatible instructions and encodings
Boolean algebra NOT
• Boolean expressions created from: • Inverts (reverses) a boolean value – NOT, AND, OR • Truth table for Boolean NOT operator:
Digital gate diagram for NOT:
NNOTOT AND OR
• Truth if both are true • True if either is true • Truth table for Boolean AND operator: • Truth table for Boolean OR operator:
Digital gate diagram for AND: Digital gate diagram for OR:
AND OR
Operator precedence Implementation of gates
• NOT > AND > OR • Fluid switch (http://www.cs.princeton.edu/introcs/lectures/fluid-computer.swf) • Examples showing the order of operations:
• Use parentheses to avoid ambiguity Implementation of gates Implementation of gates
Truth Tables (1 of 2) Truth Tables (2 of 2)
• A Boolean function has one or more Boolean • Example: X ∧¬Y inputs, and returns a silingle BlBoolean output. • A truth table shows all the inputs and outputs of a Boolean function
Example: ¬X ∨ Y