Lab Session 02 Objective
Total Page:16
File Type:pdf, Size:1020Kb
CSC-395 Lab Manual Computer Organization and Assembly Language Lab Session 02 Objective: Introduction to Assembly Language of x86 Machines Learn to write an Assembly Language program using Emu8086 Theory: Assembly Language: An assembly language is a low-level programming language for a computer, or other programmable device, in which there is a very strong (generally one-to-one) correspondence between the language and the architecture's machine code instructions. Each assembly language is specific to a particular computer architecture, in contrast to most high-level programming languages, which are generally portable across multiple architectures, but require interpreting or compiling Assembly language is converted into executable machine code by a utility program referred to as an assembler; the conversion process is referred to as assembly, or assembling the code. x86 assembly language is a family of backward-compatible assembly languages, which provide some level of compatibility all the way back to the Intel 8008. x86 assembly languages are used to produce object code for the x86 class of processors. Like all assembly languages, it uses short mnemonics to represent the fundamental instructions that the CPU in a computer can understand and follow. Compilers sometimes produce assembly code as an intermediate step when translating a high level program into machine code. Regarded as a programming language, assembly coding is machine-specific and low level. Assembly languages are more typically used for detailed and/or time critical applications such as small real-time embedded systems or operating system kernels and device drivers. How Does Assembly Language Relate to Machine Language? Machine language is a numeric language specifically understood by a computer’s processor (the CPU). All x86 processors understand a common machine language. Assembly language consists of statements written with short mnemonics such as ADD, MOV, SUB, and CALL. Assembly language has a one-to-one relationship with machine language: Each assembly language instruction corresponds to a single machine-language instruction. How Do C++ and Java Relate to Assembly Language? High-level languages such as C++ and Java have a one-to-many relationship with assembly language and machine language. Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language A single statement in C++ expands into multiple assembly language or machine instructions. We can show how C++ statements expand into machine code. Most people cannot read raw machine code, so we will use its closest relative, assembly language. Is Assembly Language Portable? A language whose source programs can be compiled and run on a wide variety of computer systems is said to be portable. A C++ program, for example, should compile and run on just about any computer, unless it makes specific references to library functions that exist under a single operating system. A major feature of the Java language is that compiled programs run on nearly any computer system. Assembly language is not portable because it is designed for a specific processor family. There are a number of different assembly languages widely used today, each based on a processor family. Some well-known processor families are Motorola 68x00, x86, SUN Sparc, Vax, and IBM-370. The instructions in assembly language may directly match the computer’s architecture or they may be translated during execution by a program inside the processor known as a microcode interpreter .Mnemonics and Opcodes: Each x86 assembly instruction is represented by a mnemonic which, often combined with one or more operands, translates to one or more bytes called an opcode; the NOP instruction translate to 0x90, for instance and the HLT instruction translates to 0xF4. A program written in assembly language consists of a series of (mnemonic) processor instructions and meta-statements (known variously as directives, pseudo-instructions and pseudo-ops), comments and data. Assembly language instructions usually consist of an opcode mnemonic followed by a list of data, arguments or parameters.[4] These are translated by an assembler into machine language instructions that can be loaded into memory and executed. For example, the instruction below tells an x86/IA-32 processor to move an immediate 8-bit value into a register. The binary code for this instruction is 10110 followed by a 3-bit identifier for which register to use. The identifier for the AL register is 000, so the following machine code loads the AL register with the data 01100001.[5] 10110000 01100001 This binary computer code can be made more human-readable by expressing it in hexadecimal as follows. B0 61 Here, B0 means 'Move a copy of the following value into AL', and 61 is a hexadecimal representation of the value 01100001, which is 97 in decimal. Intel assembly language provides the mnemonic MOV (an abbreviation of move) for instructions such as this, so the machine code Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language above can be written as follows in assembly language, complete with an explanatory comment if required, after the semicolon. This is much easier to read and to remember. MOV AL, 61h ; Load AL with 97 decimal (61 hex) Syntax: x86 assembly language has two main syntax branches: Intel syntax, originally used for documentation of the x86 platform, and AT&T syntax.[1] Intel syntax is dominant in the MS-DOS and Windows world, and AT&T syntax is dominant in the Unix world, since Unix was created at AT&T Bell Labs. Many x86 assemblers use Intel syntax including MASM, TASM, NASM, FASM and YASM. A program consists of statement per line. Each statement is an instruction or assembler directive. Statement syntax Name operation operand(s) comment Name field Used for instruction labels, procedure names, and variable names Assembler translates names into memory addresses Names are 1-31 characters including letters, numbers and special characters ? . @ _ $ % . Names may not begin with a digit. If a period is used, it must be first character. Names are Case insensitive Examples of legal names • COUNTER1 • @character • SUM_OF_DIGITS • $1000 • Done? • .TEST Examples of illegal names • TWO WORDS • 2abc • A45.28 Operation field Instruction It describes operation’s function; e.g. MOV, ADD, SUB, INC. Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language Assembler directive An assembler directive is not translated into machine code. It tells the assembler to do something. Operand field It Specifies data to be acted on. There can be Zero, one, or two operands. Examples • NOP • INC AX • ADD AX, 2 Comment field A semicolon marks the beginning of a comment. A semicolon in beginning of a line makes it all a comment line. Good programming practice dictates comment on every line Examples • MOV CX, 0 ; move 0 to CX • Do not say something obvious • MOV CX, 0 ; CX counts terms, initially 0 • Put instruction in context of program • ; initialize registers Applications: Assembly language is typically used in a system's boot code, (BIOS on IBM-compatible PC systems and CP/M), the low-level code that initializes and tests the system hardware prior to booting the operating system, and is often stored in ROM. Some compilers translate high-level languages into assembly first before fully compiling, allowing the assembly code to be viewed for debugging and optimization purposes Assembly language is valuable in reverse engineering. Many programs are distributed only in machine code form which is straightforward to translate into assembly language, but more difficult to translate into a higher-level language Procedure: Start Emu8086 by selecting its icon. Write the following code in the text editor Program 01: org 100h mov al, 5 ; bin=00000101b mov bl, 10 ;bin=00001010b Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language ; 5 + 10 = 15 (decimal) or hex=0Fh or bin=00001111b add al, bl ret Press the emulate button and single step the code. Observe the values in the registers. Note the final values of registers in the following table Register Value AX BX CS IP Program 02: org 100h mov al, 5 ; al = 5 add al, -3 ; al = 2 ret Observe the values in the registers. Note the final values of registers in the following table Register Value AX BX CS IP Program 03: Org 100h mov bl, 5 ; bl = 5 add bl, -3 ; bl = 2 ret Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language Observe the values in the registers. Note the final values of registers in the following table Register Value AX BX CS IP Program 04: Org 100h mov al, 5 sub al, 1 ; al = 4 ret Observe the values in the registers. Note the final values of registers in the following table Register Value AX BX CS IP Program 05: Org 100h mov al, 7 mov bl, 4 sub al,bl ret Observe the values in the registers. Note the final values of registers in the following table Register Value AX BX CS IP Prepared by: Engr. Aisha Danish CSC-395 Lab Manual Computer Organization and Assembly Language Where (in which register) is the result of addition stored? Why is the answer stored in the register you mentioned above? Exercise: 1. Write a program to subtract two integer constants using SUB command. ……………………………………………………………………………………………… ……………………………………………………………………………………………… ……………………………………………………………………………………………… ……………………………………………………………………………………………… ……………………………………………………………………………………………… ………………………………………………………………………………………………