Computer Organization (Chapter 4) Sample Problems in the Lecture, We

Copy Link

Computer Organization (chapter 4) sample problems In the lecture, we heard about some of the status flag bits. Some others not discussed in the lecture are found in the Intel x86 architecture and include the adjust flag, the trap flag, the interrupt enable flag and the direction flag. What do each of these represent? Adjust flag – indicates carry out of the low order 4 bits to indicate a carry between BCD digits. Trap flag – processor should operate in single-step mode so that debugging and watch the results of each instruction being executed. Interrupt flag – if set, the CPU is allowed to be interrupted. If not set, the CPU is handling something of a high enough priority that any interrupt will be postponed. Direction flag – used to determine if processing a string should be handled left-to-right or right-to-left. A word addressable computer has 64KBs of memory and 16-bit word size. Its memory is built out of 4Kx4 chips. The computer uses low-order interleave. Answer the following questions. a. How many chips make up memory? b. How many banks make up memory? c. How many bits make up an address? d. Of the bits from part c, which make up the bank select and which make up the address sent to the bank? 64KB of 16-bit words means 64KB/16/8 = 32K words. a. 32Kx16/4Kx4 = 32 chips b. 32 chips / (16/4) = 8 banks c. Log 2 32K = 15 bit address d. 8 banks requires 3 bits, low-order interleave means the rightmost 3 bits are the bank, so the format is: 12 bits (address to the bank) | 3 bits (bank select) A word addressable computer has 1GB of memory and the word size is 64 bits. Assuming there are 80 control lines and 40 interrupt lines, how large is the system bus? 1GB/8bits/byte = 128M words, so the address size is log 2 128M = 27 bits, so the size of the address bus is 27 bits The word size is 64 bits, so the data bus is 64 bits There are 120 lines on the control bus System bus size is 27 + 64 + 120 = 211 lines You are comparing two processors. Processor A has a system clock speed of 1.667 GHz and Processor B has a system clock speed of 2.5GHz. Assume Processor A has an 8 stage fetch- execute cycle. Processor B has a 20 stage fetch-execute cycle. We are running the instruction Add R1, R2, #1 on both processors (this is like doing X = Y + 1 where X and Y are registers). Based on what you know, which processor will run this instruction faster? Assume the instruction was instead Load R1, X where X is a memory location. What other factors come into play that might influence which processor is faster? Processor A will run the instruction faster because, while its clock speed is 2/3s slower then Processor B, it has a pipeline 40% the length of Processors B. Since the instruction does not involve memory or the system bus, A should be faster. If we do the Load instruction, since that involves memory and the system bus, other factors might influence the speed of execution such as whether X is in cache memory and that is influenced by the size and location of cache, and how wide the bus is, among other possible factors. Provide a sample of code in which prefetching data from memory would be advantageous and then explain whether high or low order interleave would be preferred. sum = a[i] + a[i+1] + a[i+2] + a[i+3]; We need to fetch a[i] and a[i+1] and add them, then we need to fetch a[i+2] and a[i+3] and add them to the previous sum and store the result. When fetching a[i], we can go ahead and starting fetching a[i+1], a[i+2], and a[i+3] as well. If we use low-order interleave, these four data will be stored in four different banks (assuming we have at least four banks) and so we can access each simultaneously to provide for pre-fetching. .

Recommended publications

Automatic Extraction of X86 Formal Semantics from Its Natural Language Description

Automatic extraction of x86 formal semantics from its natural language description NGUYEN, Lam Hoang Yen Graduate School of Advanced Science and Technology Japan Advanced Institute of Science and Technology March, 2018 Master's Thesis Automatic extraction of x86 formal semantics from its natural language description 1610062 NGUYEN, Lam Hoang Yen Supervisor : Professor Mizuhito Ogawa Main Examiner : Professor Mizuhito Ogawa Examiners : Associate Professor Nguyen Minh Le Professor Kazuhiro Ogata Associate Professor Nao Hirokawa Graduate School of Advanced Science and Technology Japan Advanced Institute of Science and Technology [Information Science] February, 2018 Abstract Nowadays, computers have become an essential device of almost every activity for every- body at any age. From personal demands to industrial and business sectors, they are used to improve human life as well as the efficiency and the productivity. The more important they are, the more attractive target they become for being attacked and serving malicious purposes. There are various threats to a computer system. One of the most common manners that penetrates or damages a system with bad impacts to most of the computer users is malware. Malware detection and malware classification are two of the most at- tractive problems in not only industry area but also academic research. The bits-based fingerprint also known as the signature-based pattern recognition is applied popularly in commercial anti-virus software due to its light-weight and fast features. However, it is easily cheated by advanced polymorphic techniques in malware. Therefore, malware analyses based on control flow graph (CFG) have been attracting a lot of attention, e.g., VxClass at Google.
Lecture Notes in Assembly Language

Lecture Notes in Assembly Language Short introduction to low-level programming Piotr Fulmański Łódź, 12 czerwca 2015 Spis treści Spis treści iii 1 Before we begin1 1.1 Simple assembler.................................... 1 1.1.1 Excercise 1 ................................... 2 1.1.2 Excercise 2 ................................... 3 1.1.3 Excercise 3 ................................... 3 1.1.4 Excercise 4 ................................... 5 1.1.5 Excercise 5 ................................... 6 1.2 Improvements, part I: addressing........................... 8 1.2.1 Excercise 6 ................................... 11 1.3 Improvements, part II: indirect addressing...................... 11 1.4 Improvements, part III: labels............................. 18 1.4.1 Excercise 7: ﬁnd substring in a string .................... 19 1.4.2 Excercise 8: improved polynomial....................... 21 1.5 Improvements, part IV: ﬂag register ......................... 23 1.6 Improvements, part V: the stack ........................... 24 1.6.1 Excercise 12................................... 26 1.7 Improvements, part VI – function stack frame.................... 29 1.8 Finall excercises..................................... 34 1.8.1 Excercise 13................................... 34 1.8.2 Excercise 14................................... 34 1.8.3 Excercise 15................................... 34 1.8.4 Excercise 16................................... 34 iii iv SPIS TREŚCI 1.8.5 Excercise 17................................... 34 2 First program 37 2.1 Compiling,
17Computerarchitectu

Computer Architecture and Assembly Language Prof. David August COS 217 1 Goals of Today’s Lecture • Computer architecture o Central processing unit (CPU) o Fetch-decode-execute cycle o Memory hierarchy, and other optimization • Assembly language o Machine vs. assembly vs. high-level languages o Motivation for learning assembly language o Intel Architecture (IA32) assembly language 2 Levels of Languages • Machine language o What the computer sees and deals with o Every command is a sequence of one or more numbers • Assembly language o Command numbers replaced by letter sequences that are easier to read o Still have to work with the specifics of the machine itself • High-level language o Make programming easier by describing operations in a natural language o A single command replaces a group of low-level assembly language commands 3 Why Learn Assembly Language? • Understand how things work underneath o Learn the basic organization of the underlying machine o Learn how the computer actually runs a program o Design better computers in the future • Write faster code (even in high-level language) o By understanding which high-level constructs are better o … in terms of how efficient they are at the machine level • Some software is still written in assembly language o Code that really needs to run quickly o Code for embedded systems, network processors, etc. 4 A Typical Computer CPU . CPU Memory Chipset I/O bus ROM Network 5 Von Neumann Architecture • Central Processing Unit CPU o Control unit Control – Fetch, decode, and execute Unit o Arithmetic
Intel Architecture Intel Architecture

CS499 Intel Architecture Intel Architecture References IA-32 Intel® Architecture Software Developer’s Manual, • Volume 1: Basic Architecture • Volume 2: Instruction Set Reference www.intel.com/design/pentiumii/manuals/ Number Systems Decimal-to-Hexadecimal: 420.62510 = 420.62510 = 42010 + .62510 Division Quotient Remainder 420 ÷ 16 26 4 LSB 26 ÷ 16 1 10 (or A) 1 ÷ 16 0 1 MSB Multiplication Product Carry-out .625 x 16 10.00 10 (or A) 420.62510 = 1A4.A16 413510 = 102716 625.62510 = 271.A16 Number Systems Binary-Coded Hexadecimal (BCH): 2AC = 0010 1010 1100 1000 0011 1101 . 1110 = 83D.E Complements Data are stored in complement form to represent negative numbers One’s complements of 01001100 1111 1111 -0100 1100 1011 0011 Two’s complements 1011 0011 +0000 0001 1011 0100 The 80x86 MICROPROCESSOR Some buzz words ...ha? CISC – Complex Instruction Set Computers • Refers to number and complexity of instructions • Improvements was: Multiply and Divide • The number of instruction increased from • 45 on 4004 to: • 246 on 8085 • 20,000 on 8086 and 8088 RISC – Reduced Instruction Set Computer • Executes one instruction per clock Newer RISC - Superscaler Technology • Execute more than one instruction per clock Inside The 8088/8086 Concepts important to the internal operation of 8088/8086 • Pipelining • Registers Inside The 8088/8086…pipelining • Pipelining – Two ways to make CPU process information faster: • Increase the working frequency – technology dependent • Change the internal architecture of the CPU – Pipelining is to allow CPU to fetch and
(Mentos) Fundamental Concepts

Mentoring Operating System (MentOS) fundamental concepts Alessandro Danese University of Verona [email protected] Version 1.0.0 Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 1 / 23 Table of Contents 1 Mentoring Operating System 2 Fundamental concepts Central Processing Unit (CPU) Programmable Interrupt Controller (PIC) Memory organization 3 Kernel doubly-linked list Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 2 / 23 Mentoring Operating System Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 3 / 23 MentOS What... MentOS (Mentoring Operating system) is an open source educational operating system. MentOS can be freely downloaded from a public github repository: https://mentos-team.github.io/MentOS/ Goal... The goal of MentOS is to provide a project environment that is realistic enough to show how a real Operating System work, yet simple enough that students can understand and modify it in signiﬁcant ways. Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 4 / 23 MentOS Why... There are so many operating systems, why did we write MentOs? It is true, there are a lot of education operating system, BUT how many of them follow the guideline deﬁned by Linux? MentOs aims to have the same Linux's data structures and algorithms. It has a well-documented source code, and you can compile it on your laptop in a few seconds! If you are a beginner in Operating-System developing, perhaps MentOS is the right operating system to start with. Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 5 / 23 Fundamental concepts Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 6 / 23 The big picture Central Processing Unit Programmable Interrupt Controller Random Access Memory RAM PIC CPU BUS (Address, Data, Control) Mentoring Operating System (MentOS) fundamental concepts Version 1.0.0 7 / 23 CPU registers There are three types of registers: general-purpose data registers, segment registers, and status control registers.
Chapter 3 Basic Execution Environment

CHAPTER 3 BASIC EXECUTION ENVIRONMENT This chapter describes the basic execution environment of an Intel 64 or IA-32 processor as seen by assembly- language programmers. It describes how the processor executes instructions and how it stores and manipulates data. The execution environment described here includes memory (the address space), general-purpose data registers, segment registers, the flag register, and the instruction pointer register. 3.1 MODES OF OPERATION The IA-32 architecture supports three basic operating modes: protected mode, real-address mode, and system management mode. The operating mode determines which instructions and architectural features are accessible: • Protected mode — This mode is the native state of the processor. Among the capabilities of protected mode is the ability to directly execute “real-address mode” 8086 software in a protected, multi-tasking environment. This feature is called virtual-8086 mode, although it is not actually a processor mode. Virtual-8086 mode is actually a protected mode attribute that can be enabled for any task. • Real-address mode — This mode implements the programming environment of the Intel 8086 processor with extensions (such as the ability to switch to protected or system management mode). The processor is placed in real-address mode following power-up or a reset. • System management mode (SMM) — This mode provides an operating system or executive with a transparent mechanism for implementing platform-specific functions such as power management and system security. The processor enters SMM when the external SMM interrupt pin (SMI#) is activated or an SMI is received from the advanced programmable interrupt controller (API C). In SMM, the processor switches to a separate address space while saving the basic context of the currently running program or task.
Status Flags

BASIC EXECUTION ENVIRONMENT 3.4.3.1. STATUS FLAGS The status flags (bits 0, 2, 4, 6, 7, and 11) of the EFLAGS register indicate the results of arith- metic instructions, such as the ADD, SUB, MUL, and DIV instructions. The functions of the status flags are as follows: CF (bit 0) Carry flag. Set if an arithmetic operation generates a carry or a borrow out of the most-significant bit of the result; cleared otherwise. This flag indi- cates an overflow condition for unsigned-integer arithmetic. It is also used in multiple-precision arithmetic. PF (bit 2) Parity flag. Set if the least-significant byte of the result contains an even number of 1 bits; cleared otherwise. AF (bit 4) Adjust flag. Set if an arithmetic operation generates a carry or a borrow out of bit 3 of the result; cleared otherwise. This flag is used in binary- coded decimal (BCD) arithmetic. ZF (bit 6) Zero flag. Set if the result is zero; cleared otherwise. SF (bit 7) Sign flag. Set equal to the most-significant bit of the result, which is the sign bit of a signed integer. (0 indicates a positive value and 1 indicates a negative value.) OF (bit 11) Overflow flag. Set if the integer result is too large a positive number or too small a negative number (excluding the sign-bit) to fit in the destina- tion operand; cleared otherwise. This flag indicates an overflow condition for signed-integer (two’s complement) arithmetic. Of these status flags, only the CF flag can be modified directly, using the STC, CLC, and CMC instructions.
Hardware Assisted Arithmetic Overflow Detection

A Proposal for Hardware-Assisted Arithmetic Overflow Detection for Array and Bitfield Operations Darek Mihocka Jens Troeger Intel Corp. Intel Corp. [email protected] [email protected] An additional problem for either the traditional arithmetic Abstract flags or SIMD based overflow detection mechanisms is that they cannot operate on small or unconventional data sizes. The Intel Larrabee4 SIMD instruction set for example, only Detecting arithmetic overflow during summation operations supports packed 32-bit and 64-bit integer types, and is vital to ensuring correct and secure behavior of many therefore cannot operate on 8-bit or 16-bit integers directly types of code. For example, applying transformations to without additional data conversion operations. signed integer pixel co-ordinates without any overflow detection may result in pixels rendering at unexpected Neither approach can operate directly on bitfield data types negative co-ordinates, summing a large array of signed or such as the common 5-6-5 packing of RGB pixel values, or unsigned integers without overflow detection can result in something even more trivial such as incrementing a 4-bit bogus totals, or performing arithmetic operations on packed bitfield, since the smallest arithmetic flags generating bitfields without overflow detection could result in operations on most processors require data elements at least corruption of data in adjacent bitfields. 8 bits wide. A traditional way to detect arithmetic overflow is to insert This paper examines an alternative and purely integer-based specific checks of the host processor’s Overflow arithmetic “lazy flags”5 method of detecting signed and unsigned condition flag after each arithmetic operation to detect arithmetic overflow conditions which decouples the signed integer overflow, or a check of the host processor’s generation of the sums (or differences) from the detection Carry arithmetic flag to detect unsigned integer overflow.
Assembly Programoz S

ASSEMBLY PROGRAMOZÁS AJÁNLOTT IRODALOM 1)Pethő Ádám: Assembly alapismeretek 1. kötet, Számalk, Budapest, 1992 2)Peter Norton-John Socha: Az IBM PC Assembly nyelvű programozása, Novotrade, Budapest, 1991 3)Peter Norton: Az IBM PC programozása, Műszaki Könyvkiadó, Budapest, 1992 4)László József: A VGA-kártya programozása Pascal és Assembly nyelven, ComputerBooks, Budapest, 1994 5)Abonyi Zsolt: PC hardver kézikönyv 6)Dr. Kovács Magda: 32 bites mikroprocesszorok 80386/80486 I. és II. kötet, LSI, Budapest 1 TARTALOM 1 AZ ASSEMBLY NYELV TULAJDONSÁGAI, JELENTŐSÉGE. 9 2 A PC-K HARDVERÉNEK FELÉPÍTÉSE.................................13 3 SZÁMRENDSZEREK, GÉPI ADATÁBRÁZOLÁS, ARITMETIKA ÉS LOGIKA.......................................................... 16 4 A 8086-OS PROCESSZOR JELLEMZŐI, SZOLGÁLTATÁSAI. 24 4.1 Memóriakezelés.........................................................................25 4.2 Regiszterek................................................................................ 26 4.3 Adattípusok................................................................................30 4.4 Memóriahivatkozások, címzési módok..................................... 31 4.4.1 Közvetlen címzés................................................................................................ 31 4.4.2 Báziscímzés.........................................................................................................31 4.4.3 Indexcímzés.........................................................................................................32 4.4.4 Bázis+relatív
FLAGS Register (Computing) - Wikipedia, the Free Encyclopedia Page 1 of 3

FLAGS register (computing) - Wikipedia, the free encyclopedia Page 1 of 3 FLAGS register (computing) From Wikipedia, the free encyclopedia This article discusses the flag register specific to the x86 architecture. For a general discussion about flag registers, see status register. The FLAGS register is the status register in Intel x86 microprocessors that contains the current state of the processor. This register is 16 bits wide. Its successors, the EFLAGS and RFLAGS registers are 32 bits and 64 bits wide, respectively. The wider registers retain compatibility with their smaller predecessors. Intel x86 FLAGS Register Bit # Abbreviation Description Category [1] FLAGS 0 CF Carry flag S 1 1 Reserved 2 PF Parity flag S 3 0 Reserved 4 AF Adjust flag S 5 0 Reserved 6 ZF Zero flag S 7 SF Sign flag S 8 TF Trap flag (single step) X 9 IF Interrupt enable flag X 10 DF Direction flag C 11 OF Overflow flag S 12, 13 IOPL I/O privilege level (286+ only) X 14 NT Nested task flag (286+ only) X 15 0 Reserved EFLAGS 16 RF Resume flag (386+ only) X 17 VM Virtual 8086 mode flag (386+ only) X 18 AC Alignment check (486SX+ only) X 19 VIF Virtual interrupt flag (Pentium+) X 20 VIP Virtual interrupt pending (Pentium+) X 21 ID Identification (Pentium+) X 22 0 Reserved http://en.wikipedia.org/wiki/FLAGS_register_(computing) 9/22/2010 FLAGS register (computing) - Wikipedia, the free encyclopedia Page 2 of 3 23 0 Reserved 24 0 Reserved 25 0 Reserved 26 0 Reserved 27 0 Reserved 28 0 Reserved 29 0 Reserved 30 0 Reserved 31 0 Reserved RFLAGS 32-63 0 Reserved 1.
Lecture 1-ST-8086 [Compatibility Mode]

Both 8086 / 8088 are packaged as 40-pin DIPs. In micro-electronics DIP stands for Dual in-line package. DIP packaging refers to a rectangular housing with two parallel rows of electrical connection pins. DIP chips have a notch one one end to show its correct orientation. The pins are then numbered ACW as shown in the figure below. 5 8086 is a 16 bit microprocessor (announced in 1978)with a 16 bit data bus and the 8088 is a 16 bit microprocessor(announced in 1979) with an 8 bit data bus. So why is 8088 a 16 bit microprocessor if its data bus is 8 bits ? What factor decides how many bits a microprocessor is regarded to be? Differences between 8086 and 8088: • Data bus size is different. • Control signal – M/IO (8086) IO/M (8088) • Hardware Difference – Pin 34 – SSO Pin (8088) BHE/S7 pin (8086) 6 Power Requirements Both Processors manufactured using High-performance metal oxide semiconductor(HMOS) technology. Both Processors require +5.0V with a supply voltage tolerance of ±10%. Both consists of about 29000 transistors packedin 40-pin dual-in- line (DIP) package 8086 draws a maximum current supply of 360mA 8088 draws a maximum current supply of 340mA Ambient temperature for use is 32ºF – 180ºF Not suitable for outdoor use. 7 DC Characteristics Remember we have been talking about 0 and 1s so far. In every connection we said that we either send a 0 or a 1 through the pins. So what is 0 and what is 1? How do we represent a 0 and 1 on the pin in real life ? We use voltage.
CMOS Super8 Romless MCU Product Specification

Z88C00 CMOS Super8 ROMless MCU Product Specification PS014602-0103 ZiLOG Worldwide Headquarters • 532 Race Street • San Jose, CA 95126-3432 Telephone: 408.558.8500 • Fax: 408.558.8300 • www.ZiLOG.com This publication is subject to replacement by a later edition. To determine whether a later edition exists, or to request copies of publications, contact: ZiLOG Worldwide Headquarters 532 Race Street San Jose, CA 95126-3432 Telephone: 408.558.8500 Fax: 408.558.8300 www.ZiLOG.com Windows is a registered trademark of Microsoft Corporation. Document Disclaimer © 2003 by ZiLOG, Inc. All rights reserved. Information in this publication concerning the devices, applications, or technology described is intended to suggest possible uses and may be superseded. ZiLOG, INC. DOES NOT ASSUME LIABILITY FOR OR PROVIDE A REPRESENTATION OF ACCURACY OF THE INFORMATION, DEVICES, OR TECHNOLOGY DESCRIBED IN THIS DOCUMENT. ZiLOG ALSO DOES NOT ASSUME LIABILITY FOR INTELLECTUAL PROPERTY INFRINGEMENT RELATED IN ANY MANNER TO USE OF INFORMATION, DEVICES, OR TECHNOLOGY DESCRIBED HEREIN OR OTHERWISE. Except with the express written approval ZiLOG, use of information, devices, or technology as critical components of life support systems is not authorized. No licenses or other rights are conveyed, implicitly or otherwise, by this document under any intellectual property rights. PS014602-0103 CMOS Super8 ROMless MCU Product Specification iii Table of Contents FEATURES . 1 GENERAL DESCRIPTION . 1 Protopack . 4 ARCHITECTURE . 5 PIN DESCRIPTIONS . 6 REGISTERS . 7 Working Register Window . 8 Register List . 9 MODE AND CONTROL REGISTERS . 12 I/O PORTS . 17 UART . 20 Pins . 20 ADDRESS SPACE . 21 CPU Program Memory . 22 ROMless . 22 ROM and Protopack .