The RISC-V Base ISA and Standard Extensions

Total Page:16

File Type:pdf, Size:1020Kb

The RISC-V Base ISA and Standard Extensions The RISC-V Base ISA and Standard Extensions 8th RISC-V Workshop, Barcelona May 7, 2018 Andrew Waterman, SiFive COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Why Instruction Set Architecture matters • Why can’t Intel sell mobile chips? – 99%+ of mobile phones/tablets are based on ARM’s v7/v8 ISA • Why can’t ARM partners sell servers? – 99%+ of laptops/desktops/servers are based on the AMD64 ISA (over 95%+ built by Intel) • How can IBM still sell mainframes? – IBM 360 is the oldest surviving ISA (50+ years) ISA is the most important interface in a computer system ISA is where software meets hardware 2 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Open Software/Standards Work! Field Standard Free, Open Impl. Proprietary Impl. Networking Ethernet, TCP/IP Many Many OS Posix Linux, FreeBSD M/S Windows Compilers C gcc, LLVM Intel icc, ARMcc Databases SQL MySQL, PostgresSQL Oracle 12C, M/S DB2 Graphics OpenGL Mesa3D M/S DirectX ISA ?????? ----------- x86, ARM, IBM360 • Why not have successful free & open standards and free & open implementations, like other fields? • Dominant proprietary ISAs are not great designs 3 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. What is RISC-V? • A high-quality, license-free, royalty-free RISC ISA specification originally designed at UC Berkeley • Standard maintained by the non-profit RISC-V Foundation • Suitable for all types of computing system, from microcontrollers to supercomputers • Numerous proprietary and open-source cores • Experiencing rapid uptake in industry and academia • Supported by a growing shared software ecosystem • A work in progress… 4 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Origins • In 2010, after many years and many projects using MIPS, SPARC, and x86 as the bases of research at Berkeley, it was time to choose an ISA for next set of projects • Obvious choices: x86 and ARM 5 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Intel x86 “AAA” Instruction • ASCII Adjust After Addition • AL register is default source and destination • If the low nibble is > 9 decimal, or the auxiliary carry flag AF = 1, then – Add 6 to low nibble of AL and discard overflow – Increment high byte of AL – Set CF and AF • Else – CF = AF = 0 • Single byte instruction 6 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. ARM v7 LDMIAEQ Instruction LDMIAEQ SP!, {R4-R7, PC} • LoaD Multiple, Increment-Address • Writes to 7 registers from 6 loads • Only executes if EQ condition code is set • Writes to the PC (a conditional branch) • Can change instruction sets • Idiom for "stack pop and return from a function call" 7 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Origin Story • x86 impossible – IP issues, too complex • ARM mostly impossible – no 64-bit, IP issues, complex • So we started “3-month project” in summer 2010 to develop our own clean-slate ISA – Principal designers: Andrew Waterman, Yunsup Lee, Dave Patterson, Krste Asanovic • Four years later, we released the frozen base user spec – First public specification released in May 2011 – Several publications, many tapeouts, lots of software along the way 8 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Our Goals for a Universal ISA • Works well with existing software stacks, languages • Is native hardware ISA, not virtual machine/ANDF • Suits all sizes of processor, from smallest microcontroller to largest supercomputer • Suits all implementation technologies: FPGA, ASIC, full-custom, ?? • Suits all microarchitectural styles: microcoded, in-order, decoupled, out-of-order, single-issue, superscalar, … • Supports extensive customization to act as base for customized accelerators • Stable: not changing, not disappearing 9 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The RV32I Base Instruction Set Architecture 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Architectural State 31 general-purpose registers, x1-x31 (x0 is hardwired to 0), plus pc 11 RegistersCOPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. are 32 bits wide => 1024 bits of architectural state RV32I Instruction Formats • Instructions all 32-bits wide; must be 32-bit aligned in memory • 4 base formats (R/I/S/U) + 2 immediate-encoding variants (B/J) • Register specifiers (rs2/rs1/rd) always in same place R I S B U J 12 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations • Most arithmetic instructions use R-type format • Perform the computation rs1 OP rs2 and write result to rd • There are 10 of them: • Arithmetic: ADD, SUB • Bitwise: AND, OR, XOR • Shifts: SLL, SRL, SRA • Comparisons: SLT (rd = rs1 < rs2 ? 1 : 0), SLTU (same, but unsigned) 13 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations • Also have register-immediate forms using I-type format • Perform the computation rs1 OP imm and write result to rd • Same ones as R-type, but no need for SUB • Arithmetic: ADDI • Bitwise: ANDI, ORI, XORI • Shifts: SLLI, SRLI, SRAI • Comparisons: SLTI, SLTIU • Immediate is always sign-extended (even when it represents an unsigned quantity) 14 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations 15 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Memory Access Instructions • Loads also use I-type format • Compute address rs1 + imm, read from memory, write result to rd • LB, LH, LW: load byte (8b), load halfword (16b), load word (32b) • LB and LH sign-extend the quantity before writing rd • LBU, LHU (load byte unsigned, load halfword unsigned) zero-extend 16 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Memory Access Instructions • Stores use S-type format • Compute address rs1 + imm; write contents of rs2 to memory • SW stores all 32 bits of rs2; SB and SH store lower 8b/16b • Note, imm[4:0] moves to bits 11:7 to accommodate rs2 17 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Addressing • Two U-type instructions: • LUI (Load Upper Immediate) supports global addressing lui t0, 0x12345 # t0 = 0x12345000 lw t0, 0x678(t0) # t0 = Mem[0x12345678] • Also use LUI + ADDI to generate any 32-bit constant • AUIPC (Add Upper Immediate to PC) supports PC-relative addressing auipc t0, 0x12345 # t0 = pc + 0x12345000 lw t0, 0x678(t0) # t0 = Mem[pc + 0x12345678] 18 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Conditional Branches • Branches use B-type format • Same as S-type format, except immediate scaled by 2 (can only branch to even-numbered addresses) – Supports ISA extensions with instruction lengths in multiples of 2 bytes • Compare rs1 and rs2; if true, set pc := pc + imm; else fall through • Equality (BEQ/BNE), magnitude (BLT/BGE/BLTU/BGEU) 19 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Unconditional Jumps • JAL (Jump and Link) is the only J-type instruction • Sets rd := pc + 4; sets pc := pc + imm (±1 MiB range) – When rd=x0, it’s just a jump; when rd=x1, it’s a function call • Then sets pc := pc + imm • JALR (Jump and Link Register) uses I-type format • Sets rd := pc + 4; sets pc := rs1 + imm • Use for returns (rd=x0, rs1=x1), indirect calls (rd=x1), table jumps • Use with LUI/AUIPC to call any 32-bit address 20 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The Rest of RV32I • FENCE for memory ordering (see Memory Model talk later today) • FENCE.I for self-modifying code • ECALL for system calls • EBREAK for breakpoints • CSRRx to access control and status registers (see Priv Arch talk) 21 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Specifies Three Base ISAs Besides RV32I • RV32E (E=Embedded): • Same as RV32I, but only registers x0-x15 are present. Accessing x16-x31 causes an illegal-instruction trap. • Removes 512 bits of state; helps implementations sensitive to regfile cost • RV64I: 64-bit address variant • Expands the x-registers and the pc to 64 bits • Adds new load and store instructions: LWU, LD, SD • Existing arithmetic instructions now operate on full 64-bit registers • New arithmetic instructions that operate on lower 32 bits of registers and produce a 32-bit result sign-extended to 64 bits: ADDW, SUBW, SLLW, SRLW, SRAW, ADDIW, SLLIW, SRLIW, SRAIW • RV128I: 128-bit address variant; follows same pattern as RV64I 22 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The M Standard Extension for Integer Multiplication and Division 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The M Extension for Integer Multiply/Divide • Multiply/divide not part of base ISA – Not always needed; sometimes too costly • M extension adds these features with 8 R-type instructions Instruction Meaning . mul rd, rs1, rs2 rd = rs1 × rs2 mulh rd, rs1, rs2 rd = (sext(rs1) × sext(rs2)) >> XLEN mulhu rd, rs1, rs2 rd = (zext(rs1) × zext(rs2)) >> XLEN mulhsu rd, rs1, rs2 rd = (sext(rs1) × zext(rs2)) >> XLEN div rd, rs1, rs2 rd = rs1 ÷ rs2 rem rd, rs1, rs2 rd = rs1 % rs2 divu rd, rs1, rs2 rd = rs1 ÷uns rs2 remu rd, rs1, rs2 rd = rs1 %uns rs2 24 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The A Standard Extension for Atomic Memory Operations 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The A Extension for Atomic Memory Operations • Loads/stores can’t scalably support multiprocessor synchronization • A extension provides synchronization primitives in two forms: • Load-reserved/Store-conditional • Atomic fetch-and-ϕ 26 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Load-reserved, Store-conditional • Splits an atomic read-modify-write operation into three phases: • Load data, and acquire reservation on the address • Compute new value • Store new value, only if reservation still held • Store may fail, so sequence needs to be retried – Writes rd with zero on success or nonzero on failure • Forward progress guaranteed for certain restricted sequences Instruction Meaning . lr.w rd, (rs1) rd = M[rs1]; reserve M[rs1] sc.w rd, rs2, (rs1) if still reserved: M[rs1] = rs2; rd = 0 otherwise: rd = nonzero 27 COPYRIGHT 2018 SIFIVE.
Recommended publications
  • 1 Assembly Language Programming Status Flags the Status Flags Reflect the Outcomes of Arithmetic and Logical Operations Performe
    Assembly Language Programming Status Flags The status flags reflect the outcomes of arithmetic and logical operations performed by the CPU. • The carry flag (CF) is set when the result of an unsigned arithmetic operation is too large to fit into the destination. • The overflow flag (OF) is set when the result of a signed arithmetic operation is too large or too small to fit into the destination. • The sign flag (SF) is set when the result of an arithmetic or logical operation generates a negative result. • The zero flag (ZF) is set when the result of an arithmetic or logical operation generates a result of zero. Assembly Programs We are going to run assembly programs from (http://www.kipirvine.com/asm/) using Visual Studio. I have downloaded all of the example programs and placed them in CS430 Pub. Copy them onto your local machine and start up Visual Studio. The first program we are going to run is below. Copy this into the Project_Sample project in the examples folder. Run the program. Let’s talk about what this program does. TITLE Add and Subtract ; This program ; Last update: 06/01/2006 INCLUDE Irvine32.inc .code main PROC mov eax,10000h add eax,40000h sub eax,20000h call DumpRegs exit main ENDP END main 1 What’s the difference between the previous program and this one: TITLE Add and Subtract, Version 2 (AddSub2.asm) ; This program adds and subtracts 32-bit integers ; and stores the sum in a variable. ; Last update: 06/01/2006 INCLUDE Irvine32.inc .data val1 dword 10000h val2 dword 40000h val3 dword 20000h finalVal dword ? .code main PROC mov eax,val1 ; start with 10000h add eax,val2 ; add 40000h sub eax,val3 ; subtract 20000h mov finalVal,eax ; store the result (30000h) call DumpRegs ; display the registers exit main ENDP END main Data Transfer Instructions The MOV instruction copies from a source operand to a destination operand.
    [Show full text]
  • ARM Instruction Set
    4 ARM Instruction Set This chapter describes the ARM instruction set. 4.1 Instruction Set Summary 4-2 4.2 The Condition Field 4-5 4.3 Branch and Exchange (BX) 4-6 4.4 Branch and Branch with Link (B, BL) 4-8 4.5 Data Processing 4-10 4.6 PSR Transfer (MRS, MSR) 4-17 4.7 Multiply and Multiply-Accumulate (MUL, MLA) 4-22 4.8 Multiply Long and Multiply-Accumulate Long (MULL,MLAL) 4-24 4.9 Single Data Transfer (LDR, STR) 4-26 4.10 Halfword and Signed Data Transfer 4-32 4.11 Block Data Transfer (LDM, STM) 4-37 4.12 Single Data Swap (SWP) 4-43 4.13 Software Interrupt (SWI) 4-45 4.14 Coprocessor Data Operations (CDP) 4-47 4.15 Coprocessor Data Transfers (LDC, STC) 4-49 4.16 Coprocessor Register Transfers (MRC, MCR) 4-53 4.17 Undefined Instruction 4-55 4.18 Instruction Set Examples 4-56 ARM7TDMI-S Data Sheet 4-1 ARM DDI 0084D Final - Open Access ARM Instruction Set 4.1 Instruction Set Summary 4.1.1 Format summary The ARM instruction set formats are shown below. 3 3 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 9876543210 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 Cond 0 0 I Opcode S Rn Rd Operand 2 Data Processing / PSR Transfer Cond 0 0 0 0 0 0 A S Rd Rn Rs 1 0 0 1 Rm Multiply Cond 0 0 0 0 1 U A S RdHi RdLo Rn 1 0 0 1 Rm Multiply Long Cond 0 0 0 1 0 B 0 0 Rn Rd 0 0 0 0 1 0 0 1 Rm Single Data Swap Cond 0 0 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 Rn Branch and Exchange Cond 0 0 0 P U 0 W L Rn Rd 0 0 0 0 1 S H 1 Rm Halfword Data Transfer: register offset Cond 0 0 0 P U 1 W L Rn Rd Offset 1 S H 1 Offset Halfword Data Transfer: immediate offset Cond 0
    [Show full text]
  • Multiplication and Division Instructions
    MultiplicationMultiplication andand DivisionDivision InstructionsInstructions • MUL Instruction • IMUL Instruction • DIV Instruction • Signed Integer Division • Implementing Arithmetic Expressions Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003. 1 MULMUL InstructionInstruction • The MUL (unsigned multiply) instruction multiplies an 8-, 16-, or 32-bit operand by either AL, AX, or EAX. • The instruction formats are: MUL r/m8 MUL r/m16 MUL r/m32 Implied operands: Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003. 2 MULMUL ExamplesExamples 100h * 2000h, using 16-bit operands: .data val1 WORD 2000h The Carry flag indicates whether or val2 WORD 100h not the upper half of .code the product contains mov ax,val1 significant digits. mul val2 ; DX:AX = 00200000h, CF=1 12345h * 1000h, using 32-bit operands: mov eax,12345h mov ebx,1000h mul ebx ; EDX:EAX = 0000000012345000h, CF=0 Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003. 3 YourYour turnturn .. .. .. What will be the hexadecimal values of DX, AX, and the Carry flag after the following instructions execute? mov ax,1234h mov bx,100h mul bx DX = 0012h, AX = 3400h, CF = 1 Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003. 4 YourYour turnturn .. .. .. What will be the hexadecimal values of EDX, EAX, and the Carry flag after the following instructions execute? mov eax,00128765h mov ecx,10000h mul ecx EDX = 00000012h, EAX = 87650000h, CF = 1 Irvine, Kip R. Assembly Language for Intel-Based Computers, 2003. 5 IMULIMUL InstructionInstruction • IMUL (signed integer multiply ) multiplies an 8-, 16-, or 32-bit signed operand by either AL, AX, or EAX • Preserves the sign of the product by sign-extending it into the upper half of the destination register Example: multiply 48 * 4, using 8-bit operands: mov al,48 mov bl,4 imul bl ; AX = 00C0h, OF=1 OF=1 because AH is not a sign extension of AL.
    [Show full text]
  • Overview of IA-32 Assembly Programming
    Overview of IA-32 assembly programming Lars Ailo Bongo University of Tromsø Contents 1 Introduction ...................................................................................................................... 2 2 IA-32 assembly programming.......................................................................................... 3 2.1 Assembly Language Statements................................................................................ 3 2.1 Modes........................................................................................................................4 2.2 Registers....................................................................................................................4 2.2.3 Data Registers .................................................................................................... 4 2.2.4 Pointer and Index Registers................................................................................ 4 2.2.5 Control Registers................................................................................................ 5 2.2.6 Segment registers ............................................................................................... 7 2.3 Addressing................................................................................................................. 7 2.3.1 Bit and Byte Order ............................................................................................. 7 2.3.2 Data Types.........................................................................................................
    [Show full text]
  • Assembly Language: IA-X86
    Assembly Language for x86 Processors X86 Processor Architecture CS 271 Computer Architecture Purdue University Fort Wayne 1 Outline Basic IA Computer Organization IA-32 Registers Instruction Execution Cycle Basic IA Computer Organization Since the 1940's, the Von Neumann computers contains three key components: Processor, called also the CPU (Central Processing Unit) Memory and Storage Devices I/O Devices Interconnected with one or more buses Data Bus Address Bus data bus Control Bus registers Processor I/O I/O IA: Intel Architecture Memory Device Device (CPU) #1 #2 32-bit (or i386) ALU CU clock control bus address bus Processor The processor consists of Datapath ALU Registers Control unit ALU (Arithmetic logic unit) Performs arithmetic and logic operations Control unit (CU) Generates the control signals required to execute instructions Memory Address Space Address Space is the set of memory locations (bytes) that are addressable Next ... Basic Computer Organization IA-32 Registers Instruction Execution Cycle Registers Registers are high speed memory inside the CPU Eight 32-bit general-purpose registers Six 16-bit segment registers Processor Status Flags (EFLAGS) and Instruction Pointer (EIP) 32-bit General-Purpose Registers EAX EBP EBX ESP ECX ESI EDX EDI 16-bit Segment Registers EFLAGS CS ES SS FS EIP DS GS General-Purpose Registers Used primarily for arithmetic and data movement mov eax 10 ;move constant integer 10 into register eax Specialized uses of Registers eax – Accumulator register Automatically
    [Show full text]
  • CS221 Booleans, Comparison, Jump Instructions Chapter 6 Assembly Language Is a Great Choice When It Comes to Working on Individu
    CS221 Booleans, Comparison, Jump Instructions Chapter 6 Assembly language is a great choice when it comes to working on individual bits of data. While some languages like C and C++ include bitwise operators, several high-level languages are missing these operations entirely (e.g. Visual Basic 6.0). At times these operations can be quite useful. First we will describe some common bit operations, and then discuss conditional jumps. AND Instruction The AND instruction performs a Boolean AND between all bits of the two operands. For example: mov al, 00111011b and al, 00001111b The result is that AL contains 00001011. We have “multiplied” each corresponding bit. We have used AND for a common operation in this case, to clear out the high nibble. Sometimes the operand we are AND’ing something with is called a bit mask because wherever there is a zero the result is zero (masking out the original data), and wherever we have a one, we copy the original data through the mask. For example, consider an ASCII character. “Low” ASCII ignores the high bit; we only need the low 7 bits and we might use the high bit for either special characters or perhaps as a parity bit. Given an arbitrary 8 bits for a character, we could apply a mask that removes the high bit and only retains the low bits: and al, 01111111b OR Instruction The OR instruction performs a Boolean OR between all bits of the two operands. For example: mov al, 00111011b or al, 00001111b As a result AL contains 00111111. We have “Added” each corresponding bit, throwing away the carry.
    [Show full text]
  • Instruction Set Architecture
    Instruction Set Architecture EE3376 1 –Adapted from notes from BYU ECE124 Topics to Cover… l MSP430 ISA l MSP430 Registers, ALU, Memory l Instruction Formats l Addressing Modes l Double Operand Instructions l Single Operand Instructions l Jump Instructions l Emulated Instructions – http://en.wikipedia.org/wiki/TI_MSP430 2 –Adapted from notes from BYU ECE124 Levels of Transformation –Problems –Algorithms – C Instructions –Language (Program) –Programmable –Assembly Language – MSP 430 ISA –Machine (ISA) Architecture –Computer Specific –Microarchitecture –Manufacturer Specific –Circuits –Devices 3 –Adapted from notes from BYU ECE124 Instruction Set Architecture l The computer ISA defines all of the programmer-visible components and operations of the computer – memory organization l address space -- how may locations can be addressed? l addressibility -- how many bits per location? – register set (a place to store a collection of bits) l how many? what size? how are they used? – instruction set l Opcodes (operation selection codes) l data types (data types: byte or word) l addressing modes (coding schemes to access data) l ISA provides all information needed for someone that wants to write a program in machine language (or translate 4 from a high-level language to machine language). –Adapted from notes from BYU ECE124 MSP430 Instruction Set Architecture l MSP430 CPU specifically designed to allow the use of modern programming techniques, such as: – the computation of jump addresses – data processing in tables – use of high-level languages such as C. l 64KB memory space with 16 16-bit registers that reduce fetches to memory. l Implements RISC architecture with 27 instructions and 7 addressing modes.
    [Show full text]
  • CS107, Lecture 13 Assembly: Control Flow and the Runtime Stack
    CS107, Lecture 13 Assembly: Control Flow and The Runtime Stack Reading: B&O 3.6 This document is copyright (C) Stanford Computer Science and Nick Troccoli, licensed under Creative Commons Attribution 2.5 License. All rights reserved. Based on slides created by Marty Stepp, Cynthia Lee, Chris Gregg, and others. 1 Learning Goals • Learn how assembly implements loops and control flow • Learn how assembly calls functions. 2 Plan For Today • Control Flow • Condition Codes • Assembly Instructions • Break: Announcements • Function Calls and the Stack 3 mov Variants • mov only updates the specific register bytes or memory locations indicated. • Exception: movl writing to a register will also set high order 4 bytes to 0. • Suffix sometimes optional if size can be inferred. 4 No-Op • The nop/nopl instructions are “no-op” instructions – they do nothing! • No-op instructions do nothing except increment %rip • Why? To make functions align on nice multiple-of-8 address boundaries. “Sometimes, doing nothing is the way to be most productive.” – Philosopher Nick 5 Mov • Sometimes, you’ll see the following: mov %ebx, %ebx • What does this do? It zeros out the top 32 register bits, because when mov is performed on an e- register, the rest of the 64 bits are zeroed out. 6 xor • Sometimes, you’ll see the following: xor %ebx, %ebx • What does this do? It sets %ebx to zero! May be more efficient than using mov. 7 Plan For Today • Control Flow • Condition Codes • Assembly Instructions • Break: Announcements • Function Calls and the Stack 8 Control • In C, we have control flow statements like if, else, while, for, etc.
    [Show full text]
  • X86 Assembly Programming Part 2
    x86 Assembly Programming Part 2 EECE416 uC Charles Kim Howard University Resources: Intel 80386 Programmers Reference Manual Essentials of 80x86 Assembly Language Introduction to 80x86 Assembly Language Programming WWW.MWFTR.COM Reminder – Coding Assignment Listing (.LST) File of Assembly Code (.asm) Registers for x86 Basic Data Types • Byte, Words (WORD), Double Words (DWORD) • Little-Endian • Align by 2 (word) or 4 (Dword) for better performance – instead of odd address Data Declaration • Directives for Data Declaration and Reservation of Memory – BYTE: Reserves 1 byte in memory •Example: D1 BYTE 20 D2 BYTE 00010100b String1 BYTE “Joe” ; [4A 6F 65] – WORD: 2 bytes are reserved • Example: num1 WORD -10 num2 WORD FFFFH – DWORD: 4 bytes are reserved •Example: N1 DWORD -10 – QWORD: 8 bytes • 64 bit: RAX RBX RCX ,etc • 32 bit: EDX:EAX Concatenation for CDQ instruction Instruction Format • Opcode: – specifies the operation performed by the instruction. • Register specifier – an instruction may specify one or two register operands. • Addressing-mode specifier – when present, specifies whether an operand is a register or memory location. • Displacement – when the addressing-mode specifier indicates that a displacement will be used to compute the address of an operand, the displacement is encoded in the instruction. • Immediate operand – when present, directly provides the value of an operand of the instruction. Immediate operands may be 8, 16, or 32 bits wide. Register Size and Data • Assuming that the content of eax is [01FF01FF], what would
    [Show full text]
  • Arithmetic and Logical Operations Chapter Nine
    Arithmetic and Logical Operations Chapter Nine There is a lot more to assembly language than knowing the operations of a handful of machine instructions. You’ve got to know how to use them and what they can do. Many instructions are useful for operations that have little to do with their mathematical or obvious functions. This chapter discusses how to convert expressions from a high level language into assembly language. It also discusses advanced arithmetic and logical opera- tions including multiprecision operations and tricks you can play with various instruc- tions. 9.0 Chapter Overview This chapter discusses six main subjects: converting HLL arithmetic expressions into assembly language, logical expressions, extended precision arithmetic and logical opera- tions, operating on different sized operands, machine and arithmetic idioms, and masking operations. Like the preceding chapters, this chapter contains considerable material that you may need to learn immediately if you’re a beginning assembly language programmer. The sections below that have a “•” prefix are essential. Those sections with a “❏” discuss advanced topics that you may want to put off for a while. • Arithmetic expressions • Simple assignments • Simple expressions • Complex expressions • Commutative operators • Logical expressions • Multiprecision operations • Multiprecision addition operations • Multiprecision subtraction operations • Extended precision comparisons ❏ Extended precision multiplication ❏ Extended precision division ❏ Extended precision negation • Extended
    [Show full text]
  • Efficient Multiple-ISA Embedded Processor Core Design Based on RISC-V
    Efficient Multiple-ISA Embedded Processor Core Design Based on RISC-V Yuanhu Cheng, Libo Huang*, Yijun Cui, Sheng Ma, Yongwen Wang, Bincai Sui {chengyuanhu,libohuang,cuiyijun18,masheng,wyw,bingcaisui}@nudt.edu.cn National University of Defense Technology Changsha, China ABSTRACT system. Second, most binary translation systems are still designed RISC-V ISA is developing rapidly, and its target field highly overlaps to complete functional simulation, but the performance has not with the ARM ISA. As a later ISA, RISC-V needs to solve the problem received much attention. of software compatibility. In the embedded field, by a multiple-ISA Another method to solve the software compatibility problem processor based on binary interpretation, RISC-V supports for ARM is the multiple-ISA processor that adds dedicated hardware to the Thumb can be implemented efficiently. However, binary interpreta- traditional processor to support more than one ISAs. Although the tion may result in lower performance of running non-native ISA multiple-ISA processor also increases the area and power of the chip, programs than native ISA. As a result, to improve the performance it is a more appropriate method that solving the problem of software of running the ARM Thumb programs, we propose some optimiza- compatibility in the embedded field if it can be implemented with tion methods of hardware support to reduce the number of RISC-V very low hardware resource consumption. However, at present, the instructions required to interpret ARM Thumb instructions. Based most critical issue in using the hardware approach is the additional on the open-source Zero-riscy core, we implement a demo to sup- hardware overhead associated with supporting multiple ISAs.
    [Show full text]
  • Term 2 Lecture 6
    Assembly Language J. Vaughan Lecture Notes on Assembly Language - J. Vaughan 13. Instructions and data (ctd) From the NASM Manual: AND: Bitwise AND AND r/m8,reg8 ; 20 /r [8086] AND r/m16,reg16 ; o16 21 /r [8086] AND r/m32,reg32 ; o32 21 /r [386] AND reg8,r/m8 ; 22 /r [8086] AND reg16,r/m16 ; o16 23 /r [8086] AND reg32,r/m32 ; o32 23 /r [386] AND r/m8,imm8 ; 80 /4 ib [8086] AND r/m16,imm16 ; o16 81 /4 iw [8086] AND r/m32,imm32 ; o32 81 /4 id [386] AND r/m16,imm8 ; o16 83 /4 ib [8086] AND r/m32,imm8 ; o32 83 /4 ib [386] AND AL,imm8 ; 24 ib [8086] AND AX,imm16 ; o16 25 iw [8086] AND EAX,imm32 ; o32 25 id [386] AND performs a bitwise AND operation between its two operands (i.e. each bit of the result is 1 if and only if the corresponding bits of the two inputs were both 1), and stores the result in the destination (first) operand. The destination operand can be a register or a memory location. The source operand can be a register, a memory location or an immediate value. In the forms with an 8−bit immediate second operand and a longer first operand, the second operand is considered to be signed, and is sign−extended to the length of the first operand. In these cases, the BYTE qualifier is necessary to force NASM to generate this form of the instruction. CALL: Call Subroutine CALL imm ; E8 rw/rd [8086] CALL imm:imm16 ; o16 9A iw iw [8086] CALL imm:imm32 ; o32 9A id iw [386] CALL FAR mem16 ; o16 FF /3 [8086] CALL FAR mem32 ; o32 FF /3 [386] CALL r/m16 ; o16 FF /2 [8086] CALL r/m32 ; o32 FF /2 [386] CALL calls a subroutine, by means of pushing the current instruction pointer (IP) and optionally CS as well on the stack, and then jumping to a given address.
    [Show full text]