The RISC-V Base ISA and Standard Extensions

The RISC-V Base ISA and Standard Extensions 8th RISC-V Workshop, Barcelona May 7, 2018 Andrew Waterman, SiFive COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Why Instruction Set Architecture matters • Why can’t Intel sell mobile chips? – 99%+ of mobile phones/tablets are based on ARM’s v7/v8 ISA • Why can’t ARM partners sell servers? – 99%+ of laptops/desktops/servers are based on the AMD64 ISA (over 95%+ built by Intel) • How can IBM still sell mainframes? – IBM 360 is the oldest surviving ISA (50+ years) ISA is the most important interface in a computer system ISA is where software meets hardware 2 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Open Software/Standards Work! Field Standard Free, Open Impl. Proprietary Impl. Networking Ethernet, TCP/IP Many Many OS Posix Linux, FreeBSD M/S Windows Compilers C gcc, LLVM Intel icc, ARMcc Databases SQL MySQL, PostgresSQL Oracle 12C, M/S DB2 Graphics OpenGL Mesa3D M/S DirectX ISA ?????? ----------- x86, ARM, IBM360 • Why not have successful free & open standards and free & open implementations, like other fields? • Dominant proprietary ISAs are not great designs 3 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. What is RISC-V? • A high-quality, license-free, royalty-free RISC ISA specification originally designed at UC Berkeley • Standard maintained by the non-profit RISC-V Foundation • Suitable for all types of computing system, from microcontrollers to supercomputers • Numerous proprietary and open-source cores • Experiencing rapid uptake in industry and academia • Supported by a growing shared software ecosystem • A work in progress… 4 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Origins • In 2010, after many years and many projects using MIPS, SPARC, and x86 as the bases of research at Berkeley, it was time to choose an ISA for next set of projects • Obvious choices: x86 and ARM 5 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Intel x86 “AAA” Instruction • ASCII Adjust After Addition • AL register is default source and destination • If the low nibble is > 9 decimal, or the auxiliary carry flag AF = 1, then – Add 6 to low nibble of AL and discard overflow – Increment high byte of AL – Set CF and AF • Else – CF = AF = 0 • Single byte instruction 6 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. ARM v7 LDMIAEQ Instruction LDMIAEQ SP!, {R4-R7, PC} • LoaD Multiple, Increment-Address • Writes to 7 registers from 6 loads • Only executes if EQ condition code is set • Writes to the PC (a conditional branch) • Can change instruction sets • Idiom for "stack pop and return from a function call" 7 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Origin Story • x86 impossible – IP issues, too complex • ARM mostly impossible – no 64-bit, IP issues, complex • So we started “3-month project” in summer 2010 to develop our own clean-slate ISA – Principal designers: Andrew Waterman, Yunsup Lee, Dave Patterson, Krste Asanovic • Four years later, we released the frozen base user spec – First public specification released in May 2011 – Several publications, many tapeouts, lots of software along the way 8 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Our Goals for a Universal ISA • Works well with existing software stacks, languages • Is native hardware ISA, not virtual machine/ANDF • Suits all sizes of processor, from smallest microcontroller to largest supercomputer • Suits all implementation technologies: FPGA, ASIC, full-custom, ?? • Suits all microarchitectural styles: microcoded, in-order, decoupled, out-of-order, single-issue, superscalar, … • Supports extensive customization to act as base for customized accelerators • Stable: not changing, not disappearing 9 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The RV32I Base Instruction Set Architecture 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Architectural State 31 general-purpose registers, x1-x31 (x0 is hardwired to 0), plus pc 11 RegistersCOPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. are 32 bits wide => 1024 bits of architectural state RV32I Instruction Formats • Instructions all 32-bits wide; must be 32-bit aligned in memory • 4 base formats (R/I/S/U) + 2 immediate-encoding variants (B/J) • Register specifiers (rs2/rs1/rd) always in same place R I S B U J 12 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations • Most arithmetic instructions use R-type format • Perform the computation rs1 OP rs2 and write result to rd • There are 10 of them: • Arithmetic: ADD, SUB • Bitwise: AND, OR, XOR • Shifts: SLL, SRL, SRA • Comparisons: SLT (rd = rs1 < rs2 ? 1 : 0), SLTU (same, but unsigned) 13 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations • Also have register-immediate forms using I-type format • Perform the computation rs1 OP imm and write result to rd • Same ones as R-type, but no need for SUB • Arithmetic: ADDI • Bitwise: ANDI, ORI, XORI • Shifts: SLLI, SRLI, SRAI • Comparisons: SLTI, SLTIU • Immediate is always sign-extended (even when it represents an unsigned quantity) 14 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Arithmetic and Logical Operations 15 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Memory Access Instructions • Loads also use I-type format • Compute address rs1 + imm, read from memory, write result to rd • LB, LH, LW: load byte (8b), load halfword (16b), load word (32b) • LB and LH sign-extend the quantity before writing rd • LBU, LHU (load byte unsigned, load halfword unsigned) zero-extend 16 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Memory Access Instructions • Stores use S-type format • Compute address rs1 + imm; write contents of rs2 to memory • SW stores all 32 bits of rs2; SB and SH store lower 8b/16b • Note, imm[4:0] moves to bits 11:7 to accommodate rs2 17 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Addressing • Two U-type instructions: • LUI (Load Upper Immediate) supports global addressing lui t0, 0x12345 # t0 = 0x12345000 lw t0, 0x678(t0) # t0 = Mem[0x12345678] • Also use LUI + ADDI to generate any 32-bit constant • AUIPC (Add Upper Immediate to PC) supports PC-relative addressing auipc t0, 0x12345 # t0 = pc + 0x12345000 lw t0, 0x678(t0) # t0 = Mem[pc + 0x12345678] 18 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Conditional Branches • Branches use B-type format • Same as S-type format, except immediate scaled by 2 (can only branch to even-numbered addresses) – Supports ISA extensions with instruction lengths in multiples of 2 bytes • Compare rs1 and rs2; if true, set pc := pc + imm; else fall through • Equality (BEQ/BNE), magnitude (BLT/BGE/BLTU/BGEU) 19 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RV32I Unconditional Jumps • JAL (Jump and Link) is the only J-type instruction • Sets rd := pc + 4; sets pc := pc + imm (±1 MiB range) – When rd=x0, it’s just a jump; when rd=x1, it’s a function call • Then sets pc := pc + imm • JALR (Jump and Link Register) uses I-type format • Sets rd := pc + 4; sets pc := rs1 + imm • Use for returns (rd=x0, rs1=x1), indirect calls (rd=x1), table jumps • Use with LUI/AUIPC to call any 32-bit address 20 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The Rest of RV32I • FENCE for memory ordering (see Memory Model talk later today) • FENCE.I for self-modifying code • ECALL for system calls • EBREAK for breakpoints • CSRRx to access control and status registers (see Priv Arch talk) 21 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. RISC-V Specifies Three Base ISAs Besides RV32I • RV32E (E=Embedded): • Same as RV32I, but only registers x0-x15 are present. Accessing x16-x31 causes an illegal-instruction trap. • Removes 512 bits of state; helps implementations sensitive to regfile cost • RV64I: 64-bit address variant • Expands the x-registers and the pc to 64 bits • Adds new load and store instructions: LWU, LD, SD • Existing arithmetic instructions now operate on full 64-bit registers • New arithmetic instructions that operate on lower 32 bits of registers and produce a 32-bit result sign-extended to 64 bits: ADDW, SUBW, SLLW, SRLW, SRAW, ADDIW, SLLIW, SRLIW, SRAIW • RV128I: 128-bit address variant; follows same pattern as RV64I 22 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The M Standard Extension for Integer Multiplication and Division 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The M Extension for Integer Multiply/Divide • Multiply/divide not part of base ISA – Not always needed; sometimes too costly • M extension adds these features with 8 R-type instructions Instruction Meaning . mul rd, rs1, rs2 rd = rs1 × rs2 mulh rd, rs1, rs2 rd = (sext(rs1) × sext(rs2)) >> XLEN mulhu rd, rs1, rs2 rd = (zext(rs1) × zext(rs2)) >> XLEN mulhsu rd, rs1, rs2 rd = (sext(rs1) × zext(rs2)) >> XLEN div rd, rs1, rs2 rd = rs1 ÷ rs2 rem rd, rs1, rs2 rd = rs1 % rs2 divu rd, rs1, rs2 rd = rs1 ÷uns rs2 remu rd, rs1, rs2 rd = rs1 %uns rs2 24 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The A Standard Extension for Atomic Memory Operations 8th RISC-V Workshop, Barcelona, Catalonia May 7, 2018 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. The A Extension for Atomic Memory Operations • Loads/stores can’t scalably support multiprocessor synchronization • A extension provides synchronization primitives in two forms: • Load-reserved/Store-conditional • Atomic fetch-and-ϕ 26 COPYRIGHT 2018 SIFIVE. ALL RIGHTS RESERVED. Load-reserved, Store-conditional • Splits an atomic read-modify-write operation into three phases: • Load data, and acquire reservation on the address • Compute new value • Store new value, only if reservation still held • Store may fail, so sequence needs to be retried – Writes rd with zero on success or nonzero on failure • Forward progress guaranteed for certain restricted sequences Instruction Meaning . lr.w rd, (rs1) rd = M[rs1]; reserve M[rs1] sc.w rd, rs2, (rs1) if still reserved: M[rs1] = rs2; rd = 0 otherwise: rd = nonzero 27 COPYRIGHT 2018 SIFIVE.

The RISC-V Base ISA and Standard Extensions

1 Assembly Language Programming Status Flags the Status Flags Reflect the Outcomes of Arithmetic and Logical Operations Performe

ARM Instruction Set

Multiplication and Division Instructions

Overview of IA-32 Assembly Programming

Assembly Language: IA-X86

CS221 Booleans, Comparison, Jump Instructions Chapter 6 Assembly Language Is a Great Choice When It Comes to Working on Individu

Instruction Set Architecture

CS107, Lecture 13 Assembly: Control Flow and the Runtime Stack

X86 Assembly Programming Part 2

Arithmetic and Logical Operations Chapter Nine

Efficient Multiple-ISA Embedded Processor Core Design Based on RISC-V

Term 2 Lecture 6