ARM Architecture
Total Page:16
File Type:pdf, Size:1020Kb
ARM history • 1983 developed by Acorn computers – To replace 6502 in BBC computers – 4-man VLSI design team – Its simp lic ity comes from the inexper ience team ARM Architecture – Match the needs for generalized SoC for reasonable power, performance and die size – The first commercial RISC implemenation Comppgzuter Organization and Assembl yggy Languages • 1990 ARM (Advanced RISC Mac hine ), owned by Yung-Yu Chuang Acorn, Apple and VLSI with slides by Peng-Sheng Chen, Ville Pietikainen ARM Ltd Why ARM? Design and license ARM core design but not fabricate • One of the most licensed and thus widespread processor cores in the world – Used in PDA, cell phones, multimedia players, handheld game console, digital TV and cameras – ARM7: GBA, iPod – ARM9: NDS, PSP, Sony Ericsson, BenQ – ARM11: Apple iPhone, Nokia N93, N800 – 90% of 32-bit embedded RISC processors till 2009 • Used especially in portable devices due to its low power consumption and reasonable performance ARM powered products ARM processors • A simple but powerful design • A whlhole filfamily of didesigns shiharing siilimilar didesign principles and a common instruction set Naming ARM Popular ARM architectures •ARMxyzTDMIEJFS •ARM7TDMI – x: series – 3 pipe line stages (ft(fetc h/deco de /execu te ) – y: MMU – High code density/low power consumption – z: cache – One of the most used ARM-version (for low-end – T: Thumb systems) – D: debugger – All ARM cores after ARM7TDMI include TDMI even if – M: Multiplier they do not include TDMI in their labels – I: EmbeddedICE (built-in debugger hardware) • ARM9TDMI – E: Enhanced instruction – Compatible with ARM7 – J: Jaze lle (JVM) – 5 stages (fe tc h/deco de /execu te /memory /wr ite ) – F: Floating-point – Separate instruction and data cache – S: SthiiblSynthesizible version (source code version for EDA •ARM11 tools) ARM family comparison ARM is a RISC • RISC: simple but powerful instructions that year 1995 1997 1999 2003 execute within a single cycle at high clock speed. • Four major design rules: – Instructions: reduced set/single cycle/fixed length – Pipeline: decode in one stage/no need for microcode – Registers: a large set of general-purpose registers – Load/store architecture: data processing instructions apply to registers only; load/store to transfer data from memory • RlResults in siilmple didesign and fast clklock rate • The distinction blurs because CISC implements RISC concepts ARM design philosophy ARM features • Small processor for lower power consumption • Different from pure RISC in several ways: (for embedded system) – Var ia ble cycle execution for certa in itinstruc tions: • High code density for limited memory and multiple-register load/store (faster/higher code philhysical size restititrictions density) – Inline barrel shifter leading to more complex • The ability to use slow and low-cost memory instructions: improves performance and code density • Reduced die size for reducing manufacture cost – Thumb 16-bit instruction set: 30% code density and accommodatinggpp more peripherals improvement – Conditional execution: improve performance and code density by reducing branch – Enhanced instructions: DSP instructions ARM architecture ARM architecture • Load/store architecture • A large array of uniform regis ters • Fixed-length 32-bit instructions • 3-address instructions Registers General-purpose registers • Only 16 registers are visible to a specific mode. A mode could access 31 24 23 16 15 8 7 0 – A particular set of r0-r12 –r13 (sp, stac k poin ter ) 8-bit Byte – r14 (lr, link register) 16-bit Half word –r15 (pc, program counter) 32-bit word – Current program status register (cpsr) – The uses of r0-r13 are orthogonal • 6 data types (signed/unsigned) • All ARM operations are 32-bit. Shorter data types are only supportdted by dtdata tftransfer operations. Program counter Program status register (CPSR) • Store the address of the instruction to be executed • All instructions are 32-bit wide and word- aligne d • Thus, the last two bits of pc are undefined. mode bits overflow Thumb state carry/borrow FIQ disable zero IRQ disable negative Processor modes Register organization Instruction sets Pipeline •ARM/Thumb/Jazelle ARM7 ARM9 In execution, pc always 8 bytes ahead Pipeline Interrupts • Execution of a branch or direct modification of pc causes ARM core to flush its pipeline Vector table • ARM10 starts to use branch prediction : • An instruction in the execution stage will complete even though an interrupt has been Interrupt handlers raised. Other instructions in the pipeline are abondond. code Interrupts References.