ARM and AVR32 : Which one is better?

SAMSUNG Software Membership Suwon Chang-yeon, Cho.

Copyright ⓒ 2007 by iprinceps No parts of this document may be reproduced in any form, in an electronic retrieval system or otherwise, without the prior written permission of the publisher. 1 Outline

™Abstract ™ Introduction ™ ARM ™ AVR32 ™ ARM vs. AVR32 ™ References

SAMSUNG Software Membership 2 Abstract

™ ARM ¾ Brief history of ARM ¾ The characteristic of ARM 9 Architecture 9 Programmer’s Model ¾ Why is ARM the most popular embedded in the market?

™ AVR32 ¾ What is the AVR32? ¾ The characteristic of AVR32 9 Architecture 9 Programmer’s Model ¾ What are advantages when you choose AVR32?

™ ARM vs. AVR32 ¾ Which one is better, ARM or AVR32?

SAMSUNG Software Membership 3 Outline

™ Abstract ™Introduction ™ ARM ™ AVR32 ™ ARM vs. AVR32 ™ References

SAMSUNG Software Membership 4 Introduction

™ ARM ¾ The industry’s leading provider of 32-bit embedded RISC with almost 75% of market. : It’s the most popular embedded microprocessor! ¾ There are many processors those implement ARM core in the Market. 9 PXA255, PXA270, S3C2440A, AT91 Series, TI TMS Series, etc. 9 Since last year, processors those implement ARM11 core have appeared. ¾ ARM has convenient develop environments and build tools. ¾ Were there any competitors?

™ AVR32 ¾ Introduced to the Market just last year. ¾ ’s first their own 32-bit processor. ¾ AVR32 has really many features those are interesting. ¾ Open-source support available. 9 Operating systems. 9 Develop Environment.

SAMSUNG Software Membership 5 Outline

™ Abstract ™ Introduction ™ARM ™ AVR32 ™ ARM vs. AVR32 ™ References

SAMSUNG Software Membership 6 ARM

™ History ¾ ARM was introduced in the middle of 1980’s 9 1985. Acorn Computer Group develops the world’s first commercial RISC processor 9 1991. the first embeddable RISC core: ARM6 9 1993. TI, Cirrus and Samsung license ARM, ARM7 core 9 1995. Thumb architecture, StrongARM 9 1996. ARM9TDMI family announced 9 2002. ARM11 family announced

¾ Architecture (Instruction set) Progression

SAMSUNG Software Membership 7 ARM

™ The ARM Instruction Set Architecture ¾ The ARM architecture provides support for the 32-bit ARM and 16-bit Thumb Instruction Set Architectures along with architecture extensions. 9 Extensions: Java acceleration, security, Intelligent Energy Manager, SIMD, and NEON technologies. 9 ARMv5TE: In 1999, the ARMv5TE introduced along with ARM ‘Enhanced’ DSP instruction set extensions. 9 ARMv5TEJ: In 2000, the ARMv5TEJ added the Jazelle extension to support Java acceleration. 9 ARMv6: The ARMv6 architecture, announced in 2001, features improvements in many areas covering the memory system, improved exception handling and better support for multiprocessing environments. And the ARMv6 also includes media instructions to support Single Instruction Multiple Data (SIMD). 9 ARMv7: the ARMv7 architecture includes Thumb-2 technology and the NEON technology. 9 NEON: Media Acceleration Technology – designed to address the demands of next generation mobile handheld devices. 9 Thumb-2: defining a new set of 32-bit instructions that execute alongside traditional 16-bit instructions in Thumb state – reduce the need for balancing ARM and Thumb codes.

SAMSUNG Software Membership 8 ARM

™ Programmer’s Model ¾ Pipelines 9 ARM7 Pipelines

FETCH DECODE EXECUTE

FETCH DECODE EXECUTE

FETCH DECODE EXECUTE

9 ARM9TDMI Pipelines

– Memory: Access memory area – Write: Store the result of processing to register

SAMSUNG Software Membership 9 ARM

¾ Operating Modes

Mode Description ID Comments

User Normal program execution mode usr restriction System Privileged mode for operating system sys OS task FIQ When a fast interrupt fiq High-speed ch. IRQ When a normal interrupt irq Supervisor Exception mode for operating system svc SWI Abort When data or instruction prefetch abort abt Vir. Mem, MPro Undef When an undefined instruction und HW Emulation

9 User and System mode share one bank of registers 9 Exception mode: use their own registers – FIQ mode has private R8 ~ R14. – the other modes have private R13 and R14.

SAMSUNG Software Membership 10 ARM

¾ Registers – Total 37 registers exist.

SAMSUNG Software Membership 11 ARM

¾ Registers (Cont’d) 9 Unbanked Registers: R0 ~ R7 – Same to all modes 9 Banked Registers: R8 ~ R14 –R8 ~ R12 : If simple interrupts Æ FIQ can be very fast using only R8 ~ R12. – R13 ~ R14 R13: Usually used for Stack Pointer (SP) R14: Usually used for Link Register (LR) 9 Program Counter: R15 9 Program Status Register

SAMSUNG Software Membership 12 ARM

¾ Exceptions Exception Type Priority Mode Vector High Vector Reset 1 Supervisor 0x00000000 0xFFFF0000 Undefined Instruction 6 Undefined 0x00000004 0xFFFF0004 SWI 6 Supervisor 0x00000008 0xFFFF0008 Prefetch Abort 5 Abort 0x0000000C 0xFFFF000C Data Abort 2 Abort 0x00000010 0xFFFF0010 Reserved 0x00000014 0xFFFF0014 IRQ 4 IRQ 0x00000018 0xFFFF0018 FIQ 3 FIQ 0x0000001C 0xFFFF001C

When an exception occurs To return from exception R14_ = return address CPSR = SPSR_ SPSR_ = CPSR PC = R14_ CPSR[4:0] = exception mode number By CPSR[5] = 0 // in ARM state MOVS|SUBS PC, XX or If == reset or FIQ LDM with Restore CPSR CPSR[6] = 1 // disable FIQ CPSR[7] = 1 // disable IRQ PC = vector address

SAMSUNG Software Membership 13 ARM

™ Why is ARM the most popular embedded microprocessor in the market? ¾ Low power consumption 9 In the case of embedded system or handheld device, power consumption is very important problem. 9 ARM microprocessor solutions offer the industries lowest power consumption and MIPS per watt. 9 For example, STR7 family with ARM7TDMI: 10uA in the Stand-by mode ¾ Low cost of silicon 9 ARM processors and other IP products make efficient use of silicon and memory to align with the economics of wireless devices. ¾ Core performance 9 1MHz to 1GHz with architectural performance ¾ Wide support 9 A wide range of OS, Middleware and tools support an extensive choice of multimedia codec solutions optimized for ARM processors, are available from the ARM Connected Community.

SAMSUNG Software Membership 14 ARM

™ Develop Environments and Supports ¾ Develop Environments 9 Embedded-ICE debug 9 RVDS, ADS, Keil, IAR, etc. 9 GNU Toolchain for ARM ¾ Supports 9 Open source: , eCos, etc.

™ Were there any competitors? ¾ Alchemy AU1200 9 iStation V43

SAMSUNG Software Membership 15 Outline

™ Abstract ™ Introduction ™ ARM ™AVR32 ™ ARM vs. AVR32 ™ References

SAMSUNG Software Membership 16 AVR32

™ What is the AVR32 ¾ 32-bit load/store AVR32 RISC architecture ¾ 15 general-purpose 32bit registers ¾ 32-bit stack pointer, Program Counter and Link Register reside in register file ¾ Fully orthogonal instruction set ¾ Pipelined architecture allows one instruction per clock cycle for most instructions ¾ Shadowed interrupt context for INT3 and multiple interrupt priority levels ¾ Privileged and unprivileged modes enabling efficient and secure Operating Systems ¾ Full MMU allows for operating systems with memory protection ¾ Instruction and data caches

SAMSUNG Software Membership 17 AVR32

™ AVR32 ¾ Architecture 9 NOT binary compatible with earlier AVR architecture 9 In order to achieve high code density, the instruction format is flexible providing both compact instructions with 16 bits length and extended 32-bit instructions 9 Compact and extended instruction can be FREELY mixed in the instruction stream 9 In order to reduce code size to a minimum, some instructions have multiple addressing modes. 9 Frequently used instructions, like add, have a compact format with two operands as well as an extended format with three operands. – The larger format increases performance, allowing an addition and a data move in the same instruction in a single cycle 9 Load/Store instructions have several different formats in order to reduce code size and speed up execution – Load/store to an address specified by a pointer register – Load/store to an address specified by a pointer register with postincrement, predecrement, and displacement – Load/store to an address specified by a small immediate (direct addressing within a small page) – Load/store to an address specified by a pointer register and an index register SAMSUNG Software Membership 18 AVR32

¾ Architecture (Cont’d) 9 Event handling – The different event sources have different priority levels, ensuring a well-defined behavior when multiple events are received simultaneously. – Pending events of a higher priority class may preempt handling of ongoing events of a lower priority class – Each priority class has dedicated registers to keep the return address and status register thereby removing the need to perform time-consuming memory operations to save information – 4 level external interrupts 9 Microarchitectures – AVR32A: be targeted at cost-sensitive, lower-end applications like smaller . – Does not provide dedicated hardware registers for shadowing of register file registers in interrupt context and hardware registers for the return registers and return status register Æ All information are stored on the system stack – AVR32B: be targeted at applications where interrupt latency is important. – Implements dedicated registers to hold the status register and return address for interrupts, exceptions and supervisor calls Æ this information does not need to be written to the stack, and latency is therefore reduced – The INT0 to INT3 contexts may have dedicated versions of the registers in the register file

SAMSUNG Software Membership 19 AVR32

™ Programmer’s Model ¾ Register file configuration 9 The AVR32B architecture specifies that the exception contexts may have a different number of shadowed registers in different implementations. The following shadow model is used in AVR32 AP.

SAMSUNG Software Membership 20 AVR32

¾ Status register configuration 9 The Status Register (SR) is splitted into two halfwords, one upper and one lower. The lower word contains the C, Z, N, V and Q condition code flags and the R, T and L bits, while the upper halfword contains information about the mode and state the processor executes in.

SAMSUNG Software Membership 21 AVR32

¾ Status register configuration (Cont’d) 9 D: Debug state – The processor is in debug state when this bit is set. The bit is cleared at reset and should only be modified by debug hardware. 9 M2, M1, M0: Execution Mode

9 R: Java Register Remap – When this bit is set, the addresses of the registers in the register file is dynamically changed. 9 T: Scratch bit – Not used by any instruction, but can be manipulated by application as a scratchpad bit. This bit is not cleared by reset.

SAMSUNG Software Membership 22 AVR32

¾ Status register configuration (Cont’d) 9 Q: Saturation flag – The saturation flag indicates that a saturating arithmetic operation overflowed. The flag is sticky and once set it has to be manually cleared by a csrf instruction after the desired action has been taken. 9 L: Lock flag – Used by the conditional store instruction. Used to support atomical memory access. Automatically cleared by rete. This bit is cleared after reset. ¾ Configuration Registers 9 Configuration registers are used to inform applications and operating systems about the setup and configuration of the processor on which it is running.

SAMSUNG Software Membership 23 AVR32

¾ Pipeline 9 Overview – AVR32 AP is pipelined processor with 7 pipeline stages. The pipeline has 3 sub pipes, namely the Multiply pipe, the Execute pipe and the Data pipe. These pipelines may execute different instructions in parallel. Instructions are issued in order, but may complete out of order since the sub pipes may be stalled individually, and certain operations may use a sub pipe for several clock cycles.

– IF1, IF2: Instruction Fetch 1 and 2 – ID: Instruction Decode, IS: Instruction Issue – A1, A2: ALU stage 1 and 2, M1, M2: Multiply stage 1 and 2 – DA: Data Address calculation stage, D: Data cache access –WB: Write back SAMSUNG Software Membership 24 AVR32

¾ Pipeline 9 Prefetch Unit – responsible for feeding instructions to the decode unit. – fetches 32 bits at a time from the instruction cache and places them in a FIFO prefetch buffer 9 Decode Unit – generates the necessary signals in order for the instruction to execute correctly – ID stage: accepts one instruction each clock cycle from the prefetch unit If the instruction cannot be decoded, an illegal instruction or unimplemented instruction exception is issued – IS stage: performs register file reads and keeps track of data hazards in the pipeline If hazards exist, pipelines are frozen as needed in order to resolve the hazard 9 ALU Pipeline – performs most of the data manipulation instructions, like arithmetical and logical operations – A1 Stage: target address calculation and condition check, condition code checking for conditional instructions, address calculation for indexed memory accesses, write back address calculation for LS pipeline and all flag setting for arithmetical and logical instructions. – A2 Stage: the saturation needed by satadd and satsub and the operation and flag setting needed by satrnds, satrndu, sats and satu.

SAMSUNG Software Membership 25 AVR32

¾ Pipeline (Cont’d) 9 Multiply Pipeline – All multiply instructions execute in the multiply pipeline. – contains a 32 by 16 multiplier array, and 16x16 and 32x16 multiplications therefore have an issue latency of one cycle. – Multiplication of 32 by 32 bits require two iterations through the multiplier array, and therefore needs several cycles to complete. – Additional cycles may be needed if an accumulation is to be performed. This will stall the multiply pipeline until the instruction is complete. 9 Load-store Pipeline – can read or write up to two registers per clock cycle, if the data is 64-bit aligned. – contains hardware for performing load and store multiple instructions decoupled from the rest of the core.

SAMSUNG Software Membership 26 AVR32

¾ Event Handling 9 An event can be either an interrupt or an exception. 9 Each pipeline stage has a pipeline register that holds the exception requests associated with the instruction in that pipeline stage. 9 Detect – D stage: all data-address related exceptions – All other exceptions and interrupts are detected in the A1 stage 9 Event priority – If several instructions trigger events, the instruction furthest down the pipeline is serviced first, even if upstream instructions have pending events of higher priority. – If this instruction has several pending events, the event with the highest priority is serviced first. After this event has been serviced, all pending events are cleared and the instruction is restarted. 9 Entry points for events – Several different event handler entry points exists. For AVR32B, the reset routine entry address is always fixed to 0xA000_0000. This address resides in unmapped, uncached space in order to ensure well-defined resets. – If several events occur on the same instruction, they are handled in a prioritized way. The priority ordering is presented next page. – If events occur on several instructions at different locations in the pipeline, the events on the oldest instruction are always handled before any events on any younger instruction, even if the younger instruction has events of higher priority than the oldest instruction. SAMSUNG Software Membership 27 AVR32

¾ Event Handling (Cont’d) 9 Priority and handler addresses for events

SAMSUNG Software Membership 28 AVR32

™ Develop Environments and Supports ¾ Development Environments 9 IAR Embedded Work bench

9 Atmel JTAGICEmkII 9 GNU Compiler Collection –debhttp://www.atmel.no/beta_ware/avr32/ubuntu/breezy binary/ – development environment for AVR32 standalone and AVR32 Linux ¾ Supports 9 Open source: U-Boot, Linux 2.6.18 or later, eCos, etc. 9 Development Kit

SAMSUNG Software Membership 29 Outline

™ Abstract ™ Introduction ™ ARM ™ AVR32 ™ARM vs. AVR32 ™ References

SAMSUNG Software Membership 30 ARM vs. AVR32

™ ARM vs. AVR32 ¾ AVR32 Block Diagram

SAMSUNG Software Membership 31 ARM vs. AVR32

¾ AVR32 AP multimedia Benchmarks

9 QVGA@30fps MPEG4 Decode: 75MHz CPU frequency 9 MP3 Audio: 15MHz CPU frequency 9 Outperforms ARM9 3 times – video decode

SAMSUNG Software Membership 32 ARM vs. AVR32

¾ EEMBC – Generic Benchmarks 9 Embedded Benchmarks Consortium 9 Benchmark of architectures, not devices

SAMSUNG Software Membership 33 ARM vs. AVR32

¾ EEMBC – Generic Benchmarks (Cont’d) 9 Code density

– Lower power consumption – Lower RAM requirement

SAMSUNG Software Membership 34 ARM vs. AVR32

¾ AP7000 ARM9 Competitive Overview

SAMSUNG Software Membership 35 ARM vs. AVR32

¾ ARM11?

9 Microsoft ZUNE 9 iMx.31 needs 0.65mm via it cannot be made by drill. 9 Cost increase

¾ Which one is better, ARM or AVR32?

SAMSUNG Software Membership 36 Outline

™ Abstract ™ Introduction ™ ARM ™ AVR32 ™ ARM vs. AVR32 ™References

SAMSUNG Software Membership 37 References

™ Documents ¾ From Atmel website (http://www.atmel.com) 9 AVR32 AP Technical Reference Manual 9 AVR32 Architecture Document 9 AP32AP7000 Datasheet 9 ATSTK1000 User’s Guide 9 AVR32 ASIA 2006 DRAFT

¾ From CD included in STK1000 9 AVR32 BSP User’s Guide 9 Linux 9 Tool-chain

SAMSUNG Software Membership 38