AdvancedAdvanced MicrocontrollersMicrocontrollers GrzegorzGrzegorz BudzyBudzy ńń LLectureecture 5:5: AVR32AVR32 familyfamily ARMARM corescores Plan

• AVR32 family – AVR32UC – AVR32AP • SDRAM access • ARM cores – introduction – History – ARM7 cores Introduction – AVR32 Source: [1] AVR32 family

• AVR32 core – high performance member of portfolio • Much effort put to make the controllers extremely power efficient • High performance • Many intresting peripherals • Two subfamilies: AVR32UC and AVR32AP AVR32UC AVR32UC family

• 32-bit AVR UC3 Microcontrollers: – Optimized for System Performance – True 1.6V operation – More MHz per mW – Unrivalled DSP performance – Peripheral DMA controller AVR32UC family • The 32-bit AVR UC3 product family is built on the high-performance 32-bit AVR architecture and optimized for highly integrated applications • The 32-bit AVR UC3 microcontrollers deliver: – high computational throughput, – deterministic real-time control, – low power consumption, – low system cost, – high reliability and – ease of use AVR32UC family • The 32-bit AVR CPU includes cutting-edge features such as: – integer DSP arithmetic, – fixed point DSP arithmetic, – single-cycle multiply and accumulate instructions, – single-cycle SRAM access, – Peripheral DMA controller, – multi-layer high-speed bus architecture AVR32UC family

• AVR32UC is divided into subseries: – L Series – A0/A1 Series – A3/A4 Series – B series AVR32UC family Source:[2] AVR32UC L Series • First picoPower 32-bit • Device runs on 1.65 µA/MHz in active mode, 600 nA with RTC running, and down to 9 nA with all clocks stopped • It delivers down to 0.48 mW/MHz in active operation and sleep mode consumption of 1.5uA with RTC running, or 100nA with all clocks stopped • The L series delivers a wide range of technological innovations to the 32-bit microcontroller market • It is the industry's first 32-bit microcontroller with a built-in Capacitive Touch Peripheral • A glue logic controller can eliminate an external PLD. • The FlashVault™ code protection allows the on-chip flash to be partially programmed and locked AVR32UC A0/A1 Series • Designed for high data throughput, low power consumption and outstanding computing performance, • Features high connectivity with USB On-The-Go, Ethernet MAC and SDRAM interfaces, • Fast flash and large internal SRAM, make the processor ideally suited for data-intensive applications. • The A0 and A1 Series devices are used in a wide range of applications including audio, biometric, communication, industrial AVR32UC A3/A4 Series

• The AVR UC3 A3 Series is designed for exceptionally high data throughput, • It features: – Hi-Speed USB On-The-Go, – SD/SDIO card, – Multi-Level- (MLC) NAND flash with ECC – SDRAM interfaces, – Multi-layered AVR databus, AVR32UC A3/A4 Series

• It features: – 128 KB on-chip SRAM with triple high-speed interfaces – multi-channel peripheral – memory-to-memory DMA controller, – Hi-Fi stereo Audio DAC, – full duplex multi-channel I2S audio interface, – AES crypto module AVR32UC B Series

• The AVR UC3 B Series is designed for: – high data throughput, – low power consumption – outstanding computing performance • The series features: – high connectivity with USB On-The-Go – fast flash – large internal SRAM AVR32UC – Peripheral DMA Source:[2] AVR32UC – Multilevel Interrupt Controller • The 32-bit AVR UC3 CPU includes a multi-level interrupt controller • Four priority levels are supported where higher level interrupts are prioritized and executed before low level interrupts • All peripherals can be assigned any interrupt level and the interrupt vector addresses can be changed without stopping the CPU • Interrupt latencies are very fast, typically 11 clock cycles including saving the register file to the stack. AVR32UC – Multilevel Interrupt Controller Source:[2] AVR32UC – Peripheral Event System • PES – an innovative event-triggered data transfer, the innovative Peripheral • The Peripheral Event System allows the 32-bit AVR UC3 to send signals (events) directly to other peripherals without involving the CPU • This ensures short and predictable response time • at the same time it offloads the CPU and reduces power consumption. AVR32UC – Peripheral Event System Source:[2] AVR32UC – Dynamic Frequency Scaling • Dynamic Frequency Scaling (DFS) reduces power consumption when maximum speed is not required throughout the execution of an application. DFS makes it possible to adapt the clock frequency on- the-fly to an application without halting program execution. AVR32UC – High speed I/O interfaces

• USART – • Asynchronous and synchronous operation – • SPI Mode – • LIN Mode – • Supports IrDA – • Up to 33 Mbps communication – • Peripheral DMA • TWI – • I2C and SMBusTM compliant – • Full 100 kHz and 400 kHz support – • Master and Slave operation – • Peripheral DMA AVR32UC – High speed I/O interfaces • Up to 36 PWM channels – • 8-bit resolution – • Up to 150 MHz base clock – • Peripheral Event System • Ethernet – • Up to 100 Mbps communication – • Peripheral DMA • USB On-the-Go – • Host mode – • Up to 480 Mbps communication in Hi-Speed mode – • Peripheral DMA AVR32UC – High speed I/O interfaces • SPI – • Supports up to 15 external devices – • Up to 33 Mbps communication – • Peripheral DMA • Synchronous Serial Controller (SSC) – • Full duplex 24-bit I2S – • Up to 33 Mbps communication – • Peripheral DMA AT32UC3A0/A1 – features 1/2

• Key features – • 128 - 512 KB Flash – • 32 - 64 KB SRAM – • SRAM / SDRAM controller – • Peripheral DMA controller – • Full Speed USB Device + OTG – • Ethernet MAC – • 4 USARTs – • 2 SPI AT32UC3A0/A1 – features 2/2

• Key features – • 1 I2S 24-bit input – • 1 I2S 24-bit output – • Multiple timers and PWM – • 16-bit Stereo bit stream DAC – • 5V tolerant I/O – • 100- and 144-pin package options – • QFP and BGA packages – • Qualified for Automotive AT32UC3A3/A4 – features 1/2

• Key features – 64 - 256 KB Flash – • 128 KB SRAM (64 KB + 2x32 KB) – • SRAM/ SDRAM controller – • MLC NAND Flash controller – • AES crypto engine – • Peripheral DMA controller – • Memory to Memory DMA AT32UC3A3/A4 – features 2/2 • Key features – • High Speed USB Device + OTG – • SD/ MMC/ SDIO card controller – • 4 USARTs – • 2 SPI – • 1 I2S 24-bit input – • 1 I2S 24-bit output – • 16-bit Stereo bit stream DAC – • 100- and 144-pin package options – • QFP and BGA packages AT32UC3B – features 1/2

• Key features – 64 - 512 KB Flash – • 16 - 96 KB SRAM – • Peripheral DMA controller – • Full Speed USB Device + OTG – • 3 USARTs – • 2 SPI AT32UC3B – features 2/2

• Key features – • 1 I2S 24-bit input – • 1 I2S 24-bit output – • Multiple timers and PWM – • 5V tolerant I/O – • 48- and 64-pin package options – • QFP and QFN packages AT32UC3C – features 1/2

• Key features – 64 – 512 KB Flash – • 68 KB SRAM (2 x 32 KB + 4 KB) – • SRAM / SDRAM controller – • NAND flash controller – • Peripheral DMA controller – • Memory to Memory DMA – • Peripheral Event System AT32UC3C – features 2/2 • Key features – • Single / Dual CAN interface – • Full speed USB device + OTG – • 16 ch 12-bit ADC, 1.5 MSPS dual – • 2 ch 12-bit DAC, 1.5 MSPS – • PWM with dead-time insertion – • 4 USART – • 2 SPI – • 1 I2S 24-bit input – • 1 I2S 24-bit output – • 144-, 100- and 64-pin packages – • QFP, QFN and BGA packages AT32UC3D – features • Key features – • 64 - 256 KB Flash – • 8 - 16 KB SRAM – • Peripheral DMA controller – • Full Speed USB device – • 2 USARTs – • 1 SPI – • 1 Two-Wire Interface – • Multiple timers and PWM – • 48-pin package – • QFP and QFN packages AT32UC3L – features 1/2

• Key features – • 16 - 256 KB Flash – • 8 - 16 KB SRAM – • Peripheral DMA controller – • Peripheral Event System – • Full speed USB device – • 4 USARTs AT32UC3L – features 2/2

• Key features – • 1 SPI – • 2 Two-Wire Interfaces – • 6 channels 12-bit ADC – • 8 Analog Comparators – • 36 PWM channels – • 48-pin package – • QFP and QFN packages AT32UC3 - summary

• A0/A1 Series — for Ethernet and USB OTG Applications • A3/A4 Series — for Hi-Speed USB Applications • B Series — for Battery/USB-Powered Applications • C Series — for Industrial Control Applications • D Series — for Cost-Sensitive Applications • L Series — for Battery-Powered Applications AT32UC3C - CPU AT32UC3C - CPU • Features: – 32-bit load/store AVR32A RISC architecture • 15 general-purpose 32-bit registers • 32-bit Stack Pointer, Program Counter and Link Register reside in register file • Fully orthogonal instruction set • Privileged and unprivileged modes enabling efficient and secure operating systems • Instruction set with variable instruction length • DSP extension with saturating arithmetic, and a wide variety of multiply instructions AT32UC3C - CPU • Features: – 3-stage pipeline allowing one instruction per clock cycle for most instructions • Byte, halfword, word, and double word memory access • Multiple interrupt priority levels – MPU allows for operating systems with memory protection – FPU enables hardware accelerated floating point calculations – Secure State for supporting FlashVaultTM technology AT32UC3C - Pipeline AT32UC3C - FREQM • FREQM – Frequency Meter: – Accurately measures a clock frequency – Selectable reference clock – A selectable clock can be measured – Ratio can be measured with 24-bit accuracy AT32UC3C - FREQM AT32UC3C - EBI • EBI – External Bus Interface: – Optimized for application memory space support – Integrates two external memory controllers: • Static Memory Controller (SMC) • SDRAM Controller (SDRAMC) – Optimized external bus:16-bit data bus • Up to 24-bit Address Bus, Up to 16-Mbytes Addressable • Optimized pin multiplexing to reduce latencies on external memories – Up to 4 Chip Selects, Configurable Assignment AT32UC3C0128C - EBI AT32UC3C0128C - EBI SDRAM access SDRAM vs SRAM

• 1. SRAM (Static RAM) is static (doesnot need power- refreshing) while SDRAM (Synchronous Dynamic RAM) is dynamic (needs power-refreshing periodically) • 2. SDRAM access speed is clock dependent while SRAM accesses directly. • 3. DRAM memory can pack several gigabits on a DRAM chip while the SDRAM memory can only pack several tens of mega bits on its chip. • 4. SRAMs power consumption is stable while SDRAMs is higher due to refreshing cycles. • 5. SRAM is more expensive than SDRAM due to faster speed SDRAM- basics SDRAM- basics SDRAM - signals

• Row Address Strobe (RAS) The RAS control input is used to latch the row address and to begin a memory cycle. • RAS is required at the beginning of every operation and must remain selected for a predetermined minimum amount of time. • Column Address Strobe (CAS) CAS is used to latch the column address and to initiate the write or read operation. • CAS may also be used to trigger a CAS-before-RAS refresh cycle. This refresh cycle requires CAS to remain selected for a predetermined minimum time period. • For most memory operations, CAS must remain deselected for a predetermined minimum amount of time. SDRAM - signals

• Write Enable (WE) The WE control input is used to select a read or write operation. • The operation performed is determined by the state of the WE when CAS is taken active. It is important that setup and hold timing specifications are met, with respect to CAS, to assure that the correct operation is selected. • Output Enable (OE) During a read operation, OE is set active to assure data does not appear at the I/O’s until required. During a write cycle, OE is ignored. • Address The address inputs are used to select memory locations in the array. Theaddress inputs are used to select both the desired row and column addresses. SDRAM vs SRAM SDRAM vs SRAM SDRAM - reading • 1) The row address is applied to the address inputs for a specified amount of time before RAS goes active (switched from High to Low). RAS must be Low for a minimum amount of time allowing the row latch circuitry to be completed. • 2) The column address is applied to the address inputs and held for a specified amount of time before CAS is set active (switched from High to Low). CAS is set Low and held for the specified amount of time. • 3) WE is set HIGH for a read operation and must occur before the transition of CAS. SDRAM - reading • 4) CAS must be set active (switched from High to Low), thereby latching in the column address. • 5) Data appears at the data output pins after the specified time period. • 6) Before a read cycle is considered complete, both RAS and CAS control inputs must be returned to an inactive state (both RAS and CAS set from Low to High). SDRAM - writing • 1) The row address is applied to the address inputs for a specified amount of time before RAS is set active. RAS must be held active for a minimum amount of time, allowing the row latch circuitry to be completed. • 2) The column address is applied to the address inputs and held for a specified amount of time before CAS is set active. CAS is set active and held for the specified amount of time. • 3) WE is set Low for a write operation and must occur before the transition of CAS. SDRAM - writing • 4) CAS must be set Low, thereby latching in the column address. • 5) Data must be applied to the data inputs before CAS is set active. • 6) Before a read cycle is considered complete, both RAS and CAS control inputs must be returned to an inactive state (both RAS and CAS set from Low to High). Introduction Introduction

The ARM is a 32-bit reduced instruction set computer (RISC) instruction set architecture (ISA) developed by ARM Holdings Introduction – ARM history

• ARM architecture was introduced in 1980’s • It was invented by Acorn RISC Machine • It is a „successor” of 6502, known from Commodore 64 • The ARM architecture gain a lot of popularity by the end of 1990’s Introduction – ARM history • Thanks to its simplicity and performance it found its „place” in applications like cell phones • At present about 90% of all 32bit RISC processors are ARM based • ARM processors are used from low performance driving applications up to netbook computers Thank you for your attention Introduction – ARM features

• Main features: – 32-bit architecture – Reduced instruction set RISC – Common data and instruction buses for simpler versions (von Neumann) – Split data and instruction buses for faster versions (Harvard) – Pipelining ARM cores Introduction – architecture types • ARM v1 (family ARM1): – First version of ARM core – 26-bit addressing – No hardware multiplier • ARM v2 (family ARM2): – First commercial version – Added 32-bit multiplying instructions – Added coprocessor support Introduction – architecture types • ARM v2a (family ARM3): – First usage of cache memory (4kB) – Up to 12 MIPS @ 25 MHz • ARM v3 (families ARM6 and ARM7): – 32-bit addressing – Added coprocessor and cache buses – Added memory controller (ARM7500FE) – Up to 40 MIPS @ 56 MHz Introduction – architecture types • ARM v4 (families ARM7TDMI, ARM8, ARM9): – 3-stage and 5-stage pipelining – Thumb instruction set – Loop prediction – Memory control units - MPU or MMU – High performance at relatively simple construction – Up to 200 MIPS @ 200 MHz (StrongARM) – Most popular version Introduction – architecture types • ARM v5 (families ARM7TDMI, ARM9, ARM10): – 6-stage and 7-stage pipelining – Thumb instruction set – Jazelle instruction set – DSP instruction set – Multilevel cache – Very large throughput – Up to 1000 MIPS @ 1250 MHz (XScale) – Very popular version Introduction – architecture types • ARM v6 (families ARM11, Cortex-M0, Cortex- M1): – 8-stage and 9-stage pipelining – Thumb-2 instruction set – Jazelle instruction set – DSP instruction set – SIMD – Multilevel cache – Large throughput – Optimized for audio and video applications Introduction – architecture types • ARM v7 (Cortex family without: Cortex-M0, Cortex- M1): – 13-stage pipelining – Thumb-2 instruction set – Jazelle instruction set – DSP instruction set – Hardware support for divide and mutiply of integer and float numbers – MultiCore (1-4 cores) – SIMD (NEON) – up to 16 instructions at one cycle – Multilevel cache – Huge throughput (up to 10000 MIPS!!!) ARM7TDMI ARM7TDMI

• Main features 1/2: – At present the most common version of ARM family – 32-bit RISC processor with low power consumption – Von Neumann architecture – 3-stage pipelining ARM7TDMI • Main features 2/2: – Two instruction sets: 32-bit ARM and 16-bit Thumb – Seven work modes – Operation on data: • 8-bit (byte) • 16-bit (halfword) • 32-bit (word) – TDMI: Thumb, Debug, Multiplier, Interrupts ARM7TDMI – instruction pipeline Source: [1] ARM7TDMI Source:[1] ARM7TDMI – operation modes • User (usr): normal operation mode • FIQ (fiq): data transfer mode (fast irq, DMA transfer DMA) • IRQ (irq): interrupt service mode • Supervisor (svc): protected mode for operating systems • Abort mode (abt): error mode • System (sys): privileged user mode • Undefined (und): undefined instruction mode – All modes except usr are privileged modes used to service interrupts or exceptions, or to access protected resources ARM7TDMI – registers

• 37 registers: – 31 general purpose registers – 6 status registers • Available number depends on the operation mode and processor state • R15 is always the program counter • R13 is usually the stack pointer (by convention) • R14 is sometimes used as Link Register (in Branch with Link BL instruction) ARM7TDMI – status registers • The ARM7TDMI processor contains a CPSR (Current Program Status Register ) and five SPSRs (Saved Program Status Register ) for exception handlers to use. • The program status registers: – hold information about the most recently performed ALU operation – control the enabling and disabling of interrupts – set the processor operating mode. ARM7TDMI – status registers The ARM-state register set register ARM-state The

Source: [1] The Thumb-state register set register Thumb-state The

Source: [1] ARM7TDMI – ARM instruction set

• Two instruction sets: full (ARM) and simplified (Thumb) • ARM list is a list of 32-bit long commands • ARM list commands occupy large portion of memory • Each instruction can be executed conditionally • Operation result can be accessed with shift • Five addressing modes • Each addressing mode has a few options ARM7TDMI – addressing modes

• Mode 1 Shifter operands for data processing instructions. • Mode 2 Load and store word or unsigned byte. • Mode 3 Load and store halfword or load signed byte. • Mode 4 Load and store multiple. • Mode 5 Load and store coprocessor. – Each mode has different addressing types like: immidiate, register, pre-indexed, etc. ARM7TDMI – ARM instruction set • Instruction types: – Move – Arithmetic – Logical – Branch – Load – Store – Swap – Coprocessor – Software Interrupt ARM7TDMI – ARM instruction set

• Two instruction sets: full (ARM) and simplified (Thumb) • ARM list is a list of 32-bit long commands

• ARM list commands occupy large portion of memory Source:[1] • Each instruction can be executed conditionally • Operation result can be accessed with shift • Five addressing modes • Each addressing mode has a few options ARM7TDMI – Thumb instruction set

• Thumb list is a list of 16-bit long commands • This is a subset of ARM instruction set • Thimb commands occupy small memory space • Only some of Thumbs commands can be executed conditionally • Data operation are 32-bit long • In Thumb set only R0-R7 can be freely used • Each Thumb instruction has its „parent” in the ARM set ARM7TDMI – Thumb instruction set • Instruction types: – Move – Arithmetic – Logical – Shift/Rotate – Branch – Load – Store – Push/Pop – Software Interrupt Source: [1] ARM7TDMI – Virtual Memory System • VMSA block is used for assigning different address spaces for different applications (processes) • For the memory assignment an Memory Management Unit is used • In the MMU block virtual addresses are translated to real one with the use of TLB (Translation Lookaside Buffers) ARM7TDMI – Virtual Memory System Source:[1] ARM7TDMI – Protected Memory System • PMSA block is used for assigning different address spaces for different applications (processes) • For the memory assignment an Memory Protection Unit is used • PMSA operation principle is simpler than VMSA operation • No virtual addresses • Certain processes have access to certain memory spaces ARM7TDMI – Protected Memory System Source:[1] To be continued … References

[1] www.atmel.com [2] AVR32UC family documentation; www.atmel.com [3] ARM7TDMI core documentation; www.arm.com