CS4803DGC Design Game Console
Total Page:16
File Type:pdf, Size:1020Kb
Spring 2012 Prof. Hyesoon Kim • Nintendo DS introduction • Introduction of Nintendo DS programming • ARM architecture • Friday: – ARM architecture/assembly code • Next Wednesday: – ARM assembly coding (lab-day): introduction task #2 • Next Friday: – Assignment #1 • 1st part of programming platform • Programming with Nintendo DS • http://www.cc.gatech.edu/~hyesoon/spr12/i ntro1.html • Installation Guide and Hello world http://www.cosc.brocku.ca/Offerings/3P92/seminars/nintendo_ds_slideshow.pdf • Dual TFT LCD screens • CPUs – ARM 7 TDMI (33MHz) – ARM 9 946E-S (67MHz) • Main memory: 4MB RAM – VRAM: 656 KB • 2D graphics – Up to 4 backgrounds • 3D graphics • Both can be running code at the same time. • ARM 7 is the only CPU that controls the touch screen. – Interrupt based • DevKit Pro is a collection of tool chain for homebrew applications developers for various architectures • DevKitARM: ARM binaries • Not official development tool chain – Much simpler and naïve • libnds – Started with header files for definition – Extended to have other data structures, simple APIs • *.nds – A binary for Nintendo DS, a separate region for ARM7 and ARM9 http://patater.com/files/projects/manual/manual.html#id2612503 int main(void) { consoleDemoInit(); //Initialize the console irqSet(IRQ_VBLANK, Vblank); //this line says: When the IRQ_VBLANK interrupt occurs execute function Vblank iprintf(" Hello DS dev'rs\n"); while(1) { iprintf("\x1b[10;0HFrame = %d",frame); //print out the current frame number swiWaitForVBlank(); //This line basically pauses the while loop and makes it //wait for the IRQ_VBLANK interrupt to occur. This way, we print only once //per frame. } return 0; } • Instead of pure assembly coding, we will use inline assembly programming • Not only ARM, x86 etc. • Good place to look at http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly- HOWTO.html#ss5.3 http://www.ethernut.de/en/documents/arm-inline-asm.html NOP asm( "mov r0, r0\n\t" "mov r0, r0\n\t" "mov r0, r0\n\t" "mov r0, r0" ); Use deliminaters Linefeed or tab to differentitate assembly lines http://www.ethernut.de/en/documents/arm -inline-asm.html • ARM is short for Advanced Risc Machines Ltd. – Founded 1990, owned by Acorn, Apple and VLSI • Known before becoming ARM as computer manufacturer • ARM is one of the most licensed company • Used especially in portable devices due to low power consumption and reasonable performance (MIPS/watt) • They do not fabricate silicon http://tisu.it.jyu.fi/embedded/TIE345/luentokalvot/Embedded_3_ARM.pdf • 32-bit wide (16-bit thumb compressed format) • Load-store instruction set architecture • 3-address data processing instructions • Conditional execution of every instruction • Powerful load and store multiple register instructions • A general shift operation and a sequential ALU operations in a single instruction that executes in a single clock cycle • Open instruction set extension through the coprocessor instruction set, including adding new registers and data types to the programmer’s model • Compressed 16-bit thumb architecture Steve Furber, ARM system-on-chip architecture 2nd edition • Data processing (ALU) operations write results only into registers • Memory operations are only copy (from memory to registers, register to memory) • ARM does not support memory-to-memory operations • ARM instruction three categories – 1. data processing instructions – 2. Data transfer instructions • memory-to/from-registers, exchange-memory-register (system only) – 3. Control flow instructions • Branch instructions, branch and link register (saving return address), trap instructions (supervisor calls) Steve Furber, ARM system-on-chip architecture 2nd edition Current Usable Visible in user Registers mode r0 IRQFIQSVCUndefUserAbort ModeMode ModeMode ModeMode r1 r2 r3 BankedSystem out modes Registers only r4 r5 r6 User FIQ IRQ SVC Undef Abort r7 r8 r8 r8 r9 r9 r9 r10 r10 r10 r11 r11 r11 r12 r12 r12 r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r13 (sp) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r14 (lr) r15 (pc) cpsr spsr spsr spsr spsr spsr spsr 31 28 27 8 7 6 5 4 0 N Z C V unused IF T mode • N: Negative (the last ALU operation) • Z: zero (the last ALU operation) • C: carry (the last ALU or from shifter) • V: overflow Steve Furber, ARM system-on-chip architecture 2nd edition CPSR[4:0] Mode Use Registers 10000 user Normal user code user 10001 FIQ Processing fast interrupts _fiq 10010 IRQ Processing standard interrupts _irq 10011 SVC Processing software interrupts (SWIs) _svc 10111 Abort Processing memory faults _abt 11011 Undef Handling undefined instruction traps _und 11111 System Running privileged operating system user tasks Software interrupt: supervisor calls Steve Furber, ARM system-on-chip architecture 2nd edition • A linear array of byte address • Data format (8-bit bytes, 16-bit half-words, 32-bit words) • Aligned address accesses • Little endian Bit 31 Bit 0 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Byte 1 Byte 0 Steve Furber, ARM system-on-chip architecture 2nd edition • Fetch/Decode/Execute • Allow multi-cycle execution • Register, two read ports, one write port, – Additional register read/write for r15 (program counter) Steve Furber, ARM system-on-chip architecture 2nd edition • Fetch/Decode/Execut e/Mem/write-back • Introduce a forwarding path Steve Furber, ARM system-on-chip architecture 2nd edition • 2-Phase non-overlapping clock scheme Steve Furber, ARM system-on-chip architecture 2nd edition • SPSR (Saved Program Status Register) Steve Furber, ARM system-on-chip architecture 2nd edition • 16 bits long • Similarity with ARM ISA – The load-store architecture with data processing, data transfer, and control-flow instructions – Support Byte, half-word, word (aligned accesses) – A 32-bit unsegmented memory • Differences – Most Thumb instructions are executed unconditionally • All ARM instructions are executed conditionally – Many thumb data processing instructions use a 2-address format – Thumb instruction formats are less regular than ARM ISA. Steve Furber, ARM system-on-chip architecture 2nd edition • ARM7: 3 stage pipeline, 16 32-bit Registers , 32-bit instruction set • TMDI – Thumb instruction set – Debug-interface – Multiplier (hardware) – Interrupt (fast interrupt) – The most commonly used one • 32/16-bit RISC • 32-bit ARM instruction set • 16-bit Thumb instruction set • 3-stage pipeline • Very small die size and low power • Unified bus interface (32-bit data bus carries both instruction, data) 1st Phase 2nd Phase The ARM9 Family -High Performance Microprocessors for Embedded Applications • Instruction compression to save I-cache/memory accesses • Use only top 8 registers, • 3 operands 2 operands • Instructions are compiled either native ARM code or Thumb code – To utilize full 16bit opcode – Use current processor status register (CPSR) to set thumb/native instruction • All instructions are conditional • BX, branch and eXhange branch and exchange (Thumb) • Link register (subroutine Link register) – R14 receives the return address when a Branch with Link (BL or BLX) instruction is executed • 5-stage pipeline • I-cache and D-cache • Floating point support with the optional VFP9-S coprocessor • Enhanced 16 x 32-bit multiplier capable of single cycle MAC operations • The ARM946E-S processor supports ARM's real-time trace technology • ARM7 3stage->ARM9 5 stage – Increase clock frequency The ARM9 Family -High Performance Microprocessors for Embedded Applications • ARM7: Thumb instruction decode: first ½ phase of decode stage • ARM9: Parallel decoding • ARM7: ALU (arithmetic, and logic units) is active all the time • ARM9: Two units are partitioned to save power • ARM9: Forwarding path The ARM9 Family -High Performance Microprocessors for Embedded Applications • Thumb 2 ISA • ARM architecture version 7 • A profile: high- performance open application platforms • R profile:real-time • M profile: microcontroller (deeply embedded) http://www.arm.com/images/ARM11MPCORE_chip_Big.jpg • Load store architecture has separate instruction sets to handle memory operations (True, False) • Thumb ISA is a 32-bit ISA (True, False) • What registers are used to store the program counter and link register? • Name the pipeline stages in ARM7 and ARM 9. • ARM assembly code – Up: OR operation Down: AND operation start: Reset to default values A: Exclusive OR operation B: AND NOT (BIC) operation Left: left shift by #1 Right: right shift by #1 No need to use interrupt, use a polling method – Implement at least 2 features among them and submit the code into T-square. • Some instructions clobber some hardware registers. • We have to list those registers in the clobber-list • Input/output operands do not have to there. • Mostly side-effect operands that have to be treated very carefully. Such as “CC”. condition code. KEY_A 1 << 0 A Button • Button, touch KEY_B 1 << 1 B Button screen, microphone KEY_SELECT 1 << 2 Select Button • Libnds key KEY_START 1 << 3 Start Button definition KEY_RIGHT 1 << 4 Right D-pad KEY_LEFT 1 << 5 Left D-pad KEY_UP 1 << 6 Up D-pad KEY_DOWN 1 << 7 Down D-pad KEY_R 1 << 8 R Button KEY_L 1 << 9 L Button KEY_X 1 << 10 X Button KEY_Y 1 << 11 Y Button Pen Touching KEY_TOUCH 1 << 12 Screen (no coordinates) Lid shutting KEY_LID 1 << 13 (useful for sleeping) 0x4000130 • Instead of pure assembly coding, we will use inline assembly programming • Not only ARM, x86 etc. • Good place to look at http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly- HOWTO.html#ss5.3 http://www.ethernut.de/en/documents/arm-inline-asm.html