Blackfin 533/537: an Investigation Into a High Performance L4 Microkernel Without Virtual Memory

THE UNIVERSITY OF NEW SOUTH WALES SCHOOL OF COMPUTER SCIENCE AND ENGINEERING Optimising L4 on Blackfin 533/537: an investigation into a high performance L4 microkernel without virtual memory Clarence Dang (3101378) Bachelor of Software Engineering Program Submission Date: 2006-10-31 Supervisor: Kevin Elphinstone Assessor: Gernot Heiser Abstract The L4 microkernel is used as the basis for several operating systems but was built on the assumption of virtual memory. This thesis examines general design issues for constructing a high performance port of L4 without virtual memory but with memory protection. It also aims to provide a concrete implementation by porting L4 to the Blackfin processor. The results of our research were found to be general, and not just Blackfin-specific. Therefore, we enable L4 to support an additional category of processors – those without virtual memory. Our L4 implementation on Blackfin verifies the validity of the design. While it outperforms ucLinux, at least on context switching time, there is still much work to be done before it is deployable. 2 Acknowledgements Kevin Elphinstone supervised this thesis and the thesis prior to this one. Stefan Petters and Gernot Heiser also supported me during the difficult task of changing to this thesis. Gernot Heiser gave me the summer project of which this project is based upon. He sparked my interest in kernel hacking. Alex Webster, Ben Leslie, Matthew Warton and Carl van Schaik, helped answer my torrent of questions about L4 kernel implementation. Aiden Williams lent me two STAMP boards and his expertise. 3 Table of Contents 1 Introduction.............................................................................................................................11 1.1 Goals.................................................................................................................................12 1.2 Report Structure................................................................................................................13 2 The Blackfin Architecture......................................................................................................15 2.1 Basic Operations...............................................................................................................17 2.1.1 How they were measured..........................................................................................18 2.1.2 Analysis of costs........................................................................................................18 2.2 Interrupts, Exceptions and Traps.......................................................................................20 2.3 Memory.............................................................................................................................21 2.3.1 Memory layout..........................................................................................................21 2.3.2 Memory caching........................................................................................................21 2.3.3 Memory protection....................................................................................................28 2.3.4 Conclusions about memory and Blackfin..................................................................30 2.4 Instruction Pipeline...........................................................................................................31 2.4.1 Pipelining..................................................................................................................31 2.4.2 Blackfin.....................................................................................................................31 2.4.3 Conclusions on instruction pipelining.......................................................................32 2.5 Possible Blackfin Security Bug.........................................................................................33 2.5.1 Zero overhead loops..................................................................................................33 2.5.2 Exploit code...............................................................................................................34 2.5.3 Analysis of exploit code............................................................................................35 2.5.4 Blackfin bug conclusions..........................................................................................35 2.6 Comparison to ARM.........................................................................................................37 2.6.1 ARM1156T2-S..........................................................................................................37 2.6.2 ARM7TDMI..............................................................................................................38 2.6.3 ARM comparison conclusion....................................................................................38 2.7 Blackfin Architecture Summary........................................................................................39 3 L4 Background........................................................................................................................40 3.1 The Microkernel Approach...............................................................................................41 3.1.1 Abstractions...............................................................................................................41 3.1.2 The microkernel advantage.......................................................................................42 3.2 Versions of L4...................................................................................................................43 3.2.1 NICTA L4-embedded................................................................................................43 3.2.2 Ports...........................................................................................................................43 3.3 Recent Developments........................................................................................................44 3.3.1 N2..............................................................................................................................44 3.3.2 Single Stack Kernel...................................................................................................44 3.3.3 Physical TCB Arrays.................................................................................................44 3.4 Summary...........................................................................................................................46 4 An Initial Blackfin Port..........................................................................................................47 4.1 Pingpong Microbenchmark...............................................................................................48 4.2 Kernel Entry and Exit........................................................................................................50 4.2.1 Cache performance of the whole path.......................................................................50 4 4.2.2 Trapframe cost...........................................................................................................51 4.2.3 Performance of the C++............................................................................................52 4.2.4 Conclusions about the kernel entry and exit..............................................................54 4.3 System Calls......................................................................................................................55 4.3.1 System call mechanism.............................................................................................55 4.3.2 System call convention..............................................................................................55 4.3.3 Nested exceptions......................................................................................................55 4.3.4 System call summary................................................................................................56 4.4 IPC....................................................................................................................................57 4.4.1 Before the thread switch............................................................................................57 4.4.2 After the thread switch (for Inter-AS IPC only)........................................................57 4.4.3 Analysis.....................................................................................................................58 4.4.4 Ways to speed up IPC...............................................................................................58 4.5 Memory.............................................................................................................................60 4.5.1 No instruction protection...........................................................................................60 4.5.2 Data protection..........................................................................................................60 4.5.3 CPLB replacement policy.........................................................................................64 4.5.4 Page table..................................................................................................................65 4.5.5 Initial memory layout................................................................................................66 4.5.6 New memory layout..................................................................................................69

Load more