Customized OS Kernel for Data-Processing on Modern Hardware
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Uva-DARE (Digital Academic Repository)
UvA-DARE (Digital Academic Repository) On the construction of operating systems for the Microgrid many-core architecture van Tol, M.W. Publication date 2013 Link to publication Citation for published version (APA): van Tol, M. W. (2013). On the construction of operating systems for the Microgrid many-core architecture. General rights It is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), other than for strictly personal, individual use, unless the work is under an open content license (like Creative Commons). Disclaimer/Complaints regulations If you believe that digital publication of certain material infringes any of your rights or (privacy) interests, please let the Library know, stating your reasons. In case of a legitimate complaint, the Library will make the material inaccessible and/or remove it from the website. Please Ask the Library: https://uba.uva.nl/en/contact, or a letter to: Library of the University of Amsterdam, Secretariat, Singel 425, 1012 WP Amsterdam, The Netherlands. You will be contacted as soon as possible. UvA-DARE is a service provided by the library of the University of Amsterdam (https://dare.uva.nl) Download date:27 Sep 2021 General Bibliography [1] J. Aas. Understanding the linux 2.6.8.1 cpu scheduler. Technical report, Silicon Graphics Inc. (SGI), Eagan, MN, USA, February 2005. [2] D. Abts, S. Scott, and D. J. Lilja. So many states, so little time: Verifying memory coherence in the Cray X1. In Proceedings of the 17th International Symposium on Parallel and Distributed Processing, IPDPS '03, page 10 pp., Washington, DC, USA, April 2003. -
Reconfigurable Embedded Control Systems: Problems and Solutions
RECONFIGURABLE EMBEDDED CONTROL SYSTEMS: PROBLEMS AND SOLUTIONS By Dr.rer.nat.Habil. Mohamed Khalgui ⃝c Copyright by Dr.rer.nat.Habil. Mohamed Khalgui, 2012 v Martin Luther University, Germany Research Manuscript for Habilitation Diploma in Computer Science 1. Reviewer: Prof.Dr. Hans-Michael Hanisch, Martin Luther University, Germany, 2. Reviewer: Prof.Dr. Georg Frey, Saarland University, Germany, 3. Reviewer: Prof.Dr. Wolf Zimmermann, Martin Luther University, Germany, Day of the defense: Monday January 23rd 2012, Table of Contents Table of Contents vi English Abstract x German Abstract xi English Keywords xii German Keywords xiii Acknowledgements xiv Dedicate xv 1 General Introduction 1 2 Embedded Architectures: Overview on Hardware and Operating Systems 3 2.1 Embedded Hardware Components . 3 2.1.1 Microcontrollers . 3 2.1.2 Digital Signal Processors (DSP): . 4 2.1.3 System on Chip (SoC): . 5 2.1.4 Programmable Logic Controllers (PLC): . 6 2.2 Real-Time Embedded Operating Systems (RTOS) . 8 2.2.1 QNX . 9 2.2.2 RTLinux . 9 2.2.3 VxWorks . 9 2.2.4 Windows CE . 10 2.3 Known Embedded Software Solutions . 11 2.3.1 Simple Control Loop . 12 2.3.2 Interrupt Controlled System . 12 2.3.3 Cooperative Multitasking . 12 2.3.4 Preemptive Multitasking or Multi-Threading . 12 2.3.5 Microkernels . 13 2.3.6 Monolithic Kernels . 13 2.3.7 Additional Software Components: . 13 2.4 Conclusion . 14 3 Embedded Systems: Overview on Software Components 15 3.1 Basic Concepts of Components . 15 3.2 Architecture Description Languages . 17 3.2.1 Acme Language . -
High Performance Computing Systems
High Performance Computing Systems Multikernels Doug Shook Multikernels Two predominant approaches to OS: – Full weight kernel – Lightweight kernel Why not both? – How does implementation affect usage and performance? Gerolfi, et. al. “A Multi-Kernel Survey for High- Performance Computing,” 2016 2 FusedOS Assumes heterogeneous architecture – Linux on full cores – LWK requests resources from linux to run programs Uses CNK as its LWK 3 IHK/McKernel Uses an Interface for Heterogeneous Kernels – Resource allocation – Communication McKernel is the LWK – Only operable with IHK Uses proxy processes 4 mOS Embeds LWK into the Linux kernel – LWK is visible to Linux just like any other process Resource allocation is performed by sysadmin/user 5 FFMK Uses the L4 microkernel – What is a microkernel? Also uses a para-virtualized Linux instance – What is paravirtualization? 6 Hobbes Pisces Node Manager Kitten LWK Palacios Virtual Machine Monitor 7 Sysadmin Criteria Is the LWK standalone? Which kernel is booted by the BIOS? How and when are nodes partitioned? 8 Application Criteria What is the level of POSIX support in the LWK? What is the pseudo file system support? How does an application access Linux functionality? What is the system call overhead? Can LWK and Linux share memory? Can a single process span Linux and the LWK? Does the LWK support NUMA? 9 Linux Criteria Are LWK processes visible to standard tools like ps and top? Are modifications to the Linux kernel necessary? Do Linux kernel changes propagate to the LWK? -
Hardware Configuration with Dynamically-Queried Formal Models
Master’s Thesis Nr. 180 Systems Group, Department of Computer Science, ETH Zurich Hardware Configuration With Dynamically-Queried Formal Models by Daniel Schwyn Supervised by Reto Acherman Dr. David Cock Prof. Dr. Timothy Roscoe April 2017 – October 2017 Abstract Hardware is getting increasingly complex and heterogeneous. With different compo- nents having different views of the system, the traditional assumption of unique phys- ical addresses has become an illusion. To adapt to such hardware, an operating system (OS) needs to understand the complex address translation chains and be able to handle the presence of multiple address spaces. This thesis takes a recently proposed model that formally captures these aspects and applies it to hardware configuration in the Barrelfish OS. To this end, I present Sockeye, a domain specific language that uses the model to de- scribe hardware. I then show, that code relying on knowledge about the address spaces in a system can be statically generated from these specifications. Furthermore, the model is successfully applied to device management, showing that it can also be used to configure hardware at runtime. The implementation presented here does not rely on any platform specific code and it reduced the amount of such code in Barrelfish’s device manager by over 30%. Applying the model to further hardware configuration tasks is expected to have similar effects. i Acknowledgements First of all, I want to thank my advisers Timothy Roscoe, David Cock and Reto Acher- mann for their guidance and support. Their feedback and critical questions were a valuable contribution to this thesis. I’d also like to thank the rest of the Barrelfish team and other members of the Systems Group at ETH for the interesting discussions during meetings and coffee breaks. -
Message Passing for Programming Languages and Operating Systems
Master’s Thesis Nr. 141 Systems Group, Department of Computer Science, ETH Zurich Message Passing for Programming Languages and Operating Systems by Martynas Pumputis Supervised by Prof. Dr. Timothy Roscoe, Dr. Antonios Kornilios Kourtis, Dr. David Cock October 7, 2015 Abstract Message passing as a mean of communication has been gaining popu- larity within domains of concurrent programming languages and oper- ating systems. In this thesis, we discuss how message passing languages can be ap- plied in the context of operating systems which are heavily based on this form of communication. In particular, we port the Go program- ming language to the Barrelfish OS and integrate the Go communi- cation channels with the messaging infrastructure of Barrelfish. We show that the outcome of the porting and the integration allows us to implement OS services that can take advantage of the easy-to-use concurrency model of Go. Our evaluation based on LevelDB benchmarks shows comparable per- formance to the Linux port. Meanwhile, the overhead of the messag- ing integration causes the poor performance when compared to the native messaging of Barrelfish, but exposes an easier to use interface, as shown by the example code. i Acknowledgments First of all, I would like to thank Timothy Roscoe, Antonios Kornilios Kourtis and David Cock for giving the opportunity to work on the Barrelfish OS project, their supervision, inspirational thoughts and critique. Next, I would like to thank the Barrelfish team for the discussions and the help. In addition, I would like to thank Sebastian Wicki for the conversations we had during the entire period of my Master’s studies. -
W4118: Multikernel
W4118: multikernel Instructor: Junfeng Yang References: Modern Operating Systems (3rd edition), Operating Systems Concepts (8th edition), previous W4118, and OS at MIT, Stanford, and UWisc Motivation: sharing is expensive Difficult to parallelize single-address space kernels with shared data structures . Locks/atomic instructions: limit scalability . Must make every shared data structure scalable • Partition data • Lock-free data structures • … . Tremendous amount of engineering Root cause . Expensive to move cache lines . Congestion of interconnect 1 Multikernel: explicit sharing via messages Multicore chip == distributed system! . No shared data structures! . Send messages to core to access its data Advantages . Scalable . Good match for heterogeneous cores . Good match if future chips don’t provide cache- coherent shared memory 2 Challenge: global state OS must manage global state . E.g., page table of a process Solution . Replicate global state . Read: read local copy low latency . Update: update local copy + distributed protocol to update remote copies • Do so asynchronously (“split phase”) 3 Barrelfish overview Figure 5 in paper CPU driver . kernel-mode part per core . Inter-Processor interrupt (IPI) Monitors . OS abstractions . User-level RPC (URPC) 4 IPC through shared memory #define CACHELINE (64) struct box{ char buf[CACHELINE-1]; // message contents char flag; // high bit == 0 means sender owns it }; __attribute__ ((aligned (CACHELINE))) struct box m __attribute__ ((aligned (CACHELINE))); send() recv() { { // set up message while(!(m.flag & 0x80)) memcpy(m.buf, …); ; m.flag |= 0x80; m.buf … //process message } } 5 Case study: TLB shootdown When is it necessary? Windows & Linux: send IPIs Barrelfish: sends messages to involved monitors . 1 broadcast message N-1 invalidates, N-1 fetches . -
Logca: a High-Level Performance Model for Hardware Accelerators Muhammad Shoaib Bin Altaf ∗ David A
LogCA: A High-Level Performance Model for Hardware Accelerators Muhammad Shoaib Bin Altaf ∗ David A. Wood AMD Research Computer Sciences Department Advanced Micro Devices, Inc. University of Wisconsin-Madison [email protected] [email protected] ABSTRACT 10 With the end of Dennard scaling, architects have increasingly turned Unaccelerated Accelerated 1 to special-purpose hardware accelerators to improve the performance and energy efficiency for some applications. Unfortunately, accel- 0.1 erators don’t always live up to their expectations and may under- perform in some situations. Understanding the factors which effect Time (ms) 0.01 Break-even point the performance of an accelerator is crucial for both architects and 0.001 programmers early in the design stage. Detailed models can be 16 64 highly accurate, but often require low-level details which are not 256 1K 4K 16K 64K available until late in the design cycle. In contrast, simple analytical Offloaded Data (Bytes) models can provide useful insights by abstracting away low-level system details. (a) Execution time on UltraSPARC T2. In this paper, we propose LogCA—a high-level performance 100 model for hardware accelerators. LogCA helps both programmers SPARC T4 UltraSPARC T2 GPU and architects identify performance bounds and design bottlenecks 10 early in the design cycle, and provide insight into which optimiza- tions may alleviate these bottlenecks. We validate our model across Speedup 1 Break-even point a variety of kernels, ranging from sub-linear to super-linear com- plexities on both on-chip and off-chip accelerators. We also describe the utility of LogCA using two retrospective case studies. -
Foreign Library Interface by Daniel Adler Dia Applications That Can Run on a Multitude of Plat- Forms
30 CONTRIBUTED RESEARCH ARTICLES Foreign Library Interface by Daniel Adler dia applications that can run on a multitude of plat- forms. Abstract We present an improved Foreign Function Interface (FFI) for R to call arbitary na- tive functions without the need for C wrapper Foreign function interfaces code. Further we discuss a dynamic linkage framework for binding standard C libraries to FFIs provide the backbone of a language to inter- R across platforms using a universal type infor- face with foreign code. Depending on the design of mation format. The package rdyncall comprises this service, it can largely unburden developers from the framework and an initial repository of cross- writing additional wrapper code. In this section, we platform bindings for standard libraries such as compare the built-in R FFI with that provided by (legacy and modern) OpenGL, the family of SDL rdyncall. We use a simple example that sketches the libraries and Expat. The package enables system- different work flow paths for making an R binding to level programming using the R language; sam- a function from a foreign C library. ple applications are given in the article. We out- line the underlying automation tool-chain that extracts cross-platform bindings from C headers, FFI of base R making the repository extendable and open for Suppose that we wish to invoke the C function sqrt library developers. of the Standard C Math library. The function is de- clared as follows in C: Introduction double sqrt(double x); We present an improved Foreign Function Interface The .C function from the base R FFI offers a call (FFI) for R that significantly reduces the amount of gate to C code with very strict conversion rules, and C wrapper code needed to interface with C. -
Multiprocessor Operating Systems CS 6410: Advanced Systems
Introduction Multikernel Tornado Conclusion Discussion Outlook References Multiprocessor Operating Systems CS 6410: Advanced Systems Kai Mast Department of Computer Science Cornell University September 4, 2014 Kai Mast — Multiprocessor Operating Systems 1/47 Introduction Multikernel Tornado Conclusion Discussion Outlook References Let us recall Multiprocessor vs. Multicore Figure: Multiprocessor [10] Figure: Multicore [10] Kai Mast — Multiprocessor Operating Systems 2/47 Introduction Multikernel Tornado Conclusion Discussion Outlook References Let us recall Message Passing vs. Shared Memory Shared Memory Threads/Processes access the same memory region Communication via changes in variables Often easier to implement Message Passing Threads/Processes don’t have shared memory Communication via messages/events Easier to distribute between different processors More robust than shared memory Kai Mast — Multiprocessor Operating Systems 3/47 Introduction Multikernel Tornado Conclusion Discussion Outlook References Let us recall Miscellaneous Cache Coherence Inter-Process Communication Remote-Procedure Call Preemptive vs. cooperative Multitasking Non-uniform memory access (NUMA) Kai Mast — Multiprocessor Operating Systems 4/47 Introduction Multikernel Tornado Conclusion Discussion Outlook References Current Systems are Diverse Different Architectures (x86, ARM, ...) Different Scales (Desktop, Server, Embedded, Mobile ...) Different Processors (GPU, CPU, ASIC ...) Multiple Cores and/or Multiple Processors Multiple Operating Systems on a System (Firmware, -
Hardware, Firmware, Devices Disks Kernel, Boot, Swap Files, Volumes
hardware, kernel, boot, firmware, disks files, volumes swap devices software, security, patching, networking references backup tracing, logging TTAASSKK\ OOSS AAIIXX AA//UUXX DDGG//UUXX FFrreeeeBBSSDD HHPP--UUXX IIRRIIXX Derived from By IBM, with Apple 1988- 4.4BSD-Lite input from 1995. Based and 386BSD. System V, BSD, etc. on AT&T Data General This table SysV.2.2 with was aquired does not Hewlett- SGI. SVR4- OS notes Runs mainly extensions by EMC in include Packard based on IBM from V.3, 1999. external RS/6000 and V.4, and BSD packages related 4.2 and 4.3 from hardware. /usr/ports. /usr/sysadm ssmmiitt ssaamm /bin/sysmgr (6.3+) ssmmiittttyy ttoooollcchheesstt administrativ /usr/Cadmin/ wwssmm FFiinnddeerr ssyyssaaddmm ssyyssiinnssttaallll ssmmhh (11.31+) e GUI bin/* /usr/sysadm/ useradd (5+) FFiinnddeerr uusseerraadddd aadddduusseerr uusseerraadddd privbin/ addUserAcco userdell (5+) //eettcc//aadddduusseerr uusseerrddeell cchhppaassss uusseerrddeell unt usermod edit rrmmuusseerr uusseerrmmoodd managing (5+) /etc/passwd users llssuusseerr ppww ggeettpprrppww ppaassssmmggmmtt mmkkuusseerr vviippww mmooddpprrppww /usr/Cadmin/ cchhuusseerr ppwwggeett bin/cpeople rmuser usrck TASK \ OS AAIIXX AA//UUXX DDGG//UUXX FFrreeeeBBSSDD HHPP--UUXX IIRRIIXX pprrttccoonnff uunnaammee iioossccaann hhiinnvv dmesg (if (if llssccffgg ssyyssccttll--aa you're lucky) llssaattttrr ddmmeessgg aaddbb ssyyssiinnffoo--vvvv catcat lsdev /var/run/dm model esg.boot stm (from the llssppaatthh ppcciiccoonnff--ll SupportPlus CDROM) list hardware dg_sysreport - bdf (like -
Design and Implementation of the Heterogeneous Multikernel Operating System
223 Design and Implementation of the Heterogeneous Multikernel Operating System Yauhen KLIMIANKOU Department of Computer Systems and Networks, Belarusian State University of Informatics and Radioelectronics, Belarus Abstract. The design of the computer system was significantly changed due to the emergence and popularization of the multicore processors. Moving to the advanced multicore processors, moving to the heterogeneous computer systems and increasing of the integrity level between computer system components are the main trends of the computer systems development. Significant changes in the computer systems design make reasonable the attempt of reviewing the operating system design to make it optimal for the new hardware platform. The proposed operation system design assume moving from monolithic centralized operating system to the decentralized network of the distributed independent nodes, each of which will play the role of the processor driver and threads container. The proposed design provide the numbers of the benefits against ordinal operating systems: dynamics in space and time, improved level of reliability and flexibility, support of the heterogeneous computer systems. Keywords. Multiprocessor computer system, heterogeneous computer system, real-time system Introduction Computer systems design is changing much faster than the operating systems design. The internal architecture of the modern computer resembles a distributed network system consisting from the mix of processor cores, caches, internal communications, I/O devices and expansion cards. The modern computer is similar to the early parallel computer systems or multiprocessor systems of the last century. Multi-core computer systems, which are essentially the same multi-processor systems localized on the processor die, occupy an increasingly strong position in most segments of the computer market. -
Basics of UNIX
Basics of UNIX August 23, 2012 By UNIX, I mean any UNIX-like operating system, including Linux and Mac OS X. On the Mac you can access a UNIX terminal window with the Terminal application (under Applica- tions/Utilities). Most modern scientific computing is done on UNIX-based machines, often by remotely logging in to a UNIX-based server. 1 Connecting to a UNIX machine from {UNIX, Mac, Windows} See the file on bspace on connecting remotely to SCF. In addition, this SCF help page has infor- mation on logging in to remote machines via ssh without having to type your password every time. This can save a lot of time. 2 Getting help from SCF More generally, the department computing FAQs is the place to go for answers to questions about SCF. For questions not answered there, the SCF requests: “please report any problems regarding equipment or system software to the SCF staff by sending mail to ’trouble’ or by reporting the prob- lem directly to room 498/499. For information/questions on the use of application packages (e.g., R, SAS, Matlab), programming languages and libraries send mail to ’consult’. Questions/problems regarding accounts should be sent to ’manager’.” Note that for the purpose of this class, questions about application packages, languages, li- braries, etc. can be directed to me. 1 3 Files and directories 1. Files are stored in directories (aka folders) that are in a (inverted) directory tree, with “/” as the root of the tree 2. Where am I? > pwd 3. What’s in a directory? > ls > ls -a > ls -al 4.