Haskus-Manual Release 0.1
Total Page:16
File Type:pdf, Size:1020Kb
haskus-manual Release 0.1 Sylvain HENRY Aug 03, 2017 System - Volume 1: building guide 1 haskus-system 1 1.1 Introduction...............................................1 1.2 Discussing the approach.........................................2 1.3 Building systems: the automated way..................................5 1.4 Building systems: the manual way....................................7 1.5 Reference: system.yaml syntax................................... 11 1.6 Reference: haskus-system-build ................................. 13 1.7 The Sys monad.............................................. 13 1.8 Device management........................................... 14 1.9 Graphics Overview............................................ 19 1.10 Modules Overview............................................ 25 1.11 X86 architecture notes.......................................... 26 1.12 X86 encoding............................................... 27 2 haskus-binary 29 2.1 Binary modules.............................................. 29 3 haskus-utils 37 i ii CHAPTER 1 haskus-system haskus-system is a framework written in Haskell that can be used for system programming. Fundamentally it is an experiment into providing an integrated interface leveraging Haskell features (type-safety, STM, etc.) for the whole system: input, display, sound, network, etc. Introduction haskus-system is a framework written in Haskell that can be used for system programming. Fundamentally it is an experiment into providing an integrated interface leveraging Haskell features (type-safety, STM, etc.) for the whole system: input, display, sound, network, etc. The big picture A typical operating system can be roughly split into three layers: • Kernel: device drivers, virtual memory management, process scheduling, etc. • System: system services and daemons, low-level kernel interfaces, etc. • Application: end-user applications (web browser, video player, games, etc.) Linux kernel haskus-system is based directly and exclusively on the Linux kernel. Hence, • it doesn’t rely on usual user-space kernel interfaces (e.g., libdrm, libinput, X11, wayland, etc.) to communicate with the kernel • it doesn’t contain low-level kernel code (device driver, etc.) Note, however, that programs using the haskus-system are compiled with GHC: hence they still depend on GHC’s runtime system (RTS) dependencies (libc, etc.). Programs are statically compiled to embed those dependencies. haskus-system 1 haskus-manual, Release 0.1 haskus-system acts at the system level: it provides interfaces to the Linux kernel (hence to the hardware) in Haskell and builds on them to provide higher-level interfaces (described in the Volume 2 of this documentation). You can use these interfaces to build custom systems. Then it is up to you to decide if your system has the concept of “application” or not: you may design domain specific systems which provide a single “application”. Discussing the approach The haskus-system framework aims to help writing systems in Haskell. Writing a new operating system from scratch is obviously a huge task that we won’t undertake. Instead, pragmatically, we build on the Linux kernel to develop the haskus-system. The fact that it is based on the Linux kernel shouldn’t confuse you: we don’t have to let applications directly access it through a UNIX-like interface! This is similar to the approach followed by Google with Android: the Linux kernel is used internally but applications have to be written in Java and they have to use the Android interfaces. The difference is that we are using Haskell. The haskus-system framework and the systems using it are written with the Haskell language. We use GHC to compile Haskell codes, hence we rely on GHC’s runtime system. This runtime system works on a bare-bones Linux kernel and manages memory (garbage collection), user-space threading, asynchronous I/O, etc. Portability The portability is ensured by the Linux kernel. In theory we could use our approach on any architecture supported by the Linux kernel. In practice, we also need to ensure that GHC supports the target architecture. In addition, haskus-system requires a thin architecture-specific layer because Linux interface is architecture spe- cific. Differences between architectures include: system call numbers, some structure fields (sizes and orders), the instruction to call into a system call and the way to pass system call parameters (calling convention). The following architectures are currently supported by each level of the stack: • haskus-system: x86-64 • GHC: x86, x86-64, PowerPC, and ARM • Linux kernel: many architectures Performance Using a high-level language such as Haskell is a trade-off between performance and productivity. Just like using C lan- guage instead of plain assembly language is. Moreover in both cases we expect the compilers to perform optimizations that are not obvious or that would require complicated hard to maintain codes if they were to be coded explicitly. GHC is the Haskell compiler we use. It is a mature compiler still actively developed. It performs a lot of optimizations. In particular, it performs inter-modules optimizations so that well-organized modular code doesn’t endure performance costs. Haskell codes are compiled into native code for the architecture (i.e., there is no runtime interpretation of the code). In addition, it is possible to use LLVM as a GHC backend to generate the native code. The generated native codes are linked with a runtime system provided by GHC that manages: • Memory: garbage collection • Threading: fast and cheap user-space threading • Software transactional memory (STM): safe memory locking 2 Chapter 1. haskus-system haskus-manual, Release 0.1 • Asynchronous I/O: non-blocking I/O interacting with the threading system Performance-wise, this is a crucial part of the stack we use. It has been carefully optimized and it is tunable for specific needs. It is composed of about 40k lines of C code. As a last resort, it is still possible to call codes written in other languages from Haskell through the Foreign Function Interface (FFI) or by adding a Primary Operation (primop). haskus-system uses these mechanisms to interact with the Linux kernel. It seems to us that this approach is a good trade-off. As comparison points, most UNIX-like systems rely on unsafe interpreted shell scripts (init systems, etc.); Google’s Android (with Dalvik) used to perform runtime bytecode interpretation and then just-in-time compilation, currently (with ART) it still uses a garbage collector; Apple’s platforms rely on a garbage collection variant called “automatic reference counting” in Objective-C and in Swift languages (while it might be more efficient, it requires much more care from the programmers); JavaScript based applications and applets (unsafe language, VM, etc.) tend to generalize even on desktop. Productivity Writing system code in a high-level language such as Haskell should be much more productive than writing it in a low-level language like C. • Most of the boilerplate code (e.g., error management, logging) can be abstracted away. • Thanks to the type system, many errors are caught during the compilation, which is especially useful with system programming because programs are harder to debug using standard methods and tools. Moreover it makes codes much easier to maintain because the compiler checks many more things during refactoring. • High-level code is often dense and terse. Hence we can show full working code snippets in the documentation that you can quickly copy. Moreover, writing system code is much more fun and we can quickly get enjoyable results. Durability and Evolution Our approach should be both durable and evolutive. Durable because we only use mature technology: Linux and GHC developments both started in early 1990s and are still very active. The only new layer in the stack is the haskus-system framework. All of these are open-source free software, ensuring long-term access to the sources. The approach is evolutive: Haskell language is evolving in a controlled way with GHC’s extensions (and a potential future Haskell standard revision); GHC as a compiler and a runtime system is constantly improving and support for new architectures could be added; Linux support for new hardware and new architectures is constantly enhanced and specific developments could be done to add features useful for the haskus-system (or your own system on top of it). The haskus-system framework itself is highly evolutive. First it is new and not tied to any standard. Moreover code refactoring in Haskell is much easier than in low-level languages such as C (thanks to the strong typing), hence we can easily enhance the framework interfaces as user code can easily be adapted. Single Code Base & Integration In our opinion, a big advantage of our approach is to have an integrated framework whose source is in a single code base. It makes it much easier to evolve at a fast pace without having to maintain interface compatibility between its internal components. Moreover, refactoring is usually safe and relatively easy in Haskell, so we could later split it into several parts if needed. 1.2. Discussing the approach 3 haskus-manual, Release 0.1 As a comparison point, usual Linux distributions use several system services and core libraries, most of them in their own repository and independently developed: libc, dbus, udev, libdrm, libinput, Mesa/X11/Wayland, PulseAudio, etc. It is worth noting that the issue has been identified and that an effort has been recently made