Network Buffer Allocation in the Freebsd Operating System [May 2004]

Total Page:16

File Type:pdf, Size:1020Kb

Network Buffer Allocation in the Freebsd Operating System [May 2004] Network Buffer Allocation in the FreeBSD Operating System [May 2004] Bosko Milekic <[email protected]> Abstract Not only can records of data be constructed on This paper outlines the current structure of the fly (with data appended or prepended to network data buffers in FreeBSD and explains existing records), but there also exist well- their allocator's initial implementation. The defined consumer-producer relationships current common usage patterns of network data between network-bound threads executing in buffers are then examined along with usage parallel in the kernel. For example, a local statistics for a few of the allocator's key thread performing network inter-process- supporting API routines. Finally, the communication (IPC) will tend to allocate many improvement of the allocation framework to buffers, fill them with data, and enqueue them. better support Symmetric-Multi-Processor (SMP) A second local thread performing a read may end systems in FreeBSD 5.x is outlined and an up receiving the record, consuming it (typically argument is made to extend the general purpose copying out the contents to user space), and allocator to support some of the specifics of promptly freeing the buffers. A similar network data buffer allocations. relationship will arise where certain threads performing receive operations (consuming) 1 Introduction largely just read and free buffers and other threads performing transmit operations The BSD family of operating systems is in part (producing) largely allocate and write buffers. reputed for the quality of its network subsystem implementation. The FreeBSD operating system The Mbuf and supporting data structures allow belongs to the broader BSD family and thus for both the prepending and appending of record contains an implementation of network facilities data as well as data-locality for most packets built on top of a set of data structures still destined for DMA in the network device driver common to the majority of BSD-derived (the goal is for entire packet frames from within systems. As the design goals for more recent records to be stored in single contiguous buffers). versions of FreeBSD change to better support Further justification for the fixed-size buffer SMP hardware, so too must adapt the allocator design is provided by the results in Section 4. used to provide network buffers. In fact, the general-purpose kernel memory allocator has 2.1 Mbufs been adapted as well and so the question of whether to continue to provide network data In FreeBSD, an Mbuf is currently 256 Bytes in buffers from a specialized or general-purpose size. The structure itself contains a number of memory allocator naturally arises. general-purpose fields and an optional internal data buffer region. The general-purpose fields 2 Data Structures vary depending on the type of Mbuf being allocated. FreeBSD's network data revolves around a data structure called an Mbuf. The allocated Mbuf Every Mbuf contains an m_hdr substructure data buffer is fixed in size and so additional which includes a pointer to the next Mbuf, if any, larger buffers are also defined to support larger in an Mbuf chain. The m_hdr also includes a packet sizes. The Mbuf and accompanying data pointer to the first Mbuf in the next packet or structures have been defined with the idea that record's Mbuf chain, a reference to the data the usage patterns of network buffers stored or described by the Mbuf, the length of the substantially differ from those of other general- stored data, as well as flags and type fields. purpose buffers [1]. Notably, network protocols all require the ability to construct records widely In addition to the m_hdr, a standard Mbuf varying in size and possibly throughout a longer contains a data region which is (256 Bytes - period of time. sizeof(struct m_hdr)) long. The structure differs for a packet-header type Mbuf which is typically m_ext substructure contains a reference to the the first Mbuf in a packet Mbuf chain. base of the externally-allocated buffer, its size, a Therefore, the packet-header Mbuf usually type descriptor, a pointer to an unsigned integer contains additional generic header data fields for reference counting, and a pointer to a caller- supported by the pkthdr substructure. The pkthdr specified free routine along with an optional substructure contains a reference to the pointer to an arbitrarily-typed data structure that underlying receiving network interface, the total may be passed by the Mbuf code to the specified packet or record length, a pointer to the packet free routine. header, a couple of fields used for checksumming, and a header pointer to a list of The caller who defines the External Buffer may Mbuf meta-data link structures called m_tags. wish to provide an API which allocates an Mbuf The additional overhead due to the pkthdr as well as the External Buffer and its reference substructure implies that the internal data region counter and takes care of configuring the Mbuf of the packet-header type Mbuf is (256 Bytes - for use with the specified buffer via the sizeof(struct m_hdr) - sizeof(struct pkthdr)) long. m_extadd() Mbuf API routine. The m_extadd() routine allows the caller to immediately specify The polymorphic nature of the fixed-size Mbuf the free function reference which will be hooked implies simpler allocation, regardless of type, but into the allocated Mbuf's m_ext descriptor and not necessarily simpler construction. Notably, called when the Mbuf is freed, in order to packet-header Mbufs require special provide the caller with the ability to then free the initialization at allocation time that other buffer itself. Since it is possible to have multiple ordinary Mbufs do not, and so care must be taken Mbufs refer to the same underlying External at allocation time to ensure correct behavior. Buffer (for shared-data packets), the caller- Further, while it is theoretically possible to provided free routine will only be called when allocate an ordinary Mbuf and then construct it at the last reference to the External Buffer is a later time in such a way that it is used as a released. It should be noted that once the packet-header, this behavior is strongly reference counter is specified by the caller via discouraged. Such attempts rarely lead to clean m_extadd(), a reference to it is immediately code and, more seriously, may lead to leaking of stored within the Mbuf and further reference Mbuf meta-data if the corresponding packet- counting is entirely taken care of by the Mbuf header Mbuf's m_tag structures are not properly subsystem itself without any additional caller allocated and initialized (or properly freed if intervention required. we're transforming a packet-header Mbuf to an ordinary Mbuf). Reference counting for External Buffers has long been a sticky issue with regards to the network Sometimes it is both necessary and advantageous buffer facilities in FreeBSD. Traditionally, to store packet data within a larger data area. FreeBSD's support for reference counting This is certainly the case for large packets. In consisted of providing a large sparse vector for order to accommodate this requirement, the storing the counters from which there would be a Mbuf structure provides an optional m_ext one-to-one mapping to all possible Clusters substructure that may be used instead of the allocated simultaneously (this limit has Mbuf's internal data region to describe an traditionally been NMBCLUSTERS). Thus, external buffer. In such a scenario, the reference accessing a given Cluster's reference count to the Mbuf's data (in its m_hdr substructure) is consisted of merely indexing into the vector at modified to point to a location within the external the Cluster's corresponding offset. Since the buffer, the m_ext header is appropriately filled to mapping was one-to-one, this always worked fully describe the buffer, and the M_EXT bit flag without collisions. As for other types of External is set within the Mbuf's flags to indicate the Buffers, their reference counters were to be presence of external storage. provided by the caller via m_extadd() or auto- allocated via the general-purpose kernel malloc() 2.2 Mbuf External Buffers at ultimately higher cost. FreeBSD's Mbuf allocator provides the shims Since the traditional implementation, various required to associate arbitrarily-sized external possibilities have been considered. FreeBSD 4.x buffers with existing Mbufs. The optionally-used was modified to allocate counters for all types of External Buffers from a fast linked list, similar to how 4.x still allocates Mbufs and Mbuf Clusters. recv() The advantage of the approach was that it send() provided a general solution for all types of buffers. The disadvantage was that allocating a counter was required for each allocation requiring external buffers. Another possible Socket implementation would mimick the NetBSD [2] Receive or and OpenBSD [10] Projects' clever approach, Network Send Buffer which is to interlink Mbufs referring to the same Protocol (Mbuf Stack(s) chains) External Buffer. This was both a reasonably fast and elegant approach, but would never find its way to FreeBSD 5.x due to serious complications NetISR that result because of lock ordering (with potential to lead to deadlock). Packet queue (Mbuf chains) The improvements to reference counting made in Interrupt later versions of FreeBSD (5.x) are discussed in sections 5.2 and 5.4. 2.3 Mbuf Clusters Network Device As previously noted, Mbuf Clusters are merely a general type of External Buffer provided by the Mbuf subsystem. Clusters are 2048 bytes in size and are both virtual-address and physical-address Figure 1: Data Flows for Typical contiguous, large enough to accommodate the Send & Receive ethernet MTU and allow for direct DMA to occur to and from the underlying network device.
Recommended publications
  • The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel
    Blekinge Institute of Technology Research Report 2005:06 The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel Simon Kågström Håkan Grahn Lars Lundberg Department of Systems and Software School of Engineering Blekinge Institute of Technology The Design and Implementation of Multiprocessor Support for an Industrial Operating System Kernel Simon Kågström, Håkan Grahn, and Lars Lundberg Department of Systems and Software Engineering School of Engineering Blekinge Institute of Technology P.O. Box 520, SE-372 25 Ronneby, Sweden {ska, hgr, llu}@bth.se Abstract The ongoing transition from uniprocessor to multiprocessor computers requires support from the op- erating system kernel. Although many general-purpose multiprocessor operating systems exist, there is a large number of specialized operating systems which require porting in order to work on multipro- cessors. In this paper we describe the multiprocessor port of a cluster operating system kernel from a producer of industrial systems. Our initial implementation uses a giant locking scheme that serializes kernel execution. We also employed a method in which CPU-local variables are placed in a special sec- tion mapped to per-CPU physical memory pages. The giant lock and CPU-local section allowed us to implement an initial working version with only minor changes to the original code, although the giant lock and kernel-bound applications limit the performance of our multiprocessor port. Finally, we also discuss experiences from the implementation. 1 Introduction A current trend in the computer industry is the transition from uniprocessors to various kinds of multipro- cessors, also for desktop and embeddedsystems. Apart from traditional SMP systems, many manufacturers are now presenting chip multiprocessors or simultaneous multithreaded CPUs [9, 15, 16] which allow more efficient use of chip area.
    [Show full text]
  • Synchronization Spinlocks - Semaphores
    CS 4410 Operating Systems Synchronization Spinlocks - Semaphores Summer 2013 Cornell University 1 Today ● How can I synchronize the execution of multiple threads of the same process? ● Example ● Race condition ● Critical-Section Problem ● Spinlocks ● Semaphors ● Usage 2 Problem Context ● Multiple threads of the same process have: ● Private registers and stack memory ● Shared access to the remainder of the process “state” ● Preemptive CPU Scheduling: ● The execution of a thread is interrupted unexpectedly. ● Multiple cores executing multiple threads of the same process. 3 Share Counting ● Mr Skroutz wants to count his $1-bills. ● Initially, he uses one thread that increases a variable bills_counter for every $1-bill. ● Then he thought to accelerate the counting by using two threads and keeping the variable bills_counter shared. 4 Share Counting bills_counter = 0 ● Thread A ● Thread B while (machine_A_has_bills) while (machine_B_has_bills) bills_counter++ bills_counter++ print bills_counter ● What it might go wrong? 5 Share Counting ● Thread A ● Thread B r1 = bills_counter r2 = bills_counter r1 = r1 +1 r2 = r2 +1 bills_counter = r1 bills_counter = r2 ● If bills_counter = 42, what are its possible values after the execution of one A/B loop ? 6 Shared counters ● One possible result: everything works! ● Another possible result: lost update! ● Called a “race condition”. 7 Race conditions ● Def: a timing dependent error involving shared state ● It depends on how threads are scheduled. ● Hard to detect 8 Critical-Section Problem bills_counter = 0 ● Thread A ● Thread B while (my_machine_has_bills) while (my_machine_has_bills) – enter critical section – enter critical section bills_counter++ bills_counter++ – exit critical section – exit critical section print bills_counter 9 Critical-Section Problem ● The solution should ● enter section satisfy: ● critical section ● Mutual exclusion ● exit section ● Progress ● remainder section ● Bounded waiting 10 General Solution ● LOCK ● A process must acquire a lock to enter a critical section.
    [Show full text]
  • Twenty Years of Berkeley Unix : from AT&T-Owned to Freely
    Twenty Years of Berkeley Unix : From AT&T-Owned to Freely Redistributable Marshall Kirk McKusick Early History Ken Thompson and Dennis Ritchie presented the first Unix paper at the Symposium on Operating Systems Principles at Purdue University in November 1973. Professor Bob Fabry, of the University of California at Berkeley, was in attendance and immediately became interested in obtaining a copy of the system to experiment with at Berkeley. At the time, Berkeley had only large mainframe computer systems doing batch processing, so the first order of business was to get a PDP-11/45 suitable for running with the then-current Version 4 of Unix. The Computer Science Department at Berkeley, together with the Mathematics Department and the Statistics Department, were able to jointly purchase a PDP-11/45. In January 1974, a Version 4 tape was delivered and Unix was installed by graduate student Keith Standiford. Although Ken Thompson at Purdue was not involved in the installation at Berkeley as he had been for most systems up to that time, his expertise was soon needed to determine the cause of several strange system crashes. Because Berkeley had only a 300-baud acoustic-coupled modem without auto answer capability, Thompson would call Standiford in the machine room and have him insert the phone into the modem; in this way Thompson was able to remotely debug crash dumps from New Jersey. Many of the crashes were caused by the disk controller's inability to reliably do overlapped seeks, contrary to the documentation. Berkeley's 11/45 was among the first systems that Thompson had encountered that had two disks on the same controller! Thompson's remote debugging was the first example of the cooperation that sprang up between Berkeley and Bell Labs.
    [Show full text]
  • Chapter 1. Origins of Mac OS X
    1 Chapter 1. Origins of Mac OS X "Most ideas come from previous ideas." Alan Curtis Kay The Mac OS X operating system represents a rather successful coming together of paradigms, ideologies, and technologies that have often resisted each other in the past. A good example is the cordial relationship that exists between the command-line and graphical interfaces in Mac OS X. The system is a result of the trials and tribulations of Apple and NeXT, as well as their user and developer communities. Mac OS X exemplifies how a capable system can result from the direct or indirect efforts of corporations, academic and research communities, the Open Source and Free Software movements, and, of course, individuals. Apple has been around since 1976, and many accounts of its history have been told. If the story of Apple as a company is fascinating, so is the technical history of Apple's operating systems. In this chapter,[1] we will trace the history of Mac OS X, discussing several technologies whose confluence eventually led to the modern-day Apple operating system. [1] This book's accompanying web site (www.osxbook.com) provides a more detailed technical history of all of Apple's operating systems. 1 2 2 1 1.1. Apple's Quest for the[2] Operating System [2] Whereas the word "the" is used here to designate prominence and desirability, it is an interesting coincidence that "THE" was the name of a multiprogramming system described by Edsger W. Dijkstra in a 1968 paper. It was March 1988. The Macintosh had been around for four years.
    [Show full text]
  • Filesystem Performance on Freebsd
    Filesystem Performance on FreeBSD Kris Kennaway [email protected] BSDCan 2006, Ottawa, May 12 Introduction ● Filesystem performance has many aspects ● No single metric for quantifying it ● I will focus on aspects that are relevant for my workloads (concurrent package building) ● The main relevant filesystem workloads seem to be – Concurrent tarball extraction – Recursive filesystem traversals ● Aim: determine relative performance of FreeBSD 4.x, 5.x and 6.x on these workloads – Overall performance and SMP scaling – Evaluate results of multi-year kernel locking strategy as it relates to these workloads Outline ● SMP architectural differences between 4/5/6.x ● Test methodology ● Hardware used ● Parallel tarball extraction test – Disk array and memory disk ● Scaling beyond 4 CPUs ● Recursive filesystem traversal test ● Conclusions & future work SMP Architectural Overview ● FreeBSD 4.x; rudimentary SMP support – Giant kernel lock restricts kernel access to one process at a time – SPL model; interrupts may still be processed in parallel ● FreeBSD 5.x; aim towards greater scalability – Giant-locked to begin with; then finer-grained locking pushdown ● FreeBSD 5.3; VM Giant-free ● FreeBSD 5.4; network stack Giant-free (mostly) ● Many other subsystems/drivers also locked – Interrupts as kernel threads; compete for common locks (if any) with everything else ● FreeBSD 6.x; – Consolidation; further pushdown; payoff! – VFS subsystem, UFS filesystem Giant-free FreeBSD versions ● FreeBSD 4.11-STABLE (11/2005) – Needed for amr driver fixes after 4.11-RELEASE ● FreeBSD 5.4-STABLE (11/05) – No patches needed ● FreeBSD 6.0-STABLE (11/05) – patches: ● Locking reworked in amr driver by Scott Long for better performance ● All relevant changes merged into FreeBSD 6.1 – A kernel panic was encountered at very high I/O loads ● Also fixed in 6.1 Test aims and Methodology ● Want to measure – overall performance difference between FreeBSD branches under varying (concurrent process I/O) loads – scaling to multiple CPUs ● Avoid saturating hardware resources (e.g.
    [Show full text]
  • The Release Engineering of Freebsd 4.4
    The Release Engineering of FreeBSD 4.4 Murray Stokely [email protected] Wind River Systems Abstract different pace, and with the general assumption that they This paper describes the approach used by the FreeBSD re- have first gone into FreeBSD-CURRENT and have been lease engineering team to make production-quality releases thoroughly tested by our user community. of the FreeBSD operating system. It details the methodol- In the interim period between releases, nightly snap- ogy used for the release of FreeBSD 4.4 and describes the shots are built automatically by the FreeBSD Project build tools available for those interested in producing customized machines and made available for download from ftp: FreeBSD releases for corporate rollouts or commercial pro- //stable.FreeBSD.org. The widespread availabil- ductization. ity of binary release snapshots, and the tendency of our user community to keep up with -STABLE development with CVSup and “make world”[8] helps to keep FreeBSD- 1 Introduction STABLE in a very reliable condition even before the qual- ity assurance activities ramp up pending a major release. The development of FreeBSD is a very open process. Bug reports and feature requests are continuously sub- FreeBSD is comprised of contributions from thousands of mitted by users throughout the release cycle. Problem people around the world. The FreeBSD Project provides reports are entered into our GNATS[9] database through anonymous CVS[1] access to the general public so that email, the send-pr(1) application, or via a web-based form. others can have access to log messages, diffs between de- In addition to the multitude of different technical mailing velopment branches, and other productivity enhancements lists about FreeBSD, the FreeBSD quality-assurance mail- that formal source code management provides.
    [Show full text]
  • USENIX October Proof 3
    FOR THE PAST COUPLE OF MONTHS, I have become absorbed in operating sys- RIK FARROW tems. My quest began with some consult- ing regarding security features of the cur- rent Linux kernel, including SELinux, then plunged even deeper during the HotOS workshop. I do not pretend to be a kernel hacker, but I am very interested in what musings goes on with the design and implementa- Rik Farrow provides UNIX and Internet security con- tion of operating system software. sulting and training. He is the author of UNIX System Security and System Administrator’s Guide to System V and editor of the SAGE Short Topics in System Like some others, I had wondered what had hap- Administration series. pened to FreeBSD 4’s stellar performance when Free- [email protected] BSD 5 appeared. Instead of getting faster, FreeBSD was slower. If I had bothered digging deeper, I would have learned that this was the result of far-reaching changes in the FreeBSD kernel. Michael W. Lucas ex- plains these changes in his article, which in turn is based on a talk given by Robert Watson about modify- ing the kernel to support SMP (symmetric multipro- cessing). And perhaps by the time you read this col- umn, FreeBSD 6 will have appeared, ready to utilize the new multiprocessor cores that are popping up. Lucas explains just why the transition from single- threaded to multi-threaded kernel takes so long and is so hard to do right. I first understood the importance of the Big Giant Lock when I was reviewing an early multiprocessing server that used SPARC processors.
    [Show full text]
  • CS350 – Operating Systems
    Outline 1 Processes and Threads 2 Synchronization 3 Memory Management 1 / 45 Processes • A process is an instance of a program running • Modern OSes run multiple processes simultaneously • Very early OSes only ran one process at a time • Examples (can all run simultaneously): - emacs – text editor - firefox – web browser • Non-examples (implemented as one process): - Multiple firefox windows or emacs frames (still one process) • Why processes? - Simplicity of programming - Speed: Higher throughput, lower latency 2 / 45 A process’s view of the world • Each process has own view of machine - Its own address space - Its own open files - Its own virtual CPU (through preemptive multitasking) • *(char *)0xc000 dierent in P1 & P2 3 / 45 System Calls • Systems calls are the interface between processes and the kernel • A process invokes a system call to request operating system services • fork(), waitpid(), open(), close() • Note: Signals are another common mechanism to allow the kernel to notify the application of an important event (e.g., Ctrl-C) - Signals are like interrupts/exceptions for application code 4 / 45 System Call Soware Stack Application 1 5 Syscall Library unprivileged code 4 2 privileged 3 code Kernel 5 / 45 Kernel Privilege • Hardware provides two or more privilege levels (or protection rings) • Kernel code runs at a higher privilege level than applications • Typically called Kernel Mode vs. User Mode • Code running in kernel mode gains access to certain CPU features - Accessing restricted features (e.g. Co-processor 0) - Disabling interrupts, setup interrupt handlers - Modifying the TLB (for virtual memory management) • Allows the kernel to isolate processes from one another and from the kernel - Processes cannot read/write kernel memory - Processes cannot directly call kernel functions 6 / 45 How System Calls Work • The kernel only runs through well defined entry points • Interrupts - Interrupts are generated by devices to signal needing attention - E.g.
    [Show full text]
  • 4. Process Synchronization
    Lecture Notes for CS347: Operating Systems Mythili Vutukuru, Department of Computer Science and Engineering, IIT Bombay 4. Process Synchronization 4.1 Race conditions and Locks • Multiprogramming and concurrency bring in the problem of race conditions, where multiple processes executing concurrently on shared data may leave the shared data in an undesirable, inconsistent state due to concurrent execution. Note that race conditions can happen even on a single processor system, if processes are context switched out by the scheduler, or are inter- rupted otherwise, while updating shared data structures. • Consider a simple example of two threads of a process incrementing a shared variable. Now, if the increments happen in parallel, it is possible that the threads will overwrite each other’s result, and the counter will not be incremented twice as expected. That is, a line of code that increments a variable is not atomic, and can be executed concurrently by different threads. • Pieces of code that must be accessed in a mutually exclusive atomic manner by the contending threads are referred to as critical sections. Critical sections must be protected with locks to guarantee the property of mutual exclusion. The code to update a shared counter is a simple example of a critical section. Code that adds a new node to a linked list is another example. A critical section performs a certain operation on a shared data structure, that may temporar- ily leave the data structure in an inconsistent state in the middle of the operation. Therefore, in order to maintain consistency and preserve the invariants of shared data structures, critical sections must always execute in a mutually exclusive fashion.
    [Show full text]
  • Lock (TCB) Block (TCB)
    Concurrency and Synchronizaon Mo0vaon • Operang systems (and applicaon programs) o<en need to be able to handle mul0ple things happening at the same 0me – Process execu0on, interrupts, background tasks, system maintenance • Humans are not very good at keeping track of mul0ple things happening simultaneously • Threads and synchronizaon are an abstrac0on to help bridge this gap Why Concurrency? • Servers – Mul0ple connec0ons handled simultaneously • Parallel programs – To achieve beGer performance • Programs with user interfaces – To achieve user responsiveness while doing computaon • Network and disk bound programs – To hide network/disk latency Definions • A thread is a single execu0on sequence that represents a separately schedulable task – Single execu0on sequence: familiar programming model – Separately schedulable: OS can run or suspend a thread at any 0me • Protec0on is an orthogonal concept – Can have one or many threads per protec0on domain Threads in the Kernel and at User-Level • Mul0-process kernel – Mul0ple single-threaded processes – System calls access shared kernel data structures • Mul0-threaded kernel – mul0ple threads, sharing kernel data structures, capable of using privileged instruc0ons – UNIX daemon processes -> mul0-threaded kernel • Mul0ple mul0-threaded user processes – Each with mul0ple threads, sharing same data structures, isolated from other user processes – Plus a mul0-threaded kernel Thread Abstrac0on • Infinite number of processors • Threads execute with variable speed – Programs must be designed to work with any schedule Programmer Abstraction Physical Reality Threads Processors 1 2 3 4 5 1 2 Running Ready Threads Threads Queson Why do threads execute at variable speed? Programmer vs. Processor View Programmers Possible Possible Possible View Execution Execution Execution #1 #2 #3 .
    [Show full text]
  • Kernel and Locking
    Kernel and Locking Luca Abeni [email protected] Monolithic Kernels • Traditional Unix-like structure • Protection: distinction between Kernel (running in KS) and User Applications (running in US) • The kernel behaves as a single-threaded program • One single execution flow in KS at each time • Simplify consistency of internal kernel structures • Execution enters the kernel in two ways: • Coming from upside (system calls) • Coming from below (hardware interrupts) Kernel Programming Kernel Locking Single-Threaded Kernels • Only one single execution flow (thread) can execute in the kernel • It is not possible to execute more than 1 system call at time • Non-preemptable system calls • In SMP systems, syscalls are critical sections (execute in mutual exclusion) • Interrupt handlers execute in the context of the interrupted task Kernel Programming Kernel Locking Bottom Halves • Interrupt handlers split in two parts • Short and fast ISR • “Soft IRQ handler” • Soft IRQ hanlder: deferred handler • Traditionally known ass Bottom Half (BH) • AKA Deferred Procedure Call - DPC - in Windows • Linux: distinction between “traditional” BHs and Soft IRQ handlers Kernel Programming Kernel Locking Synchronizing System Calls and BHs • Synchronization with ISRs by disabling interrupts • Synchronization with BHs: is almost automatic • BHs execute atomically (a BH cannot interrupt another BH) • BHs execute at the end of the system call, before invoking the scheduler for returning to US • Easy synchronization, but large non-preemptable sections! • Achieved
    [Show full text]
  • How Freebsd Boots: a Soft-Core MIPS Perspective Brooks Davis, Robert Norton, Jonathan Woodruff, Robert N
    How FreeBSD Boots: a soft-core MIPS perspective Brooks Davis, Robert Norton, Jonathan Woodruff, Robert N. M. Watson Abstract We have implemented an FPGA soft-core, multithreaded, 64-bit MIPS R4000-style CPU called BERI to support research on the hardware/software interface. We have ported FreeBSD to this platform including support for mul- tithreaded and soon multicore CPUs. This paper describes the process by which a BERI system boots from CPU startup through the boot loaders, hand off to the kernel, and en- abling secondary CPU threads. Historically, the process of booting FreeBSD has been documented from a user perspec- tive or at a fairly high level. This paper aims to improve the documentation of the low level boot process for developers aiming to port FreeBSD to new targets. 1. Introduction Figure 1: BERIpad with application launcher From its modest origins as a fork of 386BSD targeting Intel i386 class CPUs, FreeBSD has been ported to a range of architectures including DEC Alpha1, AMD x86_64 (aka The rest of this paper narrates the boot process with a amd64), ARM, Intel IA64, MIPS, PC98, PowerPC, and special focus on the places customization was required for Sparc64. While the x86 and Alpha are fairly homogeneous BERI. We begin by describing the BERI platform (Section targets with mechanics for detecting and adapting to spe- 2), and then in detail documents the kernel architecture- cific board and peripheral configurations, embedded systems specific boot process for FreeBSD on BERI: boot loader platforms like ARM, MIPS, and PowerPC are much more (Section 3) and kernel boot process (Section 4).
    [Show full text]