Programmer's Guide

Total Page:16

File Type:pdf, Size:1020Kb

Programmer's Guide Programmer’s Guide Release 18.11.11 Jan 20, 2021 CONTENTS 1 Introduction 1 1.1 Documentation Roadmap...............................1 1.2 Related Publications..................................2 2 Overview 3 2.1 Development Environment..............................3 2.2 Environment Abstraction Layer............................4 2.3 Core Components...................................4 2.3.1 Ring Manager (librte_ring)..........................4 2.3.2 Memory Pool Manager (librte_mempool)..................4 2.3.3 Network Packet Buffer Management (librte_mbuf).............6 2.3.4 Timer Manager (librte_timer)........................6 2.4 Ethernet* Poll Mode Driver Architecture.......................6 2.5 Packet Forwarding Algorithm Support........................6 2.6 librte_net........................................6 3 Environment Abstraction Layer7 3.1 EAL in a Linux-userland Execution Environment..................7 3.1.1 Initialization and Core Launching......................8 3.1.2 Shutdown and Cleanup...........................8 3.1.3 Multi-process Support............................8 3.1.4 Memory Mapping Discovery and Memory Reservation..........8 3.1.5 Support for Externally Allocated Memory.................. 11 3.1.6 Per-lcore and Shared Variables....................... 12 3.1.7 Logs...................................... 12 3.1.8 CPU Feature Identification.......................... 12 3.1.9 User Space Interrupt Event......................... 13 3.1.10 Blacklisting.................................. 14 3.1.11 Misc Functions................................ 14 3.1.12 IOVA Mode Configuration.......................... 14 3.2 Memory Segments and Memory Zones (memzone)................ 14 3.3 Multiple pthread.................................... 14 3.3.1 EAL pthread and lcore Affinity........................ 15 3.3.2 non-EAL pthread support.......................... 15 3.3.3 Public Thread API.............................. 15 3.3.4 Control Thread API.............................. 16 3.3.5 Known Issues................................. 16 3.3.6 cgroup control................................. 17 3.4 Malloc.......................................... 18 i 3.4.1 Cookies.................................... 18 3.4.2 Alignment and NUMA Constraints...................... 18 3.4.3 Use Cases.................................. 18 3.4.4 Internal Implementation........................... 18 4 Service Cores 23 4.1 Service Core Initialization............................... 23 4.2 Enabling Services on Cores............................. 23 4.3 Service Core Statistics................................ 24 5 Ring Library 25 5.1 References for Ring Implementation in FreeBSD*................. 26 5.2 Lockless Ring Buffer in Linux*............................ 26 5.3 Additional Features.................................. 26 5.3.1 Name..................................... 26 5.4 Use Cases....................................... 26 5.5 Anatomy of a Ring Buffer............................... 26 5.5.1 Single Producer Enqueue.......................... 27 5.5.2 Single Consumer Dequeue......................... 28 5.5.3 Multiple Producers Enqueue......................... 29 5.5.4 Modulo 32-bit Indexes............................ 33 5.6 References....................................... 35 6 Mempool Library 37 6.1 Cookies......................................... 37 6.2 Stats.......................................... 37 6.3 Memory Alignment Constraints............................ 37 6.4 Local Cache...................................... 38 6.5 Mempool Handlers.................................. 39 6.6 Use Cases....................................... 40 7 Mbuf Library 41 7.1 Design of Packet Buffers............................... 41 7.2 Buffers Stored in Memory Pools........................... 43 7.3 Constructors...................................... 43 7.4 Allocating and Freeing mbufs............................. 43 7.5 Manipulating mbufs.................................. 43 7.6 Meta Information.................................... 43 7.7 Direct and Indirect Buffers.............................. 45 7.8 Debug......................................... 46 7.9 Use Cases....................................... 46 8 Poll Mode Driver 47 8.1 Requirements and Assumptions........................... 47 8.2 Design Principles................................... 48 8.3 Logical Cores, Memory and NIC Queues Relationships.............. 49 8.4 Device Identification, Ownership and Configuration................ 49 8.4.1 Device Identification............................. 49 8.4.2 Port Ownership................................ 50 8.4.3 Device Configuration............................. 50 8.4.4 On-the-Fly Configuration........................... 50 8.4.5 Configuration of Transmit Queues...................... 51 ii 8.4.6 Free Tx mbuf on Demand.......................... 52 8.4.7 Hardware Offload............................... 52 8.5 Poll Mode Driver API................................. 53 8.5.1 Generalities.................................. 53 8.5.2 Generic Packet Representation....................... 54 8.5.3 Ethernet Device API............................. 54 8.5.4 Ethernet Device Standard Device Arguments............... 54 8.5.5 Extended Statistics API........................... 54 8.5.6 NIC Reset API................................ 57 9 Generic flow API (rte_flow) 59 9.1 Overview........................................ 59 9.2 Flow rule........................................ 59 9.2.1 Description.................................. 59 9.2.2 Attributes................................... 60 9.2.3 Pattern item.................................. 62 9.2.4 Matching pattern............................... 63 9.2.5 Meta item types................................ 64 9.2.6 Data matching item types.......................... 68 9.2.7 Actions..................................... 77 9.2.8 Action types.................................. 79 9.2.9 Negative types................................ 93 9.2.10 Planned types................................. 93 9.3 Rules management.................................. 93 9.3.1 Validation................................... 93 9.3.2 Creation.................................... 94 9.3.3 Destruction.................................. 95 9.3.4 Flush...................................... 95 9.3.5 Query..................................... 96 9.4 Isolated mode..................................... 96 9.5 Verbose error reporting................................ 97 9.6 Helpers......................................... 98 9.6.1 Error initializer................................. 98 9.6.2 Object conversion............................... 98 9.7 Caveats......................................... 99 9.8 PMD interface..................................... 99 9.9 Device compatibility.................................. 100 9.9.1 Global bit-masks............................... 100 9.9.2 Unsupported layer types........................... 100 9.9.3 ANY pattern item............................... 100 9.9.4 Unsupported actions............................. 101 9.9.5 Flow rules priority............................... 101 9.10 Future evolutions................................... 102 10 Switch Representation within DPDK Applications 103 10.1 Introduction....................................... 104 10.2 Port Representors................................... 104 10.3 Basic SR-IOV..................................... 105 10.4 Controlled SR-IOV................................... 106 10.4.1 Initialization.................................. 106 10.4.2 VF Representors............................... 107 10.4.3 Traffic Steering................................ 108 iii 10.5 Flow API (rte_flow).................................. 109 10.5.1 Extensions.................................. 109 10.5.2 Traffic Direction................................ 109 10.5.3 Transferring Traffic.............................. 110 10.5.4 Pattern Items And Actions.......................... 112 10.5.5 Actions Order and Repetition........................ 114 10.6 Switching Examples.................................. 114 10.6.1 Associating VF 1 with Physical Port 0.................... 115 10.6.2 Sharing Broadcasts............................. 115 10.6.3 Encapsulating VF 2 Traffic in VXLAN.................... 116 11 Traffic Metering and Policing API 117 11.1 Overview........................................ 117 11.2 Configuration steps.................................. 117 11.3 Run-time processing................................. 117 12 Traffic Management API 119 12.1 Overview........................................ 119 12.2 Capability API..................................... 119 12.3 Scheduling Algorithms................................ 120 12.4 Traffic Shaping..................................... 120 12.5 Congestion Management............................... 120 12.6 Packet Marking.................................... 121 12.7 Steps to Setup the Hierarchy............................. 121 12.7.1 Initial Hierarchy Specification........................ 122 12.7.2 Hierarchy Commit.............................. 122 12.7.3 Run-Time Hierarchy Updates........................ 122 13 Wireless Baseband Device Library 123 13.1 Design Principles................................... 123 13.2 Device Management................................. 123 13.2.1 Device Creation................................ 123 13.2.2 Device Identification............................. 124 13.2.3 Device Configuration............................. 124 13.2.4 Queues Configuration............................ 124 13.2.5 Device & Queues
Recommended publications
  • Reducing Power Consumption in Mobile Devices by Using a Kernel
    IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. Z, NO. B, AUGUST 2017 1 Reducing Event Latency and Power Consumption in Mobile Devices by Using a Kernel-Level Display Server Stephen Marz, Member, IEEE and Brad Vander Zanden and Wei Gao, Member, IEEE E-mail: [email protected], [email protected], [email protected] Abstract—Mobile devices differ from desktop computers in that they have a limited power source, a battery, and they tend to spend more CPU time on the graphical user interface (GUI). These two facts force us to consider different software approaches in the mobile device kernel that can conserve battery life and reduce latency, which is the duration of time between the inception of an event and the reaction to the event. One area to consider is a software package called the display server. The display server is middleware that handles all GUI activities between an application and the operating system, such as event handling and drawing to the screen. In both desktop and mobile devices, the display server is located in the application layer. However, the kernel layer contains most of the information needed for handling events and drawing graphics, which forces the application-level display server to make a series of system calls in order to coordinate events and to draw graphics. These calls interrupt the CPU which can increase both latency and power consumption, and also require the kernel to maintain event queues that duplicate event queues in the display server. A further drawback of placing the display server in the application layer is that the display server contains most of the information required to efficiently schedule the application and this information is not communicated to existing kernels, meaning that GUI-oriented applications are scheduled less efficiently than they might be, which further increases power consumption.
    [Show full text]
  • Onload User Guide
    Onload User Guide Copyright © 2017 SOLARFLARE Communications, Inc. All rights reserved. The software and hardware as applicable (the “Product”) described in this document, and this document, are protected by copyright laws, patents and other intellectual property laws and international treaties. The Product described in this document is provided pursuant to a license agreement, evaluation agreement and/or non‐disclosure agreement. The Product may be used only in accordance with the terms of such agreement. The software as applicable may be copied only in accordance with the terms of such agreement. Onload is licensed under the GNU General Public License (Version 2, June 1991). See the LICENSE file in the distribution for details. The Onload Extensions Stub Library is Copyright licensed under the BSD 2‐Clause License. Onload contains algorithms and uses hardware interface techniques which are subject to Solarflare Communications Inc patent applications. Parties interested in licensing Solarflare's IP are encouraged to contact Solarflare's Intellectual Property Licensing Group at: Director of Intellectual Property Licensing Intellectual Property Licensing Group Solarflare Communications Inc, 7505 Irvine Center Drive Suite 100 Irvine, California 92618 You will not disclose to a third party the results of any performance tests carried out using Onload or EnterpriseOnload without the prior written consent of Solarflare. The furnishing of this document to you does not give you any rights or licenses, express or implied, by estoppel or otherwise, with respect to any such Product, or any copyrights, patents or other intellectual property rights covering such Product, and this document does not contain or represent any commitment of any kind on the part of SOLARFLARE Communications, Inc.
    [Show full text]
  • A Linux Kernel Scheduler Extension for Multi-Core Systems
    A Linux Kernel Scheduler Extension For Multi-Core Systems Aleix Roca Nonell Master in Innovation and Research in Informatics (MIRI) specialized in High Performance Computing (HPC) Facultat d’informàtica de Barcelona (FIB) Universitat Politècnica de Catalunya Supervisor: Vicenç Beltran Querol Cosupervisor: Eduard Ayguadé Parra Presentation date: 25 October 2017 Abstract The Linux Kernel OS is a black box from the user-space point of view. In most cases, this is not a problem. However, for parallel high performance computing applications it can be a limitation. Such applications usually execute on top of a runtime system, itself executing on top of a general purpose kernel. Current runtime systems take care of getting the most of each system core by distributing work among the multiple CPUs of a machine but they are not aware of when one of their threads perform blocking calls (e.g. I/O operations). When such a blocking call happens, the processing core is stalled, leading to performance loss. In this thesis, it is presented the proof-of-concept of a Linux kernel extension denoted User Monitored Threads (UMT). The extension allows a user-space application to be notified of the blocking and unblocking of its threads, making it possible for a core to execute another worker thread while the other is blocked. An existing runtime system (namely Nanos6) is adapted, so that it takes advantage of the kernel extension. The whole prototype is tested on a synthetic benchmarks and an industry mock-up application. The analysis of the results shows, on the tested hardware and the appropriate conditions, a significant speedup improvement.
    [Show full text]
  • Linux Kernel Development 3Rd Edition
    ptg Linux Kernel Development Third Edition ptg From the Library of Wow! eBook Developer’s Library ESSENTIAL REFERENCES FOR PROGRAMMING PROFESSIONALS Developer’s Library books are designed to provide practicing programmers with unique, high-quality references and tutorials on the programming languages and technologies they use in their daily work. All books in the Developer’s Library are written by expert technology practitioners who are especially skilled at organizing and presenting information in a way that’s useful for other programmers. Key titles include some of the best, most widely acclaimed books within their topic areas: PHP & MySQL Web Development Python Essential Reference Luke Welling & Laura Thomson David Beazley ISBN 978-0-672-32916-6 ISBN-13: 978-0-672-32978-6 MySQL Programming in Objective-C 2.0 ptg Paul DuBois Stephen G. Kochan ISBN-13: 978-0-672-32938-8 ISBN-13: 978-0-321-56615-7 Linux Kernel Development PostgreSQL Robert Love Korry Douglas ISBN-13: 978-0-672-32946-3 ISBN-13: 978-0-672-33015-5 Developer’s Library books are available at most retail and online bookstores, as well as by subscription from Safari Books Online at safari.informit.com Developer’s Library informit.com/devlibrary From the Library of Wow! eBook Linux Kernel Development ptg Third Edition Robert Love Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Cape Town • Sydney • Tokyo • Singapore • Mexico City From the Library of Wow! eBook Linux Kernel Development Acquisitions Editor Third Edition Mark Taber Development Copyright © 2010 Pearson Education, Inc.
    [Show full text]
  • ODROID-HC2: 3.5” High Powered Storage  February 1, 2018
    ODROID WiFi Access Point: Share Files Via Samba February 1, 2018 How to setup an ODROID with a WiFi access point so that an ODROID’s hard drive can be accessed and modied from another computer. This is primarily aimed at allowing access to images, videos, and log les on the ODROID. ODROID-HC2: 3.5” High powered storage February 1, 2018 The ODROID-HC2 is an aordable mini PC and perfect solution for a network attached storage (NAS) server. This device home cloud-server capabilities centralizes data and enables users to share and stream multimedia les to phones, tablets, and other devices across a network. It is an ideal tool for many use Using SquashFS As A Read-Only Root File System February 1, 2018 This guide describes the usage of SquashFS PiFace: Control and Display 2 February 1, 2018 For those who have the PiFace Control and Display 2, and want to make it compatible with the ODROID-C2 Android Gaming: Data Wing, Space Frontier, and Retro Shooting – Pixel Space Shooter February 1, 2018 Variations on a theme! Race, blast into space, and blast things into pieces that are racing towards us. The fun doesn’t need to stop when you take a break from your projects. Our monthly pick on Android games. Linux Gaming: Saturn Games – Part 1 February 1, 2018 I think it’s time we go into a bit more detail about Sega Saturn for the ODROID-XU3/XU4 Gaming Console: Running Your Favorite Games On An ODROID-C2 Using Android February 1, 2018 I built a gaming console using an ODROID-C2 running Android 6 Controller Area Network (CAN) Bus: Implementation
    [Show full text]
  • Monitoring File Events
    MONITORING FILE EVENTS Some applications need to be able to monitor files or directories in order to deter- mine whether events have occurred for the monitored objects. For example, a graphical file manager needs to be able to determine when files are added or removed from the directory that is currently being displayed, or a daemon may want to monitor its configuration file in order to know if the file has been changed. Starting with kernel 2.6.13, Linux provides the inotify mechanism, which allows an application to monitor file events. This chapter describes the use of inotify. The inotify mechanism replaces an older mechanism, dnotify, which provided a subset of the functionality of inotify. We describe dnotify briefly at the end of this chapter, focusing on why inotify is better. The inotify and dnotify mechanisms are Linux-specific. (A few other systems provide similar mechanisms. For example, the BSDs provide the kqueue API.) A few libraries provide an API that is more abstract and portable than inotify and dnotify. The use of these libraries may be preferable for some applications. Some of these libraries employ inotify or dnotify, on systems where they are available. Two such libraries are FAM (File Alteration Monitor, http:// oss.sgi.com/projects/fam/) and Gamin (http://www.gnome.org/~veillard/gamin/). 19.1 Overview The key steps in the use of the inotify API are as follows: 1. The application uses inotify_init() to create an inotify instance. This system call returns a file descriptor that is used to refer to the inotify instance in later operations.
    [Show full text]
  • Unbreakable Enterprise Kernel Release Notes for Unbreakable Enterprise Kernel Release 3
    Unbreakable Enterprise Kernel Release Notes for Unbreakable Enterprise Kernel Release 3 E48380-10 June 2020 Oracle Legal Notices Copyright © 2013, 2020, Oracle and/or its affiliates. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, then the following notice is applicable: U.S. GOVERNMENT END USERS: Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs) and Oracle computer documentation or other Oracle data delivered to or accessed by U.S. Government end users are "commercial computer software" or "commercial computer software documentation" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, reproduction, duplication, release, display, disclosure, modification, preparation of derivative works, and/or adaptation of i) Oracle programs (including any operating system, integrated software, any programs embedded, installed or activated on delivered hardware, and modifications of such programs), ii) Oracle computer documentation and/or iii) other Oracle data, is subject to the rights and limitations specified in the license contained in the applicable contract.
    [Show full text]
  • Kernel Korner
    Kernel Korner http://0-delivery.acm.org.innopac.lib.ryerson.ca/10.1145/1110000/11030... Kernel Korner Intro to inotify Robert Love Abstract Applications that watch thousands of files for changes, or that need to know when a storage device gets disconnected, need a clean, fast solution to the file change notification problem. Here it is. John McCutchan and I had been working on inotify for about a year when it was finally merged into Linus' kernel tree and released with kernel version 2.6.13. Although a long struggle, the effort culminated in success and was ultimately worth every rewrite, bug and debate. What Is inotify? inotify is a file change notification system—a kernel feature that allows applications to request the monitoring of a set of files against a list of events. When the event occurs, the application is notified. To be useful, such a feature must be simple to use, lightweight with little overhead and flexible. It should be easy to add new watches and painless to receive notification of events. To be sure, inotify is not the first of its kind. Every modern operating system provides some sort of file notification system; many network and desktop applications require such functionality—Linux too. For years, Linux has offered dnotify. The problem was, dnotify was not very good. In fact, it stank. dnotify, which ostensibly stands for directory notify, was never considered easy to use. Sporting a cumbersome interface and several painful features that made life arduous, dnotify failed to meet the demands of the modern desktop, where asynchronous notification of events and a free flow of information rapidly are becoming the norm.
    [Show full text]
  • Deployment Guide
    Deployment Guide SUSE CaaS Platform 4.2.4 Deployment Guide: This guide describes deployment for SUSE CaaS Platform 4.2.4. SUSE CaaS Platform 4.2.4 by Markus Napp and Nora Kořánová Publication Date: 2020-12-11 SUSE LLC 1800 South Novell Place Provo, UT 84606 USA https://documentation.suse.com Contents vi About This Guide vii 1 Required Background vii 2 Available Documentation vii 3 Feedback viii 4 Documentation Conventions viii 1 Requirements 1 1.1 Platform 1 1.2 Nodes 1 1.3 Hardware 2 Management Workstation 2 • Storage Sizing 2 • Storage Performance 5 1.4 Networking 7 Sub-Network Sizing 7 • Ports 8 • IP Addresses 9 • IP Forwarding 10 • Networking Whitelist 10 • Communication 11 • Performance 11 • Security 12 • Replicas 12 2 Deployment Scenarios 14 2.1 Default Deployment Scenario 14 2.2 Airgapped deployment 16 Process Checklist 16 • Concepts 17 • Requirements 20 • RPM Repository Mirror 24 • Updating RPM Repository Mirror 25 • Container Registry Mirror 26 • Helm Chart Repository Mirror 34 • Updating Registry Mirror For Helm Charts 35 • Deploying SUSE CaaS Platform 38 • Troubleshooting 38 iii Deployment Guide 3 Deployment Instructions 40 3.1 Deployment Preparations 40 Basic SSH Key Configuration 40 • Registration Code 42 • Unique Machine IDs 43 • Installation Tools 44 • Load Balancer 45 3.2 Deployment on SUSE OpenStack Cloud 54 Deploying the Cluster Nodes 56 • Logging in to the Cluster Nodes 61 • Container Runtime Proxy 63 3.3 Deployment on VMware 63 Environment Description 63 • Choose a Setup Type 64 • Setup with AutoYaST 64 • Setup Using
    [Show full text]
  • Analyzing a Decade of Linux System Calls
    Empir Software Eng https://doi.org/10.1007/s10664-017-9551-z Analyzing a decade of Linux system calls Mojtaba Bagherzadeh1 · Nafiseh Kahani1 · Cor-Paul Bezemer1 · Ahmed E. Hassan1 · Juergen Dingel1 · James R. Cordy1 © Springer Science+Business Media, LLC 2017 Abstract Over the past 25 years, thousands of developers have contributed more than 18 million lines of code (LOC) to the Linux kernel. As the Linux kernel forms the central part of various operating systems that are used by millions of users, the kernel must be continuously adapted to the changing demands and expectations of these users. The Linux kernel provides its services to an application through system calls. The combined set of all system calls forms the essential Application Programming Interface (API) through which an application interacts with the kernel. In this paper, we conduct an empirical study of 8,770 changes that were made to Linux system calls during the last decade (i.e., from April 2005 to December 2014). In particular, we study the size of the changes, and we manually identify the type of changes and bug fixes that were made. Our analysis provides an overview of the evolution of the Linux system calls over the last decade. We find that there was a considerable amount of technical debt in the kernel, that was addressed by adding a number of sibling calls (i.e., 26% of all system calls). In addition, we find that by far, the ptrace() and signal handling Communicated by: Harald Gall Cor-Paul Bezemer [email protected] Mojtaba Bagherzadeh [email protected] Nafiseh Kahani [email protected] Ahmed E.
    [Show full text]
  • Linux/UNIX System Programming Fundamentals
    Linux/UNIX System Programming Fundamentals Michael Kerrisk man7.org NDC TechTown; August 2020 ©2020, man7.org Training and Consulting / Michael Kerrisk. All rights reserved. These training materials have been made available for personal, noncommercial use. Except for personal use, no part of these training materials may be printed, reproduced, or stored in a retrieval system. These training materials may not be redistributed by any means, electronic, mechanical, or otherwise, without prior written permission of the author. These training materials may not be used to provide training to others without prior written permission of the author. Every effort has been made to ensure that the material contained herein is correct, including the development and testing of the example programs. However, no warranty is expressed or implied, and the author shall not be liable for loss or damage arising from the use of these programs. The programs are made available under Free Software licenses; see the header comments of individual source files for details. For information about this course, visit http://man7.org/training/. For inquiries regarding training courses, please contact us at [email protected]. Please send corrections and suggestions for improvements to this course material to [email protected]. For information about The Linux Programming Interface, please visit http://man7.org/tlpi/. Short table of contents 1 Course Introduction 1-1 2 Fundamental Concepts 2-1 3 File I/O and Files 3-1 4 Directories and Links 4-1 5 Processes 5-1 6 Signals:
    [Show full text]
  • Epoll API 23-34 23.7 Epoll Events 23-45 23.8 Epoll: Edge-Triggered Notification 23-59 23.9 Epoll: API Quirks 23-70 23.10 Event-Loop Programming 23-76
    Outline 23 Alternative I/O Models 23-1 23.1 Overview 23-3 23.2 Nonblocking I/O 23-5 23.3 Signal-driven I/O 23-11 23.4 I/O multiplexing: poll() 23-14 23.5 Problems with poll() and select() 23-31 23.6 The epoll API 23-34 23.7 epoll events 23-45 23.8 epoll: edge-triggered notification 23-59 23.9 epoll: API quirks 23-70 23.10 Event-loop programming 23-76 Overview Like select() and poll(), epoll can monitor multiple FDs epoll returns readiness information in similar manner to poll() Two main advantages: epoll provides much better performance when monitoring large numbers of FDs (see TLPI 63.4.5) § epoll provides two notification modes: level-triggered and edge-triggered Default is level-triggered notification select() and poll() provide only level-triggered notification (Signal-driven I/O provides only edge-triggered notification) Linux-specific, since kernel 2.6.0 [TLPI 63.4] § Linux/UNIX System Programming ©2020, Michael Kerrisk Alternative I/O Models 23-36 §23.6 epoll instances Central data structure of epoll API is an epoll instance Persistent data structure maintained in kernel space Referred to in user space viafile descriptor Can (abstractly) be considered as container for two lists: Interest list: list of FDs to be monitored Ready list: list of FDs that are ready for I/O Ready list is (dynamic) subset of interest list Linux/UNIX System Programming ©2020, Michael Kerrisk Alternative I/O Models 23-37 §23.6 epoll APIs The key epoll APIs are: epoll_create(): create a new epoll instance and return FD referring to instance FD is used in the calls below epoll_ctl(): modify interest list of epoll instance Add FDs to/remove FDs from interest list Modify events mask for FDs currently in interest list epoll_wait(): return items from ready list of epoll instance Linux/UNIX System Programming ©2020, Michael Kerrisk Alternative I/O Models 23-38 §23.6 epoll kernel data structures and APIs User space Kernel space File descriptor from epoll instance epoll_create() refers to Interest list events data ..
    [Show full text]