P-Socket: Optimizing a Communication Library for a Pcie-Based Intra-Rack Interconnect

Total Page:16

File Type:pdf, Size:1020Kb

P-Socket: Optimizing a Communication Library for a Pcie-Based Intra-Rack Interconnect P-Socket: Optimizing a Communication Library for a PCIe-Based Intra-Rack Interconnect Liuhang Zhang, Rui Hou Sally A. McKee Jianbo Dong, Lixin Zhang Institute of Computing Chalmers University of Institute of Computing Technology Technology Technology University of Chinese Gothenburg, Sweden University of Chinese Academy of Sciences [email protected] Academy of Sciences Beijing, China Beijing, China [email protected] [email protected] [email protected] [email protected] ABSTRACT by cloud computing and big data applications, these work- Data centers require efficient, low-cost, flexible interconnects loads have caused network data flows to change such that to manage the rapidly growing internal traffic generated by the ratio of internal to external traffic has gone from 5:95 an increasingly diverse set of applications. To meet these re- to 75:25 [27]. Data center applications are also becoming quirements, data center networks are increasingly employing more diversified in their requirements. Meeting the needs of alternatives such as RapidIO, Freedom, and PCIe, which re- these applications at rapidly growing scales requires efficient quire fewer physical devices and/or have simpler protocols sharing of data-center resources, which, in turn, requires a than more traditional interconnects. These networks offer high-efficiency, low-cost, flexible interconnect. raw high performance communication capabilities, but sim- Early data centers often employed standard High Perfor- ply using them for conventional TCP/IP-based communica- mance Computing (HPC) networking solutions like Infini- tion fails to realize the potential performance of the physical Band [12] (heretofore abbreviated as IB), 10-Gigabit Ether- network. Here we analyze causes for this performance loss net [16] (10 GigE), Myrinet [4], and Quadrics [22]. Ethernet for the TCP/IP protocol over one such fabric, PCIe, and we has often been used for its ease of deployment and backward explore a hardware/software solution that mitigates over- compatibility, but linking racks of volume servers together heads and exploits PCIe's advanced features. The result with 10 GigE switches and routers is wasteful in terms of is P-Socket, an efficient library that enables legacy socket power and cost: replicated components allow each server applications to run without modification. Our experiments to enjoy individual operation, management, and connectiv- show that P-Socket achieves an end-to-end latency of 1.2µs ity, yet the servers are never used in isolation. Myrinet is and effective bandwidth of up to 2.87GB/s (out of a theo- a high-speed local area networking fabric which has much retical peak of 3.05GB/s). lower protocol overhead than Ethernet. Once popular for supercomputers, its use has decreased in recent years (it was used in 141 of the TOP500 machines in 2005 but only one CCS Concepts in 2013) [33]. InfiniBand stands out among traditional HPC •Software and its engineering ! Communications solutions: its popularity has been growing for both TOP500 management; •Networks ! Network performance evalu- supercomputers and enterprise data centers. ation; Network performance analysis; Data center networks; Many newer data centers are turning to innovative inter- •Hardware ! Networking hardware; connects that better match the communication requirements of modern data center workloads. For instance, Freescale, IDT, Mobiveil, and Prodrive promote the use of ARM servers Keywords connected by RapidIO [20], the embedded fabric currently data-center servers, rack interconnects, sockets, PCIe used in most base stations. AMD SeaMicro market ultra low-power, small-footprint data centers connected by their FreedomTM Supercomputer Fabric, a 3D torus that includes 1. INTRODUCTION both path redundancy and diversity. SeaMicro's proprietary Data centers must handle huge volumes of workloads gen- technology can be used to build servers from any proces- erated by millions of independent users. Now dominated sor with any instruction set and any communication pro- tocol [26]. Even PCI Express (PCIe) | once viewed inap- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed propriate for general-purpose fabrics | is increasingly used for profit or commercial advantage and that copies bear this notice and the full citation within small-scale, tightly coupled data-center racks once on the first page. Copyrights for components of this work owned by others than the connected by more traditional HPC technologies [6]. For in- author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission stance, Hou et al. [15] demonstrate a prototype data center and/or a fee. Request permissions from [email protected]. server in which nodes share memory resources, GPGPUs, CF’16, May 16 - 19, 2016, Como, Italy and network bandwidth via PCIe, which allows remote re- c 2016 Copyright held by the owner/author(s). Publication rights licensed to ACM. sources to be used much the same as local resources. ISBN 978-1-4503-4128-8/16/05. $15.00 DOI: http://dx.doi.org/10.1145/2903150.2903168 In all high-speed interconnects, hardware and software 2. PCIE SYSTEM ARCHITECTURE overheads prevent applications from realizing a fabric's raw Many scalable systems are built from sets of nodes coupled peak performance. For instance, Balaji et al. [1] show that tightly with low-latency, high-bandwidth local interconnects. the Socket Direct Protocol (SDP) and IP over InfiniBand These sets of tightly coupled nodes, or super nodes, are (IPoIB) both take more than five times the raw link latency themselves connected by more traditional networks such as and realize less than 2/3 and 1/5 of the raw link bandwidth, Ethernet. PCIe fabrics are good candidates for super-node respectively. Likewise, Feng et al. [11] show that implement- interconnects based on several considerations. First, the ing traditional sockets over 10 GigE incurs 16% latency over PCIe interface already enjoys widespread use, which means that of the raw link and realizes less than 3/4 of the raw link that deploying it requires no architectural changes or addi- bandwidth. These demonstrate that there is still room for tional protocol translation cards. Second, the PCIe fabric improvement. Here we use PCIe to build a prototype system has good scalability for intra-rack interconnects, which usu- (similar to that of Hou et al. [15]) on which to investigate ally include fewer than 100 nodes: PCIe cables (e.g., cop- the reasons behind these performance gaps. We select PCIe per wire or optical fiber) work well at such short distances. for its reliability and efficiency over short distances. Third, PCIe allows servers within a rack to directly share Experiments on our PCIe prototype system verify that: resources via memory load/store instructions. • using store instructions to transmit small data packet Figure 1 shows a typical organization. In this tightly reduces latency by about 16% over DMA; coupled group, compute nodes connect to a Non-Transparent Bridge (NTB) that connects to the Transparent Bridge (TB) • using store instructions realizes a peak bandwidth of on the other side. NTBs can separate different address 2.48GB/s, while using load instructions delivers only spaces and translate transactions from one address space 26.67MB/s; to another. TBs are used to forward transactions within a given address space. This fabric is sufficiently scalable • using burst DMA mode performs better than using that many TBs can be connected together to expand the block DMA mode for small packets, even though burst network, and new compute nodes need only one NTB to DMA's latency is 1.76 times longer; join. When compared with Ethernet or IB (whose adapters • bypassing the TCP/IP protocol stack can increase band- are commonly plugged into PCIe slots), PCIe fabrics elimi- width by a factor of eight and lower latency by about nate additional protocol conversions (e.g., from PCIe to IB 30% for small messages; and or from PCIe to Ethernet). This advantage gives PCIe a shorter communication channel and thus lower latency. • bypassing the kernel reduces small-packet latency from 18µs to 1.2µs by eliminating unnecessary context switch- ing and buffer copying. 3. RELATED WORK Based on this analysis, we propose P-Socket, a communi- The evolution of interconnection technology has caused cations library designed to exploit these performance arti- the communication performance bottleneck to move from facts. Specifically, P-Socket bypasses the kernel, uses store the physical layer to the software layer. One way to re- instructions to transmit small packets and to implement flow duce the bottleneck is to move some software functionality control, and uses burst DMA instead of block DMA to trans- to the hardware. For example, Illinois Fast Messages (FM) mit large packets. Note that the optimizations we study are use Myrinet capabilities to offload protocol processing to not new, and they have been exploited to good effect else- the programable NIC [21]. TCP Offload Engines also free where; rather, it is their synthesis and PCIe-specific adap- host CPU cycles by moving TCP/IP stack processing to the tation that is novel to P-Socket. The main contribution of network controller [14]. Similarly, Ethernet Message Passing this paper is the detailed performance evaluation of our im- (EMP) offloads protocol processing to take better
Recommended publications
  • Beej's Guide to Unix IPC
    Beej's Guide to Unix IPC Brian “Beej Jorgensen” Hall [email protected] Version 1.1.3 December 1, 2015 Copyright © 2015 Brian “Beej Jorgensen” Hall This guide is written in XML using the vim editor on a Slackware Linux box loaded with GNU tools. The cover “art” and diagrams are produced with Inkscape. The XML is converted into HTML and XSL-FO by custom Python scripts. The XSL-FO output is then munged by Apache FOP to produce PDF documents, using Liberation fonts. The toolchain is composed of 100% Free and Open Source Software. Unless otherwise mutually agreed by the parties in writing, the author offers the work as-is and makes no representations or warranties of any kind concerning the work, express, implied, statutory or otherwise, including, without limitation, warranties of title, merchantibility, fitness for a particular purpose, noninfringement, or the absence of latent or other defects, accuracy, or the presence of absence of errors, whether or not discoverable. Except to the extent required by applicable law, in no event will the author be liable to you on any legal theory for any special, incidental, consequential, punitive or exemplary damages arising out of the use of the work, even if the author has been advised of the possibility of such damages. This document is freely distributable under the terms of the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. See the Copyright and Distribution section for details. Copyright © 2015 Brian “Beej Jorgensen” Hall Contents 1. Intro................................................................................................................................................................1 1.1. Audience 1 1.2. Platform and Compiler 1 1.3.
    [Show full text]
  • Mmap and Dma
    CHAPTER THIRTEEN MMAP AND DMA This chapter delves into the area of Linux memory management, with an emphasis on techniques that are useful to the device driver writer. The material in this chap- ter is somewhat advanced, and not everybody will need a grasp of it. Nonetheless, many tasks can only be done through digging more deeply into the memory man- agement subsystem; it also provides an interesting look into how an important part of the kernel works. The material in this chapter is divided into three sections. The first covers the implementation of the mmap system call, which allows the mapping of device memory directly into a user process’s address space. We then cover the kernel kiobuf mechanism, which provides direct access to user memory from kernel space. The kiobuf system may be used to implement ‘‘raw I/O’’ for certain kinds of devices. The final section covers direct memory access (DMA) I/O operations, which essentially provide peripherals with direct access to system memory. Of course, all of these techniques requir e an understanding of how Linux memory management works, so we start with an overview of that subsystem. Memor y Management in Linux Rather than describing the theory of memory management in operating systems, this section tries to pinpoint the main features of the Linux implementation of the theory. Although you do not need to be a Linux virtual memory guru to imple- ment mmap, a basic overview of how things work is useful. What follows is a fairly lengthy description of the data structures used by the kernel to manage memory.
    [Show full text]
  • An Introduction to Linux IPC
    An introduction to Linux IPC Michael Kerrisk © 2013 linux.conf.au 2013 http://man7.org/ Canberra, Australia [email protected] 2013-01-30 http://lwn.net/ [email protected] man7 .org 1 Goal ● Limited time! ● Get a flavor of main IPC methods man7 .org 2 Me ● Programming on UNIX & Linux since 1987 ● Linux man-pages maintainer ● http://www.kernel.org/doc/man-pages/ ● Kernel + glibc API ● Author of: Further info: http://man7.org/tlpi/ man7 .org 3 You ● Can read a bit of C ● Have a passing familiarity with common syscalls ● fork(), open(), read(), write() man7 .org 4 There’s a lot of IPC ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memory attach ● Pseudoterminals ● proc_vm_readv() / proc_vm_writev() ● Sockets ● Signals ● Stream vs Datagram (vs Seq. packet) ● Standard, Realtime ● UNIX vs Internet domain ● Eventfd ● POSIX message queues ● Futexes ● POSIX shared memory ● Record locks ● ● POSIX semaphores File locks ● ● Named, Unnamed Mutexes ● System V message queues ● Condition variables ● System V shared memory ● Barriers ● ● System V semaphores Read-write locks man7 .org 5 It helps to classify ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memory attach ● Pseudoterminals ● proc_vm_readv() / proc_vm_writev() ● Sockets ● Signals ● Stream vs Datagram (vs Seq. packet) ● Standard, Realtime ● UNIX vs Internet domain ● Eventfd ● POSIX message queues ● Futexes ● POSIX shared memory ● Record locks ● ● POSIX semaphores File locks ● ● Named, Unnamed Mutexes ● System V message queues ● Condition variables ● System V shared memory ● Barriers ● ● System V semaphores Read-write locks man7 .org 6 It helps to classify ● Pipes ● Shared memory mappings ● FIFOs ● File vs Anonymous ● Cross-memoryn attach ● Pseudoterminals tio a ● proc_vm_readv() / proc_vm_writev() ● Sockets ic n ● Signals ● Stream vs Datagram (vs uSeq.
    [Show full text]
  • Table of Contents
    TABLE OF CONTENTS Chapter 1 Introduction ............................................................................................. 3 Chapter 2 Examples for FPGA ................................................................................. 4 2.1 Factory Default Code ................................................................................................................................. 4 2.2 Nios II Control for Programmable PLL/ Temperature/ Power/ 9-axis ....................................................... 6 2.3 Nios DDR4 SDRAM Test ........................................................................................................................ 12 2.4 RTL DDR4 SDRAM Test ......................................................................................................................... 14 2.5 USB Type-C DisplayPort Alternate Mode ............................................................................................... 15 2.6 USB Type-C FX3 Loopback .................................................................................................................... 17 2.7 HDMI TX and RX in 4K Resolution ........................................................................................................ 21 2.8 HDMI TX in 4K Resolution ..................................................................................................................... 26 2.9 Low Latency Ethernet 10G MAC Demo .................................................................................................. 29 2.10 Socket
    [Show full text]
  • IPC: Mmap and Pipes
    Interprocess Communication: Memory mapped files and pipes CS 241 April 4, 2014 University of Illinois 1 Shared Memory Private Private address OS address address space space space Shared Process A segment Process B Processes request the segment OS maintains the segment Processes can attach/detach the segment 2 Shared Memory Private Private Private address address space OS address address space space space Shared Process A segment Process B Can mark segment for deletion on last detach 3 Shared Memory example /* make the key: */ if ((key = ftok(”shmdemo.c", 'R')) == -1) { perror("ftok"); exit(1); } /* connect to (and possibly create) the segment: */ if ((shmid = shmget(key, SHM_SIZE, 0644 | IPC_CREAT)) == -1) { perror("shmget"); exit(1); } /* attach to the segment to get a pointer to it: */ data = shmat(shmid, (void *)0, 0); if (data == (char *)(-1)) { perror("shmat"); exit(1); } 4 Shared Memory example /* read or modify the segment, based on the command line: */ if (argc == 2) { printf("writing to segment: \"%s\"\n", argv[1]); strncpy(data, argv[1], SHM_SIZE); } else printf("segment contains: \"%s\"\n", data); /* detach from the segment: */ if (shmdt(data) == -1) { perror("shmdt"); exit(1); } return 0; } Run demo 5 Memory Mapped Files Memory-mapped file I/O • Map a disk block to a page in memory • Allows file I/O to be treated as routine memory access Use • File is initially read using demand paging ! i.e., loaded from disk to memory only at the moment it’s needed • When needed, a page-sized portion of the file is read from the file system
    [Show full text]
  • A Machine-Independent DMA Framework for Netbsd
    AMachine-Independent DMA Framework for NetBSD Jason R. Thorpe1 Numerical Aerospace Simulation Facility NASA Ames Research Center Abstract 1.1. Host platform details One of the challenges in implementing a portable In the example platforms listed above,there are at kernel is finding good abstractions for semantically- least three different mechanisms used to perform DMA. similar operations which often have very machine- The first is used by the i386 platform. This mechanism dependent implementations. This is especially impor- can be described as "what you see is what you get": the tant on modern machines which share common archi- address that the device uses to perform the DMA trans- tectural features, e.g. the PCI bus. fer is the same address that the host CPU uses to access This paper describes whyamachine-independent the memory location in question. DMA mapping abstraction is needed, the design consid- DMA address Host address erations for such an abstraction, and the implementation of this abstraction in the NetBSD/alpha and NetBSD/i386 kernels. 1. Introduction NetBSD is a portable, modern UNIX-likeoperat- ing system which currently runs on eighteen platforms covering nine processor architectures. Some of these platforms, including the Alpha and i3862,share the PCI busasacommon architectural feature. In order to share device drivers for PCI devices between different platforms, abstractions that hide the details of bus access must be invented. The details that must be hid- den can be broken down into twoclasses: CPU access Figure 1 - WYSIWYG DMA to devices on the bus (bus_space)and device access to host memory (bus_dma). Here we will discuss the lat- The second mechanism, employed by the Alpha, ter; bus_space is a complicated topic in and of itself, is very similar to the first; the address the host CPU and is beyond the scope of this paper.
    [Show full text]
  • Interprocess Communication
    06 0430 CH05 5/22/01 10:22 AM Page 95 5 Interprocess Communication CHAPTER 3,“PROCESSES,” DISCUSSED THE CREATION OF PROCESSES and showed how one process can obtain the exit status of a child process.That’s the simplest form of communication between two processes, but it’s by no means the most powerful.The mechanisms of Chapter 3 don’t provide any way for the parent to communicate with the child except via command-line arguments and environment variables, nor any way for the child to communicate with the parent except via the child’s exit status. None of these mechanisms provides any means for communicating with the child process while it is actually running, nor do these mechanisms allow communication with a process outside the parent-child relationship. This chapter describes means for interprocess communication that circumvent these limitations.We will present various ways for communicating between parents and chil- dren, between “unrelated” processes, and even between processes on different machines. Interprocess communication (IPC) is the transfer of data among processes. For example, a Web browser may request a Web page from a Web server, which then sends HTML data.This transfer of data usually uses sockets in a telephone-like connection. In another example, you may want to print the filenames in a directory using a command such as ls | lpr.The shell creates an ls process and a separate lpr process, connecting 06 0430 CH05 5/22/01 10:22 AM Page 96 96 Chapter 5 Interprocess Communication the two with a pipe, represented by the “|” symbol.A pipe permits one-way commu- nication between two related processes.The ls process writes data into the pipe, and the lpr process reads data from the pipe.
    [Show full text]
  • Real-Time Streaming Video and Image Processing on Inexpensive Hardware with Low Latency
    University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Theses, Dissertations, and Student Research Electrical & Computer Engineering, Department from Electrical & Computer Engineering of Spring 5-2018 Real-Time Streaming Video and Image Processing on Inexpensive Hardware with Low Latency Richard L. Gregg University of Nebraska - Lincoln, [email protected] Follow this and additional works at: https://digitalcommons.unl.edu/elecengtheses Part of the Computer Engineering Commons, and the Other Electrical and Computer Engineering Commons Gregg, Richard L., "Real-Time Streaming Video and Image Processing on Inexpensive Hardware with Low Latency" (2018). Theses, Dissertations, and Student Research from Electrical & Computer Engineering. 93. https://digitalcommons.unl.edu/elecengtheses/93 This Article is brought to you for free and open access by the Electrical & Computer Engineering, Department of at DigitalCommons@University of Nebraska - Lincoln. It has been accepted for inclusion in Theses, Dissertations, and Student Research from Electrical & Computer Engineering by an authorized administrator of DigitalCommons@University of Nebraska - Lincoln. Real-Time Streaming Video And Image Processing On Inexpensive Hardware With Low Latency by Richard L. Gregg A THESIS Presented to the Faculty of The Graduate College at the University of Nebraska In Partial Fulfillment of Requirements For the Degree of Master of Science Major: Telecommunications Engineering Under the Supervision of Professor Dongming Peng Lincoln, Nebraska May 2018 REAL-TIME STREAMING VIDEO AND IMAGE PROCESSING ON INEXPENSIVE HARDWARE WITH LOW LATENCY Richard L. Gregg, M.S. University of Nebraska, 2018 Advisor: Dongming Peng The use of resource constrained inexpensive hardware places restrictions on the design of streaming video and image processing system performance.
    [Show full text]
  • Lecture 14: Paging
    Lecture 14: Paging Fall 2018 Jason Tang Slides based upon Operating System Concept slides, http://codex.cs.yale.edu/avi/os-book/OS9/slide-dir/index.html Copyright Silberschatz, Galvin, and Gagne, 2013 "1 Topics • Memory Mappings! • Page Table! • Translation Lookaside Bu$ers! • Page Protection "2 Memory Mapped • Logical address translated not to memory but some other location! • Memory-mapped I/O (MMIO): hardware redirects read/write of certain addresses to physical device! • For example, on x86, address 0x3F8 usually mapped to first serial port! • Memory-mapped file (mmap): OS redirects read/write of mapped memory region to file on disk! • Every call to read() / write() involves a system call! • Writing to a pointer faster, and OS can translate in the background (also see upcoming lecture) "3 Memory-Mapped File • In Linux, typical pattern is to:! • Open file, using open() function! • Optional: preallocate file size, using ftruncate()! • Create memory mapping, using mmap() function! • Do work, and then release mapping using munmap() function! • Kernel might not write data to disk until munmap() "4 mmap() function void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset) • addr is target address of mapping, or NULL to let kernel decide! • length is number of bytes to map! • prot defines what mapping protection (read-only or read/write)! • flags sets other options! • fd is file descriptor that was returned by open()! • offset is o$set into file specified by fd "5 mmap() example part 1 • See man page for each of these functions
    [Show full text]
  • Process Address Spaces and Binary Formats
    2/5/20 COMP 790: OS Implementation COMP 790: OS Implementation Logical Diagram Binary Memory Process Address Spaces Threads Formats AllocatorsToday’s and Binary Formats Lecture User System Calls Kernel Don Porter RCU File System Networking Sync Memory Device CPU Management Drivers Scheduler Hardware Interrupts Disk Net Consistency 1 2 1 2 COMP 790: OS Implementation COMP 790: OS Implementation Review Definitions (can vary) • We’ve seen how paging and segmentation work on • Process is a virtual address space x86 – 1+ threads of execution work within this address space – Maps logical addresses to physical pages • A process is composed of: – These are the low-level hardware tools – Memory-mapped files • This lecture: build up to higher-level abstractions • Includes program binary • Namely, the process address space – Anonymous pages: no file backing • When the process exits, their contents go away 3 4 3 4 COMP 790: OS Implementation COMP 790: OS Implementation Address Space Layout Simple Example • Determined (mostly) by the application • Determined at compile time Virtual Address Space – Link directives can influence this hello heap stk libc.so • See kern/kernel.ld in JOS; specifies kernel starting address • OS usually reserves part of the address space to map 0 0xffffffff itself • “Hello world” binary specified load address – Upper GB on x86 Linux • Also specifies where it wants libc • Application can dynamically request new mappings from the OS, or delete mappings • Dynamically asks kernel for “anonymous” pages for its heap and stack 5 6 5 6 1 2/5/20 COMP 790: OS Implementation COMP 790: OS Implementation In practice Problem 1: How to represent in the kernel? • You can see (part of) the requested memory layout • What is the best way to represent the components of of a program using ldd: a process? $ ldd /usr/bin/git – Common question: is mapped at address x? linux-vdso.so.1 => (0x00007fff197be000) • Page faults, new memory mappings, etc.
    [Show full text]
  • Analysis of Body Movement and Its Effects on Cyberware 3D Whole
    ANALYSIS OF BODY MOVEMENT AND ITS EFFECTS ON CYBERWARE 3D WHOLE BODY SCANNER A Thesis Presented to The Faculty of the Fritz J. and Dolores H. Russ College of Engineering and Technology Ohio University In Partial Fulfillment of the Requirement for the Degree Master of Science By Anmin Hu August, 1999 OHIO UNIVERSITY LIBRARY ACKNOWLEDGEMENTS I, Anmin Hu, would like to take this opportunity to thank several people who make thesis become possible. First, I would like to express my deep appreciation and thanks to my advisor, Dr. Joseph H. Nurre, for his directions. He has been so kind and patient to help me when I had problems. "Thanks a lot! Dr. Nurre". I would like to thank my friend, Collier Jeff and Lewark Eric, who are the graduate students of Dr. Nurre. They always gave me hints and help when I had the troubles. This helps me a lot. I would like to thank Mr. Brian Comer, who is research scientist in US Army Natick RD&E Center, to provide the 3D scan data. Without his assistance, this thesis can not be formed. I also would like to thank Dr. Jeffrey Giesey, Dr. Mehmet Celenk to be the members of the thesis committee and Dr. Bhavin Mehta to be the college representative of the thesis committee. Thanks for your time to revise my thesis. Contents Chapter One Introduction 1 Chapter Two Literature Review 5 Chapter Three Body Sway Analysis in Cyberware WB4 Scanning 10 3.0 Introduction 10 3.1 Analysis Method 10 3.1.0 The Cyberware Whole Body Scanner 10 3.1.1 Scan the Subject with the Cylinder Attached 12 3.1.2 Post-Processing of 3D Cylinder Surface
    [Show full text]
  • CS162 Operating Systems and Systems Programming Lecture 19
    CS162 Operating Systems and Systems Programming Lecture 19 File Systems (Con’t), MMAP October 1st, 2017 Prof. Ion Stoica http://cs162.eecs.Berkeley.edu So What About a “Real” File System? • Meet the inode: Inode Array Triple Double Indirect Indirect Indirect Data Inode Blocks Blocks Blocks Blocks File file_number Metadata ... ... ... Direct ... Pointers ... ... ... ... ... Indirect Pointer Dbl. Indirect Ptr. Tripl. Indrect Ptr. ... ... ... ... ... ... ... 11/1/17 CS162 © UCB Fall 2017 Lec 19.2 An “Almost Real” File System • Pintos: src/filesys/file.c, inode.c Inode Array Triple Double Indirect Indirect Indirect Data Inode Blocks Blocks Blocks Blocks File file_number Metadata ... ... ... Direct ... Pointers ... ... ... ... ... Indirect Pointer Dbl. Indirect Ptr. Tripl. Indrect Ptr. ... ... ... ... ... ... ... 11/1/17 CS162 © UCB Fall 2017 Lec 19.3 Unix File System (1/2) • Original inode format appeared in BSD 4.1 – Berkeley Standard Distribution Unix – Part of your heritage! – Similar structure for Linux Ext2/3 • File Number is index into inode arrays • Multi-level index structure – Great for little and large files – Asymmetric tree with fixed sized blocks 11/1/17 CS162 © UCB Fall 2017 Lec 19.4 Unix File System (2/2) • Metadata associated with the file – Rather than in the directory that points to it • UNIX Fast File System (FFS) BSD 4.2 Locality Heuristics: – Block group placement – Reserve space • Scalable directory structure 11/1/17 CS162 © UCB Fall 2017 Lec 19.5 File Attributes • inode metadata Inode Array Triple Double Indirect Indirect Indirect Data Inode Blocks Blocks Blocks Blocks File Metadata ... User ... ... Group Direct ... Pointers 9 basic access control bits ... - UGO x RWX ... ... ... Setuid bit ... Indirect Pointer - execute at owner permissionsDbl.
    [Show full text]