Thread (Computing) Gajendra Singh
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Demystifying the Real-Time Linux Scheduling Latency
Demystifying the Real-Time Linux Scheduling Latency Daniel Bristot de Oliveira Red Hat, Italy [email protected] Daniel Casini Scuola Superiore Sant’Anna, Italy [email protected] Rômulo Silva de Oliveira Universidade Federal de Santa Catarina, Brazil [email protected] Tommaso Cucinotta Scuola Superiore Sant’Anna, Italy [email protected] Abstract Linux has become a viable operating system for many real-time workloads. However, the black-box approach adopted by cyclictest, the tool used to evaluate the main real-time metric of the kernel, the scheduling latency, along with the absence of a theoretically-sound description of the in-kernel behavior, sheds some doubts about Linux meriting the real-time adjective. Aiming at clarifying the PREEMPT_RT Linux scheduling latency, this paper leverages the Thread Synchronization Model of Linux to derive a set of properties and rules defining the Linux kernel behavior from a scheduling perspective. These rules are then leveraged to derive a sound bound to the scheduling latency, considering all the sources of delays occurring in all possible sequences of synchronization events in the kernel. This paper also presents a tracing method, efficient in time and memory overheads, to observe the kernel events needed to define the variables used in the analysis. This results in an easy-to-use tool for deriving reliable scheduling latency bounds that can be used in practice. Finally, an experimental analysis compares the cyclictest and the proposed tool, showing that the proposed method can find sound bounds faster with acceptable overheads. 2012 ACM Subject Classification Computer systems organization → Real-time operating systems Keywords and phrases Real-time operating systems, Linux kernel, PREEMPT_RT, Scheduling latency Digital Object Identifier 10.4230/LIPIcs.ECRTS.2020.9 Supplementary Material ECRTS 2020 Artifact Evaluation approved artifact available at https://doi.org/10.4230/DARTS.6.1.3. -
Context Switch in Linux – OS Course Memory Layout – General Picture
ContextContext switchswitch inin LinuxLinux ©Gabriel Kliot, Technion 1 Context switch in Linux – OS course Memory layout – general picture Stack Stack Stack Process X user memory Process Y user memory Process Z user memory Stack Stack Stack tss->esp0 TSS of CPU i task_struct task_struct task_struct Process X kernel Process Y kernel Process Z kernel stack stack and task_struct stack and task_struct and task_struct Kernel memory ©Gabriel Kliot, Technion 2 Context switch in Linux – OS course #1 – kernel stack after any system call, before context switch prev ss User Stack esp eflags cs … User Code eip TSS … orig_eax … tss->esp0 es Schedule() function frame esp ds eax Saved on the kernel stack during ebp a transition to task_struct kernel mode by a edi jump to interrupt and by SAVE_ALL esi macro edx thread.esp0 ecx ebx ©Gabriel Kliot, Technion 3 Context switch in Linux – OS course #2 – stack of prev before switch_to macro in schedule() func prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 4 Context switch in Linux – OS course #3 – switch_to: save esi, edi, ebp on the stack of prev prev … Schedule() saved EAX, ECX, EDX Arguments to contex_switch() Return address to schedule() TSS Old (schedule’s()) EBP … tss->esp0 ESI EDI EBP esp task_struct thread.eip thread.esp thread.esp0 ©Gabriel Kliot, Technion 5 Context switch in Linux – OS course #4 – switch_to: save esp in prev->thread.esp -
Task Scheduling Balancing User Experience and Resource Utilization on Cloud
Rochester Institute of Technology RIT Scholar Works Theses 7-2019 Task Scheduling Balancing User Experience and Resource Utilization on Cloud Sultan Mira [email protected] Follow this and additional works at: https://scholarworks.rit.edu/theses Recommended Citation Mira, Sultan, "Task Scheduling Balancing User Experience and Resource Utilization on Cloud" (2019). Thesis. Rochester Institute of Technology. Accessed from This Thesis is brought to you for free and open access by RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected]. Task Scheduling Balancing User Experience and Resource Utilization on Cloud by Sultan Mira A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Software Engineering Supervised by Dr. Yi Wang Department of Software Engineering B. Thomas Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, New York July 2019 ii The thesis “Task Scheduling Balancing User Experience and Resource Utilization on Cloud” by Sultan Mira has been examined and approved by the following Examination Committee: Dr. Yi Wang Assistant Professor Thesis Committee Chair Dr. Pradeep Murukannaiah Assistant Professor Dr. Christian Newman Assistant Professor iii Dedication I dedicate this work to my family, loved ones, and everyone who has helped me get to where I am. iv Abstract Task Scheduling Balancing User Experience and Resource Utilization on Cloud Sultan Mira Supervising Professor: Dr. Yi Wang Cloud computing has been gaining undeniable popularity over the last few years. Among many techniques enabling cloud computing, task scheduling plays a critical role in both efficient resource utilization for cloud service providers and providing an excellent user ex- perience to the clients. -
Resource Access Control Facility (RACF) General User's Guide
----- --- SC28-1341-3 - -. Resource Access Control Facility -------_.----- - - --- (RACF) General User's Guide I Production of this Book This book was prepared and formatted using the BookMaster® document markup language. Fourth Edition (December, 1988) This is a major revision of, and obsoletes, SC28-1341- 3and Technical Newsletter SN28-l2l8. See the Summary of Changes following the Edition Notice for a summary of the changes made to this manual. Technical changes or additions to the text and illustrations are indicated by a vertical line to the left of the change. This edition applies to Version 1 Releases 8.1 and 8.2 of the program product RACF (Resource Access Control Facility) Program Number 5740-XXH, and to all subsequent versions until otherwise indicated in new editions or Technical Newsletters. Changes are made periodically to the information herein; before using this publication in connection with the operation of IBM systems, consult the latest IBM Systemj370 Bibliography, GC20-0001, for the editions that are applicable and current. References in this publication to IBM products or services do not imply that IBM· intends to make these available in all countries in which IBM operates. Any reference to an IBM product in this publication is not intended to state or imply that only IBM's product may be used. Any functionally equivalent product may be used instead. This statement does not expressly or implicitly waive any intellectual property right IBM may hold in any product mentioned herein. Publications are not stocked at the address given below. Requests for IBM publications should be made to your IBM representative or to the IBM branch office serving your locality. -
3.A Fixed-Priority Scheduling
2017/18 UniPD - T. Vardanega 10/03/2018 Standard notation : Worst-case blocking time for the task (if applicable) 3.a Fixed-Priority Scheduling : Worst-case computation time (WCET) of the task () : Relative deadline of the task : The interference time of the task : Release jitter of the task : Number of tasks in the system Credits to A. Burns and A. Wellings : Priority assigned to the task (if applicable) : Worst-case response time of the task : Minimum time between task releases, or task period () : The utilization of each task ( ⁄) a-Z: The name of a task 2017/18 UniPD – T. Vardanega Real-Time Systems 168 of 515 The simplest workload model Fixed-priority scheduling (FPS) The application is assumed to consist of tasks, for fixed At present, the most widely used approach in industry All tasks are periodic with known periods Each task has a fixed (static) priority determined off-line This defines the periodic workload model In real-time systems, the “priority” of a task is solely derived The tasks are completely independent of each other from its temporal requirements No contention for logical resources; no precedence constraints The task’s relative importance to the correct functioning of All tasks have a deadline equal to their period the system or its integrity is not a driving factor at this level A recent strand of research addresses mixed-criticality systems, with Each task must complete before it is next released scheduling solutions that contemplate criticality attributes All tasks have a single fixed WCET, which can be trusted as The ready tasks (jobs) are dispatched to execution in a safe and tight upper-bound the order determined by their (static) priority Operation modes are not considered Hence, in FPS, scheduling at run time is fully defined by the All system overheads (context-switch times, interrupt handling and so on) are assumed absorbed in the WCETs priority assignment algorithm 2017/18 UniPD – T. -
Introduction to Multi-Threading and Vectorization Matti Kortelainen Larsoft Workshop 2019 25 June 2019 Outline
Introduction to multi-threading and vectorization Matti Kortelainen LArSoft Workshop 2019 25 June 2019 Outline Broad introductory overview: • Why multithread? • What is a thread? • Some threading models – std::thread – OpenMP (fork-join) – Intel Threading Building Blocks (TBB) (tasks) • Race condition, critical region, mutual exclusion, deadlock • Vectorization (SIMD) 2 6/25/19 Matti Kortelainen | Introduction to multi-threading and vectorization Motivations for multithreading Image courtesy of K. Rupp 3 6/25/19 Matti Kortelainen | Introduction to multi-threading and vectorization Motivations for multithreading • One process on a node: speedups from parallelizing parts of the programs – Any problem can get speedup if the threads can cooperate on • same core (sharing L1 cache) • L2 cache (may be shared among small number of cores) • Fully loaded node: save memory and other resources – Threads can share objects -> N threads can use significantly less memory than N processes • If smallest chunk of data is so big that only one fits in memory at a time, is there any other option? 4 6/25/19 Matti Kortelainen | Introduction to multi-threading and vectorization What is a (software) thread? (in POSIX/Linux) • “Smallest sequence of programmed instructions that can be managed independently by a scheduler” [Wikipedia] • A thread has its own – Program counter – Registers – Stack – Thread-local memory (better to avoid in general) • Threads of a process share everything else, e.g. – Program code, constants – Heap memory – Network connections – File handles -
Parallel Programming
Parallel Programming Parallel Programming Parallel Computing Hardware Shared memory: multiple cpus are attached to the BUS all processors share the same primary memory the same memory address on different CPU’s refer to the same memory location CPU-to-memory connection becomes a bottleneck: shared memory computers cannot scale very well Parallel Programming Parallel Computing Hardware Distributed memory: each processor has its own private memory computational tasks can only operate on local data infinite available memory through adding nodes requires more difficult programming Parallel Programming OpenMP versus MPI OpenMP (Open Multi-Processing): easy to use; loop-level parallelism non-loop-level parallelism is more difficult limited to shared memory computers cannot handle very large problems MPI(Message Passing Interface): require low-level programming; more difficult programming scalable cost/size can handle very large problems Parallel Programming MPI Distributed memory: Each processor can access only the instructions/data stored in its own memory. The machine has an interconnection network that supports passing messages between processors. A user specifies a number of concurrent processes when program begins. Every process executes the same program, though theflow of execution may depend on the processors unique ID number (e.g. “if (my id == 0) then ”). ··· Each process performs computations on its local variables, then communicates with other processes (repeat), to eventually achieve the computed result. In this model, processors pass messages both to send/receive information, and to synchronize with one another. Parallel Programming Introduction to MPI Communicators and Groups: MPI uses objects called communicators and groups to define which collection of processes may communicate with each other. -
What Is an Operating System III 2.1 Compnents II an Operating System
Page 1 of 6 What is an Operating System III 2.1 Compnents II An operating system (OS) is software that manages computer hardware and software resources and provides common services for computer programs. The operating system is an essential component of the system software in a computer system. Application programs usually require an operating system to function. Memory management Among other things, a multiprogramming operating system kernel must be responsible for managing all system memory which is currently in use by programs. This ensures that a program does not interfere with memory already in use by another program. Since programs time share, each program must have independent access to memory. Cooperative memory management, used by many early operating systems, assumes that all programs make voluntary use of the kernel's memory manager, and do not exceed their allocated memory. This system of memory management is almost never seen any more, since programs often contain bugs which can cause them to exceed their allocated memory. If a program fails, it may cause memory used by one or more other programs to be affected or overwritten. Malicious programs or viruses may purposefully alter another program's memory, or may affect the operation of the operating system itself. With cooperative memory management, it takes only one misbehaved program to crash the system. Memory protection enables the kernel to limit a process' access to the computer's memory. Various methods of memory protection exist, including memory segmentation and paging. All methods require some level of hardware support (such as the 80286 MMU), which doesn't exist in all computers. -
Operating System Support for Parallel Processes
Operating System Support for Parallel Processes Barret Rhoden Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2014-223 http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-223.html December 18, 2014 Copyright © 2014, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. Operating System Support for Parallel Processes by Barret Joseph Rhoden A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Eric Brewer, Chair Professor Krste Asanovi´c Professor David Culler Professor John Chuang Fall 2014 Operating System Support for Parallel Processes Copyright 2014 by Barret Joseph Rhoden 1 Abstract Operating System Support for Parallel Processes by Barret Joseph Rhoden Doctor of Philosophy in Computer Science University of California, Berkeley Professor Eric Brewer, Chair High-performance, parallel programs want uninterrupted access to physical resources. This characterization is true not only for traditional scientific computing, but also for high- priority data center applications that run on parallel processors. These applications require high, predictable performance and low latency, and they are important enough to warrant engineering effort at all levels of the software stack. -
Real-Time Performance During CUDA™ a Demonstration and Analysis of Redhawk™ CUDA RT Optimizations
A Concurrent Real-Time White Paper 2881 Gateway Drive Pompano Beach, FL 33069 (954) 974-1700 www.concurrent-rt.com Real-Time Performance During CUDA™ A Demonstration and Analysis of RedHawk™ CUDA RT Optimizations By: Concurrent Real-Time Linux® Development Team November 2010 Overview There are many challenges to creating a real-time Linux distribution that provides guaranteed low process-dispatch latencies and minimal process run-time jitter. Concurrent Real Time’s RedHawk Linux distribution meets and exceeds these challenges, providing a hard real-time environment on many qualified hardware configurations, even in the presence of a heavy system load. However, there are additional challenges faced when guaranteeing real-time performance of processes while CUDA applications are simultaneously running on the system. The proprietary CUDA driver supplied by NVIDIA® frequently makes demands upon kernel resources that can dramatically impact real-time performance. This paper discusses a demonstration application developed by Concurrent to illustrate that RedHawk Linux kernel optimizations allow hard real-time performance guarantees to be preserved even while demanding CUDA applications are running. The test results will show how RedHawk performance compares to CentOS performance running the same application. The design and implementation details of the demonstration application are also discussed in this paper. Demonstration This demonstration features two selectable real-time test modes: 1. Jitter Mode: measure and graph the run-time jitter of a real-time process 2. PDL Mode: measure and graph the process-dispatch latency of a real-time process While the demonstration is running, it is possible to switch between these different modes at any time. -
Unit: 4 Processes and Threads in Distributed Systems
Unit: 4 Processes and Threads in Distributed Systems Thread A program has one or more locus of execution. Each execution is called a thread of execution. In traditional operating systems, each process has an address space and a single thread of execution. It is the smallest unit of processing that can be scheduled by an operating system. A thread is a single sequence stream within in a process. Because threads have some of the properties of processes, they are sometimes called lightweight processes. In a process, threads allow multiple executions of streams. Thread Structure Process is used to group resources together and threads are the entities scheduled for execution on the CPU. The thread has a program counter that keeps track of which instruction to execute next. It has registers, which holds its current working variables. It has a stack, which contains the execution history, with one frame for each procedure called but not yet returned from. Although a thread must execute in some process, the thread and its process are different concepts and can be treated separately. What threads add to the process model is to allow multiple executions to take place in the same process environment, to a large degree independent of one another. Having multiple threads running in parallel in one process is similar to having multiple processes running in parallel in one computer. Figure: (a) Three processes each with one thread. (b) One process with three threads. In former case, the threads share an address space, open files, and other resources. In the latter case, process share physical memory, disks, printers and other resources. -
Hiding Process Memory Via Anti-Forensic Techniques
DIGITAL FORENSIC RESEARCH CONFERENCE Hiding Process Memory via Anti-Forensic Techniques By: Frank Block (Friedrich-Alexander Universität Erlangen-Nürnberg (FAU) and ERNW Research GmbH) and Ralph Palutke (Friedrich-Alexander Universität Erlangen-Nürnberg) From the proceedings of The Digital Forensic Research Conference DFRWS USA 2020 July 20 - 24, 2020 DFRWS is dedicated to the sharing of knowledge and ideas about digital forensics research. Ever since it organized the first open workshop devoted to digital forensics in 2001, DFRWS continues to bring academics and practitioners together in an informal environment. As a non-profit, volunteer organization, DFRWS sponsors technical working groups, annual conferences and challenges to help drive the direction of research and development. https://dfrws.org Forensic Science International: Digital Investigation 33 (2020) 301012 Contents lists available at ScienceDirect Forensic Science International: Digital Investigation journal homepage: www.elsevier.com/locate/fsidi DFRWS 2020 USA d Proceedings of the Twentieth Annual DFRWS USA Hiding Process Memory Via Anti-Forensic Techniques Ralph Palutke a, **, 1, Frank Block a, b, *, 1, Patrick Reichenberger a, Dominik Stripeika a a Friedrich-Alexander Universitat€ Erlangen-Nürnberg (FAU), Germany b ERNW Research GmbH, Heidelberg, Germany article info abstract Article history: Nowadays, security practitioners typically use memory acquisition or live forensics to detect and analyze sophisticated malware samples. Subsequently, malware authors began to incorporate anti-forensic techniques that subvert the analysis process by hiding malicious memory areas. Those techniques Keywords: typically modify characteristics, such as access permissions, or place malicious data near legitimate one, Memory subversion in order to prevent the memory from being identified by analysis tools while still remaining accessible.