Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing

Total Page:16

File Type:pdf, Size:1020Kb

Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing Parallel and Distributed Computing Chapter 1: Introduction to Parallel Computing Jun Zhang Department of Computer Science University of Kentucky Lexington, KY 40506 Chapter 1: CS621 1 1.1a: von Neumann Architecture Common machine model for over 70 years Stored-program concept CPU executes a stored program A sequence of read and write operations on the memory Order of operations is sequential Chapter 1: CS621 2 1.1b: A More Detailed Architecture based on von Neumann Model Chapter 1: CS621 3 1.1c: Old von Neumann Computer Chapter 1: CS621 4 1.1d: CISC von Neumann Computer CISC stands for Complex Instruction Set Computer with a single bus system Harvard (RISC) architecture utilizes two buses, a separate data bus and an address bus RISC stands for Reduced Instruction Set Computer They are SISD machines – Single Instruction Stream on Single Data Stream Chapter 1: CS621 5 1.1e: Personal Computer Chapter 1: CS621 6 1.1f: John von Neumann December 28, 1903 – February 8, 1957 Hungarian mathematician Mastered calculus at 8 Graduate level math at 12 Got his Ph.D. at 23 His proposal to his 1st wife, “You and I might be able to have some fun together, seeing as how we both like to drink." Chapter 1: CS621 7 1.2a: Motivations for Parallel Computing Fundamental limits on single processor speed Heat dissipation from CPU chips Disparity between CPU & memory speeds Distributed data communications Need for very large scale computing platforms Chapter 1: CS621 8 1.2b: Fundamental Limits – Cycle Speed Intel 8080 2MHz 1974 ARM 2 8MHz 1986 Intel Pentium Pro 200MHz 1996 AMD Athlon 1.2GHz 2000 Intel QX6700 2.66GHz 2006 Intel Core i7 3770k 3.9GHz 2013 Speed of light: 30cm in 1ns Signal travels about 10 times slower Chapter 1: CS621 9 1.2c: High-End CPU is Expensive Price for high-end CPU rises sharply Intel processor price/performance Chapter 1: CS621 10 1.2d: Moore’s Law Moore’s observation in 1965: the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented Moore’s revised observation in 1975: the pace was slowed down a bit, but data density had doubled approximately every 18 months How about the future? (price of computing power falls by a half every 18 months?) Chapter 1: CS621 11 1.2e: Moore’s Law – Held for Now Chapter 1: CS621 12 1.3: Power Wall Effect in Computer Architecture Too many transistors in a given chip die area Tremendous increase in power density Increased chip temperature High temperature slows down the transistor switching rate and the overall speed of the computer Chip may melt down if not cooled properly Efficient cooling systems are expensive Chapter 1: CS621 13 1.3: Cooling Computer Chips Some people suggest to put computer chips in liquid nitrogen to cool them Chapter 1: CS621 14 1.3: Solutions Use multiple inexpensive processors A processor with multiple cores Chapter 1: CS621 15 1.3: A Multi-core Processor Chapter 1: CS621 16 1.3a: CPU and Memory Speeds In 20 years, CPU speed (clock rate) has increased by a factor of 1000 DRAM speed has increased only by a factor of smaller than 4 How to feed data faster enough to keep CPU busy? CPU speed: 1-2 ns DRAM speed: 50-60 ns Cache: 10 ns Chapter 1: CS621 17 1.3b: Memory Access and CPU Speed Chapter 1: CS621 18 1.3b: CPU, Memory, and Disk Speed Chapter 1: CS621 19 1.3c: Possible Solutions A hierarchy of successively fast memory devices (multilevel caches) Location of data reference (code) Efficient programming can be an issue Parallel systems may provide 1.) larger aggregate cache 2.) higher aggregate bandwidth to the memory system Chapter 1: CS621 20 1.3f: Multilevel Hierarchical Cache Chapter 1: CS621 21 1.4a: Distributed Data Communications Data may be collected and stored at different locations It is expensive to bring them to a central location for processing Many computing assignments may be inherently parallel Privacy issues in data mining and other large scale commercial database manipulations Chapter 1: CS621 22 1.4b: Distributed Data Communications Chapter 1: CS621 23 1.4c: Move Computation to Data (CS626: Large Scale Data Science) Chapter 1: CS621 24 1.5a: Why Use Parallel Computing Save time – reduce wall clock time – many processors work together Solve larger problems – larger than one processor’s CPU and memory can handle Provide concurrency – do multiple things at the same time: online access to databases, search engine Google’s 4,000 PC servers were one of the largest clusters in the world (server farm) Chapter 1: CS621 25 1.4b: Google’s Data Center Chapter 1: CS621 26 1.5b: Other Reasons for Parallel Computing Taking advantages of non-local resources – using computing resources on a wide area network, or even internet (grid or cloud computing) Cost savings – using multiple “cheap” computing resources instead of a high-end CPU Overcoming memory constraints – for large problems, using memories of multiple computers may overcome the memory constraint obstacle Chapter 1: CS621 27 1.6a: Need for Large Scale Modeling Long term weather forecasting Large scale ocean modeling Oil reservoir simulations Car and airplane manufacturing Semiconductor simulation Pollution tracking Large scale commercial databases Aerospace (NASA microgravity modeling) Chapter 1: CS621 28 1.6b: Semiconductor Simulation Before 1975, an engineer had to make several runs through the fabrication line until a successful device was fabricated Device dimensions shrink below 0.1 micro-meter A fabrication line costs 1.0 billion dollars to build A design must be thoroughly verified before it is committed to silicon A realistic simulation for one diffusion process may take days or months to run on a workstation Chip price drops quickly after entering the market Chapter 1: CS621 29 1.4b: Semiconductor Diffusion Process Chapter 1: CS621 30 1.6c: Drug Design Most drugs work by binding to a specific site, called a receptor, on a protein A central problem is to find molecules (ligands) with high binding affinity Need to accurately and efficiently estimate electrostatic forces in molecular and atomic interactions Calculate drug-protein binding energies from quantum mechanics, statistical mechanics and simulation techniques Chapter 1: CS621 31 1.6d: Computing Protein Binding Chapter 1: CS621 32 1.6d: Computer Aided Drug Design Chapter 1: CS621 33 1.7: Issues in Parallel Computing Design of parallel computers Design of efficient parallel algorithms Methods for evaluating parallel algorithms Parallel computer languages Parallel programming tools Portable parallel programs Automatic programming of parallel computers Education of parallel computing philosophy Chapter 1: CS621 34 1.8 Eclipse Parallel Tools Platform A standard, portable parallel integrated development environment that supports a wide range of parallel architectures and run time systems (IBM) A scalable parallel debugger Support for the integration of a wide range of parallel tools An environment that simplifies the end-user interaction with parallel systems Chapter 1: CS621 35 1.9 Message Passing Interface (MPI) We will use MPI for hands-on experience in this class Message Passing Interface can be downloaded from an online website (MPICH) Parallel computing can be simulated on your own computers (Microsoft MPI) UK can provide distributed computing services for research purposes, for free Chapter 1: CS621 36 1.10 Cloud Computing Cloud Computing is a style of computing in which dynamically scalable and often virtualized resources are provided over the Internet Users need not have knowledge of, expertise in, or control over the technology infrastructure in the “cloud” that supports them Compared to: Grid Computing (cluster of networked, loosely coupled computers) Utility Computing (packaging of computing resources as a metered service) and Autonomic Computing (computer systems capable of self-management) Chapter 1: CS621 37 1.11 Cloud Computing By Sam Johnston in Wikimedia Commons Chapter 1: CS621 38 1.12 Cloud Computing A means to increase computing capacity or add computing capabilities at any time without investing in new infrastructure, training new personnel, or licensing new software Chapter 1: CS621 39 1.13 Cloud Computing Chapter 1: CS621 40 1.14 Cloud Computing Chapter 1: CS621 41 1.15 Cloud Computing Chapter 1: CS621 42 1.16 Top Players in Cloud Computing Amazon.com (IaaS, most comprehesive) Vmware (vCloud, software to build cloud) Microsoft (PaaS, IaaS) Salesforce.com (SaaS) Google (IaaS, PaaS, Google App Engine) Rackspace (IaaS) IBM (Openstack) Citrix (compete with Vmware, free cloud operating system) Joyent (compete with Vmware, OpenStack, Citrix) SoftLayer (web hosting service provider) Chapter 1: CS621 43 1.17 Parallel and Distributed Computing (CS621) This class is about parallel and distributed computing algorithms and applications Algorithms for communications between processors Algorithms to solve scientific computing problems Hands-on experience with message-passing interface (MPI) to write parallel programs to solve some problems This class is NOT about parallel data processing (CS626) Chapter 1: CS621 44 *** Hands-on Experience You can down and install a copy of Microsoft MPI (Message Passing Interface) at https://www.microsoft.com/en- us/download/details.aspx?id=57467 You also need MS Visual Studio to work with it. https://visualstudio.microsoft.com/ Chapter 1: CS621 45 *** Hands-on Experience Here is a video demonstrating how to set up MS Visual Studio for MS MPI programming https://www.youtube.com/watch?v=IW3SKDM_yEs&t=330s Please make sure that you have installed the MS Visual Studio and MS MPI on your computer Please run the sample MPI “Hello World” program https://mpitutorial.com/tutorials/mpi-hello-world/ Chapter 1: CS621 46 *** Hands-on Experience If you have serious research projects that need a lot of computing resources, you can also request an account on a supercomputer at the UK Center for Computational Science Go to the Center for Computational Science and get a form to fill out and ask your advisor to sign it https://www.ccs.uky.edu/ Chapter 1: CS621 47.
Recommended publications
  • Distributed & Cloud Computing: Lecture 1
    Distributed & cloud computing: Lecture 1 Chokchai “Box” Leangsuksun SWECO Endowed Professor, Computer Science Louisiana Tech University [email protected] CTO, PB Tech International Inc. [email protected] Class Info ■" Class Hours: noon-1:50 pm T-Th ! ■" Dr. Box’s & Contact Info: ●"www.latech.edu/~box ●"Phone 318-257-3291 ●"Email: [email protected] Class Info ■" Main Text: ! ●" 1) Distributed Systems: Concepts and Design By Coulouris, Dollimore, Kindberg and Blair Edition 5, © Addison-Wesley 2012 ●" 2) related publications & online literatures Class Info ■" Objective: ! ●"the theory, design and implementation of distributed systems ●"Discussion on abstractions, Concepts and current systems/applications ●"Best practice on Research on DS & cloud computing Course Contents ! •" The Characterization of distributed computing and cloud computing. •" System Models. •" Networking and inter-process communication. •" OS supports and Virtualization. •" RAS, Performance & Reliability Modeling. Security. !! Course Contents •" Introduction to Cloud Computing ! •" Various models and applications. •" Deployment models •" Service models (SaaS, PaaS, IaaS, Xaas) •" Public Cloud: Amazon AWS •" Private Cloud: openStack. •" Cost models ( between cloud vs host your own) ! •" ! !!! Course Contents case studies! ! •" Introduction to HPC! •" Multicore & openMP! •" Manycore, GPGPU & CUDA! •" Cloud-based EKG system! •" Distributed Object & Web Services (if time allows)! •" Distributed File system (if time allows)! •" Replication & Disaster Recovery Preparedness! !!!!!! Important Dates ! ■" Apr 10: Mid Term exam ■" Apr 22: term paper & presentation due ■" May 15: Final exam Evaluations ! ■" 20% Lab Exercises/Quizzes ■" 20% Programming Assignments ■" 10% term paper ■" 20% Midterm Exam ■" 30% Final Exam Intro to Distributed Computing •" Distributed System Definitions. •" Distributed Systems Examples: –" The Internet. –" Intranets. –" Mobile Computing –" Cloud Computing •" Resource Sharing. •" The World Wide Web. •" Distributed Systems Challenges. Based on Mostafa’s lecture 10 Distributed Systems 0.
    [Show full text]
  • Distributed Algorithms with Theoretic Scalability Analysis of Radial and Looped Load flows for Power Distribution Systems
    Electric Power Systems Research 65 (2003) 169Á/177 www.elsevier.com/locate/epsr Distributed algorithms with theoretic scalability analysis of radial and looped load flows for power distribution systems Fangxing Li *, Robert P. Broadwater ECE Department Virginia Tech, Blacksburg, VA 24060, USA Received 15 April 2002; received in revised form 14 December 2002; accepted 16 December 2002 Abstract This paper presents distributed algorithms for both radial and looped load flows for unbalanced, multi-phase power distribution systems. The distributed algorithms are developed from tree-based sequential algorithms. Formulas of scalability for the distributed algorithms are presented. It is shown that computation time dominates communication time in the distributed computing model. This provides benefits to real-time load flow calculations, network reconfigurations, and optimization studies that rely on load flow calculations. Also, test results match the predictions of derived formulas. This shows the formulas can be used to predict the computation time when additional processors are involved. # 2003 Elsevier Science B.V. All rights reserved. Keywords: Distributed computing; Scalability analysis; Radial load flow; Looped load flow; Power distribution systems 1. Introduction Also, the method presented in Ref. [10] was tested in radial distribution systems with no more than 528 buses. Parallel and distributed computing has been applied More recent works [11Á/14] presented distributed to many scientific and engineering computations such as implementations for power flows or power flow based weather forecasting and nuclear simulations [1,2]. It also algorithms like optimizations and contingency analysis. has been applied to power system analysis calculations These works also targeted power transmission systems. [3Á/14].
    [Show full text]
  • Grid Computing: What Is It, and Why Do I Care?*
    Grid Computing: What Is It, and Why Do I Care?* Ken MacInnis <[email protected]> * Or, “Mi caja es su caja!” (c) Ken MacInnis 2004 1 Outline Introduction and Motivation Examples Architecture, Components, Tools Lessons Learned and The Future Questions? (c) Ken MacInnis 2004 2 What is “grid computing”? Many different definitions: Utility computing Cycles for sale Distributed computing distributed.net RC5, SETI@Home High-performance resource sharing Clusters, storage, visualization, networking “We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” Len Kleinrock (1969) The word “grid” doesn’t equal Grid Computing: Sun Grid Engine is a mere scheduler! (c) Ken MacInnis 2004 3 Better definitions: Common protocols allowing large problems to be solved in a distributed multi-resource multi-user environment. “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Kesselman & Foster (1998) “…coordinated resource sharing and problem solving in dynamic, multi- institutional virtual organizations.” Kesselman, Foster, Tuecke (2000) (c) Ken MacInnis 2004 4 New Challenges for Computing Grid computing evolved out of a need to share resources Flexible, ever-changing “virtual organizations” High-energy physics, astronomy, more Differing site policies with common needs Disparate computing needs
    [Show full text]
  • Computing at Massive Scale: Scalability and Dependability Challenges
    This is a repository copy of Computing at massive scale: Scalability and dependability challenges. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/105671/ Version: Accepted Version Proceedings Paper: Yang, R and Xu, J orcid.org/0000-0002-4598-167X (2016) Computing at massive scale: Scalability and dependability challenges. In: Proceedings - 2016 IEEE Symposium on Service-Oriented System Engineering, SOSE 2016. 2016 IEEE Symposium on Service-Oriented System Engineering (SOSE), 29 Mar - 02 Apr 2016, Oxford, United Kingdom. IEEE , pp. 386-397. ISBN 9781509022533 https://doi.org/10.1109/SOSE.2016.73 (c) 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Reuse Unless indicated otherwise, fulltext items are protected by copyright with all rights reserved. The copyright exception in section 29 of the Copyright, Designs and Patents Act 1988 allows the making of a single copy solely for the purpose of non-commercial research or private study within the limits of fair dealing. The publisher or other rights-holder may allow further reproduction and re-use of this version - refer to the White Rose Research Online record for this item. Where records identify the publisher as the copyright holder, users can verify any specific terms of use on the publisher’s website. Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request.
    [Show full text]
  • Lecture 18: Parallel and Distributed Systems
    Lecture 18: Parallel and What is a distributed system? Distributed Systems n From various textbooks: u “A distributed system is a collection of independent computers n Classification of parallel/distributed that appear to the users of the system as a single computer.” architectures u “A distributed system consists of a collection of autonomous n SMPs computers linked to a computer network and equipped with n Distributed systems distributed system software.” u n Clusters “A distributed system is a collection of processors that do not share memory or a clock.” u “Distributed systems is a term used to define a wide range of computer systems from a weakly-coupled system such as wide area networks, to very strongly coupled systems such as multiprocessor systems.” 1 2 What is a distributed system?(cont.) What is a distributed system?(cont.) n Michael Flynn (1966) P1 P2 P3 u n A distributed system is Network SISD — single instruction, single data u SIMD — single instruction, multiple data a set of physically separate u processors connected by MISD — multiple instruction, single data u MIMD — multiple instruction, multiple data one or more P4 P5 communication links n More recent (Stallings, 1993) n Is every system with >2 computers a distributed system?? u Email, ftp, telnet, world-wide-web u Network printer access, network file access, network file backup u We don’t usually consider these to be distributed systems… 3 4 MIMD Classification parallel and distributed Classification of the OS of MIMD computers Architectures tightly loosely coupled
    [Show full text]
  • Hybrid Computing – Co-Processors/Accelerators Power-Aware Computing – Performance of Applications Kernels
    C-DAC Four Days Technology Workshop ON Hybrid Computing – Co-Processors/Accelerators Power-aware Computing – Performance of Applications Kernels hyPACK-2013 (Mode-1:Multi-Core) Lecture Topic : Multi-Core Processors : Algorithms & Applications Overview - Part-II Venue : CMSD, UoHYD ; Date : October 15-18, 2013 C-DAC hyPACK-2013 Multi-Core Processors : Alg.& Appls Overview Part-II 1 Killer Apps of Tomorrow Application-Driven Architectures: Analyzing Tera-Scale Workloads Workload convergence The basic algorithms shared by these high-end workloads Platform implications How workload analysis guides future architectures Programmer productivity Optimized architectures will ease the development of software Call to Action Benchmark suites in critical need for redress C-DAC hyPACK-2013 Multi-Core Processors : Alg.& Appls Overview Part-II 2 Emerging “Killer Apps of Tomorrow” Parallel Algorithms Design Static and Load Balancing Mapping for load balancing Minimizing Interaction Overheads in parallel algorithms design Data Sharing Overheads C-DAC hyPACK-2013 Multi-Core Processors : Alg.& Appls Overview Part-II 3 Emerging “Killer Apps of Tomorrow” Parallel Algorithms Design (Contd…) General Techniques for Choosing Right Parallel Algorithm Maximize data locality Minimize volume of data Minimize frequency of Interactions Overlapping computations with interactions. Data replication Minimize construction and Hot spots. Use highly optimized collective interaction operations. Collective data transfers and computations • Maximize Concurrency.
    [Show full text]
  • Cost Model: Work, Span and Parallelism 1 Overview of Cilk
    CSE 539 01/15/2015 Cost Model: Work, Span and Parallelism Lecture 2 Scribe: Angelina Lee Outline of this lecture: 1. Overview of Cilk 2. The dag computation model 3. Performance measures 4. A simple greedy scheduler 1 Overview of Cilk A brief history of Cilk technology In this class, we will overview various parallel programming technology developed. One of the technology we will examine closely is Cilk, a C/C++ based concurrency platform. Before we overview the development and implementation of Cilk, we shall first overview a brief history of Cilk technology to account for where the major concepts originate. Cilk technology has developed and evolved over more than 15 years since its origin at MIT. The text under this subheading is partially abstracted from the “Cilk” entry in Encyclopedia of Distributed Computing [14] with the author’s consent. I invite interested readers to go through the original entry for a more complete review of the history. Cilk (pronounced “silk”) is a linguistic and runtime technology for algorithmic multithreaded programming originally developed at MIT. The philosophy behind Cilk is that a programmer should concentrate on structuring her or his program to expose parallelism and exploit locality, leaving Cilk’s runtime system with the responsibility of scheduling the computation to run effi- ciently on a given platform. The Cilk runtime system takes care of details like load balancing, synchronization, and communication protocols. Cilk is algorithmic in that the runtime system guarantees efficient and predictable performance. Important milestones in Cilk technology include the original Cilk-1 [2,1, 11], 1 Cilk-5 [7, 18,5, 17], and the commercial Cilk++ [15] by Cilk Arts.
    [Show full text]
  • Difference Between Grid Computing Vs. Distributed Computing
    Difference between Grid Computing Vs. Distributed Computing Definition of Distributed Computing Distributed Computing is an environment in which a group of independent and geographically dispersed computer systems take part to solve a complex problem, each by solving a part of solution and then combining the result from all computers. These systems are loosely coupled systems coordinately working for a common goal. It can be defined as 1. A computing system in which services are provided by a pool of computers collaborating over a network. 2. A computing environment that may involve computers of differing architectures and data representation formats that share data and system resources. Distributed Computing, or the use of a computational cluster, is defined as the application of resources from multiple computers, networked in a single environment, to a single problem at the same time - usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. Picture: The concept of distributed computing is simple -- pull together and employ all available resources to speed up computing. The key distinction between distributed computing and grid computing is mainly the way resources are managed. Distributed computing uses a centralized resource manager and all nodes cooperatively work together as a single unified resource or a system. Grid computingutilizes a structure where each node has its own resource manager and the system does not act as a single unit. Definition of Grid Computing The Basic idea between Grid Computing is to utilize the ideal CPU cycles and storage of millions of computer systems across a worldwide network function as a flexible, pervasive, and inexpensive accessible pool that could be harnessed by anyone who needs it, similar to the way power companies and their users share the electrical grid.
    [Show full text]
  • Distributed GPU-Based K-Means Algorithm for Data-Intensive Applications: Large-Sized Image Segmentation Case
    (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 8, No. 12, 2017 Distributed GPU-Based K-Means Algorithm for Data-Intensive Applications: Large-Sized Image Segmentation Case Hicham Fakhi*, Omar Bouattane, Mohamed Youssfi, Hassan Ouajji Signals, Distributed Systems and Artificial Intelligence Laboratory ENSET, University Hassan II, Casablanca, Morocco Abstract—K-means is a compute-intensive iterative Nonetheless, there are two important factors to consider algorithm. Its use in a complex scenario is cumbersome, when doing image segmentation. First is the number of images specifically in data-intensive applications. In order to accelerate to be processed in a given use case. Second is the image quality the K-means running time for data-intensive application, such as which has known an important evolution during the last few large sized image segmentation, we use a distributed multi-agent years, i.e., the number of pixels that make up an image has system accelerated by GPUs. In this K-means version, the input been multiplied by 200 from the 720 x 480 pixels to 9600 x image data are divided into subsets of image data which can be 7200 pixels. This has resulted in a much better definition of the performed independently on GPUs. In each GPU, we offloaded image (more detail visibility) and more nuances in the colors the data assignment and the K-centroids recalculation steps of and shades. the K-means algorithm for a massively parallel processing. We have implemented this K-means version on the Nvidia GPU with Thus, during last decade, image processing techniques have Compute Unified Device Architecture.
    [Show full text]
  • Parallel Sorting Algorithms in Intel Cilk Plus
    International Journal of Hybrid Information Technology Vol.7, No.2 (2014), pp.151-164 http://dx.doi.org/10.14257/ijhit.2014.7.2.15 Multi-Core Program Optimization: Parallel Sorting Algorithms in Intel Cilk Plus Sabahat Saleem1, M. IkramUllah Lali1, M. Saqib Nawaz1* and Abou Bakar Nauman2 1Department of Computer Science & IT, University of Sargodha, Sargodha, Pakistan 2Department of CS & IT, Sarhad University of Science & IT, Peshawar, Pakistan * [email protected] Abstract New performance leaps has been achieved with multiprogramming and multi-core systems. Present parallel programming techniques and environment needs significant changes in programs to accomplish parallelism and also constitute complex, confusing and error-prone constructs and rules. Intel Cilk Plus is a C based computing system that presents a straight forward and well-structured model for the development, verification and analysis of multi- core and parallel programming. In this article, two programs are developed using Intel Cilk Plus. Two sequential sorting programs in C/C++ languageOnly. are ILLEGAL. converted to multi-core programs in Intel Cilk Plus framework to achieve parallelism and better performance. Converted program in Cilk Plus is then checked for various conditionsis using tools of Cilk and after that, comparison of performance and speedup achieved over the single-core sequential program is discussed and reported. Keywords: Single-Core, Multi-Core, Cilk, Sortingfile Algorithms, Quick Sort, Merge Sort, Parallelism, Speedup Version 1. Introduction this With current technology, improving the number of transistors does not increase the computing capability of a system.by One of the new techniques used to increase the performance of a system is the development of multi-core processors.
    [Show full text]
  • Advanced Architectures Goals of Distributed Computing Evaluating
    6/7/2018 Advanced Architectures Goals of Distributed Computing 15A. Distributed Computing • better services 15B. Multi-Processor Systems – scalability • apps too big to run on a single computer 15C. Tightly Coupled Distributed Systems • grow system capacity to meet growing demand 15D. Loosely Coupled Distributed Systems – improved reliability and availability 15E. Cloud Models – improved ease of use, reduced CapEx/OpEx 15F. Distributed Middle-Ware • new services – applications that span multiple system boundaries – global resource domains, services (vs. systems) – complete location transparency Advanced Architectures 1 Advanced Architectures 2 Major Classes of Distributed Systems Evaluating Distributed Systems • Symmetric Multi-Processors (SMP) • Performance – multiple CPUs, sharing memory and I/O devices – overhead, scalability, availability • Single-System Image (SSI) & Cluster Computing • Functionality – a group of computers, acting like a single computer – adequacy and abstraction for target applications • loosely coupled, horizontally scalable systems • Transparency – coordinated, but relatively independent systems – compatibility with previous platforms • application level distributed computing – scope and degree of location independence – peer-to-peer, application level protocols • Degree of Coupling – distributed middle-ware platforms – on how many things do distinct systems agree – how is that agreement achieved Advanced Architectures 3 Advanced Architectures 4 SMP systems and goals Symmetric Multi-Processors • Characterization:
    [Show full text]
  • Operating System Architecture Based on Distributed Objects
    located in different nodes. However, A. Yu. Burtsev, L. B. Ryzhyk, Yu. O. efficient access to resources can oc- Timoshenko cur only locally. E1 enables access locality by making an object physi- OPERATING SYSTEM ARCHITECTURE BASED ON cally distributed, i.e. storing a com- DISTRIBUTED OBJECTS plete or partial copy of its state in several network nodes. The internal organization of an object allows mini- Introduction mizing interactions with remote nodes during execution of operations on ob- Today the computing power of the jects, thus neutralizing the influence networks of workstations has become of communication latencies on overall comparable to that of modern supercom- system performance. puters. The practical implementation Load balancing. The E1 operating of high-performance distributed compu- system supports load balancing, i.e. tations obviously requires a special dynamic distribution of computational software platform. At present, this payload between system nodes. platform is usually implemented by ap- Support for redundancy mechanisms. plication-level parallel programming E1 allows applications to transpar- libraries, providing software develop- ently use distributed algorithms for ers with distributed communication and redundant data storage and execution. synchronization facilities [1,2]. Our Component model support. E1 has a claim is that the problem can be ad- component architecture and supports dressed in full measure by an operat- component-oriented software develop- ing system providing processes with ment paradigm. Both operating system transparent access to all resources of services and application software are the distributed system. Such operating developed within the framework of a system would allow users and applica- single E1 component model. The compo- tion programmers to think of the com- nent model provides the following ser- puter network as a single multiproces- vices: sor with common memory.
    [Show full text]