Computer Systems Architecture
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Balanced Multithreading: Increasing Throughput Via a Low Cost Multithreading Hierarchy
Balanced Multithreading: Increasing Throughput via a Low Cost Multithreading Hierarchy Eric Tune Rakesh Kumar Dean M. Tullsen Brad Calder Computer Science and Engineering Department University of California at San Diego {etune,rakumar,tullsen,calder}@cs.ucsd.edu Abstract ibility of SMT comes at a cost. The register file and rename tables must be enlarged to accommodate the architectural reg- A simultaneous multithreading (SMT) processor can issue isters of the additional threads. This in turn can increase the instructions from several threads every cycle, allowing it to clock cycle time and/or the depth of the pipeline. effectively hide various instruction latencies; this effect in- Coarse-grained multithreading (CGMT) [1, 21, 26] is a creases with the number of simultaneous contexts supported. more restrictive model where the processor can only execute However, each added context on an SMT processor incurs a instructions from one thread at a time, but where it can switch cost in complexity, which may lead to an increase in pipeline to a new thread after a short delay. This makes CGMT suited length or a decrease in the maximum clock rate. This pa- for hiding longer delays. Soon, general-purpose micropro- per presents new designs for multithreaded processors which cessors will be experiencing delays to main memory of 500 combine a conservative SMT implementation with a coarse- or more cycles. This means that a context switch in response grained multithreading capability. By presenting more virtual to a memory access can take tens of cycles and still provide contexts to the operating system and user than are supported a considerable performance benefit. -
Amdahl's Law Threading, Openmp
Computer Science 61C Spring 2018 Wawrzynek and Weaver Amdahl's Law Threading, OpenMP 1 Big Idea: Amdahl’s (Heartbreaking) Law Computer Science 61C Spring 2018 Wawrzynek and Weaver • Speedup due to enhancement E is Exec time w/o E Speedup w/ E = ---------------------- Exec time w/ E • Suppose that enhancement E accelerates a fraction F (F <1) of the task by a factor S (S>1) and the remainder of the task is unaffected Execution Time w/ E = Execution Time w/o E × [ (1-F) + F/S] Speedup w/ E = 1 / [ (1-F) + F/S ] 2 Big Idea: Amdahl’s Law Computer Science 61C Spring 2018 Wawrzynek and Weaver Speedup = 1 (1 - F) + F Non-speed-up part S Speed-up part Example: the execution time of half of the program can be accelerated by a factor of 2. What is the program speed-up overall? 1 1 = = 1.33 0.5 + 0.5 0.5 + 0.25 2 3 Example #1: Amdahl’s Law Speedup w/ E = 1 / [ (1-F) + F/S ] Computer Science 61C Spring 2018 Wawrzynek and Weaver • Consider an enhancement which runs 20 times faster but which is only usable 25% of the time Speedup w/ E = 1/(.75 + .25/20) = 1.31 • What if its usable only 15% of the time? Speedup w/ E = 1/(.85 + .15/20) = 1.17 • Amdahl’s Law tells us that to achieve linear speedup with 100 processors, none of the original computation can be scalar! • To get a speedup of 90 from 100 processors, the percentage of the original program that could be scalar would have to be 0.1% or less Speedup w/ E = 1/(.001 + .999/100) = 90.99 4 Amdahl’s Law If the portion of Computer Science 61C Spring 2018 the program that Wawrzynek and Weaver can be parallelized is small, then the speedup is limited The non-parallel portion limits the performance 5 Strong and Weak Scaling Computer Science 61C Spring 2018 Wawrzynek and Weaver • To get good speedup on a parallel processor while keeping the problem size fixed is harder than getting good speedup by increasing the size of the problem. -
Parallel Generation of Image Layers Constructed by Edge Detection
International Journal of Applied Engineering Research ISSN 0973-4562 Volume 13, Number 10 (2018) pp. 7323-7332 © Research India Publications. http://www.ripublication.com Speed Up Improvement of Parallel Image Layers Generation Constructed By Edge Detection Using Message Passing Interface Alaa Ismail Elnashar Faculty of Science, Computer Science Department, Minia University, Egypt. Associate professor, Department of Computer Science, College of Computers and Information Technology, Taif University, Saudi Arabia. Abstract A. Elnashar [29] introduced two parallel techniques, NASHT1 and NASHT2 that apply edge detection to produce a set of Several image processing techniques require intensive layers for an input image. The proposed techniques generate computations that consume CPU time to achieve their results. an arbitrary number of image layers in a single parallel run Accelerating these techniques is a desired goal. Parallelization instead of generating a unique layer as in traditional case; this can be used to save the execution time of such techniques. In helps in selecting the layers with high quality edges among this paper, we propose a parallel technique that employs both the generated ones. Each presented parallel technique has two point to point and collective communication patterns to versions based on point to point communication "Send" and improve the speedup of two parallel edge detection techniques collective communication "Bcast" functions. The presented that generate multiple image layers using Message Passing techniques achieved notable relative speedup compared with Interface, MPI. In addition, the performance of the proposed that of sequential ones. technique is compared with that of the two concerned ones. In this paper, we introduce a speed up improvement for both Keywords: Image Processing, Image Segmentation, Parallel NASHT1 and NASHT2. -
CS650 Computer Architecture Lecture 10 Introduction to Multiprocessors
NJIT Computer Science Dept CS650 Computer Architecture CS650 Computer Architecture Lecture 10 Introduction to Multiprocessors and PC Clustering Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 10: Intro to Multiprocessors/Clustering 1/15 12/7/2003 A. Sohn NJIT Computer Science Dept CS650 Computer Architecture Key Issues Run PhotoShop on 1 PC or N PCs Programmability • How to program a bunch of PCs viewed as a single logical machine. Performance Scalability - Speedup • Run PhotoShop on 1 PC (forget the specs of this PC) • Run PhotoShop on N PCs • Will it run faster on N PCs? Speedup = ? Lecture 10: Intro to Multiprocessors/Clustering 2/15 12/7/2003 A. Sohn NJIT Computer Science Dept CS650 Computer Architecture Types of Multiprocessors Key: Data and Instruction Single Instruction Single Data (SISD) • Intel processors, AMD processors Single Instruction Multiple Data (SIMD) • Array processor • Pentium MMX feature Multiple Instruction Single Data (MISD) • Systolic array • Special purpose machines Multiple Instruction Multiple Data (MIMD) • Typical multiprocessors (Sun, SGI, Cray,...) Single Program Multiple Data (SPMD) • Programming model Lecture 10: Intro to Multiprocessors/Clustering 3/15 12/7/2003 A. Sohn NJIT Computer Science Dept CS650 Computer Architecture Shared-Memory Multiprocessor Processor Prcessor Prcessor Interconnection network Main Memory Storage I/O Lecture 10: Intro to Multiprocessors/Clustering 4/15 12/7/2003 A. Sohn NJIT Computer Science Dept CS650 Computer Architecture Distributed-Memory Multiprocessor Processor Processor Processor IO/S MM IO/S MM IO/S MM Interconnection network IO/S MM IO/S MM IO/S MM Processor Processor Processor Lecture 10: Intro to Multiprocessors/Clustering 5/15 12/7/2003 A. -
High-Performance Message Passing Over Generic Ethernet Hardware with Open-MX Brice Goglin
High-Performance Message Passing over generic Ethernet Hardware with Open-MX Brice Goglin To cite this version: Brice Goglin. High-Performance Message Passing over generic Ethernet Hardware with Open-MX. Parallel Computing, Elsevier, 2011, 37 (2), pp.85-100. 10.1016/j.parco.2010.11.001. inria-00533058 HAL Id: inria-00533058 https://hal.inria.fr/inria-00533058 Submitted on 5 Nov 2010 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. High-Performance Message Passing over generic Ethernet Hardware with Open-MX Brice Goglin INRIA Bordeaux - Sud-Ouest – LaBRI 351 cours de la Lib´eration – F-33405 Talence – France Abstract In the last decade, cluster computing has become the most popular high-performance computing architec- ture. Although numerous technological innovations have been proposed to improve the interconnection of nodes, many clusters still rely on commodity Ethernet hardware to implement message passing within parallel applications. We present Open-MX, an open-source message passing stack over generic Ethernet. It offers the same abilities as the specialized Myrinet Express stack, without requiring dedicated support from the networking hardware. Open-MX works transparently in the most popular MPI implementations through its MX interface compatibility. -
An Investigation of Symmetric Multi-Threading Parallelism for Scientific Applications
An Investigation of Symmetric Multi-Threading Parallelism for Scientific Applications Installation and Performance Evaluation of ASCI sPPM Frank Castaneda [email protected] Nikola Vouk [email protected] North Carolina State University CSC 591c Spring 2003 May 2, 2003 Dr. Frank Mueller An Investigation of Symmetric Multi-Threading Parallelism for Scientific Applications Introduction Our project is to investigate the use of Symmetric Multi-Threading (SMT), or “Hyper-threading” in Intel parlance, applied to course-grain parallelism in large-scale distributed scientific applications. The processors provide the capability to run two streams of instruction simultaneously to fully utilize all available functional units in the CPU. We are investigating the speedup available when running two threads on a single processor that uses different functional units. The idea we propose is to utilize the hyper-thread for asynchronous communications activity to improve course-grain parallelism. The hypothesis is that there will be little contention for similar processor functional units when splitting the communications work from the computational work, thus allowing better parallelism and better exploiting the hyper-thread technology. We believe with minimal changes to the 2.5 Linux kernel we can achieve 25-50% speedup depending on the amount of communication by utilizing a hyper-thread aware scheduler and a custom communications API. Experiment Setup Software Setup Hardware Setup ?? Custom Linux Kernel 2.5.68 with ?? IBM xSeries 335 Single/Dual Processor Red Hat distribution 2.0 Ghz Xeon ?? Modification of Kernel Scheduler to run processes together ?? Custom Test Code In order to test our code we focus on the following scenarios: ?? A serial execution of communication / computation sections o Executing in serial is used to test the expected back-to-back run-time. -
Cluster, Grid and Cloud Computing: a Detailed Comparison
The 6th International Conference on Computer Science & Education (ICCSE 2011) August 3-5, 2011. SuperStar Virgo, Singapore ThC 3.33 Cluster, Grid and Cloud Computing: A Detailed Comparison Naidila Sadashiv S. M Dilip Kumar Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering Acharya Institute of Technology University Visvesvaraya College of Engineering (UVCE) Bangalore, India Bangalore, India [email protected] [email protected] Abstract—Cloud computing is rapidly growing as an alterna- with out any prior reservation and hence eliminates over- tive to conventional computing. However, it is based on models provisioning and improves resource utilization. like cluster computing, distributed computing, utility computing To the best of our knowledge, in the literature, only a few and grid computing in general. This paper presents an end-to- end comparison between Cluster Computing, Grid Computing comparisons have been appeared in the field of computing. and Cloud Computing, along with the challenges they face. This In this paper we bring out a complete comparison of the could help in better understanding these models and to know three computing models. Rest of the paper is organized as how they differ from its related concepts, all in one go. It also follows. The cluster computing, grid computing and cloud discusses the ongoing projects and different applications that use computing models are briefly explained in Section II. Issues these computing models as a platform for execution. An insight into some of the tools which can be used in the three computing and challenges related to these computing models are listed models to design and develop applications is given. -
Grid Computing: What Is It, and Why Do I Care?*
Grid Computing: What Is It, and Why Do I Care?* Ken MacInnis <[email protected]> * Or, “Mi caja es su caja!” (c) Ken MacInnis 2004 1 Outline Introduction and Motivation Examples Architecture, Components, Tools Lessons Learned and The Future Questions? (c) Ken MacInnis 2004 2 What is “grid computing”? Many different definitions: Utility computing Cycles for sale Distributed computing distributed.net RC5, SETI@Home High-performance resource sharing Clusters, storage, visualization, networking “We will probably see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” Len Kleinrock (1969) The word “grid” doesn’t equal Grid Computing: Sun Grid Engine is a mere scheduler! (c) Ken MacInnis 2004 3 Better definitions: Common protocols allowing large problems to be solved in a distributed multi-resource multi-user environment. “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” Kesselman & Foster (1998) “…coordinated resource sharing and problem solving in dynamic, multi- institutional virtual organizations.” Kesselman, Foster, Tuecke (2000) (c) Ken MacInnis 2004 4 New Challenges for Computing Grid computing evolved out of a need to share resources Flexible, ever-changing “virtual organizations” High-energy physics, astronomy, more Differing site policies with common needs Disparate computing needs -
Cloud Computing Over Cluster, Grid Computing: a Comparative Analysis
Journal of Grid and Distributed Computing Volume 1, Issue 1, 2011, pp-01-04 Available online at: http://www.bioinfo.in/contents.php?id=92 Cloud Computing Over Cluster, Grid Computing: a Comparative Analysis 1Indu Gandotra, 2Pawanesh Abrol, 3 Pooja Gupta, 3Rohit Uppal and 3Sandeep Singh 1Department of MCA, MIET, Jammu 2Department of Computer Science & IT, Jammu Univ, Jammu 3Department of MCA, MIET, Jammu e-mail: [email protected], [email protected], [email protected], [email protected], [email protected] Abstract—There are dozens of definitions for cloud Virtualization is a technology that enables sharing of computing and through each definition we can get the cloud resources. Cloud computing platform can become different idea about what a cloud computing exacting is? more flexible, extensible and reusable by adopting the Cloud computing is not a very new concept because it is concept of service oriented architecture [5].We will not connected to grid computing paradigm whose concept came need to unwrap the shrink wrapped software and install. into existence thirteen years ago. Cloud computing is not only related to Grid Computing but also to Utility computing The cloud is really very easier, just to install single as well as Cluster computing. Cloud computing is a software in the centralized facility and cover all the computing platform for sharing resources that include requirements of the company’s users [1]. software’s, business process, infrastructures and applications. Cloud computing also relies on the technology II. CLUSTER COMPUTING of virtualization. In this paper, we will discuss about Grid computing, Cluster computing and Cloud computing i.e. -
“Grid Computing”
VISHVESHWARAIAH TECHNOLOGICAL UNIVERSITY S.D.M COLLEGE OF ENGINEERING AND TECHNOLOGY A seminar report on “Grid Computing” Submitted by Nagaraj Baddi (2SD07CS402) 8th semester DEPARTMENT OF COMPUTER SCIENCE ENGINEERING 2009-10 1 VISHVESHWARAIAH TECHNOLOGICAL UNIVERSITY S.D.M COLLEGE OF ENGINEERING AND TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE ENGINEERING CERTIFICATE Certified that the seminar work entitled “Grid Computing” is a bonafide work presented by Mr. Nagaraj.M.Baddi, bearing USN 2SD07CS402 in a partial fulfillment for the award of degree of Bachelor of Engineering in Computer Science Engineering of the Vishveshwaraiah Technological University Belgaum, during the year 2009-10. The seminar report has been approved as it satisfies the academic requirements with respect to seminar work presented for the Bachelor of Engineering Degree. Staff in charge H.O.D CSE (S. L. DESHPANDE) (S. M. JOSHI) Name: Nagaraj M. Baddi USN: 2SD07CS402 2 INDEX 1. Introduction 4 2. History 5 3. How Grid Computing Works 6 4. Related technologies 8 4.1 Cluster computing 8 4.2 Peer-to-peer computing 9 4.3 Internet computing 9 5. Grid Computing Logical Levels 10 5.1 Cluster Grid 10 5.2 Campus Grid 10 5.3 Global Grid 10 6. Grid Architecture 11 6.1 Grid fabric 11 6.2 Core Grid middleware 12 6.3 User-level Grid middleware 12 6.4 Grid applications and portals. 13 7. Grid Applications 13 7.1 Distributed supercomputing 13 7.2 High-throughput computing 14 7.3 On-demand computing 14 7.4 Data-intensive computing 14 7.5 Collaborative computing 15 8. Difference: Grid Computing vs Cloud Computing 15 9. -
Strategies for Managing Business Disruption Due to Grid Computing
Strategies for managing business disruption due to Grid Computing by Vidyadhar Phalke Ph.D. Computer Science, Rutgers University, New Jersey, 1995 M.S. Computer Science, Rutgers University, New Jersey, 1992 B.Tech. Computer Science, Indian Institute of Technology, Delhi, 1989 Submitted to the MIT Sloan School of Management in Partial Fulfillment of the Requirements for the Degree of Master of Science in the Management of Technology at the Massachusetts Institute of Technology June 2003 © 2003 Vidyadhar Phalke All Rights Reserved The author hereby grants to MIT permission to reproduce and to distribute publicly paper and electronic copies of this thesis document in whole or in part Signature of Author: MIT Sloan School of Management 9 May 2003 Certified By: Starling D. Hunter III Theodore T. Miller Career Development Assistant Professor Thesis Supervisor Accepted By: David A. Weber Director, Management of Technology Program 2 Strategies for managing business disruption due to Grid Computing by Vidyadhar Phalke Submitted to the MIT Sloan School of Management on May 9 2003 in partial fulfillment of the requirements for the degree of Master of Science in the Management of Technology ABSTRACT In the technology centric businesses disruptive technologies displace incumbents time and again, sometimes to the extent that incumbents go bankrupt. In this thesis we would address the issue of what strategies are essential to prepare for and to manage disruptions for the affected businesses and industries. Specifically we will look at grid computing that is poised to disrupt (1) certain Enterprise IT departments, and (2) the software industry in the high-performance and web services space. -
Computer Architecture: Parallel Processing Basics
Computer Architecture: Parallel Processing Basics Onur Mutlu & Seth Copen Goldstein Carnegie Mellon University 9/9/13 Today What is Parallel Processing? Why? Kinds of Parallel Processing Multiprocessing and Multithreading Measuring success Speedup Amdhal’s Law Bottlenecks to parallelism 2 Concurrent Systems Embedded-Physical Distributed Sensor Claytronics Networks Concurrent Systems Embedded-Physical Distributed Sensor Claytronics Networks Geographically Distributed Power Internet Grid Concurrent Systems Embedded-Physical Distributed Sensor Claytronics Networks Geographically Distributed Power Internet Grid Cloud Computing EC2 Tashi PDL'09 © 2007-9 Goldstein5 Concurrent Systems Embedded-Physical Distributed Sensor Claytronics Networks Geographically Distributed Power Internet Grid Cloud Computing EC2 Tashi Parallel PDL'09 © 2007-9 Goldstein6 Concurrent Systems Physical Geographical Cloud Parallel Geophysical +++ ++ --- --- location Relative +++ +++ + - location Faults ++++ +++ ++++ -- Number of +++ +++ + - Processors + Network varies varies fixed fixed structure Network --- --- + + connectivity 7 Concurrent System Challenge: Programming The old joke: How long does it take to write a parallel program? One Graduate Student Year 8 Parallel Programming Again?? Increased demand (multicore) Increased scale (cloud) Improved compute/communicate Change in Application focus Irregular Recursive data structures PDL'09 © 2007-9 Goldstein9 Why Parallel Computers? Parallelism: Doing multiple things at a time Things: instructions,