Grid Computing and Netsolve
Total Page:16
File Type:pdf, Size:1020Kb
Between Now and the End Grid Computing Today March 30th Grids and NetSolve and April 6th Felix Wolf: More on Grid computing, peer-to-peer, grid NetSolve infrastructure etc April 13th Jack Dongarra Performance Measurement with PAPI (Dan Terpstra) April 20th: Debugging (Shirley Moore) April 25th (Monday): Presentation of class projects 1 2 More on the In-Class Presentations Distributed Computing Start at 1:00 on Monday, 4/25/05 Concept has been around for two decades Roughly 20 minutes each Basic idea: run scheduler across systems to Use slides runs processes on least-used systems first Describe your project, perhaps motivate via Maximize utilization application Minimize turnaround time Describe your method/approach Have to load executables and input files to Provide comparison and results selected resource Shared file system See me about your topic File transfers upon resource selection 3 4 Examples of Distributed Computing SETI@home: Global Distributed Computing Running on 500,000 PCs, ~1000 CPU Years per Day Workstation farms, Condor flocks, etc. 485,821 CPU Years so far Sophisticated Data & Signal Processing Analysis Generally share file system Distributes Datasets from Arecibo Radio Telescope SETI@home project, United Devices, etc. Only one source code; copies correct binary code and input data to each system Napster, Gnutella: file/data sharing NetSolve Runs numerical kernel on any of multiple independent systems, much like a Grid solution 5 6 1 SETI@home Grid Computing - from ET to Use thousands of Internet- connected PCs to help in the Anthrax search for extraterrestrial intelligence. Uses data collected with the Arecibo Radio Telescope, in Puerto Rico Largest distributed When their computer is idle or computation project in being wasted this software will existence download a 300 kilobyte chunk ~ 400,000 machines of data for analysis. Averaging 27 Tflop/s The results of this analysis are sent back to the SETI team, Today many companies trying combined with thousands of this for profit. other participants. 7 8 United Devices and Cancer Screening Grid Computing Enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals—in the absence of central control, omniscience, trust relationships. Resources (HPC systems, visualization systems & displays, storage systems, sensors, instruments, people) are integrated via ‘middleware’ to facilitate use of all resources. 9 10 Why Grids? Why Grids? A biochemist exploits 10,000 computers to screen Resources have different functions, but 100,000 compounds in an hour multiple classes resources are necessary for 1,000 physicists worldwide pool resources for most interesting problems. petaop analyses of petabytes of data Power of any single resource is small Civil engineers collaborate to design, execute, & compared to aggregations of resources analyze shake table experiments Network connectivity is increasing rapidly in Climate scientists visualize, annotate, & analyze bandwidth and availability terabyte simulation datasets An emergency response team couples real time Large problems require teamwork and data, weather model, population data computation 11 12 2 Why Grids? (contd.) The Grid Problem A multidisciplinary analysis in aerospace couples Flexible, secure, coordinated resource code and data in four companies sharing among dynamic collections of A home user invokes architectural design individuals, institutions, and resource From “The Anatomy of the Grid: Enabling Scalable Virtual Organizations” functions at an application service provider Enable communities (“virtual organizations”) An application service provider purchases cycles to share geographically distributed from compute cycle providers resources as they pursue common goals -- Scientists working for a multinational soap assuming the absence of… company design a new product central location, central control, A community group pools members’ PCs to analyze alternative designs for a local road omniscience, existing trust relationships. 13 14 Elements of the Problem Grid vs. Internet/Web Services? Resource sharing We’ve had computers connected by networks Computers, storage, sensors, networks, … for 20 years Sharing always conditional: issues of trust, policy, negotiation, payment, … We’ve had web services for 5-10 years Coordinated problem solving The Grid combines these things, and brings Beyond client-server: distributed data analysis, additional notions computation, collaboration, … Virtual Organizations Dynamic, multi-institutional virtual organisations Infrastructure to enable computation to be carried out Community overlays on classic org structures across these Large or small, static or dynamic Authentication, monitoring, information, resource discovery, status, coordination, etc Can I just plug my application into the Grid? No! Much work to do to get there! 15 16 What does this mean now for users and Network Bandwidth Growth developers? There is a grand vision of the future Network vs. computer performance Collecting resources around the world into Vos Computer speed doubles every 18 months Seamless access to them, with a single signon Network speed doubles every 9 months NEW applications to exploit of them in unique ways! Today we want to help you prepare for this Difference = order of magnitude per 5 years There is a frustrating reality of the present 1986 to 2000 These technologies are not yet fully mature Computers: x 500 Not fully deployed Networks: x 340,000 Not consistent across even single Vos But centers and funding agencies worldwide are 2001 to 2010 pushing this very, very hard Computers: x 60 Better get ready now Networks: x 4000 You can help! Work with your centers to get this deployed 17 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American18(Jan- 2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins. 3 Computational Grids and The Grid Electric Power Grids To treat CPU cycles and software like commodities. Why the Computational Why the Computational Grid is like the Electric Grid is different from Napster on steroids. Power Grid the Electric Power Grid Electric power is Wider spectrum of Enable the coordinated use of geographically distributed resources – in the absence of central control and existing trust relationships. ubiquitous performance Computing power is produced much like utilities such as power and Don’t need to know the Wider spectrum of water are produced for consumers. source of the power services Users will have access to “power” on demand (transformer, Access governed by “When the Network is as fast as the computer’s internal links, the generator) or the power more complicated machine disintegrates across the Net into a set of special purpose company that serves it issues appliances” Security Gilder Technology Report June 2000 Performance Socio-political factors 19 20 What is Grid Computing? The Computational Grid is… Resource sharing & coordinated problem solving in dynamic, multi-institutional …a distributed control infrastructure that allows virtual organizations applications to treat compute cycles as commodities. Power Grid analogy Power producers: machines, software, networks, storage DATA ADVANCED systems ACQUISITION VISUALIZATION ,ANALYSIS Power consumers: user applications Applications draw power from the Grid the way appliances draw electricity from the power utility. QuickTime™ and a decompressor are needed to see this picture. COMPUTATIONAL Seamless RESOURCES IMAGING INSTRUMENTS LARGE-SCALE DATABASES High-performance Ubiquitous Dependable 21 22 Introduction Introduction Grid systems can be classified depending on Computational Grid: their usage: denotes a system that has a higher aggregate Distributed Computational Supercomputing capacity than any of its constituent machine Grid High it can be further categorized based on how the Throughput overall capacity is used Grid Data Grid Systems Distributed Supercomputing Grid: On Demand executes the application in parallel on multiple Service machines to reduce the completion time of a job Collaborative Grid Multimedia 23 24 4 Introduction Introduction Grand challenge problems typically require a Data Grid: distributed supercomputing Grid – one of the systems that provide an infrastructure for motivating factors of early Grid research – synthesizing new information from data still driving in some quarters repositories such as digital libraries or data High throughput Grid: warehouses increases the completion rate of a stream of jobs applications for these systems would be special arriving in real time purpose data mining that correlates information ASIC or processor design verifications tests would from multiple different high volume data sources be run on a high throughput Grid 25 26 Introduction Introduction Service Grid: Multimedia Grid: systems that provide services that are not provides an infrastructure for real time multimedia provided by any single machine applications -- requires the support quality of subdivided based on the type of service they service across multiple different machines provide whereas a multimedia application on a single dedicated machine can be deployed without QoS collaborative Grid: synchronization between network and end-point connects users and applications into collaborative QoS workgroups -- enable real time interaction between