UNIT- I INTRODUCTION Grid Computing Values and Risks
Total Page:16
File Type:pdf, Size:1020Kb
CS693 Grid Computing Unit - I UNIT- I INTRODUCTION Grid Computing values and risks – History of Grid computing – Grid computing model and protocols – overview of types of Grids Grid Computing values and risks Key value elements that provide credible inputs to the various valuation metrics that can be used by enterprises to build successful Grid Computing deployment business cases. Each of the value elements can be applied to one, two, or all of the valuation models that a company may consider using, such as return on investment (ROI), total cost of ownership (TCO), and return on assets (ROA) Grid Computing Value Elements • Leveraging Existing Hardware Investments and Resources • Reducing Operational Expenses • Creating a Scalable and Flexible Enterprise IT Infrastructure • Accelerating Product Development, Improving Time to Market, and Raising Customer Satisfaction • Increasing Productivity Leveraging Existing Hardware Investments and Resources • Grids can be deployed on an enterprise’s existing infrastructure o mitigate the need for investment in new hardware systems. • eliminating expenditure on air conditioning, electricity, and in many cases, development of new data centers Example: grids deployed on existing desktops and servers provide over 93 percent in up-front hardware cost savings when compared to High Performance Computing Systems (HPC) Reducing Operational Expenses • Key self-healing and self-optimizing capabilities free system administrators from routine tasks o allow them to focus on high-value, important system administration • The operational expenses of a Grid Computing deployment are 73 percent less than for comparable HPC- based solutions • grids are being deployed in enterprises in as quickly as two days, with little or no disruption to operations • Cluster system deployments, on the other hand, are taking 60–90 days, in addition to the days required to configure and deploy the applications Creating a Scalable and Flexible Enterprise IT Infrastructure • Grid Computing allows companies to add resources linearly based on real-time business requirements • These resources can be derived from within the enterprise or from utility computing services • projects never have to be put on hold for a lack of computational capacity, space, or system priority • The entire compute infrastructure of the enterprise is available for connecting • Grid Computing can help bring about the end of departmental silos and expose computational assets curtained by server huggers and bureaucracy Accelerating Product Development, Improving Time to Market, and Raising Customer Satisfaction • The dramatic reduction in, simulation times can get products completed quickly. • This also provides the capability to perform a lot more detailed and exhaustive product design, as the computational resources brought to bear by the grid can quickly churn through the complex models and scenarios to detect design flaws Example: Life Sciences: Drug Discovery • introducing a “New Chemical Entity” (drug)” into the market costs US $802M @ 12–15 years MTech CSE (PT, 2011-14) SRM, Ramapuram 1 hcr:innovationcse@gg CS693 Grid Computing Unit - I • Grid Computing is allowing drug companies to get the most out of their R&D expenditure by developing the right product and getting it to market in the shortest possible time. • can save almost US $5M per month in R&D expenses of the drug development process • can amount to almost US $1M per day for each day that the product is brought to market early Increasing Productivity • run times of jobs submitted by its engineers were reduced by 58 percent by deploying a grid Example: Risk Analysis the key risk factors that plague technology deployments and analyze its vulnerabilities Lock-in • the IT manager will not be making any investments in durable complementary assets which promote lock-in • software, and supporting infrastructure and may contribute to customer lock-in. • pay keen attention to which vendors are supporting the Grid Computing standards at the Global Grid Forum Switching Costs • the primary switching cost will be driven by the effort required to integrate and enable enterprise applications to work on whatever replacement grid infrastructure that has been selected. • performed by utilizing software development toolkits • introduce new grid software in the enterprise to support new grid-enabled applications, while letting the existing software deployment and its integration with legacy grid software remain unchanged. Project Implementation Failure • project failure, either due to bad project management or incorrect needs assessment • take advantage of hosted pilot and professional services offered by grid software vendors • This will allow the IT manager to accurately pre-assess the suitability of the grid software, level of integration required, and feasibility (application speedup times, productivity gains, etc.). • Hosted pilots are conducted on the vendors’ data centers and have no impact to the company MTech CSE (PT, 2011-14) SRM, Ramapuram 2 hcr:innovationcse@gg CS693 Grid Computing Unit - I History of Grid Computing Academic Research Projects I-Way • 11 high speed networks were used to connect 17 sites with high-end computing resources for a demonstration to create one super “metacomputer* • Sixty different applications, spanning various faculties of science and engineering, were developed and run over this demonstration network. • Many of the early Grid Computing concepts were explored Globus • a suite of tools that laid the foundation for Grid Computing activities • 80 sites worldwide running software based on the Globus Toolkit were connected together Entropia • to harness the idle computers worldwide to solve problems of scientific interest • grew to 30,000 computers with aggregate speed of over one teraflop per second • ordinary users volunteered their PCs to analyze research topics such as patient’s response to chemotherapy, discovering drugs for AIDS, and potential cures for anthrax High-Performance Computing • refers to supercomputing • There are hundreds of supercomputers deployed throughout the world. • Key parallel processing algorithms have already been developed to support execution of programs on different, but co-located processors. • High-performance computing system deployment, is not limited to academic or research institutions. • The industries in which high performance systems are deployed are numerous in nature. • Example: Telecommunication, Finance, Automotive, Database, Transportation, Electronics, Geophysics, Aerospace, Energy, World Wide Web, Information Services, Chemistry, Manufacturing, Mechanics, Pharmaceutics Cluster Computing • Clusters are high-performance, massively parallel computers built primarily out of commodity hardware components, running a free-software operating system such as Linux or FreeBSD, and interconnected by a private high-speed network. • It consists of a cluster of PCs, or workstations, dedicated to running high-performance computing tasks. • The nodes in the cluster do not sit on users’ desks, but are dedicated to running cluster jobs. • A cluster is usually connected to the outside world through only a single node • numerous tools have been developed to run and manage clusters • load-balancing tools, tools to adapt applications run in the parallel cluster environment (Ex: ForgeExplorer) MTech CSE (PT, 2011-14) SRM, Ramapuram 3 hcr:innovationcse@gg CS693 Grid Computing Unit - I Peer-to-Peer Computing • Two main models: centralized model, decentralized model Centralized model • file sharing is based around the use of a central server system that directs traffic between individual registered users. • The central servers maintain directories of the shared files stored on the respective PCs of registered users of the network. • These directories are updated every time a user logs on or off the Napster server network. • Each time a user of a centralized P2P file sharing system submits a request or searches for a particular file, the central server creates a list of files matching the search request by cross-checking the request with the server’s database of files belonging to users who are currently connected to the network. • The central server then displays that list to the requesting user. • The requesting user can then select the desired file from the list and open a direct HTTP link with the individual computer that currently possesses that file. • The download of the actual file takes place directly, from one network user to the other. • The actual file is never stored on the central server or on any intermediate point on the network. Decentralized model • file sharing does not use a central server to keep track of files. • it relies on each individual computer to announce its existence to a peer, which in turn announces it to all the users that it is connected to, and so on. • If one of the computers in the peer network has a file that matches the request, it transmits the file information (name, size) back through all the computers in the pathway to the user that requested the file. • A direct connection is established and the file is transferred Internet Computing • utilize the vast processing cycles available at users’ desktops • Large compute intensive projects are coded so that tasks can be broken down into smaller subtasks and distributed over the Internet for processing. • Volunteer users then download a lightweight client onto their desktop, which periodically communicates with the central server to receive tasks. • The client initiates