Difference Between Grid Computing Vs. Distributed Computing
Total Page:16
File Type:pdf, Size:1020Kb
Difference between Grid Computing Vs. Distributed Computing Definition of Distributed Computing Distributed Computing is an environment in which a group of independent and geographically dispersed computer systems take part to solve a complex problem, each by solving a part of solution and then combining the result from all computers. These systems are loosely coupled systems coordinately working for a common goal. It can be defined as 1. A computing system in which services are provided by a pool of computers collaborating over a network. 2. A computing environment that may involve computers of differing architectures and data representation formats that share data and system resources. Distributed Computing, or the use of a computational cluster, is defined as the application of resources from multiple computers, networked in a single environment, to a single problem at the same time - usually to a scientific or technical problem that requires a great number of computer processing cycles or access to large amounts of data. Picture: The concept of distributed computing is simple -- pull together and employ all available resources to speed up computing. The key distinction between distributed computing and grid computing is mainly the way resources are managed. Distributed computing uses a centralized resource manager and all nodes cooperatively work together as a single unified resource or a system. Grid computingutilizes a structure where each node has its own resource manager and the system does not act as a single unit. Definition of Grid Computing The Basic idea between Grid Computing is to utilize the ideal CPU cycles and storage of millions of computer systems across a worldwide network function as a flexible, pervasive, and inexpensive accessible pool that could be harnessed by anyone who needs it, similar to the way power companies and their users share the electrical grid. There are many definitions of the term: Grid computing: 1. A service for sharing computer power and data storage capacity over the Internet 2. An ambitious and exciting global effort to develop an environment in which individual users can access computers, databases and experimental facilities simply and transparently, without having to consider where those facilities are located. 3. Grid computing is a model for allowing companies to use a large number of computing resources on demand, no matter where they are located. Grid computing can be defined as a type of parallel and distributed system that enables sharing, selection, and aggregation of geographically distributed autonomous resources. Grid resources are assigned dynamically at runtime depending on their availability and capability. Many people confuse between grid computing, distributed computing, and computational clusters. You have 10 computers somewhere that can be used for distributed calculations of your model, and people already call it a grid, most likely because the word grid is easy to work with and sounds good too. It does not really matter much, but for the sake of clarity, IT perfectionists like to distinguish between a grid and the others. Grid Computing or the use of a computational grid is defined as the application of resources of multiple computers in a network to a single problem at the same time, while crossing political and theoretical boundaries. A true grid comprises multiple distinct distributed processing environments. Grid computing virtualizes the processing resources of multiple computers for use towards a single problem, either through dedicated or shared hardware. What this means is that your grid- enabled application is not tied to the computer on your desk, it can seamlessly use more than one computer and other resources even beyond the walls of your building to boost its performance. Picture: Grid computing employs not only single resources but whole systems from various locations while crossing geographic and political boundaries. Grid Computing Vs. Distributed Computing Since 1980, two advances in technology have made distributed computing a more practical idea, computer CPU power and communication bandwidth. The result of these technologies is not only feasible but easy to put together large number of computer systems for solving complex computational power or storage requirements. The numbers of real distributable applications are still somewhat limited, and the challenges are still significant (standardization, interoperability etc). As it is clear from the definition, traditional distributed computing can be characterized as a subset of grid computing. Some of the differences between these two are 1. Distributed Computing normally refers to managing or pooling the hundreds or thousands of computer systems which individually are more limited in their memory and processing power. On the other hand, grid computing has some extra characteristics. It is concerned to efficient utilization of a pool of heterogeneous systems with optimal workload management utilizing an enterprise's entire computational resources (servers, networks, storage, and information) acting together to create one or more large pools of computing resources. There is no limitation of users, departments or originations in grid computing. 2. Grid computing is focused on the ability to support computation across multiple administrative domains that sets it apart from traditional distributed computing. Grids offer a way of using the information technology resources optimally inside an organization involving virtualization of computing resources. Its concept of support for multiple administrative policies and security authentication and authorization mechanisms enables it to be distributed over a local, metropolitan, or wide-area network. Reference:- 1. http://www.jatit.org/distributed-computing/grid-vs-distributed.htm 2. http://www.maxi-pedia.com/Grid+computing+distributed+computing Cluster computing Definition: Cluster computing is the technique of linking two or more computers into a network (usually through a local area network) in order to take advantage of the parallel processing power of those computers. An eternal struggle in any IT department is in finding a method to squeeze the maximum processing power out of a limited budget. Today more than ever, enterprises require enormous processing power in order to manage their desktop applications, databases and knowledge management. Many business processes are extremely heavy users of IT resources, and yet IT budgets struggle to keep pace with the ever growing demand for yet more power. Types of Computer Clusters There are several different varieties of computer clusters, each offering different advantages to the user. These varieties are: High Availability Clusters:- HA Clusters are designed to ensure constant access to service applications. The clusters are designed to maintain redundant nodes that can act as systems in the event of failure. The minimum number of nodes in a HA cluster is two – one active and one redundant – though most HA clusters will use considerably more nodes. HA clusters aim to solve the problems that arise from mainframe failure in an enterprise. Rather than lose all access to IT systems, HA clusters ensure 24/7 access to computational power. This feature is especially important in business, where data processing is usually time-sensitive. Load-balancing Clusters Load-balancing clusters operate by routing all work through one or more load-balancing front-end nodes, which then distribute the workload efficiently between the remaining active nodes. Load-balancing clusters are extremely useful for those working with limited IT budgets. Devoting a few nodes to managing the workflow of a cluster ensures that limited processing power can be optimized. High-performance Clusters HPC clusters are designed to exploit the parallel processing power of multiple nodes. They are most commonly used to perform functions that require nodes to communicate as they perform their tasks – for instance, when calculation results from one node will affect future results from another. The best known HPC cluster is Berkeley’s Seti@Home Project, an HPC cluster consisting of over 5 million volunteer home computers devoting processing power to the analysis of data from the Arecibo Observatory radio telescope. Ref: - [http://www.bestpricecomputers.co.uk/glossary/cluster-computing.htm] Parallel Computing Traditionally, software has been written for serial computation: o To be run on a single computer having a single Central Processing Unit (CPU); o A problem is broken into a discrete series of instructions. o Instructions are executed one after another. o Only one instruction may execute at any moment in time. In the simplest sense, parallel computing is the simultaneous use of multiple compute resources to solve a computational problem: o To be run using multiple CPUs o A problem is broken into discrete parts that can be solved concurrently o Each part is further broken down to a series of instructions o Instructions from each part execute simultaneously on different CPUs The compute resources can include: o A single computer with multiple processors; o An arbitrary number of computers connected by a network; o A combination of both. The computational problem usually demonstrates characteristics such as the ability to be: o Broken apart into discrete pieces of work that can be solved simultaneously; o Execute multiple program instructions at any moment in time; o Solved in less time with multiple compute resources than with a single compute resource. Ref: - [https://computing.llnl.gov/tutorials/parallel_comp/] Shantinath M. Patil BE[CSE] .