Hardware Systems: Processor and Board Alternatives
Total Page:16
File Type:pdf, Size:1020Kb
Hardware Systems: Processor and Board Alternatives Afshin Attarzadeh INFOTECH, Universität Stuttgart, [email protected] Abstract: Nowadays parallel computing has a great influence in our daily life. Weather forecast, air control, modeling nuclear experiments instead of actually performing it and lots of other issues are directly related to parallel computing concept. Clusters in the issue of parallelism, are more commonly used today. The issue of clusters, like other concepts in parallelism, is a system issue which involves with software and hardware and their relation to each other. In this paper the hardware part is considered mostly. Introduction – What is cluster computing? As Pfister mentions in his book [1], there are three ways to improve performance: − Work harder − Work smarter − Get help To work harder is just like using faster hardware. This means using faster processors, faster and higher capacity of memory storage and peripheral devices with higher capabilities. Working smarter is when things are done more efficiently and this is due to use of efficient and faster algorithms and techniques. The last aspect deals with parallel processing. Clusters could be a good solution in utilizing all those three aspects of performance improvement with a reasonable expense! Clusters are so flexible that commodity components could be used in their hardware structure. Their flexibility allows designers in developing high performance parallel algorithms. Clusters could be easily configured and more over they are scalable. This means they can easily follow the technology advances in both hardware and software aspects. According to Buyya[2] the most common scalable parallel computer architectures could be classified as follows: • Massively Parallel Processors (MPP) • Symmetric Multiprocessors (SMP) • Cache-Coherent Nonuniform Memory Access (CC-NUMA) • Distributed Systems • Clusters 2 Afshin Attarzadeh Clusters – Definition and Architecture There are many different definitions on clusters. Some of these definitions are given below: − “A commonly found computing environment consists of many workstations connected together by a local area network. The workstations, which have become increasingly powerful over the years, can together, be viewed as a significant computing resource. This resource is commonly known as cluster of workstations.” [3] − “A computer cluster is a group of locally connected computers that work together as a unit. [4] − “A cluster is a type of parallel or distributed processing system, which consists of a collection of interconnected stand-alone computers working together as a single, integrated computing resource.” [2] Although the definitions above has some slight differences with each other but all are common in defining a cluster as a single computing resource although it contains several nods. Additionally all these nodes are located locally. Clusters are typically used for two major reasons. One is High Availability (HA) for greater reliability and the other one is High Performance Computing (HPC) to provide greater computational power than a single computer can provide. Below a schematic view of cluster architecture is given. Fig. 1. Cluster Computer Architecture [4] 4T-Machines Clusters are also mentioned as 4T machines. They are named 4T because they have a computation power of Tips (Tera instruction per second), Tera bytes of Memory storage, Tera bytes per second of IO bandwidth and Tera bytes per second bandwidth of communication.[] Hardware Systems: Processor and Board Alternatives 3 Why clusters? One of the reasons of using clusters in parallel computing is the flexibility in using commodity or commercial hardware components. This feature leads to a huge cost reduction while the overall performance does not decrease as much. The other brilliant reason is that a cluster could be developed from scratch with an arbitrary number of computing nodes which in the future could be changed (Scalability). Classification of clusters Clusters are used widely today, while they offer the following features at a relatively low cost [2]: • High Performance • Expandability and Scalability • High Throughput • High Availability Clusters are classified into many categories based on various factors. Some of these classifications are given below in brief [2]: 1. Application Target • High Performance Clusters for Computational science application • High Availability Clusters for mission-critical applications 2. Node Ownership • Dedicated Clusters: all nodes are reserved for cluster and all the nodes have a dedicated task. • Non-dedicated Clusters: in this case nodes are as workstation and for the tasks related to the cluster, the server steals idle cycles from the workstations. 3. Node Hardware • Clusters of PCs (CoPs) or Pile of PCs (PoPs) • Clusters of Workstations (COWs) • Clusters of SMPs (CLUMPs) Cluster Hardware Components The major hardware components to build a cluster are Processors, Memory and cache, Disk storage and IO interfaces, and Network interfaces. Among all, the processors and memory technology have been improved rapidly. Once it was said that the ultimate in processors would be achieving the processors up to 1GHz. But now the processors are produced with a clock frequency up to 3 GHz which means 3 times faster. 4 Afshin Attarzadeh Processors During the past decades there has been an outstanding progress in microprocessor architecture, which made the single processor chips as powerful as supercomputers but obviously with a lower price! Some of these developed architectures are: RISC, CISC, VLIW1 and Vector [2]. The following is a brief information on the latest products of four big processor producers. Alpha [8] The DEC Alpha, also known as the Alpha AXP, is a 64-bit RISC microprocessor originally developed and fabricated by Digital Equipment Corp. (DEC), which used it in its own line of workstations and servers. Designed as a successor to the VAX line of computers, it supported the VMS operating system, as well as Digital UNIX. Later open source operating systems also ran on the Alpha, notably Linux and BSD UNIX flavors. Microsoft supported the processor until Windows NT 4.0 SP6 and did not extend Alpha support beyond beta 3 of Windows 2000. Intel [9][10][11] Intel processors are now popular among PC users. The current generation of Intel x86 processor family is now the Pentium 4 family. Intel has introduced different CPUs for different applications. In general Intel has introduced 4 different types of CPUs; Desktop CPUs, Mobile CPUs, Server CPUs and CPUs for workstation. Itanium the new generation of Intel processors in the class of server solutions, is designed with an array of innovative features to extract greater instruction parallelism including speculation, prediction, large register files, a register stack, advanced branch architecture, and many others. It has the ability to address the memory with its 64-bit address registers, which results in a better function and higher performance in server applications. The Itanium also has an innovative floating-point architecture that supports the high performance requirements of workstation applications such as digital content creation, design engineering, and scientific analysis. For compatibility reasons Itanium based processors can run IA-32 applications on an Itanium-based operating system that supports execution of IA-32 applications. A mixed IA-32 and Itanium-based code execution is also supported in Itanium based processors. For the Server and Workstation types, Intel has introduced a new technology named Hyper-Threading. [10] Hyper-Threading Technology enables multi-threaded software applications to execute threads in parallel. This level of threading technology has never been seen before in a general-purpose microprocessor. To improve performance in the past, threading was enabled in the software by splitting instructions into multiple streams so that multiple processors could act upon them. Today with Hyper-Threading Technology, processor-level threading can be utilized which offers more efficient use of processor resources for greater parallelism and improved performance on today's multi-threaded software. 1 Very Long Instruction Word Hardware Systems: Processor and Board Alternatives 5 Hyper-Threading Technology provides thread-level-parallelism (TLP) on each processor resulting in increased utilization of processor execution resources. As a result, resource utilization yields higher processing throughput. Hyper-Threading Technology is a form of simultaneous multi-threading technology (SMT) where multiple threads of software applications can be run simultaneously on one processor. This is achieved by duplicating the architectural state on each processor, while sharing one set of processor execution resources. AMD [12][13] AMD and Intel are progressing with the same pace in the area of processors. Like Intel AMD has introduced its processors in four different applications: Desktop, Mobile, Server and Workstation. For server solutions AMD has currently introduced its Athlon MP and Optron. Athlon MP is the same as Athlon 64 for desktops but with this additional smart MP technology introduced by AMD. AMD’s innovative Smart MP technology uses dual, independent point-to-point system buses to increase available bus bandwidth. Along with a praiseworthy cache management system Smart MP technology allows high speed communications between processors, helps reduce data transfer latencies, and helps ensure that both processors work to their full