The Distributed, Peer-To-Peer Approach" (2010)

Clemson University TigerPrints All Theses Theses 5-2010 Building Scientific loudC s: The Distributed, Peer- to-Peer Approach Linton Vadakedathu Clemson University, [email protected] Follow this and additional works at: https://tigerprints.clemson.edu/all_theses Part of the Computer Sciences Commons Recommended Citation Vadakedathu, Linton, "Building Scientific loudC s: The Distributed, Peer-to-Peer Approach" (2010). All Theses. 786. https://tigerprints.clemson.edu/all_theses/786 This Thesis is brought to you for free and open access by the Theses at TigerPrints. It has been accepted for inclusion in All Theses by an authorized administrator of TigerPrints. For more information, please contact [email protected]. Building Scientific Clouds: The Distributed, Peer-to-Peer Approach A Dissertation Presented to the Graduate School of Clemson University In Partial Fulfillment of the Requirements for the Degree Master of Science Computer Science by Linton Abraham Vadakedathu May 2010 Accepted by: Dr. Sebastien Goasguen, Committee Chair Dr. James Martin Dr. Pradip Srimani Abstract The Scientific community is constantly growing in size. The increase in personnel number and projects have resulted in the requirement of large amounts of storage, CPU power and other computing resources. It has also become necessary to acquire these resources in an affordable manner that is sensitive to work loads. In this thesis, the author presents a novel approach that provides the communication platform that will support such large scale scientific projects. These resources could be difficult to acquire due to NATs, firewalls and other site-based restrictions and policies. Methods used to overcome these hurdles have been discussed in detail along with other advantages of using such a system, which include: increased availability of necessary computing infrastructure; increased grid resource utilization; reduced user dependability; reduced job execution time. Experiments conducted included local infrastructure on the Clemson University Campus as well as resources provided by other federated grid sites. ii Dedication This thesis is dedicated to my family who stood by me with love and support. It is also dedicated to Dr. Sebastien Goasguen and the School of Computing at Clemson University. iii Acknowledgments The Author would like to acknowledge Dr. Sebastien Goasguen for his guidance and support in all research and course work done during the Masters program. The author thanks Michael Murphy and Michael Fenn for their time and support. The author acknowledges the guidance of Dr. James Martin and Dr. Pradip Srimani towards the correction and completion of this thesis. Finally, the author would like to acknowledge David Wolinsky of the University of Florida for his assistance in various research related undertakings. iv Contents TitlePage .......................................... .. i Abstract ........................................... .. ii Dedication......................................... iii Acknowledgments.................................... .... iv ListofTables ....................................... vii ListofFigures...................................... .viii ListofListings ..................................... .... ix 1 Introduction ...................................... ... 1 1.1 ResearchQuestions............................... ...... 3 1.2 MotivationforResearch . ....... 3 1.3 DistributedComputing .. .. .. .. .. .. .. .. .. .. .. .. ...... 4 1.4 Virtualization and Cloud Computing . .......... 5 1.5 High Throughput Computing and Condor . ........ 8 1.5.1 CondoratClemson............................... .. 10 1.6 Peer to Peer Architecture vs. Client-Server Architecture................ 10 1.7 DistributedHashTables . ....... 12 1.8 From P2P to DHT to Brunet, then came - IP Over P2P (IPOP) . ........ 15 1.8.1 ArchitecturalOverview . ..... 17 1.9 VirtualOrganizationCluster(VOC) . .......... 20 1.9.1 Physical Administrative Domain . ....... 22 1.9.2 Virtual Administrative Domain . ....... 23 1.9.3 Provisioning and Execution of Virtual Machines . ........... 24 2 RelatedWork ....................................... 25 2.1 Chord ........................................... 25 2.2 Pastry.......................................... .. 26 2.3 Kademlia ........................................ .. 27 2.4 Brunet(BasedonSymphony) . ...... 28 2.5 WorkRelatedtoVOCs ............................... ... 29 3 ProposedSystem .................................... 31 3.1 OverlayNetworkingDetails . ........ 31 3.2 IPOP BootstrapsandNamespacesinClemson . ......... 34 3.3 Multi-site communication and Watchdog . ........... 38 3.4 CondorIPOPPool-Local/SingleSite . ......... 39 v 3.5 Condor IPOP Pool - Multiple Grid Sites . ........ 42 4 Experiments&Results ............................... ... 44 4.1 Building the Physical and Virtual Clusters . ............. 44 4.2 LocalTesting .................................... .... 45 4.2.1 Experiments ................................... 45 4.2.2 Results ....................................... 48 4.3 VOC and the Amazon Elastic Compute Cloud(EC2) . ......... 49 4.3.1 Experiments ................................... 49 4.3.2 Results ....................................... 54 5 Conclusions ....................................... 55 Bibliography....................................... .... 57 vi List of Tables 4.1 Observed makespan lengths and system throughputs for 10 overlay experiment trials 48 vii List of Figures 1.1 GeneralRep.ofGrids .............................. ..... 2 1.2 Serialvs. ParallelComputation . .......... 2 1.3 Distributedcomputing . ....... 5 1.4 CloudComputing.................................. .... 7 1.5 CondorArchitecture .............................. ...... 9 1.6 Client-ServerArchitecture . .......... 11 1.7 Peer-to-PeerSystem .............................. ...... 12 1.8 DistributedHashTable .. .. .. .. .. .. .. .. .. .. .. .. ...... 13 1.9 DHTImplementation ............................... .... 14 1.10IPOPArchitecture ............................... ...... 16 1.11StructureofIPOP ................................ ..... 17 1.12 VirtualOrganizationClusters . ........... 21 3.1 IPOPasaSTUNserver ............................... ... 32 3.2 IPOPDHCPConfigurationFile. ...... 34 3.4 IPOPBootstrapConfigurationFile. ......... 35 3.3 IPOPNamespaceConfigurationFile . ........ 35 3.5 IPOPNodeConfigurationFile . ...... 36 3.6 SingleSiteVOC ................................... ... 40 3.7 VOCoverasinglesite .............................. ..... 41 3.8 MultipleGridSiteVOC ............................. ..... 42 4.1 ShortJobswithouttheOverlayNetwork . .......... 47 4.2 ShortJobswiththeOverlayNetwork. ......... 47 4.3 LongJobswithouttheOverlayNetwork . ......... 47 4.4 LongJobswiththeOverlayNetwork . ........ 48 4.5 Amazon EC2 Primary Distributed Architecture[2] . ............. 49 4.6 AmazonHomeUI .................................... 50 4.7 AmazonInstanceUI ................................ .... 50 4.8 Short Jobs executing in Multiple Grid sites over IPOP . .............. 52 4.9 Long Jobs executing in Multiple Grid sites over IPOP . ............. 53 viii Listings ix Chapter 1 Introduction “Grid computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large-scale resource sharing, innovative applications, and, in some cases, high-performance orientation.” - Ian Foster [41]. This statement, which was made by Ian Foster in 2001, serves as a standard for all research and development conducted in the grid community. The key difference between Grid computing and Distributed computing is in the allocation of compute resources. Grid computing requires access to compute resources that are exclusively dedicated which includes powerful workstations, super computers and servers. Distributed computing, on the other hand, does not have such a stringent requirement. It requires only a subset of resources from any single computation unit. This means that at any instant of time, a system could be ’live’ (actively deployed in a distributed environment) and be used by the owner of the system simultaneously. Such operations are abstracted away from the user in such a way that his work is never affected. Despite these differences, the scientific community has agreed on the consensus that Grid computing is a subset of Distributed computing. Large applications are often required for scientific simulations, research, modeling and other applications. Single computation units, despite having multiple processors, huge amounts of RAM and disk space, can neither satisfy the computing requirement of ever growing data sets nor provide results in a reasonable amount of time. Therefore the timing for a gradual but definite adaptation of large-scale distributed systems is indeed warranted and befitting[39]. Clemson University has a large supercomputing infrastructure, in the Palmetto Cluster. It has over 6000 cores and is ranked 89 among the TOP 500 Supercomputing Sites in the World. 1 Figure 1.1: A General Representation of Grids comprising not just Computing and Networking hardware but geographic locations and people Figure 1.2: Serial vs. Parallel Computation - The effect of parallelizing code and using distributed resources[10]. 2 The Cyber Infrastructure Research Group[31] at Clemson is the leading developer and strategist for distributed computing and infrastructure on campus. Led by Dr. Sebastien Goasguen, the group looks to leverage campus wide resources including the Palmetto Cluster to research and develop distributed

The Distributed, Peer-To-Peer Approach" (2010)

Anonsocialmix: Anonymous P2P File Sharing Over Social Networks

A Flexible Middleware for Metacomputing a Flexible

Four-Dimensional Model for Describing the Status of Peers in Peer-To-Peer Distributed Systems

Social Network Based Anonymous Communication in Tor Peng Zhou, Xiapu Luo, Ang Chen, and Rocky K

Technology Stack for Decentralized Mobile Services

From Computational Science to Internetics: Integration of Science with Computer Science

Censorship-Resistant Collaboration with a Hybrid DTN/P2P Network

Unveiling the I2P Web Structure: a Connectivity Analysis

Privacy-Preserving P2P Data Sharing with Oneswarm

Globus: a Metacomputing Infrastructure Toolkit

Using Metacomputing Tools to Facilitate Large Scale Analyses Of

Metacomputing Across Intercontinental Networks S.M