FromComputationalSciencetoScienceDiscovery:The NextComputingLandscape
GiladShainer,BrianSparks,ScotSchultz,EricLantz,WilliamLiu,TongLiu,GoldiMisra
HPCAdvisoryCouncil
{Gilad,Brian,Scot,Eric,William,Tong,[email protected]}
Computationalscienceisthefieldofstudyconcernedwithconstructingmathematicalmodelsand numericaltechniquesthatrepresentscientific,socialscientificorengineeringproblemsandemploying thesemodelsoncomputers,orclustersofcomputerstoanalyze,exploreorsolvethesemodels. Numericalsimulationenablesthestudyofcomplexphenomenathatwouldbetooexpensiveor dangeroustostudybydirectexperimentation.Thequestforeverhigherlevelsofdetailandrealismin suchsimulationsrequiresenormouscomputationalcapacity,andhasprovidedtheimpetusfor breakthroughsincomputeralgorithmsandarchitectures.Duetotheseadvances,computational scientistsandengineerscannowsolvelargeͲscaleproblemsthatwereoncethoughtintractableby creatingtherelatedmodelsandsimulatethemviahighͲperformancecomputeclustersor supercomputers.Simulationisbeingusedasanintegralpartofthemanufacturing,designanddecisionͲ makingprocesses,andasafundamentaltoolforscientificresearch.ProblemswherehighͲperformance simulationplayapivotalroleincludeforexampleweatherandclimateprediction,nuclearandenergy research,simulationanddesignofvehiclesandaircrafts,electronicdesignautomation,astrophysics, quantummechanics,biology,computationalchemistryandmore.
Computationalscienceiscommonlyconsideredthethirdmodeofscience,wherethepreviousmodesor paradigmswereexperimentation/observationandtheory.Inthepast,sciencewasperformedby observingevidenceofnaturalorsocialphenomena,recordingmeasurabledatarelatedtothe observations,andanalyzingthisinformationtoconstructtheoreticalexplanationsofhowthingswork. WiththeintroductionofhighͲperformancesupercomputers,themethodsofscientificresearchcould includemathematicalmodelsandsimulationofphenomenonthataretooexpensiveorbeyondour experiment'sreach.Inturn,wecanforecastweatherconditionssooner,explorealternativeenergy sources,buildsafervehiclesandpackageconsumedgoodsinamoreeconomicalway.Inorderto performthosenumericalsimulationeffectivelyandproductively,costͲeffectiveorcommoditybased supercomputersarchitectureswerecreatedhighͲperformanceclusteringofcomputers.
HighͲperformancecomputing(HPC)clustersarescalableperformancecomputesolutionsbasedon industryͲstandardhardwareconnectedbyaprivatesystemhighͲspeednetwork.Themainbenefitsof clustersareaffordability,flexibility,availability,highͲperformanceandscalability.Aclusterusesthe aggregatedpowerofcomputeservernodestoformahighͲperformancesolutionforparallel applications.Whenmorecomputepowerisneeded,itcanbesimplyachievedbyaddingmoreserver nodestothecluster.TheLosAlamosNationalLab(US)Roadrunnercluster(figure1)wasthefirst systemtoprovidePetaflop(athousandtrillionCPUfloatingpointoperationsorinstructionspersecond) performanceforscientificsimulations(nationalnuclearweapons,astronomy,humangenomescience andclimatechange).RoadrunnerwasbuiltusingIBMCellCPUsandAMDOpteronCPUsboards,and MellanoxInfiniBandtoconnectbetweenthem.OakRidgeNationalLab(US)Spidersystemisoneof theworldslargestandfasteststorageclusterfilesystemthatincludesthousandsofconnections(based onInfiniBandinterconnect)andover10.7PetaBytestoragecapacitytoservethehighͲperformance systemsatthelab.TheNationalUniversityofDefenseTechnology(China)TianHesystem(figure2)is thefirstPetascalesysteminAsia.ThesystemisusingthousandsofIntelCPUandATIGPUs,all connectedviaMellanoxInfiniBandnetworking.
Figure1LosAlamosNationalLabRoadrunnersystemstheworldsfirstPetaflopsystem
Figure2NationalUniversityofDefenseTechnologyTianHesystem
WiththecreationofbiggerandfasterhighͲperformancecomputingsystemsforscientificand engineeringsimulations,newgenerationsofsensorͲcomputerapplianceshavebeencreatedforspecific applications.Oneexampleisthe,theAustralianSquareKilometreArrayPathfinder(ASKAP),anarrayof radiotelescopesthatwillcompriseof36antennaseach12mindiameter,capableofhighdynamicrange imagingandusingwideͲfieldͲofͲviewphasedarrayfeeds.ASKAPwillbeatelescopethatcancapture radioimageswithunprecedentedsensitivityoverlargeareasofsky.WithalargeinstantaneousfieldͲofͲ viewASKAPwillbeabletosurveythewholeskyvastlyfasterthanispossiblewithexistingradio telescopes.
Figure3IllustrationoftheAustralianSquareKilometreArrayPathfinder
PetaflopSupercomputersCreateExaͲfloodofData
Theeverincreasingdemandsforcomputationalpowerdeliveredbytheeverincreasingsupercomputer capabilityandcapacityproduceanoverwhelmingflowofdata.InoneweektheAustralianSquare KilometreArrayPathfinderwillgeneratemoreinformationthaniscurrentlycontainedonthewhole WorldWideWeb,andinonemonthitwillgeneratemoreinformationthaniscontainedintheworld's academiclibraries.APetaflopsupercomputerequals150,000computationsforeveryhumanonthe planetpersecond,andasingledaysusageworldTOP500supercomputers(accordingtotheNovember 2009list)isequalto240billionpeoplearmedwithcalculatorsfornearly50years.
Withtheincreasingrampofdatagenerationfromscientificandengineeringsimulationsand observationtargetedsupercomputers,futuretechnologydevelopmentshouldbefocusedoncreating scalablehighͲperformanceclustersofcomputersthatcanmanageandprocessallofthisdata.The futurepremiseofcomputeinfrastructuresshouldbeaimedintobuildingorprovidingtoolsandsystems forsciencediscovery,inwhichallofthecomputationalscienceliteratureanddatabasescanbe availableonlineandbesharedbyscientists,researchersandengineersaroundtheglobe.Distributed sciencecanbeseenasthefourthmodeorparadigmwheresciencebecomescentralizedthroughout centralizationofcomputingfacilities,andthosecomputingfacilitiesarethentargetedintomanaging, visualizingandanalyzingthedataflood.Computationalsciencedrivesthevastcreationofdatawhichis beyondourcapabilitiestoanalyzeandunderstand,andtheroleofsciencediscoverywillextendto createthetoolstoextractthefuturesciencediscoveriesoutofthedataflood.
Furthermore,inmanyscientificfieldsofstudies,theinstrumentsareextremelyexpensive,andassuch, thedatamustbeshared.WiththisdataexplosionandashighͲperformancesystemsbecomea commodityinfrastructure,thepressuretosharescientificdataisincreasing.Thatresonateswellwith theemergingcomputingtrendknownasthecloudorcloudcomputing.Whileforthemomentcloud computingappearstobeacosteffectivealternativeforITspending,ortheshiftofenterpriseITcenters fromcapitalexpensetooperationalexpense,researchinstituteshavestartedexploringhowcloud computingcancreatethedesiredcomputecentralizationandanenvironmentforresearcherstoshare andcrunchthefloodofdata.OneexampleisthenewsystemattheNationalEnergyResearchScientific ComputingCenter(US),namedMagellan.WhileMagellansinitialtargetistoprovideatoolfor computationalscienceinacloudenvironment,itcanbeeasilymodifiedtobecomeacenterfordata processingaccessedbymanyresearchersandscientists.
CentralizedDataCrunchingComputeEnvironmentThroughoutCloudComputing
Theconceptofcomputinginacloudistypicallyreferredasahostedcomputationalenvironment (couldbelocalorremote)thatcanprovideelasticcomputeandstorageservicesforusersperdemand. Thereforethecurrentusagemodelofcloudenvironmentsisaimedforcomputationalscience.Future cloudscanbeservedasenvironmentsfordistributedsciencetoallowresearchersandengineersto sharetheirdatawiththeirpeersaroundtheglobeandallowexpensiveachievedresultstobeutilizedfor moreresearchprojectsandscientificdiscoveries.
Toallowtheshifttothefourthmodeofsciencediscoverythosecloudenvironmentswillneednot onlytoprovidecapabilitytosharethedatacreatedbythecomputationalscienceandthevarious observationsresults,butalsotobeabletoprovidecostͲeffectivehighͲperformancecomputing capabilities,similartothatoftodaysleadingsupercomputers,inordertobeabletorapidlyand effectivelyanalyzethedataflood.Moreover,animportantcriteriaofcloudsneedtobefastprovisioning ofthecloudresources,bothcomputeandstorage,inordertoservicemanyusers,manydifferent analysisandbeabletosuspendtasksandbringthembacktolifeinafastmanner.Reliabilityisanother concern,andcloudsneedtobeabletobeselfhealingcloudswherefailingcomponentscanbe replacedbysparesoronͲdemandresourcestoguaranteeconstantaccessandresourceavailability.
TheuseofGridsforscientificcomputinghasbecomesuccessfulinthelastfewyearsandmany internationalprojectsledtotheestablishmentofworldͲwideinfrastructuresavailableforcomputational science.TheOpenScienceGridprovidessupportfordataͲintensiveresearchfordifferentdisciplines suchasbiology,chemistry,particlephysics,andgeographicinformationsystems.EnablingGridfor ESciencE(EGEE)isaninitiativefundedbytheEuropeanCommissionthatconnectsmorethan91 institutionsinEurope,Asia,andUnitedStatesofAmerica,toconstructthelargestmultiͲscience computingGridinfrastructureoftheworld.TeraGRIDisanNSFfundedprojectthatprovidesscientists withalargecomputinginfrastructurebuiltontopofresourcesatnineresourceproviderpartnersites.It isusedby4000usersatover200universitiesthatadvanceresearchinmolecularbioscience,ocean science,earthscience,mathematics,neuroscience,designandmanufacturing,andotherdisciplines. WhileGridscanprovideagoodinfrastructureforsharedscienceanddataanalysis,severalissuesmake theGridsproblematictoleadthefourthmodeofsciencelimitedsoftwareflexibility,applications typicallyneedtobepreͲpackaged,nonelasticityandlackofvirtualization.Thosemissingitemscanbe deliveredthroughcloudcomputing.
Cloudcomputingaddressesmanyoftheaforementionedproblemsbymeansofvirtualization technologies,whichprovidetheabilitytoscaleupanddownthecomputinginfrastructureaccordingto givenrequirements.ByusingCloudͲbasedtechnologiesscientistscanhaveeasyaccesstolarge distributedinfrastructuresandcompletelycustomizetheirexecutionenvironment.Furthermore, effectiveprovisioningcansupportmanymoreactivitiesandsuspendorbringtolifeactivitiesinan instant.Thismakesthespectrumofoptionsavailabletoscientistswideenoughtocoveranyspecific needfortheirresearch.
HighͲPerformanceCloudComputing
Inthepast,highͲperformancecomputinghasnotbeenagoodcandidateforcloudcomputingduetoits requirementfortightintegrationbetweenservernodesvialowͲlatencyinterconnects.Theperformance overheadassociatedwithhostvirtualization,aprerequisitetechnologyformigratinglocalapplications tothecloud,quicklyerodesapplicationscalabilityandefficiencyinanHPCcontext.Thenew virtualizationsolutionssuchasKVMandXENaimtosolvetheperformanceissuebyallowingnative performancecapabilitiesfromthevirtualmachinesbyreducingthevirtualizationmanagement overheadandbyallowingdirectaccessfromthevirtualmachinestothenetwork.
HighͲspeednetworkingisacriticalrequirementforaffordablehighͲperformancecomputing,asclusters ofserversandstorageneedtobeabletocommunicateasfastaspossiblebetweenthem.Avast majorityoftheworldtop100supercomputersareusingthehighͲspeedInfiniBandnetworkingdueto thisreason,andtheinterconnectallowsthosesystemstoreachtomorethan90%efficiency,acritical elementforeffectiveforhighͲperformancecomputinginanyinfrastructure,includingclouds.National EnergyResearchScientificComputingCenter(NERSC,US)MagellansystemisusingInfiniBandasthe interconnecttoprovidethefastestconnectionbetweenserversandstorageinordertoallowthe maximumgainfromthesystem,highestefficiencyandaninfrastructurethatwillbeabletoanalyzedata inrealtime.
PowerconsumptionisanotherimportantissueforhighͲperformanceclouds.AstheHPCcloudsbecome bigger,affordabilityofsciencediscoverywillbedeterminedbytheabilitysothesavethecostsofthe powerandcooling.Powermanagement,whichisimplementedwithintheCPUs,theinterconnectand thesystemmanagementandschedulingwillneedtobeintegratedasacomprehensivesolution.NonͲ utilizedsectionsofthecloudsneedtobepoweredofformovedintopowersavingstatesandthe schedulingmechanismwillneedtoincorporatetopologyawareness. TheHPCAdvisoryCouncil(http://www.hpcadvisorycouncil.com)HPC|Cloudgroupisworkingto investigatethecreationandusagemodelsofcloudsinHPC.Pastactivitiesonsmartscheduling mechanismshavebeenpublishedonthecouncilswebsite,andfutureresultswillincludetheusageof KVMandXEN,manycoresCPUs(suchasAMD'sMagnyͲCourswhichincludes12coresinasingleCPU) andcloudmanagementsoftware(suchasPlatformISF)willbepublishedthroughout2010.
ScienceDiscovery:TheNextComputingLandscape
SciencediscoverythroughdataͲintensiveanalysiscanbethenextmodeofscience,after experimentation/observation,theoryandcomputationalscience.ThiswillbethemodeinwhichhighͲ performancecloudcomputingwillconnecttheglobeandprovidethetoolforresearchers,scientistsand engineerstosharetheirexperimentsandtoanalyzetheincreasingdatathatisbeinggatheredor created.Thosecloudenvironmentswillbebasedoncommodityserversandstorage,allconnectedvia highͲspeednetworkingwithacomprehensiveeconomicalvirtualizationsoftwaremanagement.
TheHPCAdvisoryCouncilwillcontinuetoinvestigatetheemergingtechnologiesandaspectsthatwill leadusintothefourthmodeofscience.
Acknowledgments
TheauthorswouldliketothankCydneyStevensforhervisionandguidance.