FromComputationaltoScienceDiscovery:The NextLandscape

GiladShainer,BrianSparks,ScotSchultz,EricLantz,WilliamLiu,TongLiu,GoldiMisra

HPCAdvisoryCouncil

{Gilad,Brian,Scot,Eric,William,Tong,[email protected]}



Computationalscienceisthefieldofstudyconcernedwithconstructingmathematicalmodelsand numericaltechniquesthatrepresentscientific,socialscientificorproblemsandemploying thesemodelson,orclustersofcomputerstoanalyze,exploreorsolvethesemodels. Numericalenablesthestudyofcomplexphenomenathatwouldbetooexpensiveor dangeroustostudybydirectexperimentation.Thequestforeverhigherlevelsofdetailandrealismin suchrequiresenormouscomputationalcapacity,andhasprovidedtheimpetusfor breakthroughsinandarchitectures.Duetotheseadvances,computational scientistsandengineerscannowsolvelargeͲscaleproblemsthatwereoncethoughtintractableby creatingtherelatedmodelsandsimulatethemviahighͲperformancecomputeclustersor .Simulationisbeingusedasanpartofthemanufacturing,designanddecisionͲ makingprocesses,andasafundamentaltoolforscientificresearch.ProblemswherehighͲperformance simulationplayapivotalroleincludeforexampleweatherandclimateprediction,nuclearandenergy research,simulationanddesignofvehiclesandaircrafts,electronicdesignautomation,astrophysics, quantummechanics,biology,computationalchemistryandmore.

Computationalscienceiscommonlyconsideredthethirdmodeofscience,wherethepreviousmodesor paradigmswereexperimentation/observationand.Inthepast,sciencewasperformedby observingevidenceofnaturalorsocialphenomena,recordingmeasurabledatarelatedtothe observations,andanalyzingthisinformationtoconstructtheoreticalexplanationsofhowthingswork. WiththeintroductionofhighͲperformancesupercomputers,themethodsofscientificresearchcould includemathematicalmodelsandsimulationofphenomenonthataretooexpensiveorbeyondour 'sreach.Inturn,wecanforecastweatherconditionssooner,explorealternativeenergy sources,buildsafervehiclesandpackageconsumedgoodsinamoreeconomicalway.Inorderto performthosenumericalsimulationeffectivelyandproductively,costͲeffectiveorcommoditybased supercomputersarchitectureswerecreated–highͲperformanceclusteringofcomputers.

HighͲperformancecomputing(HPC)clustersarescalableperformancecomputesolutionsbasedon industryͲstandardhardwareconnectedbyaprivatehighͲspeednetwork.Themainbenefitsof clustersareaffordability,flexibility,availability,highͲperformanceandscalability.Aclusterusesthe aggregatedpowerofcomputeservernodestoformahighͲperformancesolutionforparallel applications.Whenmorecomputepowerisneeded,itcanbesimplyachievedbyaddingmoreserver nodestothecluster.TheLosAlamosNationalLab(US)“Roadrunner”cluster(figure1)wasthefirst systemtoprovidePetaflop(athousandtrillionCPUfloatingpointoperationsorinstructionspersecond) performanceforscientificsimulations(nationalnuclearweapons,astronomy,humangenomescience andclimatechange).RoadrunnerwasbuiltusingIBMCellCPUsandAMDOpteronCPUsboards,and MellanoxInfiniBandtoconnectbetweenthem.OakRidgeNationalLab(US)“Spider”systemisoneof theworld’slargestandfasteststorageclusterfilesystemthatincludesthousandsofconnections(based onInfiniBandinterconnect)andover10.7PetaBytestoragecapacitytoservethehighͲperformance atthelab.TheNationalUniversityofDefenseTechnology(China)“TianHe”system(figure2)is thefirstPetascalesysteminAsia.ThesystemisusingthousandsofIntelCPUandATIGPUs,all connectedviaMellanoxInfiniBandnetworking.





Figure1–LosAlamosNationalLab“Roadrunner”systems–theworld’sfirstPetaflopsystem





Figure2–NationalUniversityofDefenseTechnology“TianHe”system



WiththecreationofbiggerandfasterhighͲperformancecomputingsystemsforscientificand engineeringsimulations,newgenerationsofsensorͲcomputerapplianceshavebeencreatedforspecific applications.Oneexampleisthe,theAustralianSquareKilometreArrayPathfinder(ASKAP),anarrayof radiotelescopesthatwillcompriseof36antennaseach12mindiameter,capableofhighdynamicrange imagingandusingwideͲfieldͲofͲviewphasedarrayfeeds.ASKAPwillbeatelescopethatcancapture radioimageswithunprecedentedsensitivityoverlargeareasofsky.WithalargeinstantaneousfieldͲofͲ viewASKAPwillbeabletosurveythewholeskyvastlyfasterthanispossiblewithexistingradio telescopes.



Figure3–IllustrationoftheAustralianSquareKilometreArrayPathfinder



PetaflopSupercomputersCreateExaͲfloodofData

Theeverincreasingdemandsforcomputationalpowerdeliveredbytheeverincreasing capabilityandcapacityproduceanoverwhelmingflowofdata.InoneweektheAustralianSquare KilometreArrayPathfinderwillgeneratemoreinformationthaniscurrentlycontainedonthewhole WorldWideWeb,andinonemonthitwillgeneratemoreinformationthaniscontainedintheworld's academiclibraries.APetaflopsupercomputerequals150,000computationsforeveryhumanonthe planetpersecond,andasingleday’susageworldTOP500supercomputers(accordingtotheNovember 2009list)isequalto240billionpeoplearmedwithcalculatorsfornearly50years.

Withtheincreasingrampofdatagenerationfromscientificandengineeringsimulationsand observationtargetedsupercomputers,futuretechnologydevelopmentshouldbefocusedoncreating scalablehighͲperformanceclustersofcomputersthatcanmanageandprocessallofthisdata.The futurepremiseofcomputeinfrastructuresshouldbeaimedintobuildingorprovidingtoolsandsystems for“sciencediscovery”,inwhichallofthecomputationalscienceliteratureanddatabasescanbe availableonlineandbesharedbyscientists,researchersandengineersaroundtheglobe.Distributed sciencecanbeseenasthefourthmodeorparadigmwheresciencebecomescentralizedthroughout centralizationofcomputingfacilities,andthosecomputingfacilitiesarethentargetedintomanaging, visualizingandanalyzingthedataflood.Computationalsciencedrivesthevastcreationofdatawhichis beyondourcapabilitiestoanalyzeandunderstand,andtheroleofsciencediscoverywillextendto createthetoolstoextractthefuturesciencediscoveriesoutofthedataflood.

Furthermore,inmanyscientificfieldsofstudies,theinstrumentsareextremelyexpensive,andassuch, thedatamustbeshared.WiththisdataexplosionandashighͲperformancesystemsbecomea commodityinfrastructure,thepressuretosharescientificdataisincreasing.Thatresonateswellwith theemergingcomputingtrendknownas“thecloud”or“cloudcomputing”.Whileforthemomentcloud computingappearstobeacosteffectivealternativeforITspending,ortheshiftofenterpriseITcenters fromcapitalexpensetooperationalexpense,researchinstituteshavestartedexploringhowcloud computingcancreatethedesiredcomputecentralizationandanenvironmentforresearcherstoshare andcrunchthefloodofdata.OneexampleisthenewsystemattheNationalEnergyResearchScientific ComputingCenter(US),named“Magellan”.WhileMagellan’sinitialtargetistoprovideatoolfor computationalscienceinacloudenvironment,itcanbeeasilymodifiedtobecomeacenterfordata processingaccessedbymanyresearchersandscientists.



CentralizedDataCrunchingComputeEnvironmentThroughoutCloudComputing

Theconceptofcomputing“inacloud”istypicallyreferredasahostedcomputationalenvironment (couldbelocalorremote)thatcanprovideelasticcomputeandstorageservicesforusersperdemand. Thereforethecurrentusagemodelofcloudenvironmentsisaimedforcomputationalscience.Future cloudscanbeservedasenvironmentsfordistributedsciencetoallowresearchersandengineersto sharetheirdatawiththeirpeersaroundtheglobeandallowexpensiveachievedresultstobeutilizedfor moreresearchprojectsandscientificdiscoveries.

Toallowtheshifttothefourthmodeof“sciencediscovery”thosecloudenvironmentswillneednot onlytoprovidecapabilitytosharethedatacreatedbythecomputationalscienceandthevarious observationsresults,butalsotobeabletoprovidecostͲeffectivehighͲperformancecomputing capabilities,similartothatoftoday’sleadingsupercomputers,inordertobeabletorapidlyand effectivelyanalyzethedataflood.Moreover,animportantcriteriaofcloudsneedtobefastprovisioning ofthecloudresources,bothcomputeandstorage,inordertoservicemanyusers,manydifferent analysisandbeabletosuspendtasksandbringthembacktolifeinafastmanner.Reliabilityisanother concern,andcloudsneedtobeabletobe“selfhealing”cloudswherefailingcomponentscanbe replacedbysparesoronͲdemandresourcestoguaranteeconstantaccessandresourceavailability.

TheuseofGridsforscientificcomputinghasbecomesuccessfulinthelastfewyearsandmany internationalprojectsledtotheestablishmentofworldͲwideinfrastructuresavailableforcomputational science.TheOpenScienceGridprovidessupportfordataͲintensiveresearchfordifferentdisciplines suchasbiology,chemistry,particle,andgeographicinformationsystems.EnablingGridfor ESciencE(EGEE)isaninitiativefundedbytheEuropeanCommissionthatconnectsmorethan91 institutionsinEurope,Asia,andUnitedStatesofAmerica,toconstructthelargestmultiͲscience computingGridinfrastructureoftheworld.TeraGRIDisanNSFfundedprojectthatprovidesscientists withalargecomputinginfrastructurebuiltontopofresourcesatnineresourceproviderpartnersites.It isusedby4000usersatover200universitiesthatadvanceresearchinmolecularbioscience,ocean science,earthscience,mathematics,neuroscience,designandmanufacturing,andotherdisciplines. WhileGridscanprovideagoodinfrastructureforsharedscienceanddataanalysis,severalissuesmake theGridsproblematictoleadthefourthmodeofscience–limitedsoftwareflexibility,applications typicallyneedtobepreͲpackaged,nonelasticityandlackofvirtualization.Thosemissingitemscanbe deliveredthroughcloudcomputing.

Cloudcomputingaddressesmanyoftheaforementionedproblemsbymeansofvirtualization technologies,whichprovidetheabilitytoscaleupanddownthecomputinginfrastructureaccordingto givenrequirements.ByusingCloudͲbasedtechnologiesscientistscanhaveeasyaccesstolarge distributedinfrastructuresandcompletelycustomizetheirexecutionenvironment.Furthermore, effectiveprovisioningcansupportmanymoreactivitiesandsuspendorbringtolifeactivitiesinan instant.Thismakesthespectrumofoptionsavailabletoscientistswideenoughtocoveranyspecific needfortheirresearch.



HighͲPerformanceCloudComputing

Inthepast,highͲperformancecomputinghasnotbeenagoodcandidateforcloudcomputingduetoits requirementfortightintegrationbetweenservernodesvialowͲlatencyinterconnects.Theperformance overheadassociatedwithhostvirtualization,aprerequisitetechnologyformigratinglocalapplications tothecloud,quicklyerodesapplicationscalabilityandefficiencyinanHPCcontext.Thenew virtualizationsolutionssuchasKVMandXENaimtosolvetheperformanceissuebyallowingnative performancecapabilitiesfromthevirtualmachinesbyreducingthevirtualizationmanagement overheadandbyallowingdirectaccessfromthevirtualmachinestothenetwork.

HighͲspeednetworkingisacriticalrequirementforaffordablehighͲperformancecomputing,asclusters ofserversandstorageneedtobeabletocommunicateasfastaspossiblebetweenthem.Avast majorityoftheworldtop100supercomputersareusingthehighͲspeedInfiniBandnetworkingdueto thisreason,andtheinterconnectallowsthosesystemstoreachtomorethan90%efficiency,acritical elementforeffectiveforhighͲperformancecomputinginanyinfrastructure,includingclouds.National EnergyResearchScientificComputingCenter(NERSC,US)“Magellan”systemisusingInfiniBandasthe interconnecttoprovidethefastestconnectionbetweenserversandstorageinordertoallowthe maximumgainfromthesystem,highestefficiencyandaninfrastructurethatwillbeabletoanalyzedata inrealtime.

PowerconsumptionisanotherimportantissueforhighͲperformanceclouds.AstheHPCcloudsbecome bigger,affordabilityofsciencediscoverywillbedeterminedbytheabilitysothesavethecostsofthe powerandcooling.Powermanagement,whichisimplementedwithintheCPUs,theinterconnectand thesystemmanagementandschedulingwillneedtobeintegratedasacomprehensivesolution.NonͲ utilizedsectionsofthecloudsneedtobepoweredofformovedintopowersavingstatesandthe schedulingmechanismwillneedtoincorporatetopologyawareness. TheHPCAdvisoryCouncil(http://www.hpcadvisorycouncil.com)HPC|Cloudgroupisworkingto investigatethecreationandusagemodelsofcloudsinHPC.Pastactivitiesonsmartscheduling mechanismshavebeenpublishedonthecouncil’swebsite,andfutureresultswillincludetheusageof KVMandXEN,manycoresCPUs(suchasAMD'sMagnyͲCourswhichincludes12coresinasingleCPU) andcloudmanagementsoftware(suchasPlatformISF)willbepublishedthroughout2010.



ScienceDiscovery:TheNextComputingLandscape

SciencediscoverythroughdataͲintensiveanalysiscanbethenextmodeofscience,after experimentation/observation,theoryandcomputationalscience.ThiswillbethemodeinwhichhighͲ performancecloudcomputingwillconnecttheglobeandprovidethetoolforresearchers,scientistsand engineerstosharetheirandtoanalyzetheincreasingdatathatisbeinggatheredor created.Thosecloudenvironmentswillbebasedoncommodityserversandstorage,allconnectedvia highͲspeednetworkingwithacomprehensiveeconomicalvirtualizationsoftwaremanagement.

TheHPCAdvisoryCouncilwillcontinuetoinvestigatetheemergingtechnologiesandaspectsthatwill leadusintothefourthmodeofscience.



Acknowledgments

TheauthorswouldliketothankCydneyStevensforhervisionandguidance.