IBM Blue Gene
Total Page:16
File Type:pdf, Size:1020Kb
IBMIBM BlueBlue GeneGene MultipleMultiple ProcessorProcessor SystemsSystems SpencerSpencer MacDonaldMacDonald RyanRyan WilliamsWilliams AgendaAgenda Introduction Hardware Nodes Network Operating Environment Current and Future Systems Blue Gene/L - World’s fastest computer. IBMIBM BlueBlue GeneGene MissionMission StatementStatement “The IBM® System Blue Gene® solution is the result of an IBM supercomputing project begun over five years ago, dedicated to building a new family of supercomputers optimized for bandwidth, scalability and the ability to handle large amounts of data while consuming a fraction of the power and floor space required by today’s fastest systems.” --IBM Blue gene systems currently hold the number 1 and 2 spot on the top 500 list of super computers. BlueBlue GeneGene ProjectProject GoalsGoals ScientificScientific ResearchResearch TopTop SupercomputerSupercomputer PetaflopPetaflop BarrierBarrier SelfSelf--ImposedImposed ConstraintsConstraints LowLow PowerPower ConsumptionConsumption LowLow FloorFloor SpaceSpace HighHigh performanceperformance toto areaarea ratioratio HighlyHighly ScalableScalable Different Size Blue Gene Solutions ScientificScientific ApplicationsApplications G Protein-Coupled Receptors Half of pharmaceutical drugs Tens of billions of dollars annually Lipid analysis Lipids enable cell signaling and division Critical to understanding diseases Omega-3 Fatty acids and cholesterol Study of how membrane proteins affect membrane environment HardwareHardware OverviewOverview BlueBlue GeneGene ASICASIC Two power PC 440 CPU with Double Hummer FPU Operating at 700 MHz 11.1 mm X 11.1 mm die Non-coherent 64K L1 Cache Coherent 2 KB Fully associative cache Shared L3 Cache Integrated Communication Assist IBMIBM BlueBlue GeneGene NodeNode IBMIBM BlueBlue GeneGene NodeNode BoardBoard 1616 ComputeCompute CardsCards perper boardboard 00--22 I/OI/O CardsCards perper boardboard 9090--180180 GigaflopsGigaflops TheThe CommunicationCommunication NetworksNetworks 55 NetworksNetworks 3D3D torustorus CollectiveCollective NetworkNetwork GlobalGlobal InterruptInterrupt GigabitGigabit EthernetEthernet JTAGJTAG ClockClock DistributionDistribution 3D3D torustorus EachEach nodenode isis connectedconnected toto 66 nearestnearest neighborsneighbors HighHigh bandwidthbandwidth NoNo edgesedges DynamicDynamic andand DeterministicDeterministic RoutingRouting VirtualVirtual BufferingBuffering CutCut--throughthrough RoutingRouting 1.41.4 Gb/sGb/s perper linklink 1.051.05 GB/sGB/s perper NodeNode 64 Rack Blue Gene System CollectiveCollective NetworkNetwork MaxMax ofof 3030 hopshops forfor 65,53665,536 NodeNode systemsystem UsedUsed forfor globalglobal minimum,minimum, maximummaximum andand sum.sum. ForFor floatingfloating pointpoint sumsum operations,operations, twotwo passespasses areare usedused onon thethe networknetwork OneOne toto findfind thethe maximummaximum exponentexponent TheThe secondsecond toto sumsum thethe mantissasmantissas OperatingOperating EnvironmentEnvironment ComputeCompute nodesnodes runrun ComputeCompute NodeNode KernelKernel (CNK)(CNK) I/OI/O nodesnodes runrun PowerPCPowerPC LinuxLinux Power on configuration is done via the service nodes using the control network One core on each I/O nodes is locked in an infinite loop. ComputeCompute NodeNode EnvironmentEnvironment ComputeCompute nodesnodes runrun ComputeCompute NodeNode KernelKernel (CNK)(CNK) AA subsetsubset ofof POSIXPOSIX -- OneOne threadthread perper CPUCPU FixedFixed flatflat addressaddress space.space. NoNo PagingPaging KernelKernel andand applicationapplication shareshare thethe samesame addressaddress spacespace TorusTorus networknetwork mappedmapped toto useruser spacespace CommunicationCommunication Two supported Processor Modes Coprocessor Mode One MPI task per node – split between processors co_start, co_join Virtual Node Mode Two MPI-tasks per node Processors communicate through message passing Communication can be done on 3 layers: Packet layer Message layer Message Passing Interface (MPI) based on MPICH2. TheThe BlueBlue GeneGene SystemsSystems Blue Gene/L – Lawrence Livermore National Laboratories #1 Supercomputer - 360 Teraflops - 64 racks, 65,536 nodes Blue Gene/W – Thomas J. Watson Research Center #2 Supercomputer - 114 Teraflops - 20 racks, 20,480 nodes Juelich Blue Gene/L “JUBL” – John von Neumann Institute for Computing Fastest in Europe - 45.8 teraflops - 8 racks, 8192 nodes Future Machines Blue Gene/P – 1 petaflop target Blue Gene/Q – 3 petaflop target Blue Gene/C “Cyclops” – Cell architecture ConclusionConclusion TopTop supersuper computercomputer LowLow PowerPower ConsumptionConsumption 3.63.6 percentpercent ofof thethe powerpower consumconsumptionption ofof thethe EarthEarth SimulatorSimulator* LowLow FloorFloor SpaceSpace 11 percentpercent ofof thethe sizesize ofof thethe EarthEarth SimulatorSimulator* HighHigh performanceperformance toto areaarea ratioratio *When Blue Gene/L surpassed Earth Simulator as world’s fastest computer..