INNOVATION MATTERS

Sun Initiatives in Grid and Cloud Computing: Configuring the Cloud Grid Asia 2008

Stephen Perrenod, Ph.D. Director, HPC Business Development, APAC 2008. 09. 15 Agenda • 1982 • Grid Engine > Compute, data, visualization grids • Network.com Cloud • SOA • Grid-in-a-Box • Project Caroline • Project Hydrazine • Ranger > Largest open supercomputer in world Grid – Cloud - SOA 1982 1982

• Sun founded ”Network is the Computer” • Supercomputing service bureaus popular > 56 kbit link Our Vision

• The Network is the Computer • Everyone and everything participates on the network

1.5 Billion

s

r

e

s

U

t

e

n

r

e

t

n

I

1995 2000 2005 2007 2010 Grid Engine The Sun HPC Stack Core Solaris and products from Sun

Sun Studio

s Free e Developer Tools c i Sun HPC Cluster Tools v r e S

l a

n Management Sun Grid Engine Software Open & o i Supported s Workload Management s

e Versions f Cluster Management Sun XVM Ops Centre Software o r P

, l a r

u Solaris is t

c Open &Free

e Operating System t i h c r A

, t r o

p Node Open p

u Processor 64 Bit S

, S R C

n u S Interconnect , Myrinet, InfiniBand Open Tokyo Institute of Technology TSUBAME on Grid Engine > Accounting > Grid Engine provides detailed accounting information > Tight integration with rsh and ssh > Lots of machines > Grid Engine is able to scale to their needs > Lots of MPI > Grid Engine's configurability makes integration possible > Lots of applications > Grid Engine is zero-touch – most apps won't notice > Infiniband support > Absolutely! Sun Grid Engine Multi-Clustering • Grids are monitored by Service Level Objectives • Policies control relative grid priorities

Sun Grid Sun Grid Engine grid Engine grid #2 #1 Service Domain Manager Sun Grid Engine Multi-Clustering

I need I have resources 2 free

Sun Grid Sun Grid Spare Engine grid Engine grid Pool #1 #2

Service Domain Manager Sun / APSTC Bio-Cluster Grid

• http://apstc.sun.com.sg - BioBox initiative > Bio-Cluster Grid > Over 20 of most popular bioinfo apps ready to run on Linux or Solaris clusters > using Sun Grid Engine for workload management • Seamless, simplified and GUI access to high throughput computing resources on the grid • Access to dispersed data resources Joint Collaboration between NUS and APSTC The Architecture

Cluster of Execution Hosts Sun Grid Engine Results Dispatch

Jobs Grid Engine Portal Bio-ClusterGrid: Benefits • Fast setup (3 hours) > Installation of the 28 Bioinformatics applications > Activation of Sun Grid Engine > Installation and Activation of Grid Engine Portal/JES Portal Server • Compared to Manual Installation (9 days) > Downloading, compiling and installation of 28 bioinformatics applications (1 week) > Installation and Configuration of Sun Grid Engine and Portal (2 days) • Huge saving in time for administrators > No need to deal with the complexity of compilation : compilation errors, checking for installing dependencies. Sun / APSTC Bio-Cluster Grid Sites

• APAC Bio-ClusterGrid sites: > Genome Institute of Singapore > BIOTEC - Thailand > UKM & USM Grid Testbed - Malaysia > School of Biological Sciences, NTU - Singapore > School of Chemical and Life Sciences - Singapore Polytechnic > Centre for DNA Fingerprinting & Diagnostics - India > National Cheng Kung University - Taiwan > International Islamic University - Malaysia > University of Delhi South Campus - India Data and Visualization Grids Sun's HPC Three Tier Storage Architecture for Information Life-cycle Management High speed, Compute Cluster High I/O Cluster Intermediate Data facing & Cache

Primary Storage Medium speed, enterprise class

Lower speed Backup high capacity archival facility Sun Visualization Grid System Sun Shared Visualization Sun Scalable Visualization

Graphics Network

Graphics

• Scalable • Sharable Supports combination of multiple • Integratable graphics devices to drive: > Higher Performance • Secure > Higher Image Quality • Virtualizing Visualization > Higher Resolution Network.com

Cloud Computing from Sun Innovative Business Models: The Sun Grid – now in 25 countries

• No hidden fees • No minimums • No barrier to exit $ • Immediate access 1 • Simple licensing ISVs $1/CPU-hr • www.network.com

Computing Becomes an Operational Expense Network.com Application Catalog Enabling On-Demand Delivery of HPC Applications • Life Sciences Apps: • HMMER • Amber PMEMD-parallel • MPQC • ARMS • NAMD • ASV • PETSc • BLAST • POP • ClustalW • Q-Chem • CSR • QCM • DOG • Rational Numbers Assign • eHiTS • Rational Numbers Partition • EMBOSS • Rational Numbers FragSearch • FASTA • Readseq • fastDNAml • T-Coffee • GROMACS Network.com users in Life Sciences

• SimBioSys eHiTS > By using the computationally intensive eHiTS program on the Sun Grid Compute Utility, scientists can leverage additional compute capacity to speed time to results for molecular docking, and accelerate the pace of innovation and discovery. • Applera / Applied Biosystems > Through Sun Grid Compute Utility, AB was able to perform the compute-intensive data research to develop millions of new genomic assays in a matter of days rather than months. In addition, because the company only had to purchase the number of hours required, at a rate of $1 per CPU hour, it avoided an investment in infrastructure that would have cost the company hundreds of thousands of dollars. SOA Healthcare Industry Trends Customer Needs Solution

• Health Information Networks • Digital Information Flow • Digitization • Management Of Escalating Data Federated IdM and Compliance • Regulatory Privacy& Access Control

• Consumer-directed Healthcare • Personal Health Records Consumerism • Long Term Care Management For Higher • Disease Management Platforms (Patient-centric) • Quality And Lower Cost

• Electronic Medical Records Evidence • Genomic Profiling And Pharma Data • Identity Management & Security - • Guidance Based On Best Practices based • Data Privacy Medicine

• Vendor Mergers & Acquisitions • Business Integration & Composite Consolidation • SOA-based Architectures Leverage • Applications • Legacy Systems • Image Archiving Integrating Medical Images with EMR Reduces Access Time

Single Patient View of Medical Record Includes Images

ANYTIME, ANYWHERE ACCESS TO A PATIENT’S IMAGES AND TEXT HISTORY Healthcare “Grid” Customers • Sun Integration Solutions in Action: • UK NHS National Programme for IT - Sun Integration Suite used to help deliver and manage a national patient record database and transactional messaging service • Sweden Capio AB - data flow from Oracle to local Patient Admin systems, extracted using Sun (Java CAPS) SeeBeyond eGate integrator and then fed to ERP • Austria Oberosterreichische Gesundheits-und-Spitals-AG - large hospital operator uses Sun software to manage info flow from 400 interfaces across 50 heterogeneous IT systems • UK Salford Royal NHS Hospital Trust - Sun software to manage integration of info from patient admin, pathology, radiology and other systems with central electronic patient record, all on one screen • Luxembourg Le Centre Hospitalier de Luxembourg - uses SeeBeyond (Java CAPS) to link SAP-based Hospital Information System with Radiology, Cardiology, Theater, Laboratory and PACS (Picture and Archiving System); also used to exchange info with other hospitals • Sweden Kalmar County Council - one patient, one record with Cambio and Sun SeeBeyond (Java CAPS) eInsight Business Process Manager Sun Integration Technology

• Pragmatic approach to SOA ("An architecture where services are defined and orchestrated using open standards, allowing for a pluggable, agile, heterogeneous service infrastructure") • Application-to-application Integration - Java CAPS / SeeBeyond (unified suite to develop, deploy, manage, and monitor a SOA) • Information Lifecycle Management – StorageTek storage and SAM-FS archival file system • Virtualization (Containers, Domains) • Security - • Broad range of OS platforms: Solaris, Linux, Windows • Broad range of hardware: Sparc and CMT, (Intel, AMD) Grid-in-a-Box Sun Modular Datacenter S20 / D20

Standard shipping container packaged with eight standard racks Integrated, high‐efficiency power and cooling Supports a wide range of compute, storage and network infrastructure – build once, deploy anywhere when fully configured Top 250 with over 250 servers configurable Stanford Linear Accelerator • Stanford Linear Accelerator Center (SLAC) High Performance Computing Node • Supports Particle Physics Research > Babar experiment (B-Mesons) > The goal of the experiment is to study the violation of charge and parity (CP) symmetry in the decays of B mesons. This violation manifests itself as different behaviour between particles and anti- particles and is the first step to explain the absence of anti-particles in everyday life. Project Caroline Project Caroline (projectcaroline.net)

• Sun Labs project - open source research project • Developing a horizontally scalable platform for the development and deployment of Internet services • The platform comprises a programmatically configurable pool of virtualized compute, storage, and networking resources • Project Caroline helps software providers develop services rapidly, update in-production services frequently, and automatically flex their use of platform resources to match changing runtime demands • Caroline Platform API 10.4.2 Released August, 2008 Project Caroline http://blogs.sun.com/zippel/entry/project_caroline_video

• resources: computation, storage, network • process - java perl python (can put ruby or php on top of those) • file systems and DBs, use ZFS can propagate many ind. file systems (via NFS, WebDAV), postgres DB ( later), private & internet networks; Load balancer, NAT, VPN; DNS mapping • resource creation, modify, destruction etc. and can monitor • create ip addresses for services, DBs, (clients), and address for domain name • process creation - includes firewall rules, IPs, file systems • can clone for debug, test • layer 4 load balancing • can bind directly, static, dynamic net for outbound .... • public DNS also private namespace for own account • Demo of building and taking down Data Center Virtualization Approaches Communications Computation Storage

Domain specific Knowledge languages Spaces Fortress MapReduce Object Store n Language VM's o i t e.g., Java, Python, etc. DHT, MySQL

a Master Worker z i l MPI

a pNFS, distr. FS, GFS u t r i Crossbow v File Systems g

n Project Caroline Google i Zones s

a TCP/IP network ZFS e r OS, e.g. Solaris, Linux c

n Volumes I Based Virtualization Disk Blocks Ethernet, WiFi HW VM's, e.g, , Infiniband LDOM, & VMWare SATA, SCSI Bare Cables Bare processorBs are Metal Bare Disks Project Hydrazine Vision and Objectives • Vision > Hydrazine enables the rapid mashing of services creating the “rocket fuel” that will power millions of Java and JavaFX applications delivering personalized, contextual based services across 4 screens • Objectives > Provide a complete development environment enabling the rapid creation and deployment of rich cloud applications > Simplify the discovery and utilization of key service enablers facilitating the creation of compelling blended, personalized, and contextual services > Provide means for developers to more easily monetize their services > Drive developer adoption to support the creation of back end services for JavaFX and Java applications Project Hydrazine AWS is “the launch pad, not just for the next million Facebook apps, but also for personal live TV channels, virtual desktops, pay-by-the-mile auto insurance, and no doubt plenty of things no one has thought of yet” (Wired Magazine, Apr 08) Project Hydrazine Hosted, standards-based platform enabling developers to easily discover, blend, deploy, and monetize services

3rd Party Services

Service Repository Deployment Platform

App App App Storage Service Enablers Core Services App Enablers Project O Comms Suite Identity myMedia G m p

rd l

3 Parties Service Profile Address Book Agg a e y s n

SP Services Session Contextual mapping S s

F Q ...... E i S L Hydrazine s Database h B

Network.com Ranger at TACC The First Sun Constellation System in Production – Feb. 2008

• The world’s largest general purpose compute cluster in production (4 Feb. 2008) based on Sun Constellation System • National Science Foundation funded TeraGrid center – more capacity than all others combined • Now 580 Tflops > 82 Sun ultra-dense blade platforms > 2 Sun ultra-dense switches > 72 Sun X4500 storage servers • Sun is the sole HW supplier • Opteron quad-core based Courtesy of Indiana University • Other customers in the pipeline, from small to large Sun Constellation System Open Petascale Architecture Radical Simplicity, Faster time to Deployment

s ck Sun Ra Other s ck Compute Ra Constellation Clusters System Cluster

Reduced g L n Cabling e li Other a b f a g S C in Cabling w bl it C a Infrastructure ch S o C e w re s it Constellation System ch Open Super Computer es Alternative Open Standards Fabric • 1 switching element 300:1 reduction • 300 switching elements • 1152 cables 6:1 reduction • 6912 cables • 74 racks 20% smaller footprint • 92 racks Why Does Sun Care about HPC and Grid? • “Redshift” sectors include HPC – the Universe is Expanding > Grid, Cluster computing and Networking is the driver • It's a growing market, outpacing growth of many others > $20B opportunity in 2008 • Success in HPC is a precursor to success in Enterprise markets

HPC is under served by Moore's Law Thank you

Stephen Perrenod, Ph.D. [email protected]