Openssi (Single System Image) Linux C Luster Project Openssi.Org
Total Page:16
File Type:pdf, Size:1020Kb
OpenSSI (Single System Image) Linux C luster Project openssi.org Jeff Edlund Senior Principle Solution Architect NSP Bruce Walker Staff Fellow – Office of Strategy & Technology © 2004 Hewlett-Packard Developm ent Com pany, L.P. The inform ation contained herein is subject to change without notice Agenda • W hat are today’s clustering strategies for Linux • W hy isn’t failover clustering enough • W hat is Single System Im age (SSI) • W hy is SSI so im portant • O penSSI Cluster Project Architecture • Project Status 07/10/03 2 M any types of Clusters • High Perform ance Clusters ¦ Beowulf; 1000 nodes; parallel program s; M PI • Load-leveling Clusters ¦ M ove processes around to borrow cycles (eg. M osix) • W eb-Service Clusters ¦ LVS/Piranah; load-level tcp connections; replicate data • Storage Clusters ¦ G FS; parallel filesystem s; sam e view of data from each node • Database Clusters ¦ O racle Parallel Server; • High Availability Clusters ¦ ServiceG uard, Lifekeeper, Failsafe, heartbeat, failover clusters • Single System Im age Clusters 07/10/03 3 W ho is Doing SSI Clustering? • O utside Linux: ¦ Com paq/HP with VM SClusters, TruClusters, N SK, and N SC ¦ Sun had “Full M oon”/Solaris M C (now SunClusters) ¦ IBM Sysplex ? • Linux SSI: ¦ Scyld - form of SSI via Bproc ¦ M osix - form of SSI due their hom enode/process m igration technique and looking at a single root filesystem ¦ Polyserve - form of SSI via CFS (Cluster File System ) ¦ Q Clusters – SSI through software / m iddleware layer ¦ RedHat G FS – G lobal File System (based on Sistina) ¦ Hive Com puting – Declarative program m ing m odel for “workers” ¦ O penSSI Cluster Project – SSI project to bring all attributes together 07/10/03 4 Scyld - Beowulf Bproc (used by Scyld): ¦ process-related solution ¦ m aster node with slaves ¦ initiate process on m aster node and explicitly “m ove”, “rexec” or “rfork” to slave node ¦ all files closed when the process is m oved ¦ m aster node can “see” all the processes which were started there ¦ m oved processes see the process space of the m aster (som e pid m apping) ¦ process system calls shipped back to the m aster node (including fork) ¦ other system calls executed locally but not SSI 07/10/03 5 M osix M osix / O penM osix: ¦ hom e nodes with slaves ¦ initiate process on hom e node and transparently m igrate to other nodes ¦ hom e node can “see” all and only all processes started there ¦ m oved processes see the view of the hom e node ¦ m ost system calls actually executed back on the hom e node ¦ DFSA helps to allow I/O to be local to the process 07/10/03 6 PolyServe M atrix Server: ¦ Com pletely sym m etric Cluster File System with DLM ( no m aster / slave relationships) ¦ Each node m ust be directly attached to SAN ¦ Lim ited SSI for m anagem ent ¦ N o SSI for processes ¦ N o load balancing 07/10/03 7 Q lusters ClusterFram e: ¦ Based on M osix ¦ Uses Hom e-node SSI Application Components ¦ centralized policy-based ClusterFrame XHA ClusterFrame SSI m anagem ent Xtreme High Availability Single System Image • reduces overhead Enterprise Cluster ClusterFrame Q RM – Q lusters Resource M anager • pre-determ ined resource M anagement allowances • centralized provisioning ClusterFrame Platform ¦ stateful application recovery Linux Kernel Intel Blades & Storage Systems 07/10/03 8 RedHat G FS – G lobal File System RedHat Cluster Suite (G FS): ¦ Form erly Sistina ¦ Prim arily Parallel Physical file system (only real form of SSI) ¦ Used in conjunction with RedHat cluster m anager to provide • High availability • IP load balancing ¦ Lim ited sharing and no process load balancing 07/10/03 9 Hive Com puting - Tsunam i Hive Creator: ¦ Hives can be m ade up of any num ber of IA32 m achines ¦ Hives consist of: • Client applications • Hive client API • W orkers • W orker applications ¦ Databases exist outside of the Hive ¦ Applications m ust be m odified to run in a Hive ¦ N o Cluster File System ¦ Closer to G rid m odel than SSI 07/10/03 10 Are there O pportunity G aps in the current SSI offerings? YES!! A Full SSI solution is the foundation for sim ultaneously addressing all the issues in all the cluster solution areas O pportunity to com bine: •High Availability •IP load balancing •IP failover •Process load balancing •Cluster filesystem •Distributed Lock M anager •Single nam espace •M uch m ore … 07/10/03 11 W hat is a Full Single System Im age Solution? Com plete Cluster looks like a single system to: ¦ Users; ¦ Adm inistrators; ¦ Program s and Program m ers; Co-operating O S Kernels providing transparent access to all O S resources cluster-wide, using a single nam espace ¦ A.K.A – You don’t really know it’s a cluster! The state of cluster nirvana 07/10/03 12 SM P – Sym m etrical M ulti Processing functionality Function SM P M anageability Yes Usability Yes Sharing / Utilization Yes High Availability Scalability Incremental G row th Price / Performance 07/10/03 13 Value add of HA clustering to SM P Traditional Function SM P Clusters M anageability Yes Usability Yes Sharing / Utilization Yes High Availability Yes Scalability Yes Incremental G row th Yes Price / Performance Yes 07/10/03 14 SSI Clusters have the best of both!! Traditional Function SM P Clusters SSI Clusters M anageability Yes Yes Usability Yes Yes Sharing / Utilization Yes Yes High Availability Yes Yes Scalability Yes Yes Incremental G row th Yes Yes Price / Performance Yes Yes 07/10/03 15 Com m on Clustering G oals O ne or All of: • High Availability ¦ A com pute engine is always available to run m y workload • Scalability ¦ As I need m ore resource I can access it transparently to the end user application • M anageability ¦ I can guarantee som e level of service because I can efficiently m onitor, operate and service m y com pute resources • Usability ¦ Com pute resources are assem bled together in such a way as to give m e trouble free easy operations of m y com pute resources without regard to having knowledge of the cluster 07/10/03 16 O penSSI Linux Cluster Project Ideal/Perfect Cluster in all dimensions SMP Typical HA Cluster Availability OpenSSI Linux Cluster Project Scalability Manageability log scale HUGE Really BIG Usability 07/10/03 17 O verview of O penSSI Cluster • Single HA root filesystem • Consistent O S kernel on each node • Cluster form ation early in boot • Strong M em bership • Single, clusterwide view of files, filesystem s, devices, processes and ipc objects • Single m anagem ent dom ain • Load balancing of connections and processes 07/10/03 18 O penSSI Cluster Project Availability • N o Single (or even m ultiple) Point(s) of Failure • Autom atic Failover/restart of services in the event of hardware or software failure • Application Availability is sim pler in an SSI Cluster environm ent; statefull restart easily done; • SSI Cluster provides a sim pler operator and program m ing environm ent • O nline software upgrade • Architected to avoid scheduled downtim e 07/10/03 19 O penSSI Cluster Project Price / Perform ance Scalability • W hat is Scalability? ¦ Environm ental Scalability and Application Scalability! • Environm ental (Cluster) Scalability: ¦ m ore USEABLE processors, m em ory, I/O , etc. ¦ SSI m akes these added resources useable 07/10/03 20 O penSSI Cluster Project Price / Perform ance Scalability Application Scalability: • SSI m akes distributing function very easy • SSI allows sharing of resources between processes on different nodes (all resources transparently visible from all nodes): ¦ filesystem s, IPC, processes, devices*, networking* • SSI allows replicated instances to co-ordinate (alm ost as easy as replicated instances on an SM P; in som e ways m uch better) • Load balancing of connections and processes • O S version in local m em ory on each node • Industry Standard Hardware (can m ix hardware) • Distributed O S algorithm s written to scale to hundreds of nodes (and successful dem onstrated to 133 blades and 27 Itanium SM P nodes) 07/10/03 21 O penSSI Linux Cluster - M anageability • Single Installation ¦ Joining the cluster is autom atic as part of booting and doesn’t have to m anaged • Trivial online addition of new nodes • Use standard single node tools (SSI Adm inistration) • Visibility of all resources of all nodes from any node ¦ Applications, utilities, program m ers, users and adm inistrators often needn’t be aware of the SSI Cluster • Sim pler HA (High Availability) m anagem ent 07/10/03 22 O penSSI Linux Cluster Single System Adm inistration • Single set of User accounts (not N IS) • Single set of filesystem s (no “N etwork m ounts”) • Single set of devices • Single view of networking • Single set of Services (printing, dum ps, networking*, etc.) • Single root filesystem (lots of adm in files there) • Single set of paging/swap spaces (goal) • Single install • Single boot and single copy of kernel • Single m achine m anagem ent tools 07/10/03 23 O penSSI Linux Cluster - Ease of Use • Can run anything anywhere with no setup; • Can see everything from any node; • Service failover/restart is trivial; • Autom atic or m anual load balancing; ¦ powerful environm ent for application service provisioning, m onitoring and re-arranging as needed 07/10/03 24 Blades and O penSSI Clusters ¦Very sim ple provisioning of hardware, system and applications ¦N o root filesystem per node ¦Single install of the system and single application install ¦N odes can netboot ¦Local disk only needed for swap but can be shared ¦Blades don’t need FCAL connect but can use it ¦Single, highly available IP address for the cluster ¦Single system update