A Generic Basis for Distributed Kernel Infrastructure

KDDM: A Generic Basis For Distributed Kernel Infrastructure Renaud Lottiaux – Kerlabs Erich Focht - NEC June 28wwwth - .Okerlabs.LS 20com07 Goal of this BoF Introduce the KDDM concept Measure the interest in such a sub-system Identify who could be interested in using this Have an idea of how far we could be from an integration in the main line KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 2 Linux and Clustering Linux community was quite skeptic regarding clustering However, several cluster projects are already included or close to be included in main line Transparent Inter Process Communication (TIPC) Distributed Lock Manager (DLM) GFS Oracle Cluster FS 2 (OCFS2) Distributed IPC (DIPC) ... This is just the beginning ! Good time to setup the basis of a kernel level distributed infrastructure KDDM KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 3 Definition of KDDM Distributed cache of objects Generic mechanism to share objects between nodes Ensure an access to data which is Transparent Efficient Coherent ! Objects are accessed through a set of functions Don't care about data localization Just use data ! KDDM can host any kind of data Memory pages Data structure KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 4 Object identifier Objects are identified using 3 values Object id Set id Name space id A set is of collection of objects You can freely define your sets Pages from the same system V IPC segment ... A name space is a collection of sets You can freely define your name spaces Regular linux name spaces ... KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 5 Object coherence R/W-semaphore like access to object Single writer / multiple reader Object are transparently moved / duplicated between nodes Duplication means coherence problem Coherence managed using “invalidation on write” mechanism Lighter coherence mechanisms will be implemented Update on time-out Hardware helped data sharing (specific network needed) ... KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 6 Basic KDDM interface void * _kddm_get_object ( struct kddm_set *kddm_set, objid_t objid ) void * _kddm_grab_object ( struct kddm_set *kddm_set, objid_t objid ) void * _kddm_put_object ( struct kddm_set *kddm_set, objid_t objid ) void * _kddm_remove_object ( struct kddm_set *kddm_set, objid_t objid ) KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 7 IO Linkers KDDM set instantiated by IO linkers Determine the nature of hosted objects Define object input/output functions One kind of IO linker per kind of object to share Memory pages File cache pages Inodes ... Define links between object and physical nodes File pages are linked to the node hosting the file data Process memory pages are not linked to a given node KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 8 IO Linker functions IO Linkers are mainly a set of function pointer Define the behavior of the KDDM set Object allocation / Free First touch Object invalidation Object export / Import Object synchronization Etc... Default functions for kmalloc based objects. KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 9 The IO Linker Structure struct iolinker_struct { int (*instantiate) (struct kddm_set *set, void *private_data, int master); void (*uninstantiate) (struct kddm_set *set, int destroy); int (*first_touch) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*remove_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*invalidate_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*flush_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*insert_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*put_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*sync_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); void (*change_access) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid, dsm_state_t state); void *(*alloc_object) (struct kddm_obj *entry, struct kddm_set *set, objid_t objid); int (*import_object) (struct kddm_obj *entry, struct rpc_desc *desc); int (*export_object) (struct rpc_desc *desc, struct kddm_obj *obj_entry); void (*freeze_object) (struct kddm_obj *obj_entry); void (*warm_object) (struct kddm_obj *obj_entry); int (*is_frozen) (struct kddm_obj *obj_entry); char linker_name[16]; iolinker_id_t linker_id; }; KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 10 KDDM Architecture Distributed service Distributed service KDDM Core I/O Linker I/O Linker Local resource Local resource manager manager KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 11 Outline General overview Hello world with KDDM ! Quick KDDM architecture overview System V Shared Memory example KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 12 KDDM “Hello World !” (1/3) struct iolinker_struct hw_linker = { linker_name: “hw”, linker_id: 1 }; struct kddm_set *hw_set; void hello_world_setup (void) { register_io_linker (1, &hw_linker); hw_set = create_new_kddm_set (kddm_def_ns, /* Default name space */ 1, /* IO linker id */ KDDM_SET_NOT_LINKED, 64, /* Size of objects to share*/ NULL, 0, 0); } KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 13 KDDM “Hello World !” (2/3) void hello_world_node0 (void) { char *buf_en, *buf_en; buf_en = kddm_grab_object (hw_set, 0); strcpy (buf_en, “Hello “); kddm_put_object (hw_set, 0); buf_fr = kddm_grab_object (hw_set, 1); strcpy (buf_fr, “Bonjour “); kddm_put_object (hw_set, 1); } void hello_world_node1 (void) { char *buf_en, *buf_en; buf_en = kddm_grab_object (hw_set, 0); strcpy (&buf_en[6], ”world !“); kddm_put_object (hw_set, 0); buf_fr = kddm_grab_object (hw_set, 1); strcpy (&buf_fr[8], “monde !“); kddm_put_object (hw_set, 1); } KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 14 KDDM “Hello World !” (3/3) Node 0 hello_world_setup (); hello_world_node0 (); Node 1 hello_world_setup (); hello_world_node1 (); char *buf; buf = kddm_get_object (hw_set, 0); printk (“%s\n”, buf); kddm_put_object (hw_set, 0); buf = kddm_get_object (hw_set, 1); printk (“%s\n”, buf); kddm_put_object (hw_set, 1); Hello world ! kddm_remove_object (hw_set, 0); Bonjour monde ! kddm_remove_object (hw_set, 1); KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 15 How does that work ? kddm_grab_object (hw_st, 0) KDDM set I/O Linker I/O Linker Memory Memory Hello KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 16 How does that work ? kddm_grab_object (hw_set, 0) KDDM set I/O Linker I/O Linker Memory Memory Hello Hello KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 17 How does that work ? kddm_grab_object (hw_set, 0) KDDM set I/O Linker I/O Linker Memory Memory Hello world ! KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 18 How does that work ? kddm_get_object (hw_set, 0) KDDM set I/O Linker I/O Linker Memory Memory Hello world ! Hello world ! KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 19 How does that work ? kddm_get_object (hw_set, 0) KDDM set I/O Linker I/O Linker Memory Memory Hello world ! Hello world ! KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 20 Outline General overview Hello world with KDDM ! Quick KDDM architecture overview System V Shared Memory example KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 21 KDDM Design Interface KDDM Object NS Set Object Protocol IO Linker Core server manager manager manager KDDM Communication interface HotPlug Comm Layer TIPC KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 22 Outline General overview Hello world with KDDM ! Quick KDDM architecture overview System V Shared Memory example KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 23 Building Distributed SHM with KDDM Based on KDDM, building a distributed SHM mechanism is quite simple We need to share Segment content One SHM memory data IO linker A set of KDDM set instantiated with this linker One per memory segment SHM ids One SHM ids IO linker A unique KDDM hosting existing ids cluster wide KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 24 Distributed SHM implementation On a new segment creation Hook in kernel newseg function Create a new KDDM set for segment data Create a new entry for the segment in the ids KDDM set Make the link between SHM id and KDDM data set id On segment removal Hook in kernel do_shm_rmid function Destroy the data KDDM set Remove the entry in ids KDDM set. On segment mapping Hook in kernel shm_mmap function Set the vm_ops mm field to our set of functions. KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 25 Distributed SHM implementation On a segment lookup Hook in the kernel shm_lock function Check if the requested id exist in the ids KDDM set. kddm_get_object VM operations no_page kddm_get_object / kddm_grab_object on data KDDM set wp_page (present in 2.2 series) kddm_grab_object on data KDDM set KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 26 Container use in Kerrighed SSI OS Used as a basic bloc to implement Process memory migration Memory sharing cluster wide File cache sharing cluster wide Inodes sharing cluster wide Cluster wide locks Signal sharing Etc... Could be used by some other projects DIPC OpenSSI Etc... KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 27 Conclusion KDDM is a high level abstraction to share data between nodes at kernel level KDDM can be used to (more) easily implement distributed services Could be a very good basis for a distributed kernel infrastructure KDDM 0BoF2/11/0 8- June 28thw ww.- OkerLSlab s20.com07 28 Backups KDDM 0BoF2/11/0 8-

Load more