Ganeti Extstorage Interface More Options for Your Data
Total Page:16
File Type:pdf, Size:1020Kb
Ganeti, the New and Arcane ganeti's best kept secrets, and exciting new developments Ganeti Eng Team - Google LinuxCon Japan 2014 - 2 Feb 2014 Introduction to Ganeti A cluster virtualization manager, in one slide What is Ganeti? · Manage clusters 1-200 of physical machines, divided in nodegroups · Deploy Xen/KVM/LXC virtual machines on them - Live migration - Resiliency to failure (DRBD, Ceph, SAN/NAS, ...) - Cluster balancing - Ease of repairs and hardware swaps · Controlled via command line, REST, web interfaces 4/53 Newest features Development status 2.10 The very stable release · Improved upgrade procedure "gnt-cluster upgrade" · CPU Load in hail/hbal (GSOC project) · Hotplug support (KVM) · RBD storage direct access (KVM) · Better Openvswitch support (GSOC project) 6/53 2.11 The latest stable release · Faster instance moves · GlusterFS support · hsqueeze (achieve maximum cluster compaction) 7/53 2.12 and future The next stable release(s) · Jobs as processes · New install model · More secure master candidates · Better container support (GSOC) · Resource reservation/Extra parallelization · Generic conversion between disk templates (GSOC) 8/53 Monitoring daemon What's going on in your cluster? Monitoring a cluster The old school way Other Systems Cluster Monitoring NICs System Instance Master Node Storage 10/53 Monitoring a cluster Using the monitoring daemon Other Systems Cluster Monitoring System Monitoring Daemons 11/53 What is the monitoring daemon? Provides information: · about the cluster state/health · live · read-only design doc: design-monitoring-agent.rst 12/53 More details · HTTP daemon · Replying to REST-like queries · Actually, GET only · Providing JSON replies · Easy to parse in any language · Already used in all the rest of Ganeti · Running on every node (Not: only master-candidates, VM-enabled) · Additionally: mon-collector: quick 'n dirty CLI tool 13/53 Data collectors · provide data to the deamon · one collector, one report · one collector, one category: - storage, hypervisor, daemon, instance · two kinds: performance reporting, status reporting · new feature: stateful data collectors 14/53 Data collectors What data can be retrieved right now? Now: · instance status (Xen only) (category: instance) · diskstats information (storage) · LVM logical volumes information (storage) · DRBD status information (storage) · Node OS CPU load average (no category, default) Soon(-ish): · instance status for KVM (instance) · Ganeti daemons status (daemon) · Hypervisor resources (hypervisor) · Node OS resources report (default) 15/53 The report format { JSON "name" : "TheCollectorIdentifier", "version" : "1.2", "format_version" : 1, "timestamp" : 1351607182000000000, "category" : null, "kind" : 0, "data" : { "plugin_specific_data" : "go_here" } } · name: the name of the plugin. Unique string. · version: the version of the plugin. A string. · format_version: the version of the data format of the plugin. Incremental integer. · timestamp: when the report was produced. Nanoseconds. Can be zero- padded. 16/53 Status reporting collectors: report They introduce a mandatory part inside the data section. "data" : { JSON ... "status" : { "code" : <value> "message: "some summary goes here" } } · <value>:by increasing criticality level · 0: working as intended · 1: temporarily wrong. Being auto-repaired · 2: unknown. Potentially dangerous state · 4: problems. External intervention required 17/53 How to use the daemon? · Accepts HTTP connections on node.example.com:1815 · Not authenticated: read only · Just firewall, or bind on local address only · GET requests to specific addresses · Each address returns different info according to the API / (return the list of supported protocol version) /1/list/collectors /1/report/all /1/report/[category]/[collector_name] 18/53 Configuration Daemon (confd) How's your cluster supposed to look like? Before confd · Configuration only available on master candidates · Few selected values replicated with ssconf · Small pieces of config in text files on all the nodes · Doesn't scale · Need for a way to access config from other nodes · Scalable · No single point of failure (so, no RAPI) 20/53 What does confd do? · Provides information from config.data · Read-only · Distributed · Multiple daemons running on master candidates · Accessible from all the nodes through confd protocol · Resilient to failures · Optional 21/53 What info does it provide? Replies to simple queries: · Ping · Master IP · Node role · Node primary IP · Master candidates primary IPs · Instance IPs · Node primary IP from Instance primary IP · Node DRBD minors · Node instances 22/53 confd protocol General description · UDP (port 1814) · keyed-Hash Message Authentication Code (HMAC) authentication · Pre-shared, cluster wide key · Generated at cluster-init · Root-only readable · Timestamp · Checked (± 2.5 mins) to prevent replay attacks · Used as HMAC salt · Queries made to any subset of master candidates · Timeout · Maximum number of expected replies 23/53 Confd protocol Request/Reply request request request request request 24/53 Confd protocol Request/Reply timeout reply (v: 57) reply (v: 57) reply (v: 57) reply (v: 56) (enough replies) 25/53 confd protocol Request plj0{ CONFD "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" } · plj0: fourcc detailing the message content (PLain Json 0) · hmac: HMAC signature of salt+msg with the cluster hmac key 26/53 confd protocol Request plj0{ CONFD "msg": "{\"type\": 1, \"rsalt\": \"9aa6ce92-8336-11de-af38-001d093e835f\", \"protocol\": 1, \"query\": \"node1.example.com\"}\n", "salt": "1249637704", "hmac": "4a4139b2c3c5921f7e439469a0a45ad200aead0f" } · msg: JSON-encoded query · protocol: confd protocol version (=1) · type: What to ask for (CONFD_REQ_* constants) · query: additional parameters · rsalt: response salt == UUID identifying the request 27/53 confd protocol Reply plj0{ CONFD "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" } · salt: the rsalt of the query · hmac: hmac signature of salt+msg 28/53 confd protocol Reply plj0{ CONFD "msg": "{\"status\": 0, \"answer\": 0, \"serial\": 42, \"protocol\": 1}\n", "salt": "9aa6ce92-8336-11de-af38-001d093e835f", "hmac": "aaeccc0dff9328fdf7967cb600b6a80a6a9332af" } · msg: JSON-encoded answer · protocol: protocol version (=1) · status: 0=ok; 1=error · answer: query-specific reply · serial: version of config.data 29/53 Ready-made clients The protocol is simple, but clients are simpler · Ready to use confd clients · Python · lib/confd/client.py · Haskell · Since Ganeti 2.7 · src/Ganeti/ConfD/Client.hs · src/Ganeti/ConfD/ClientFunctions.hs 30/53 Expanding confd capabilities · Currently not so many queries are supported · Easy to add new ones · Just add a new query type in the constants list · ...and extend the buildResponse function (src/Ganeti/Confd/Server.hs to reply to it in the appropriate way 31/53 Ganeti and Networks How do your instances talk to the world? · Some slides contributed by Dimitris Aragiorgis <[email protected]> current nics: MAC + IP + link + mode NIC configuration · mode=bridged uses brctl addif · Hooks can deal with firewall rules, and more · External systems needed for DHCP, IPv6, etc. Management · Which VMs are on the same collision domain? · Which IP is free for a new VM to use? 33/53 gnt-network overview · manage collision domains for your instances · easy way to assign IPs to instances - If resources are shared in multiple clusters, allocation must be done externally · keep existing per-nic flexibility · hide underlying infrastructure · better networking overview 34/53 gnt-network: Who does what? masterd: config.data integrity · abstract network infrastructure: network + netparams per nodegroup · IP uniqueness inside network: IP pool management - bitarray, TemporaryReservationmanager, Locking · encapsulate network information in NIC opjects: RPC external scripts and hooks: ping vm1.ganeti.example.com · use exported environment provided by noded · brctl, iptables, ebtables, ip rule, etc. · update external dhcp/DNS server entries · let VM act unaware of the "situation" (dhclient, etc.) 35/53 gnt-network + external scripts · gnt-network alone is nothing more than a nice config.data · snf-network: node level scripts and hooks · nfdhcpd: node level DHCP server based on NFQUEUE 36/53 snf-network node level scripts and hooks · overrides Ganeti default scripts (kvm-ifup, vif-ganeti) · looks for specific tag types in NIC's network · applies corresponding rules · created nfdhcpd binding files · provides hook to update DNS entries 37/53 nfdhcpd node level DHCP server based on NFQUEUE · listens on specific NFQEUE · updates its leases db - inotify on specific directory for binding files · mangles DHCP requests and replies based on it's db · responds to RS and NS for IPv6 auto-configuration 38/53 gnt-network Examples Create and connect a new network gnt-network add --network 192.168.1.0/24 --gateway 192.168.1.1 --tags nfdhcpd net1 gnt-network connect net1 bridged prv0 Create an instance inside this network gnt-instance add --net 0:ip=pool,network=net1 ... inst1 gnt-instance info inst1 gnt-network info net1 39/53 gnt-network + snf-* Examples Use snf-network and nfdhcpd apt-get install snf-network nfdhcdpd iptables -t mangle -A PREROUTING -i prv+ -p udp -m udp --dport 67