iaas Documentation Release 0.1.0
NorCAMS
Sep 14, 2021
CONTENTS
1 Contents 3 1.1 Getting started...... 3 1.1.1 For end users of NREC...... 3 1.1.2 For NREC developer and operators...... 3 1.1.3 For team and project status information...... 3 1.1.4 Customers and local administrators...... 3 1.2 Design...... 3 1.2.1 Locations...... 4 1.2.2 Physical hardware...... 4 1.2.3 Networking overview...... 5 1.2.4 Virtual machines...... 7 1.2.5 Development hardware requirements...... 9 1.2.6 Node overview...... 10 1.3 Security...... 11 1.3.1 [2021] System documentation...... 11 1.3.2 [2021] Management...... 12 1.3.3 [2021] Secure communication...... 17 1.3.4 [2021] API endpoints...... 19 1.3.5 [2021] Identity...... 21 1.3.6 [2021] Dashboard (FIXME)...... 24 1.3.7 [2021] Compute...... 29 1.3.8 [2021] Block Storage...... 32 1.3.9 [2021] Image Storage...... 33 1.3.10 [2021] Shared File Systems...... 34 1.3.11 [2019] Networking...... 35 1.3.12 [2019] Object Storage...... 37 1.3.13 Message queuing...... 38 1.3.14 [2019] Data processing...... 39 1.3.15 Databases...... 40 1.3.16 Tenant data privacy...... 42 1.3.17 [2019] Instance security management...... 46 1.4 Howtos and guides...... 47 1.4.1 Build docs locally using Sphinx...... 47 1.4.2 Git in the real world...... 48 1.4.3 Install KVM on CentOS 7 from minimal install...... 48 1.4.4 Configure a Dell S55 FTOS switch from scratch...... 49 1.4.5 Install cumulus linux on ONIE enabled Dell S4810...... 50 1.4.6 Create Cumulus VX vagrant boxes for himlar dev...... 51 1.4.7 Routed, virtual network interfaces for guest VMs on controllers...... 52 1.4.8 Configure iDRAC-settings on Dell 13g servers with USB stick...... 54
i 1.4.9 Using vncviewer to access the console...... 55 1.4.10 Building puppet-agent for PPC-based Cumulus Linux...... 55 1.4.11 How to create the designate-dashboard RPM package...... 55 1.5 Team operations...... 56 1.5.1 Getting started...... 56 1.5.2 Development...... 60 1.5.3 Operations...... 94 1.5.4 Installation...... 145 1.5.5 Tips and tricks...... 156 1.6 Status...... 168 1.6.1 Teamdeltakere og roller...... 168 1.6.2 Vakt...... 169 1.6.3 Navnekonvensjon...... 170 1.6.4 Kontaktpunkter...... 171 1.6.5 List historie...... 173 1.6.6 Rapporter og referat...... 174 1.6.7 Aktiviteter...... 261 1.6.8 Arkiv...... 285 1.6.9 Hardware overview...... 302 1.7 Kundeinformasjon...... 304 1.7.1 Priser og flavors...... 304 1.7.2 Prosjektinformasjon...... 307
ii iaas Documentation, Release 0.1.0
This documentation are intended for the team working on UH-IaaS, and people involved in the project. End user documentation are found at http://docs.uh-iaas.no In addition to information about development and operations, we also have a section in Norwegian about the current status.
CONTENTS 1 iaas Documentation, Release 0.1.0
2 CONTENTS CHAPTER ONE
CONTENTS
1.1 Getting started
1.1.1 For end users of NREC
• Documentation at https://docs.nrec.no • Status at https://status.nrec.no
1.1.2 For NREC developer and operators
• Team operations
1.1.3 For team and project status information
This will be in Norwegian only. • Status
1.1.4 Customers and local administrators
This will be in Norwegian only. • Kundeinformasjon
1.2 Design
High-level documents describing the IaaS platform design
3 iaas Documentation, Release 0.1.0
1.2.1 Locations
UH-IaaS is located in Bergen, as the OpenStack region BGO, and in Oslo, as the OpenStack region OSL. For the most part, the services in the two regions are logically identical.
1.2.2 Physical hardware
The following illustration show the physical hardware, in broad terms. The number of compute hosts and storage hosts is horizontally scalable and will vary from region to region.
4 Chapter 1. Contents iaas Documentation, Release 0.1.0
The illustration shows these types of physical components: Management switch Network ethernet switch used for internal networking, i.e. non-routed RFC1918 addresses. These are only used for management tasks. Public switch A switch that has access to Internet. These switches also perform layer 3 routing, and are used to provide access to the public services in UH-IaaS. Controller hosts Servers that are running virtual machines manually with libvirt (i.e. not managed by OpenStack). All OpenStack components such as the dashboard, API services etc. are running as virtual machines on these hosts. Compute hosts Servers that are used as compute hosts in OpenStack. Customer’s virtual machines are running on these servers. Storage hosts Servers that are part of the Ceph storage cluster. They provide storage services to OpenStack (e.g. storage volumes).
1.2.3 Networking overview
Physical networking connections of each site: BGO
1.2. Design 5 iaas Documentation, Release 0.1.0
OSL
6 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.2.4 Virtual machines
The illustration below shows the various virtual machines running on the controller hosts.
Some of the virtual machines have a purely administrative purpose, some provide internal infrastructure services, and some are running OpenStack components. Some virtual machines are scaled out horizontally, typically one on each controller host, mostly this applies on Open- Stack services. This is done for efficiency and redundancy reasons.
OpenStack components
These VMs are purely running OpenStack components. image-01 Runs the OpenStack Image component, Glance. dashboard-01 Runs the OpenStack Dashboard component, Horizon. novactrl-[01..n] Usually three VMs in a redundant setup, runs the controller part (e.g. API) of the OpenStack Com- pute component, Nova. volume-01 Runs the OpenStack Volume component, Cinder. telemetry-01 Runs the OpenStack metering component, Ceilometer.
1.2. Design 7 iaas Documentation, Release 0.1.0
network-[01..n] Usually three VMs in a redundant setup, runs the OpenStack Network component, Neutron. identity-01 Runs the OpenStack Identity component, Keystone.
Infrastructure services
These VMs are running various infrastructure services, that are used by either the OpenStack components, or other infrastructure or administrative services, or both. proxy-01 Proxy service for Internet access for the infrastructure nodes that are not on any public network. ns-01 Autoritative DNS server.Available publicly as ns.
Administrative services
These VMs are running on a separate controller host, because they need to be up and running during maintenance on other VMs. admin-01 Runs Foreman for e.g. provisioning tasks, and functions at Puppetmaster for all hosts. monitor-01 Runs Sensu for monitoring tasks. logger-01 Log receiver for all hosts. builder-01 Runs our builder service, for building OpenStack images.
8 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.2.5 Development hardware requirements
A key point is that each location is built from the same hardware specification. This is done to simplify and limit influence of external variables as much as possible while building the base platform. The spec represents a minimal baseline for one site/location.
Networking
4x Layer 3 routers/switches • Connected as routed leaf-spine fabric (OSPF) • At least 48 ports 10gb SFP+ / 4 ports 40gb QSFP • Support for ONIE/OCP preferred 1x L2 management switch • 48 ports 1GbE, VLAN capable • Remote management possible Cabling and optics • 48x 10GBase-SR SFP+ tranceivers • 8x 40GBase-SR4 QSFP+ tranceivers • 4x QSFP+ to QSFP+, 40GbE passive copper direct attach cable, 0.5 meter • 4x 3 or 5 meter QSFP+ to QSFP+ OM3 MTP fiber cable
Servers
3x management nodes • 1u 1x12 core with 128gb RAM • 2x SFP+ 10gb and 2x 1gbE • 2x SSD drives RAID1 • Room for more disks • Redundant PSUs 3x compute nodes • 1u 2x12 core with 512Gb RAM • 2x SFP+ 10Gb and 2x 1GbE • 2x SSD drives RAID1 • Room for more disks • Redundant PSUs 5x storage nodes • 2u 1x12 core with 128gb RAM • 2x SFP+ 10Gb and 2x 1GbE • 8x 3.5” 2tb SATA drives
1.2. Design 9 iaas Documentation, Release 0.1.0
• 4x 120gb SSD drives • No RAID, only JBOD • Room for more disks (12x 3.5” ?) • Redundant PSUs Comments • Management and compute nodes could very well be the same chassis with different specs. Possibly even higher density like half width would be considered, but not blade chassis (it would mean non-standard ca- bling/connectivity) • Important key attribute for SSD drives is sequential write performance. SSDs might be PCIe connected. • 2tb disks for storage nodes to speed up recovery times with Ceph
1.2.6 Node overview
Warning: This page is OBSOLETE node = a virtual machine running on a physical controller box with libvirt This overview shows the different nodes, which network the nodes have access to and where the Openstack and other services are running.
10 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.3 Security
Warning: This document is currently under review/construction.
This document is an attempt to write up all the security measures that can, will or should be implemented. The basis is the OpenStack Security Guide on openstack.org. We use the sections in the security guide, and try to answer the following questions: 1. Is this security measure implemented? and if not: 2. What are the potential security impact? 3. Other concerns? 4. Should this be implemented? For each recommendation, there is at least one check that can have one of four different values: • [PASS] This check has been passed • [FAIL] This check is failed • [----] This check has not been considered yet • [N/A] This check is not applicable • [DEFERRED] This check has been postponed, will be considered at a later time
1.3.1 [2021] System documentation
REVISION 2021-01-26
Contents
• [2021] System documentation – System Inventory
Impact Low Implemented percent 75% (3/4)
System Inventory
From OpenStack Security Guide: System documentation: Documentation should provide a general description of the OpenStack environment and cover all systems used (production, development, test, etc.). Documenting system components, networks, services, and software often provides the bird’s-eye view needed to thoroughly cover and consider security concerns, attack vectors and possible security domain bridging points. A system inventory may need to capture ephemeral resources such as virtual machines or virtual disk volumes that would otherwise be persistent resources in a traditional IT system.
1.3. Security 11 iaas Documentation, Release 0.1.0
The UH-IaaS infrastructure is, from hardware and up, managed completely by the UH-IaaS group, and therefore independent of each institution. Except for networking interface and physical hardware management, there are no dependencies on the institutions. Links to infrastructure documentation: [PASS] Hardware inventory A high-level view of the hardware inventory is outlined in the document Physical hardware. [PASS] Software inventory A high-level view of the software inventory is outlined in the document Virtual ma- chines. [PASS] Network topology A high-level view of the network topology is outlined in the document Networking overview. [DEFERRED] Services, protocols and ports FIXME
1.3.2 [2021] Management
REVISION 2021-01-26
Contents
• [2021] Management – Continuous systems management
* Vulnerability management * Configuration management * Secure backup and recovery * Security auditing tools – Integrity life-cycle
* Secure bootstrapping * Runtime verification * Server hardening – Management interfaces
* Dashboard * OpenStack API * Secure shell (SSH) * Management utilities * Out-of-band management interface
Impact Medium Implemented percent 76% (13/17)
12 Chapter 1. Contents iaas Documentation, Release 0.1.0
Continuous systems management
From OpenStack Security Guide: Management - Continuous systems management: A cloud will always have bugs. Some of these will be security problems. For this reason, it is critically important to be prepared to apply security updates and general software updates. This involves smart use of configuration management tools, which are discussed below. This also involves knowing when an upgrade is necessary.
Vulnerability management
Updates are announced on the OpenStack Announce mailing list. The security notifications are also posted through the downstream packages, for example, through Linux distributions that you may be sub- scribed to as part of the package updates. [PASS] Triage When we are notified of a security update, this is discussed at the next morning meeting. We will then decide the impact of the update to our environment, and take proper action. [PASS] Testing the updates We have test clouds in each location (currently OSL and BGO) which in most respects are identical to the production clouds. This allows for easy testing of updates. [PASS] Deploying the updates When testing is completed and the update is verified, and we are satisfied with any performance impact, stability, application impact etc., the update is deployed in production. This is done via the Patching policy.
Configuration management
Deployment of both physical and virtual nodes in NREC is done using Ansible playbooks, which are maintained on GitHub. The configuration managements is completely automated via Puppet. The Puppet code and hieradata is maintained on GitHub. All changes are tracked via Git. [PASS] Policy changes Policy changes are tracked in Git and/or our KanBan board
Secure backup and recovery
If we at some point decide to take backup of the infrastructure or instances, we should include the backup procedures and policies in the overall security plan. [PASS] Backup procedure and policy We do not take regular, incremental backups. Important data are replicated within the NREC infrastructure to mitigate information loss.
Security auditing tools
We should consider using SCAP or similar security auditing tools in combination with configuration management. [FAIL] Security auditing tools Security auditing tools such as SCAP adds complexity and significant delays in the pipeline. Therefore, this is not a priority at this time.
1.3. Security 13 iaas Documentation, Release 0.1.0
Integrity life-cycle
From OpenStack Security Guide: Management - Integrity life-cycle: We define integrity life cycle as a deliberate process that provides assurance that we are always running the expected software with the expected configurations throughout the cloud. This process begins with secure bootstrapping and is maintained through configuration management and security monitoring.
Secure bootstrapping
The Security Guide recommends having an automated provisioning process for all nodes in the cloud. This includes compute, storage, network, service and hybrid nodes. The automated provisioning process also facilitates security patching, upgrades, bug fixes, and other critical changes. Software that runs with the highest privilege levels in the cloud needs special attention. [PASS] Node provisioning We use PXE for provisioning, which is recommended. We also use a separate, isolated network within the management security domain for provisioning. The provisioning process is handled by Ansible. [FAIL] Verified boot It is recommended to use secure boot via TPM chip to boot the infrastructure nodes in the cloud. TPM adds unwanted complexity and we don’t use it. [PASS] Node hardening We do general node hardening via a security baseline which we maintain via Puppet. The security baseline is based on best practice from the OS vendor, as well as our own experience. All nodes are using Mandatory Access Control (MAC) via SELinux.
Runtime verification
From OpenStack Security Guide: Once the node is running, we need to ensure that it remains in a good state over time. Broadly speaking, this includes both configuration management and security monitoring. The goals for each of these areas are different. By checking both, we achieve higher assurance that the system is operating as desired. [FAIL] Intrusion detection system We are not running an Intrusion detection system (IDS).
Server hardening
This mostly includes file integrity management. [FAIL] File integrity management (FIM) We should consider a FIM tool to ensure that files such as sensitive system or application configuration files are no corrupted or changed to allow unauthorized access or malicious behaviour. • While we don’t run a specific FIM tool, our configuration management system (Puppet) functions as a watchdog for most important files.
14 Chapter 1. Contents iaas Documentation, Release 0.1.0
Management interfaces
From OpenStack Security Guide: Management - Management interfaces: It is necessary for administrators to perform command and control over the cloud for various operational functions. It is important these command and control facilities are understood and secured. OpenStack provides several management interfaces for operators and tenants: • OpenStack dashboard (horizon) • OpenStack API • Secure shell (SSH) • OpenStack management utilities such as nova-manage and glance-manage • Out-of-band management interfaces, such as IPMI
Dashboard
[PASS] Capabilities The dashboard is configured via Puppet, and shows only capabilities that are known to work properly. Buttons, menu items etc. that doesn’t work or provides capabilities that NREC doesn’t offer are disabled in the dashboard. [PASS] Security considerations There are a few things that need to be considered (from OpenStack Security Guide: Management - Management interfaces): • The dashboard requires cookies and JavaScript to be enabled in the web browser. – (FIXME FIXME FIXME) The cookies are only used for the dashboard and are not used for tracking the user’s activities beyond NREC. • The web server that hosts the dashboard should be configured for TLS to ensure data is encrypted. – (pass): TLS v1.2 or later is enforced. • Both the horizon web service and the OpenStack API it uses to communicate with the back end are sus- ceptible to web attack vectors such as denial of service and must be monitored. – (pass) We have monitoring in place • It is now possible (though there are numerous deployment/security implications) to upload an image file directly from a user’s hard disk to OpenStack Image service through the dashboard. For multi-gigabyte images it is still strongly recommended that the upload be done using the glance CLI. – (pass) Image uploading is done directly to Glance via a redirect in dashboard. • Create and manage security groups through dashboard. The security groups allows L3-L4 packet filtering for security policies to protect virtual machines. – (pass) The default security group blocks everything. Users can edit security groups through the dash- board.
1.3. Security 15 iaas Documentation, Release 0.1.0
OpenStack API
[PASS] Security considerations There are a few things that need to be considered (from OpenStack Security Guide: Management - Management interfaces): • The API service should be configured for TLS to ensure data is encrypted. – (pass): TLS v1.2 or later is enforced. • As a web service, OpenStack API is susceptible to familiar web site attack vectors such as denial of service attacks. – (pass) We have monitoring in place
Secure shell (SSH)
[N/A] Host key fingerprints Host key fingerprints should be stored in a secure and queryable location. One partic- ularly convenient solution is DNS using SSHFP resource records as defined in RFC-4255. For this to be secure, it is necessary that DNSSEC be deployed. • Host keys are wiped periodically to avoid conflicts and ensure that reinstalled hosts function correctly. SSH access is done through a single entry point and host keys are not important.
Management utilities
[PASS] Security considerations There are a few things that need to be considered (from OpenStack Security Guide: Management - Management interfaces): • The dedicated management utilities (*-manage) in some cases use the direct database connection. – (pass) We don’t use dedicated management utilities unless strictly necessary • Ensure that the .rc file which has your credential information is secured. – (pass) Credential information is stored securely.
Out-of-band management interface
[PASS] Security considerations There are a few things that need to be considered (from OpenStack Security Guide: Management - Management interfaces): • Use strong passwords and safeguard them, or use client-side TLS authentication. – (pass) We have strong passwords that are stored securely • Ensure that the network interfaces are on their own private(management or a separate) network. Segregate management domains with firewalls or other network gear. – (pass) OOB interfaces are on a private network • If you use a web interface to interact with the BMC/IPMI, always use the TLS interface, such as HTTPS or port 443. This TLS interface should NOT use self-signed certificates, as is often default, but should have trusted certificates using the correctly defined fully qualified domain names (FQDNs). – (n/a) OOB interfaces are on a closed network and trusted CA is not necessary.
16 Chapter 1. Contents iaas Documentation, Release 0.1.0
• Monitor the traffic on the management network. The anomalies might be easier to track than on the busier compute nodes. – (n/a) Not necessary due to closed network.
1.3.3 [2021] Secure communication
REVISION 2021-01-27
Contents
• [2021] Secure communication – Certification authorities – TLS libraries – Cryptographic algorithms, cipher modes, and protocols
Impact High Implemented percent 83% (5/6)
From OpenStack Security Guide: Secure communication: There are situations where there is a security requirement to assure the confidentiality or integrity of network traffic in an OpenStack deployment. This is generally achieved using cryptographic measures, such as the Transport Layer Security (TLS) protocol. In a typical deployment all traffic transmitted over public networks is secured, but security best practice dictates that internal traffic must also be secured. It is insufficient to rely on security domain separation for protection. If an attacker gains access to the hypervisor or host resources, compromises an API endpoint, or any other service, they must not be able to easily inject or capture messages, commands, or otherwise affect the management capabilities of the cloud. All domains should be secured with TLS, including the management domain services and intra-service communications. TLS provides the mechanisms to ensure authentication, non-repudiation, confidentiality, and integrity of user communications to the OpenStack services and between the OpenStack services themselves. Due to the published vulnerabilities in the Secure Sockets Layer (SSL) protocols, we strongly recommend that TLS is used in preference to SSL, and that SSL is disabled in all cases, unless compatibility with obsolete browsers or libraries is required. There are a number of services that need to be addressed: • Compute API endpoints • Identity API endpoints • Networking API endpoints • Storage API endpoints • Messaging server • Database server • Dashboard
1.3. Security 17 iaas Documentation, Release 0.1.0
Certification authorities
The security guide recommends that we use separate PKI deployments for internal systems and public facing services. In the future, we may want to use separate PKI deployments for different security domains. [PASS] Customer facing interfaces using trusted CA All customer facing interfaces should be provisioned using Certificate Authorities that are installed in the operating system certificate bundles by default. It should just work without the customer having to accept an untrusted CA, or having to install some third-party software. We need certificates signed by a widely recognized public CA. • We use Digicert Terena CA on all customer facing interfaces. [FAIL] Internal endpoints use non-public CA As described above, it is recommended to use a private CA for internal endpoints. • Database connection between regions use non-public CA • Internal connections within regions use private networks and are not secured via a CA (private or other- wise)
TLS libraries
From OpenStack Security Guide: The TLS and HTTP services within OpenStack are typically implemented using OpenSSL which has a module that has been validated for FIPS 140-2. We need to make sure that we’re using an updated version of OpenSSL. [PASS] Ensure updated OpenSSL NREC is based on CentOS, and uses the OpenSSL library from that distro. We need to make sure that OpenSSL is up-to-date. • OpenSSL and all other packages are manually updated once a month.
Cryptographic algorithms, cipher modes, and protocols
The security guide recommends using TLS 1.2, as previous versions are known to be vulnerable: When you are using TLS 1.2 and control both the clients and the server, the cipher suite should be limited to ECDHE-ECDSA-AES256-GCM-SHA384. In circumstances where you do not control both endpoints and are using TLS 1.1 or 1.2 the more general HIGH:!aNULL:!eNULL:!DES:!3DES:!SSLv3:!TLSv1:!CAMELLIA is a reasonable cipher se- lection. [PASS] Ensure TLS 1.2 Make sure that only TLS 1.2 is used. Previous versions of TLS, as well as SSL, should be disabled completely. [PASS] Limit cipher suite on public endpoints Limit the cipher suite on public facing endpoints to the general HIGH:!aNULL:!eNULL:!DES:!3DES:!SSLv3:!TLSv1:!CAMELLIA. [N/A] Limit cipher suite on internal endpoints Limit the cipher suite on public facing endpoints to ECDHE- ECDSA-AES256-GCM-SHA384. • Not using a internal CA so this doesn’t apply in our case
18 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.3.4 [2021] API endpoints
REVISION 2021-01-27
Contents
• [2021] API endpoints – API endpoint configuration recommendations
* Internal API communications * Paste and middleware * API endpoint process isolation and policy * API endpoint rate-limiting
Impact High Implemented percent 85% (6/7)
From OpenStack Security Guide: API endpoints: The process of engaging an OpenStack cloud is started through the querying of an API endpoint. While there are different challenges for public and private endpoints, these are high value assets that can pose a significant risk if compromised.
API endpoint configuration recommendations
Internal API communications
From OpenStack Security Guide: OpenStack provides both public facing and private API endpoints. By default, OpenStack components use the publicly defined endpoints. The recommendation is to configure these components to use the API endpoint within the proper security domain. Services select their respective API endpoints based on the OpenStack service catalog. These services might not obey the listed public or internal API end point values. This can lead to internal management traffic being routed to external API endpoints. [PASS] Configure internal URLs in the Identity service catalog The guide recommends that our Identity service catalog be aware of our internal URLs. This feature is not utilized by default, but may be leveraged through configuration. See API endpoint configuration recommendations for details. • All services have configured admin, internal and public endpoints. [PASS] Configure applications for internal URLs It is recommended that each OpenStack service communicating to the API of another service must be explicitly configured to access the proper internal API endpoint. See API endpoint configuration recommendations. All service to service communication use internal endpoints within a region. This includes: • volume to identity • image to identity • network to identity
1.3. Security 19 iaas Documentation, Release 0.1.0
• compute to identity • compute to image • compute to volume • compute to network
Paste and middleware
From OpenStack Security Guide: Most API endpoints and other HTTP services in OpenStack use the Python Paste Deploy library. From a security perspective, this library enables manipulation of the request filter pipeline through the appli- cation’s configuration. Each element in this chain is referred to as middleware. Changing the order of filters in the pipeline or adding additional middleware might have unpredictable security impact. [N/A] Document middleware We should be careful when implementing non-standard software in the middleware, and this should be thoroughly documented. • We are not using any non-standard middleware
API endpoint process isolation and policy
From OpenStack Security Guide: You should isolate API endpoint processes, especially those that reside within the public security domain should be isolated as much as possible. Where deployments allow, API endpoints should be deployed on separate hosts for increased isolation. [N/A] Namespaces Linux supports namespaces to assign processes into independent domains. • All service backends run on different virtual hosts. [PASS] Network policy We should pay special attention to API endpoints, as they typically bridge multiple security domains. Policies should be in place and documented, and we can use firewalls, SELinux, etc. to enforce proper compartmentalization in the network layer. • The API endpoints are protected via a load balancer and strict firewalls. SELinux are running in enforced mode. [PASS] Mandatory access controls API endpoint processes should be as isolated from each other as possible. This should be enforced through Mandatory Access Controls (e.g. SELinux), not just Discretionary Access Controls. • SELinux is running in enforced mode on all nodes (virtual and physical) that are involved in API endpoints.
API endpoint rate-limiting
From OpenStack Security Guide: Within OpenStack, it is recommended that all endpoints, especially public, are provided with an extra layer of protection, by means of either a rate-limiting proxy or web application firewall. [DEFERRED] Rate-limiting on API endpoints FIXME: Add rate-limiting to HAProxy
20 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.3.5 [2021] Identity
REVISION 2021-01-28
Contents
• [2021] Identity – Authentication
* Invalid login attempts * Multi-factor authentication – Authentication methods – Authorization
* Establish formal access control policies * Service authorization * Administrative users – Policies – Checklist
Impact High Implemented percent 95% (17/18)
From OpenStack Security Guide: Identity: Identity service (keystone) provides identity, token, catalog, and policy services for use specifically by services in the OpenStack family. Identity service is organized as a group of internal services exposed on one or many endpoints. Many of these services are used in a combined fashion by the frontend, for example an authenticate call will validate user/project credentials with the identity service and, upon success, create and return a token with the token service.
Authentication
Ref: OpenStack Security Guide: Identity - Authentication
Invalid login attempts
[PASS] Prevent or mitigate brute-force attacks A pattern of repetitive failed login attempts is generally an indi- cator of brute-force attacks. This is important to us as ours is a public cloud. We need to figure out if our user authentication service has the possibility to block out an account after some configured number of failed login attempts. If not, describe policies around reviewing access control logs to identify and detect unauthorized attempts to access accounts. • Users are automatically banned from logging in after a number of authentication requests.
1.3. Security 21 iaas Documentation, Release 0.1.0
Multi-factor authentication
[PASS] Multi-factor authentication for privileged accounts We should employ multi-factor authentication for network access to privileged user accounts. This will provide insulation from brute force, social engineering, and both spear and mass phishing attacks that may compromise administrator passwords. • While authentication to service accounts is possible from the “outside”, administrative actions are not possible unless connecting from the “inside”. In order to access the “inside”, 2-factor authentication is required.
Authentication methods
Ref: OpenStack Security Guide: Identity - Authentication methods [N/A] Document authentication policy requirements We should document (or provide link to external documen- tation) the authentication policy requirements, such as password policy enforcement (password length, diversity, expiration etc.). • Regular users are set up after autentication through Dataporten. Their password are auto-generated and random, the logic used is currently only documented in code (github:nocams-himlar-db-prep).
Authorization
Ref: OpenStack Security Guide: Identity - Authorization The Identity service supports the notion of groups and roles. Users belong to groups while a group has a list of roles. OpenStack services reference the roles of the user attempting to access the service. The OpenStack policy enforcer middleware takes into consideration the policy rule associated with each resource then the user’s group/roles and association to determine if access is allowed to the requested resource.
Establish formal access control policies
[PASS] Describe formal access control policies The policies should include the conditions and processes for cre- ating, deleting, disabling, and enabling accounts, and for assigning privileges to the accounts. • Enabling accounts and giving account privileges (such access to projects, flavors, images) are done auto- matically using the self-service portal, or by the NREC administrators. [PASS] Describe periodic review We should periodically review the policies to ensure that the configuration is in compliance with approved policies. • The policy is reviewed in this document. The compliance is reviewed often, during regular daily meetings.
Service authorization
[PASS] Don’t use “tempAuth” file for service auth The Compute and Object Storage can be configured to use the Identity service to store authentication information. The “tempAuth” file method displays the password in plain text and should not be used. • tempAuth is not used. [DEFERRED] FIXME Use client authentication for TLS The Identity service supports client authentication for TLS which may be enabled. TLS client authentication provides an additional authentication factor, in addi- tion to the user name and password, that provides greater reliability on user identification.
22 Chapter 1. Contents iaas Documentation, Release 0.1.0
[PASS] Protect sensitive files The cloud administrator should protect sensitive configuration files from unautho- rized modification. This can be achieved with mandatory access control frameworks such as SELinux, including /etc/keystone/keystone.conf and X.509 certificates. • SELinux is running in enforcing mode.
Administrative users
We recommend that admin users authenticate using Identity service and an external authentication service that supports 2-factor authentication, such as a certificate. This reduces the risk from passwords that may be compromised. This recommendation is in compliance with NIST 800-53 IA-2(1) guidance in the use of multi-factor authentication for network access to privileged accounts. [PASS] Use 2-factor authentication for administrative access Administrative access is provided via a login ser- vice that requires 2-factor authentication.
Policies
Ref: OpenStack Security Guide: Identity - Policies [PASS] Describe policy configuration management Each OpenStack service defines the access policies for its re- sources in an associated policy file. A resource, for example, could be API access, the ability to attach to a volume, or to fire up instances. The policy rules are specified in JSON format and the file is called policy.json. Ensure that any changes to the access control policies do not unintentionally weaken the security of any resource. • We are using default policies, with overrides to disable certain capabilities.
Checklist
Ref: OpenStack Security Guide: Identity - Checklist See the above link for info about these checks. [PASS] Check-Identity-01: Is user/group ownership of config files set to keystone? Ownership set to root:keystone or keystone:keystone [PASS] Check-Identity-02: Are strict permissions set for Identity configuration files? Not all files in check list exists, the rest is OK [N/A] Check-Identity-03: is TLS enabled for Identity? Endpoint runs on the load balancer [PASS] Check-Identity-04: Does Identity use strong hashing algorithms for PKI tokens? Yes, set to bcrypt [PASS] Check-Identity-05: Is max_request_body_size set to default (114688)? Yes [N/A] Check-Identity-06: Disable admin token in /etc/keystone/keystone.conf Enabled in keystone.conf, but the service itself is disabled. [PASS] Check-Identity-07: insecure_debug false in /etc/keystone/keystone.conf Yes [PASS] Check-Identity-08: Use fernet token in /etc/keystone/keystone.conf Yes
1.3. Security 23 iaas Documentation, Release 0.1.0
1.3.6 [2021] Dashboard (FIXME)
REVISION 2021-02-05
Contents
• [2021] Dashboard (FIXME) – Domain names, dashboard upgrades, and basic web server configuration
* Domain names * Basic web server configuration * Allowed hosts * Horizon image upload – HTTPS, HSTS, XSS, and SSRF
* Cross Site Scripting (XSS) * Cross Site Request Forgery (CSRF) * Cross-Frame Scripting (XFS) * HTTPS * HTTP Strict Transport Security (HSTS) – Front-end caching and session back end
* Front-end caching * Session back end – Static media – Secret key – Cookies – Cross Origin Resource Sharing (CORS) – Debug – Checklist
Impact High Implemented percent 67% (23/34)
From OpenStack Security Guide: Dashboard: The Dashboard (horizon) is the OpenStack dashboard that provides users a self-service portal to provision their own resources within the limits set by administrators. These include provisioning users, defining instance flavors, uploading virtual machine (VM) images, managing networks, setting up security groups, starting instances, and accessing the instances through a console.
24 Chapter 1. Contents iaas Documentation, Release 0.1.0
Domain names, dashboard upgrades, and basic web server configuration
Ref: OpenStack Security Guide: Dashboard - Domain names, dashboard upgrades, and basic web server configuration
Domain names
From OpenStack Security Guide: We strongly recommend deploying dashboard to a second-level domain, such as https://example.com, rather than deploying dashboard on a shared subdomain of any level, for example https://openstack.example.org or https://horizon.openstack.example.org. We also advise against de- ploying to bare internal domains like https://horizon/. These recommendations are based on the limitations of browser same-origin-policy. [FAIL] Use second-level domain We are not given our own second-devel domain. The dashboard is available as “dashboard.nrec.no”. [DEFERRED] Employ HTTP Strict Transport Security (HSTS) If not using second-level domain, we are advised to avoid a cookie-backed session store and employ HTTP Strict Transport Security (HSTS) • We need to revisit this as soon as possible.
Basic web server configuration
From OpenStack Security Guide: The dashboard should be deployed as a Web Services Gateway Interface (WSGI) application behind an HTTPS proxy such as Apache or nginx. If Apache is not already in use, we recommend nginx since it is lightweight and easier to configure correctly. [PASS] Is dashboard deployed as a WSGI application behind an HTTPS proxy? Yes, dashboard is deployed using mod_wsgi on an Apache server.
Allowed hosts
From OpenStack Security Guide: Configure the ALLOWED_HOSTS setting with the fully qualified host name(s) that are served by the OpenStack dashboard. Once this setting is provided, if the value in the “Host:” header of an incoming HTTP request does not match any of the values in this list an error will be raised and the requestor will not be able to proceed. Failing to configure this option, or the use of wild card characters in the specified host names, will cause the dashboard to be vulnerable to security breaches associated with fake HTTP Host headers. [FAIL] Is ALLOWED_HOSTS configured for dashboard? The NREC dashboard should be available on the In- ternet. As such, using ALLOWED_HOSTS would defeat the purpose of the dashboard.
1.3. Security 25 iaas Documentation, Release 0.1.0
Horizon image upload
It is recommended that we disable HORIZON_IMAGES_ALLOW_UPLOAD unless we have a plan to prevent re- source exhaustion and denial of service. [N/A] Is HORIZON_IMAGES_ALLOW_UPLOAD disabled? Image uploads works through the Glance API and not through dashboard. The API has rate-limiting turned on.
HTTPS, HSTS, XSS, and SSRF
Ref: OpenStack Security Guide: Dashboard - HTTPS, HSTS, XSS, and SSRF
Cross Site Scripting (XSS)
From OpenStack Security Guide: Unlike many similar systems, the OpenStack dashboard allows the entire Unicode character set in most fields. This means developers have less latitude to make escaping mistakes that open attack vectors for cross-site scripting (XSS). [N/A] Audit custom dashboards Audit any custom dashboards, paying particular attention to use of the mark_safe function, use of is_safe with custom template tags, the safe template tag, anywhere auto escape is turned off, and any JavaScript which might evaluate improperly escaped data. • We are not using custom dashboards
Cross Site Request Forgery (CSRF)
From OpenStack Security Guide: Dashboards that utilize multiple instances of JavaScript should be audited for vulnerabilities such as inappropriate use of the @csrf_exempt decorator. [N/A] Audit custom dashboards We are not using custom dashboards
Cross-Frame Scripting (XFS)
From OpenStack Security Guide: Legacy browsers are still vulnerable to a Cross-Frame Scripting (XFS) vulnerability, so the OpenStack dashboard provides an option DISALLOW_IFRAME_EMBED that allows extra security hardening where iframes are not used in deployment. [PASS] Disallow iframe embed DISALLOW_IFRAME_EMBED it set.
26 Chapter 1. Contents iaas Documentation, Release 0.1.0
HTTPS
From OpenStack Security Guide: Deploy the dashboard behind a secure HTTPS server by using a valid, trusted certificate from a recognized certificate authority (CA). [PASS] Use trusted certificate for dashboard We are using a trusted CA [PASS] Redirect to fully qualified HTTPS URL HTTP requests to the dashboard domain are configured to redi- rect to the fully qualified HTTPS URL.
HTTP Strict Transport Security (HSTS)
It is highly recommended to use HTTP Strict Transport Security (HSTS). [DEFERRED] Use HSTS FIXME: Revisit this ASAP
Front-end caching and session back end
Ref: OpenStack Security Guide: Dashboard - Front-end caching and session back end
Front-end caching
[PASS] Do not use front-end caching tools We are not using front-end caching.
Session back end
It is recommended to use django.contrib.sessions.backends.cache as our session back end with mem- cache as the cache. This as opposed to the default, which saves user data in signed, but unencrypted cookies stored in the browser. [PASS] Consider using caching back end Memcache is used as caching backend.
Static media
Ref: OpenStack Security Guide: Dashboard - Static media The dashboard’s static media should be deployed to a subdomain of the dashboard domain and served by the web server. The use of an external content delivery network (CDN) is also acceptable. This subdomain should not set cookies or serve user-provided content. The media should also be served with HTTPS. [FAIL] Static media via subdomain The amount of static media served from the NREC dashboard is next to noth- ing. We don’t see any need to move this to a subdomain. [N/A] Subdomain not serving cookies or user-provided content Not using subdomain. [N/A] Subdomain via HTTPS Not using subdomain.
1.3. Security 27 iaas Documentation, Release 0.1.0
Secret key
Ref: OpenStack Security Guide: Dashboard - Secret key The dashboard depends on a shared SECRET_KEY setting for some security functions. The secret key should be a randomly generated string at least 64 characters long, which must be shared across all active dashboard instances. Compromise of this key may allow a remote attacker to execute arbitrary code. Rotating this key invalidates existing user sessions and caching. Do not commit this key to public repositories. [DEFERRED] Randomly generated string at least 64 characters long Randomly generated, but much shorter than 64 chars (FIXME - TODO) [PASS] Not in public repo We have internal stores for secret keys.
Cookies
Ref: OpenStack Security Guide: Dashboard - Cookies [PASS] Session cookies should be set to HTTPONLY Configured in /etc/openstack-dashboard/local_settings:
OPENSTACK_SESSION_COOKIE_HTTPONLY= True
[PASS] Never configure CSRF or session cookies to have a wild card domain with a leading dot Configured in /etc/openstack-dashboard/local_settings:
CSRF_COOKIE_SECURE= True
[PASS] Horizon’s session and CSRF cookie should be secured when deployed with HTTPS Configured in /etc/openstack-dashboard/local_settings:
SESSION_COOKIE_SECURE= True
Cross Origin Resource Sharing (CORS)
Ref: OpenStack Security Guide: Dashboard - Cross Origin Resource Sharing (CORS) Configure your web server to send a restrictive CORS header with each response, allowing only the dashboard domain and protocol [DEFERRED] Restrictive CORS header FIXME - TODO
Debug
It is recommended to set debug to false in production environments. [PASS] Disable the debug flag Configured in /etc/openstack-dashboard/local_settings:
DEBUG= False
28 Chapter 1. Contents iaas Documentation, Release 0.1.0
Checklist
Ref: OpenStack Security Guide: Dashboard - Checklist See the above link for info about these checks. [FAIL] Check-Dashboard-01: Is user/group of config files set to root/horizon? The “horizon” group does not exist in our case, we’re using the group “apache”. The local_settings file has user/group “apache apache” (FIXME - TODO):
# ls -l /etc/openstack-dashboard/local_settings -rw-r-----.1 apache apache 32004 Dec3 13:21/etc/openstack-dashboard/local_
˓→settings
[PASS] Check-Dashboard-02: Are strict permissions set for horizon configuration files? The “horizon” group does not exist in our case, we’re using the group “apache”. The local_settings file has mode 0640:
# ls -l /etc/openstack-dashboard/local_settings -rw-r-----.1 apache apache 32004 Dec3 13:21/etc/openstack-dashboard/local_
˓→settings
[PASS] Check-Dashboard-03: Is DISALLOW_IFRAME_EMBED parameter set to True? Yes. [PASS] Check-Dashboard-04: Is CSRF_COOKIE_SECURE parameter set to True? Yes [PASS] Check-Dashboard-05: Is SESSION_COOKIE_SECURE parameter set to True? Yes [PASS] Check-Dashboard-06: Is SESSION_COOKIE_HTTPONLY parameter set to True? Yes [PASS] Check-Dashboard-07: Is PASSWORD_AUTOCOMPLETE set to False? Yes [PASS] Check-Dashboard-08: Is DISABLE_PASSWORD_REVEAL set to True? Yes [PASS] Check-Dashboard-09: Is ENFORCE_PASSWORD_CHECK set to True? Yes [N/A] Check-Dashboard-10: Is PASSWORD_VALIDATOR configured? We use external authentication [FAIL] Check-Dashboard-11: Is SECURE_PROXY_SSL_HEADER configured? FIXME - TODO
1.3.7 [2021] Compute
REVISION 2021-02-28
Contents
• [2021] Compute – Hypervisor selection – Hardening the virtualization layers
* Physical hardware (PCI passthrough) * Minimizing the QEMU code base * Compiler hardening * Mandatory access controls – How to select virtual consoles – Checklist
1.3. Security 29 iaas Documentation, Release 0.1.0
Impact High Implemented percent 78% (7/9)
From OpenStack Security Guide: Compute: The OpenStack Compute service (nova) is one of the more complex OpenStack services. It runs in many locations throughout the cloud and interacts with a variety of internal services. The OpenStack Compute service offers a variety of configuration options which may be deployment specific. In this chapter we will call out general best practice around Compute security as well as specific known configurations that can lead to security issues. In general, the nova.conf file and the /var/lib/nova locations should be secured. Controls like centralized logging, the policy.json file, and a mandatory access control framework should be implemented. Additionally, there are environmental considerations to keep in mind, depending on what functionality is desired for your cloud.
Hypervisor selection
Ref: OpenStack Security Guide: Compute - Hypervisor selection We are using KVM.
Hardening the virtualization layers
Ref: OpenStack Security Guide: Compute - Hardening the virtualization layers
Physical hardware (PCI passthrough)
Many hypervisors offer a functionality known as PCI passthrough. This allows an instance to have di- rect access to a piece of hardware on the node. For example, this could be used to allow instances to access video cards or GPUs offering the compute unified device architecture (CUDA) for high perfor- mance computation. This feature carries two types of security risks: direct memory access and hardware infection. [N/A] Ensure that the hypervisor is configured to utilize IOMMU Not applicable as PCI passthrough is dis- abled. [PASS] Disable PCI passthrough PCI passthrough is disabled. We may enable PCI passthrough for special com- pute nodes with GPU etc., but these will be confined in spesialized availability zones and not generally available.
Minimizing the QEMU code base
Does not apply. We are using precompiled QEMU.
30 Chapter 1. Contents iaas Documentation, Release 0.1.0
Compiler hardening
Does not apply. We are using precompiled QEMU.
Mandatory access controls
[PASS] Ensure SELinux / sVirt is running in Enforcing mode SELinux is running in enforcing mode on all hy- pervisor nodes.
How to select virtual consoles
Ref: OpenStack Security Guide: Compute - How to select virtual consoles [PASS] Is the VNC service encrypted? Yes. Communication between the customer and the public facing VNC service is encrypted.
Checklist
Ref: OpenStack Security Guide: Compute - Checklist See the above link for info about these checks. [PASS] Check-Compute-01: Is user/group ownership of config files set to root/nova? Yes, except for /etc/nova which has “root root”:
# stat -L -c "%U %G" /etc/nova/{,nova.conf,api-paste.ini,policy.json,rootwrap.
˓→conf} root root root nova root nova root nova root nova
[PASS] Check-Compute-02: Are strict permissions set for configuration files? Yes:
# stat -L -c "%a" /etc/nova/{nova.conf,api-paste.ini,policy.json,rootwrap.conf} 640 640 640 640
[PASS] Check-Compute-03: Is keystone used for authentication? Yes [FAIL] Check-Compute-04: Is secure protocol used for authentication? Communication is completely on the inside on a private network, which we consider to be an acceptible risk. [FAIL] Check-Compute-05: Does Nova communicate with Glance securely? Communication is completely on the inside on a private network, which we consider to be an acceptible risk.
1.3. Security 31 iaas Documentation, Release 0.1.0
1.3.8 [2021] Block Storage
REVISION 2021-03-06
Contents
• [2021] Block Storage – NREC block storage description – Checklist
Impact High Implemented percent 55% (5/9)
From OpenStack Security Guide: Block Storage: OpenStack Block Storage (cinder) is a service that provides software (services and libraries) to self- service manage persistent block-level storage devices. This creates on-demand access to Block Storage resources for use with OpenStack Compute (nova) instances. This creates software-defined storage via abstraction by virtualizing pools of block storage to a variety of back-end storage devices which can be either software implementations or traditional hardware storage products. The primary functions of this is to manage the creation, attaching and detaching of the block devices. The consumer requires no knowledge of the type of back-end storage equipment or where it is located.
NREC block storage description
We have deployed a cinder backend based on Ceph, the clustered file system. Every compute node is given read/write access to a pool where instance block volumes are stored. The connection is made with the ceph rbd client.
Checklist
Ref: OpenStack Security Guide: Block Storage - Checklist See the above link for info about these checks. [PASS] Check-Block-01: Is user/group ownership of config files set to root/cinder? Yes, except for /etc/cinder which has “root root”:
# stat -L -c "%U %G" /etc/cinder/{,cinder.conf,api-paste.ini,policy.json,rootwrap.
˓→conf} root root root cinder root cinder stat: cannot stat ‘/etc/cinder/policy.json’: No such file or directory root cinder
[PASS] Check-Block-02: Are strict permissions set for configuration files? Yes:
# stat -L -c "%a" /etc/cinder/{cinder.conf,api-paste.ini,policy.json,rootwrap.
˓→conf} 640 640 (continues on next page)
32 Chapter 1. Contents iaas Documentation, Release 0.1.0
(continued from previous page) stat: cannot stat ‘/etc/cinder/policy.json’: No such file or directory 640
[N/A] Check-Block-03: Is keystone used for authentication? Deprecated as of Stein release. [FAIL] Check-Block-04: Is TLS enabled for authentication? Communication is completely on the inside on a private network, which we consider to be an acceptible risk. [FAIL] Check-Block-05: Does cinder communicate with nova over TLS? Communication is completely on the inside on a private network, which we consider to be an acceptible risk. [FAIL] Check-Block-06: Does cinder communicate with glance over TLS? Communication is completely on the inside on a private network, which we consider to be an acceptible risk. [N/A] Check-Block-07: Is NAS operating in a secure environment? We do not have a NAS in our environment. [PASS] Check-Block-08: Is max size for the body of a request set to default (114688)? Yes [FAIL] Check-Block-09: Is the volume encryption feature enabled? We do not offer encrypted volumes at this time.
1.3.9 [2021] Image Storage
REVISION 2021-05-12
Contents
• [2021] Image Storage – Checklist
Impact Medium Implemented percent 80% (4/5)
From OpenStack Security Guide: Image Storage: *OpenStack Image Storage (glance) is a service where users can upload and discover data assets that are meant to be used with other services. This currently includes images and metadata definitions. Image services include discovering, registering, and retrieving virtual machine images. Glance has a RESTful API that allows querying of VM image metadata as well as retrieval of the actual image.*
Checklist
Ref: OpenStack Security Guide: Image Storage - Checklist See the above link for info about these checks. [PASS] Check-Image-01: Is user/group ownership of config files set to root/glance? Yes, except for /etc/glance which has “root root”:
1.3. Security 33 iaas Documentation, Release 0.1.0
# stat -L -c "%U %G" /etc/glance/{,glance-api-paste.ini,glance-api.conf,glance-
˓→cache.conf,glance-manage.conf,glance-registry-paste.ini,glance-registry.conf,
˓→glance-scrubber.conf,glance-swift-store.conf,policy.json,schema-image.json,
˓→schema.json} root root stat: cannot stat ‘/etc/glance/glance-api-paste.ini’: No such file or directory root glance root glance stat: cannot stat ‘/etc/glance/glance-manage.conf’: No such file or directory stat: cannot stat ‘/etc/glance/glance-registry-paste.ini’: No such file or
˓→directory root glance root glance stat: cannot stat ‘/etc/glance/glance-swift-store.conf’: No such file or directory root glance root glance stat: cannot stat ‘/etc/glance/schema.json’: No such file or directory
[FAIL] Check-Image-02: Are strict permissions set for configuration files? Yes, all files have permissions 640:
# stat -L -c "%a" /etc/glance/{,glance-api-paste.ini,glance-api.conf,glance-cache.
˓→conf,glance-manage.conf,glance-registry-paste.ini,glance-registry.conf,glance-
˓→scrubber.conf,glance-swift-store.conf,policy.json,schema-image.json,schema.json} 755 stat: cannot stat ‘/etc/glance/glance-api-paste.ini’: No such file or directory 640 640 stat: cannot stat ‘/etc/glance/glance-manage.conf’: No such file or directory stat: cannot stat ‘/etc/glance/glance-registry-paste.ini’: No such file or
˓→directory 640 640 stat: cannot stat ‘/etc/glance/glance-swift-store.conf’: No such file or directory 640 640 stat: cannot stat ‘/etc/glance/schema.json’: No such file or directory
[N/A] Check-Image-03: Is keystone used for authentication? Deprecated as of Stein release. [FAIL] Check-Image-04: Is TLS enabled for authentication? Communication is completely on the inside on a private network, which we consider to be an acceptible risk. [N/A] Check-Image-05: Are masked port scans prevented? The Glance v1 API is disabled.
1.3.10 [2021] Shared File Systems
REVISION 2021-03-06
Contents
• [2021] Shared File Systems
From OpenStack Security Guide: Shared File Systems: The Shared File Systems service (manila) provides a set of services for management of shared file systems in a multi-tenant cloud environment, similar to how OpenStack provides for block-based storage manage-
34 Chapter 1. Contents iaas Documentation, Release 0.1.0
ment through the OpenStack Block Storage service project. With the Shared File Systems service, you can create a remote file system, mount the file system on your instances, and then read and write data from your instances to and from your file system.
Note: Does not apply. We are not using Manila.
1.3.11 [2019] Networking
REVISION 2019-03-14
Contents
• [2019] Networking – Networking services
* L2 isolation using VLANs and tunneling * Network services – Networking services security best practices
* OpenStack Networking service configuration – Securing OpenStack networking services
* Networking resource policy engine * Security groups * Quotas – Checklist
Impact High Implemented percent 85% (12/14)
From OpenStack Security Guide: Networking: OpenStack Networking enables the end-user or tenant to define, utilize, and consume networking re- sources. OpenStack Networking provides a tenant-facing API for defining network connectivity and IP addressing for instances in the cloud in addition to orchestrating the network configuration. With the transition to an API-centric networking service, cloud architects and administrators should take into con- sideration best practices to secure physical and virtual network infrastructure and services.
1.3. Security 35 iaas Documentation, Release 0.1.0
Networking services
Ref: OpenStack Security Guide: Networking - Networking services
L2 isolation using VLANs and tunneling
Does not apply. We’re using Calico, in which L2 isn’t employed at all.
Network services
[PASS] Use Neutron for security groups The calico neutron network plugin provides a rich security feature set. Calico uses neutron security groups and implements the rules with iptables on the compute hosts. Thus, security rulesets can be described down to instance level.
Networking services security best practices
Ref: OpenStack Security Guide: Networking - Networking services security best practices [PASS] Document how Calico is used in UH-IaaS infrastructure We enable the calico plugin as the neutron core plugin system wide. Thus, no L2 connectivity is provided between instances, and as a design feature, no project isolation on L3 connectivity. In other words, there is no such thing as a private network, even for RFC 1918 address spaces. This design relies on security groups to provide isolation and pr project security. [N/A] Document which security domains have access to OpenStack network node As a consequence of our network design, no network nodes are deployed. [N/A] Document which security domains have access to SDN services node We do not use SDN service nodes.
OpenStack Networking service configuration
[PASS] Restrict bind address of the API server: neutron-server Neutron API servers is bound to interal network only.
Securing OpenStack networking services
Ref: OpenStack Security Guide: Networking - Securing OpenStack networking services
Networking resource policy engine
From OpenStack Security Guide: A policy engine and its configuration file, policy.json, within OpenStack Networking provides a method to provide finer grained authorization of users on tenant networking methods and objects. The OpenStack Networking policy definitions affect network availability, network security and overall OpenStack security. [PASS] Evaluate network policy User creation of networks, virtual routers and networks is prohibited by policy. Only administrator created networking resources are available for projects and users.
36 Chapter 1. Contents iaas Documentation, Release 0.1.0
Security groups
``nova.conf`` should always disable built-in security groups and proxy all security group calls to the OpenStack Networking API when using OpenStack Networking. [PASS] Set firewall_driver option in nova.conf firewall_driver is set to nova.virt.firewall.NoopFirewallDriver so that nova-compute does not perform iptables-based filtering itself. [FAIL] Set security_group_api option in nova.conf It is recommended that security_group_api is set to neutron so that all security group requests are proxied to the OpenStack Networking service. We do not set the security_group_api option at all.
Quotas
[N/A] Document choices wrt. networking quotas As users can not create networking resources, no quotas apply.
Checklist
Ref: OpenStack Security Guide: Networking - Checklist See the above link for info about these checks. [PASS] Check-Neutron-01: Is user/group ownership of config files set to root/neutron? Yes [PASS] Check-Neutron-02: Are strict permissions set for configuration files? Yes [PASS] Check-Neutron-03: Is keystone used for authentication? Yes [PASS] Check-Neutron-04: Is secure protocol used for authentication? Yes [FAIL] Check-Neutron-05: Is TLS enabled on Neutron API server? The negative implications for the user ex- perience by implementing this is considered to outweight the extra security gained by this.
1.3.12 [2019] Object Storage
REVISION 2019-03-14
Contents
• [2019] Object Storage
From OpenStack Security Guide: Object Storage: OpenStack Object Storage (swift) is a service that provides software that stores and retrieves data over HTTP. Objects (blobs of data) are stored in an organizational hierarchy that offers anonymous read- only access, ACL defined access, or even temporary access. Object Store supports multiple token-based authentication mechanisms implemented via middleware.
Note: Does not apply. We are not using Swift.
1.3. Security 37 iaas Documentation, Release 0.1.0
1.3.13 Message queuing
Last changed: 2021-09-14
Contents
• Message queuing – Messaging security
* Messaging transport security * Queue authentication and access control * Message queue process isolation and policy
Impact High Implemented percent 0% (0/8)
From OpenStack Security Guide: Message queuing: Message queues effectively facilitate command and control functions across OpenStack deployments. Once access to the queue is permitted no further authorization checks are performed. Services accessible through the queue do validate the contexts and tokens within the actual message payload. However, you must note the expiration date of the token because tokens are potentially re-playable and can authorize other services in the infrastructure. OpenStack does not support message-level confidence, such as message signing. Consequently, you must secure and authenticate the message transport itself. For high-availability (HA) configurations, you must perform queue-to-queue authentication and encryption.
Note: We are using RabbitMQ as message queuing service back end.
Messaging security
Ref: OpenStack Security Guide: Message queuing - Messaging security
Messaging transport security
From OpenStack Security Guide: We highly recommend enabling transport-level cryptography for your message queue. Using TLS for the messaging client connections provides protection of the communications from tampering and eavesdrop- ping in-transit to the messaging server. [DEFERRED] Ensure TLS is used for RabbitMQ • TLS is NOT used for the messaging service. Should be considered. [DEFERRED] Use an internally managed CA • No CA as TLS is not used [DEFERRED] Ensure restricted file permissions on certificate and key files
38 Chapter 1. Contents iaas Documentation, Release 0.1.0
• No CA as TLS is not used
Queue authentication and access control
From OpenStack Security Guide: We recommend configuring X.509 client certificates on all the OpenStack service nodes for client connec- tions to the messaging queue and where possible (currently only Qpid) perform authentication with X.509 client certificates. When using user names and passwords, accounts should be created per-service and node for finer grained auditability of access to the queue. [DEFERRED] Configure X.509 client certificates on all OpenStack service nodes • Currently no TLS/user certificates set up [DEFERRED] Any user names and passwords are per-service and node • Currently common password. ?????
Message queue process isolation and policy
[----] Use network namespaces Network namespaces are highly recommended for all services running on Open- Stack Compute Hypervisors. This will help prevent against the bridging of network traffic between VM guests and the management network. • FIXME: Ensure and document [DEFERRED] Ensure queue servers only accept connections from management network FIXME: Ensure and document [DEFERRED] Use mandatory access controls FIXME: SELinux in enforcing mode on all nodes
1.3.14 [2019] Data processing
REVISION 2019-03-14
Contents
• [2019] Data processing
From OpenStack Security Guide: Data processing: The Data processing service for OpenStack (sahara) provides a platform for the provisioning and man- agement of instance clusters using processing frameworks such as Hadoop and Spark. Through the Open- Stack dashboard or REST API, users will be able to upload and execute framework applications which may access data in object storage or external providers. The data processing controller uses the Orches- tration service to create clusters of instances which may exist as long-running groups that can grow and shrink as requested, or as transient groups created for a single workload.
Note: Does not apply. We are not using Sahara.
1.3. Security 39 iaas Documentation, Release 0.1.0
1.3.15 Databases
Last changed: 2021-09-14
Contents
• Databases – Database back end considerations – Database access control
* Database authentication and access control * Require user accounts to require SSL transport * Authentication with X.509 certificates * Nova-conductor – Database transport security
* Database server IP address binding * Database transport
Impact High Implemented percent | 44% (4/9)
From OpenStack Security Guide: Databases: The choice of database server is an important consideration in the security of an OpenStack deployment. Multiple factors should be considered when deciding on a database server, however for the scope of this book only security considerations will be discussed. OpenStack supports a variety of database types (see OpenStack Cloud Administrator Guide for more information). The Security Guide currently focuses on PostgreSQL and MySQL.
Note: We are using MariaDB 10.1 with packages directly from upstream repo.
Database back end considerations
Ref: OpenStack Security Guide: Databases - Database back end considerations [DEFERRED] Evaluate existing MySQL security guidance See link above for details. • FIXME: Evaluate and document
40 Chapter 1. Contents iaas Documentation, Release 0.1.0
Database access control
Ref: OpenStack Security Guide: Databases - Database access control
Database authentication and access control
From OpenStack Security Guide: Given the risks around access to the database, we strongly recommend that unique database user accounts be created per node needing access to the database. [PASS] Unique database user accounts per node Each service run on different host, and each host has a unique user. [PASS] Separate database administrator account The root user is only used to provision new databases and users. [DEFERRED] Database administrator account is protected FIXME: Document this
Require user accounts to require SSL transport
[DEFERRED] The database user accounts are configured to require TLS All databases support TLS, but only DB replication between location requires TLS.
Authentication with X.509 certificates
[DEFERRED] The database user accounts are configured to require X.509 certificates FIXME: Document this
Nova-conductor
[PASS] Consider turning off nova-conductor OpenStack Compute offers a sub-service called nova-conductor which proxies database connections over RPC. We use nova conductor, and nova compute have access to it over the message bus. The RPC messaging bus are not encrypted, but run on a private network. This is acceptable risk.
Database transport security
Ref: OpenStack Security Guide: Databases - Database transport security
Database server IP address binding
[PASS] Database access only over an isolated management network Database replication is done over public network, with TLS and firewall to restrict access.
1.3. Security 41 iaas Documentation, Release 0.1.0
Database transport
[DEFERRED] The database requires TLS All databases support TLS transport, but only DB replication between locations requires TLS.
1.3.16 Tenant data privacy
Last changed: 2021-09-14
Contents
• Tenant data privacy – Data privacy concerns
* Data residency * Data disposal · Data not securely erased · Instance memory scrubbing · Cinder volume data · Image service delay delete feature · Compute soft delete feature · Compute instance ephemeral storage – Data encryption
* Volume encryption * Ephemeral disk encryption * Block Storage volumes and instance ephemeral filesystems * Network data – Key management
Impact High Implemented percent 0% (0/?)
From OpenStack Security Guide: Tenant data privacy: OpenStack is designed to support multitenancy and those tenants will most probably have different data requirements. As a cloud builder and operator you need to ensure your OpenStack environment can address various data privacy concerns and regulations.
42 Chapter 1. Contents iaas Documentation, Release 0.1.0
Data privacy concerns
Ref: OpenStack Security Guide: Tenant data privacy - Data privacy concerns
Data residency
From OpenStack Security Guide: Numerous OpenStack services maintain data and metadata belonging to tenants or reference tenant in- formation. Tenant data stored in an OpenStack cloud may include the following items: - Object Storage objects - Compute instance ephemeral filesystem storage - Compute instance memory - Block Storage volume data - Public keys for Compute access - Virtual machine images in the Image service - Machine snapshots - Data passed to OpenStack Compute’s configuration-drive extension Metadata stored by an OpenStack cloud includes the following non-exhaustive items: - Organization name - User’s “Real Name” - Number or size of running instances, buckets, objects, volumes, and other quota- related items - Number of hours running instances or storing data - IP addresses of users - Internally generated private keys for compute image bundling
Data disposal
From OpenStack Security Guide: OpenStack operators should strive to provide a certain level of tenant data disposal assurance. Best prac- tices suggest that the operator sanitize cloud system media (digital and non-digital) prior to disposal, release out of organization control or release for reuse. Sanitization methods should implement an appro- priate level of strength and integrity given the specific security domain and sensitivity of the information. The security guide states that the cloud operators should do the following: [DEFERRED] Track, document and verify media sanitization and disposal actions • OSL: Media are shredded before being disposed • BGO: unknown [DEFERRED] Test sanitation equipment and procedures to verify proper performance • OSL: Equipment has been properly tested • BGO: unknown [PASS] Sanitize portable, removable storage devices prior to connecting such devices to the cloud infrastructure
• Portable, removable media are never connected to the cloud infrastructure [DEFERRED] Destroy cloud system media that cannot be sanitized • OSL: Media are destroyed using a shredder • BGO: unknown
1.3. Security 43 iaas Documentation, Release 0.1.0
Data not securely erased
Regarding erasure of metadata, the security guide suggests using database and/or system configuration for auto vacu- uming and periodic free-space wiping. [DEFERRED] Periodic database vacuuming Not implemented at this time. We will revisit this at a later time. [FAIL] Periodic free-space wiping of ephemeral storage We’re not doing this, as we consider this to be an ac- ceptable risk.
Instance memory scrubbing
As we’re using KVM, which relies on Linux page management, we need to consult the KVM documentation about memory scrubbing. [----] Consider automatic/periodic memory scrubbing FIXME: Consult KVM doc, consider if this is needed and document
Cinder volume data
From OpenStack Security Guide: Use of the OpenStack volume encryption feature is highly encouraged. This is discussed in the Data Encryption section below. When this feature is used, destruction of data is accomplished by securely deleting the encryption key. [DEFERRED] Consider volume encryption Nice to have, but adds complexity. We will revisit this. [FAIL] Secure erasure of volume data We’re not doing this, as we consider this to be an acceptable risk.
Image service delay delete feature
From OpenStack Security Guide: OpenStack Image service has a delayed delete feature, which will pend the deletion of an image for a defined time period. It is recommended to disable this feature if it is a security concern [PASS] Consider disabling delayed delete Considered, we don’t think this is a security concern.
Compute soft delete feature
From OpenStack Security Guide: OpenStack Compute has a soft-delete feature, which enables an instance that is deleted to be in a soft- delete state for a defined time period. The instance can be restored during this time period. [PASS] Consider disabling compute soft delete Considered, we don’t think this is a security concern.
44 Chapter 1. Contents iaas Documentation, Release 0.1.0
Compute instance ephemeral storage
From OpenStack Security Guide: The creation and destruction of ephemeral storage will be somewhat dependent on the chosen hypervisor and the OpenStack Compute plug-in. [DEFERRED] Document ephemeral storage deletion FIXME: Document how this works in our environment
Data encryption
From OpenStack Security Guide: Tenant data privacy - Data encryption: The option exists for implementers to encrypt tenant data wherever it is stored on disk or transported over a network, such as the OpenStack volume encryption feature described below. This is above and beyond the general recommendation that users encrypt their own data before sending it to their provider.
Volume encryption
[DEFERRED] Consider volume encryption Postponed.
Ephemeral disk encryption
[PASS] Consider ephemeral disk encryption Considered.
Block Storage volumes and instance ephemeral filesystems
[DEFERRED] Consider which options we have available FIXME: Document [PASS] Consider adding encryption Considered.
Network data
[PASS] Consider encrypting tenant data over IPsec or other tunnels Considered. Not a security concern in our case.
Key management
From OpenStack Security Guide: Tenant data privacy - Key management: The volume encryption and ephemeral disk encryption features rely on a key management service (for example, barbican) for the creation and secure storage of keys. The key manager is pluggable to facilitate deployments that need a third-party Hardware Security Module (HSM) or the use of the Key Management Interchange Protocol (KMIP), which is supported by an open-source project called PyKMIP. [DEFERRED] Consider adding Barbican FIXME: Consider and document
1.3. Security 45 iaas Documentation, Release 0.1.0
1.3.17 [2019] Instance security management
REVISION 2019-03-14
Contents
• [2019] Instance security management – Security services for instances
* Entropy to instances * Scheduling instances to nodes * Trusted images * Instance migrations * Monitoring, alerting, and reporting
Impact High Implemented percent 67% (4/6)
From OpenStack Security Guide: Instance security management: One of the virtues of running instances in a virtualized environment is that it opens up new opportunities for security controls that are not typically available when deploying onto bare metal. There are several technologies that can be applied to the virtualization stack that bring improved information assurance for cloud tenants. Deployers or users of OpenStack with strong security requirements may want to consider deploying these technologies. Not all are applicable in every situation, indeed in some cases technologies may be ruled out for use in a cloud because of prescriptive business requirements. Similarly some technologies inspect instance data such as run state which may be undesirable to the users of the system.
Security services for instances
Ref: OpenStack Security Guide: Instance security management - Security services for instances
Entropy to instances
From OpenStack Security Guide: The Virtio RNG is a random number generator that uses /dev/random as the source of entropy by default, however can be configured to use a hardware RNG or a tool such as the entropy gathering daemon (EGD) to provide a way to fairly and securely distribute entropy through a distributed system. [PASS] Consider adding hardware random number generators (HRNG) We do not consider HRNG necessary for a deployment of this scale. This may be revisited in the future.
46 Chapter 1. Contents iaas Documentation, Release 0.1.0
Scheduling instances to nodes
From OpenStack Security Guide: Before an instance is created, a host for the image instantiation must be selected. This selection is per- formed by the nova-scheduler which determines how to dispatch compute and volume requests. [PASS] Describe which scheduler and filters that are used For normal workloads, we use the default nova scheduling filters, and all compute hosts are considered equal in features and performance. For specialized resources such as HPC workloads we have different filters.
Trusted images
From OpenStack Security Guide: In a cloud environment, users work with either pre-installed images or images they upload themselves. In both cases, users should be able to ensure the image they are utilizing has not been tampered with. [PASS] Maintain golden images We provide updated upstream cloud images for popular linux distributions, as well as the latest Windows Server versions. [FAIL] Enable instance signature verification This is not something that we will prioritize at this time. It also requires the setup and management of additional services.
Instance migrations
[FAIL] Disable live migration While live migration has its risks, the benefits of live migration for outweigh the disadvantages. We have live migration enabled.
Monitoring, alerting, and reporting
[PASS] Aggrgate logs, e.g. to ELK Compute host logs are sent to an ELK stack.
1.4 Howtos and guides
This is a collection of howtos and documentation bits with relevance to the project.
1.4.1 Build docs locally using Sphinx
This describes how to build the documentation from norcams/iaas locally
1.4. Howtos and guides 47 iaas Documentation, Release 0.1.0
RHEL, CentOS, Fedora
You’ll need the python-virtualenvwrapper package from EPEL sudo yum -y install python-virtualenvwrapper # Restart shell exit
Ubuntu (trusty) sudo apt-get -y install virtualenvwrapper make # Restart shell exit
Build docs
# Make a virtual Python environment # This env is placed in .virtualenv in $HOME mkvirtualenv docs
# activate the docs virtualenv workon docs # install sphinx into it pip install sphinx sphinx_rtd_theme
# Compile docs cd iaas/docs make html
# Open in modern internet browser of choice xdg-open _build/html/index.html
# Deactivate the virtualenv deactivate
1.4.2 Git in the real world
Fix and restore a “messy” branch http://push.cwcon.org/learn/stay-updated#oops_i_was_messing_around_on_
1.4.3 Install KVM on CentOS 7 from minimal install
See http://mwiki.yyovkov.net/index.php/Linux_KVM_on_CentOS_7
48 Chapter 1. Contents iaas Documentation, Release 0.1.0
1.4.4 Configure a Dell S55 FTOS switch from scratch
This describes how to build configure a Dell Powerconnect S55 switch as management switch for our iaas from scratch.
Initial config
You will need a laptop with serial console cable. Connect the cable to the rs232 port in front of the switch. Open a console to ttyUSBx using screen, tmux, putty or other useable software. Then power on the switch. After the switch has booted, you can now enter the enable state:
> enable
The switch will default to jumpstart mode, trying to get a config from a central repository. We will disable it by typing:
# reload-type normal
Now we need to provide an ip address, create user with a passord and set enable password in order to provide ssh access:
# configure (conf)# interface managementethernet 0/0 (conf-if-ma-0/0)# ip address 10.0.0.2 /32 (conf-if-ma-0/0)# no shutdown (conf-if-ma-0/0)# exit (conf)# management route 0.0.0.0 /0 10.0.0.1 (conf)# username mylocaluser password 0 mysecretpassword (conf)# enable password 0 myverysecret (conf)# exit # write # copy running-config startup-config
Now you can ssh to the switch using your new user from a computer with access to the switch’s management network.
Configure the switch itself
Let’s configure the rest! We start by shutting down all ports:
> enable # configure (conf)# interface range gigabitethernet 0/0-47 (conf-if-range-gi-0/0-47)# switchport (conf-if-range-gi-0/0-47)# shutdown (conf-if-range-gi-0/0-47)# exit
If you want to use a port channel (with LACP) for redundant uplink to core you can create one. If you don’t, omit all references to it later in the document:
(conf)# interface port-channel 1 (conf-if-po-1)# switchport (conf-if-po-1)# no shutdown (conf-if-po-1)# exit
Assign interfaces to the port channel group:
1.4. Howtos and guides 49 iaas Documentation, Release 0.1.0
(conf)# interface range gigabitethernet 0/42-43 (conf-if-range-gi-0/42-43)# no switchport (conf-if-range-gi-0/42-43)# port-channel-protocol LACP (conf-if-range-gi-0/42-43)# port-channel 1 mode active (conf-if-range-gi-0/42-43)# no shutdown (conf-if-range-gi-0/42-43)# exit
Define in-band and out-of-band VLANs:
(conf)# interface vlan 201 (conf-if-vl-201)# description "iaas in-band mgmt" (conf-if-vl-201)# no ip address (conf-if-vl-201)# untagged GigabitEthernet 0/22-33,38-41 (conf-if-vl-201)# tagged Port-channel 1 (conf-if-vl-201)# exit (conf)# interface vlan 202 (conf-if-vl-201)# description "iaas out-of-band mgmt" (conf-if-vl-201)# no ip address (conf-if-vl-201)# untagged GigabitEthernet 0/0-10 (conf-if-vl-201)# tagged Port-channel 1 (conf-if-vl-201)# exit (conf)# exit
Congratulations! Save the config and happy server provisioning:
# write # copy running-config startup-config
1.4.5 Install cumulus linux on ONIE enabled Dell S4810
The project will be using Dell PowerConnect S4810 switches with ONIE installer enabled by default instead of FTOS. This enables easy installation of cumulus linux to the switches.
Configure dhcpd and http server
You will need a running http server with a copy of the cumulus image:
# ls /var/www/html CumulusLinux-2.5.0-powerpc.bin onie-installer-powerpc
“onie-installer-powerpc” is a symlink to the bin-file. The symlink is used by ONIE to identify an image to download. Read here about the order ONIE tries to download the install file: http://opencomputeproject.github.io/onie/docs/user-guide/
Now, for the dhcp server to serve out an IP address and URL for ONIE to download from, dhcp option 114 (URL) is used. This example utilizes ISC dhcpd: option default-url="http://192.168.0.1/onie-installer-powerpc";
This option can be host, group, subnet or system wide. Read more about different dhcp servers and other methods here:
50 Chapter 1. Contents iaas Documentation, Release 0.1.0
https://support.cumulusnetworks.com/hc/en-us/articles/203771426-Using-ONIE-to-Install-
˓→Cumulus-Linux
When you power up the switch, it will by default be a dhcp client and accept an offered IP address, after which you can ssh to the ONIE installer with user root without password. However, if option 114 is specified, it will download the image and immediatly install it, and then reboot the switch. When the installation is complete, you can ssh to the switch using default cumulus login.
1.4.6 Create Cumulus VX vagrant boxes for himlar dev
This describes how to create (or update) the norcams/net vagrant box which is based on the Cumulus VX test appliance.
Requirements
• An account with access to the norcams organisation on Hashicorp Atlas system at https://atlas.hashicorp.com/ norcams • An account on cumulusnetworks.com to download the vagrant appliance from https://cumulusnetworks.com/ cumulus-vx/download/ • A current vagrant installation with virtualbox and libvirt providers working
Prepare virtualbox and libvirt box files
Install the vagrant-mutate plugin: vagrant plugin install vagrant-mutate
Download and rename the cumulus vagrant box, then add and convert it: mv Downloads/CumulusVX*virtualbox.box /path/to/norcams-net-2.5.6-virtualbox.box vagrant box add norcams/net /path/to/norcams-net-2.5.6-virtualbox.box vagrant mutate norcams/net libvirt
Verify that the box is available for both providers: vagrant box list
Repackage the libvirt box (this command takes a while to complete . . . ): vagrant box repackage norcams/net libvirt0 mv package.box norcams-net-2.5.6-libvirt.box
You should now have two box files, one for libvirt and one for virtualbox. ls *.box norcams-net-2.5.6-libvirt.box norcams-net-2.5.6-virtualbox.box
1.4. Howtos and guides 51 iaas Documentation, Release 0.1.0
Publish to Atlas
In order for vagrant autoupdate to work we need to publish both these files on a webserver somewhere and point to their locations from a provider and version configuration on Atlas. • Publish both box files somewhere where they can be downloaded from a public URL. • Log in at https://atlas.hashicorp.com/norcams • Find the norcams/net box at https://atlas.hashicorp.com/norcams/boxes/net • Add a new version, if needed • Create providers for “virtualbox” and “libvirt”. The URL should point at the location of the respective box file, e.g http://somewhere/files/norcams-net-2.5.6-virtualbox.box
1.4.7 Routed, virtual network interfaces for guest VMs on controllers
This describes how to setup a routed network interface for a guest VM running on a controller host. This is an adaptation of the general calico way of setting up networks from neutron data and some information from https: //jamielinux.com/docs/libvirt-networking-handbook/custom-routed-network.html
Requirements
• BIRD running on the controller host pointed at one or more route reflector instances and a bird.conf similar to the one on the compute nodes • A VM running in libvirt on the controller host with eth0 connected to the br0 host bridge (mgmt network).
Prepare the outgoing default gateway interface
Traffic originating from inside the guest need to have a gateway to send packets to. This will be a dummy interface with the same IP on each of the controller hosts. In this example we’ll generate a random MAC address in the format libvirt expects and use that to create a dummy dev01 service network IP interface om the host that we will later route to from within the guest. modprobe dummy mac=$(hexdump -vn3 -e '/3 "52:54:00"' -e '/1 ":%02x"' -e '"\n"' /dev/urandom) ip link add virgw-service address $mac type dummy ip addr add 172.31.16.1/24 dev virgw-service ip link set dev virgw-service up
This will bring up a virtual gateway interface that will be able to receive traffic from inside the guest instances on this controller host and deliver it to the kernel to be routed. However, we only want this interface to be used for outgoing traffic FROM the guests. But there is a problem - when we “up” the interface in the last step above an entry for the 172.31.16.0/24 network will be made in the kernel routing table:
[root@dev01-controller-03 ~]# ip route | grep virgw-service 172.31.16.0/24 dev virgw-service proto kernel scope link src 172.31.16.1
This leads to all and any traffic to that network being routed back over the virgw-service interface, we don’t want that. To fix this (and this is what Calico does, too) we remove the route that was created ip route del 172.31.16.0/24
52 Chapter 1. Contents iaas Documentation, Release 0.1.0
We’ve now prepared the virgw-service interface on the controller host to act as a dummy gateway for the service network on guest instances.
Add a tunnel interface connecting the host with a guest VM
First we make a tap interface on the controller host and give it a recognizable name - it seems like only a single dash is allowed in the name? The settings for the device are derived from what calico does on the compute nodes: ip tuntap add dev tap-dev01db02 mode tap one_queue vnet_hdr ip tuntap ip tuntap help
Next, we need to define this tap device in the libvirt domain config for the guest VM. Make sure the domain is not running first. virsh shutdown dev01-db-02
Generate an xml block describing the new guest network interface with a new random mac address - target device should be the tap device we just created on the host
Copy and paste this xml block below the current interface definition in the domain xml: virsh edit dev01-db-01
Make configuration changes to libvirt to allow this interface type
This is already documented in step 1) of the Calico compute node documentation at http://docs.projectcalico.org/en/ stable/redhat-opens-install.html?highlight=cgroup_device_acl#compute-node-install
Boot the guest and set up the interface
On the controller host you should now be able to boot the guest with the new interface added. We also need to create the host route to the new service IP that now will be available and bring up the tap device. virsh start dev01-db-02 ip link set dev tap-dev01db02 up ip route add 172.31.16.18/32 dev tap-dev01db02
Log in to the VM from the mgmt network and set up the new interface manually, then verify that it works: sudo ssh iaas@dev01-db-02 sudo -i ip addr ip addr add 172.31.16.18/24 dev eth1 ip link (continues on next page)
1.4. Howtos and guides 53 iaas Documentation, Release 0.1.0
(continued from previous page) ip link set dev eth1 up # switch default gw ip route del default ip route add default via 172.31.16.1
To make configuring services that use the new interface easier - use the new service IP interface as the default gw for the guest. You should now be able to ping the outside dummy gateway using the new interface ping -I eth1 172.31.16.1
On the controller host, verify that bird knows about the host route, e.g
[root@dev01-controller-03 ~]# birdcl show route BIRD1.4.5 ready. 0.0.0.0/0 via 172.31.1.1 on br0[kernel1 21:38:10] * (10) 172.31.16.18/32 dev tap-dev01db02[kernel1 21:41:33] * (10) 172.31.34.0/24 dev eth1.912[direct1 21:38:10] * (240) 172.31.35.0/24 dev eth1.913[direct1 21:38:10] * (240)
In order for a VM to reach an address on it’s same subnet, proxy-arp has to be enabled on the tap interface. Then the host computer with the router will offer it’s own mac address from the tap interface and then route the traffic.
[root@dev01-controller-03 ~]# echo 1 > /proc/sys/net/ipv4/conf/tap-dev01db02/proxy_arp
1.4.8 Configure iDRAC-settings on Dell 13g servers with USB stick
With Dell PowerEdge 13g servers the iDRAC base management controller can be configured automatically by reading settings from an xml file located on a USB stick. The USB port to be used is labelled with a wrench icon. By default, Dell PE 13g servers will auto apply config in this manner if the default username and/or password is not changed, so typically new servers are prime targets.
Create USB stick and copy files to it
You will need a USB stick formatted with fat32 and a directory called:
System_Configuration_XML
Two files are needed: config.xml control.xml
These xml files can be exported from an already configured server, or better still, git cloned from https://github.com/norcams/dell-idracdirect
54 Chapter 1. Contents iaas Documentation, Release 0.1.0
Apply profile to server iDRAC
Provide power to the server, but do not insert the USB stick just yet. Power on the server, and wait for the POST process to finish. After POST has finished, insert the USB stick to the port in front of the server with the wrench label. If the server provides a display, it will show first importing, then applying. After some odd 10 seconds the server will reboot. You will notice, as all lights will go out. Remove the USB stick and proceed to the next server.
1.4.9 Using vncviewer to access the console
We configure the bmc (baseboard management controller) on our servers to enable a VNC server feature. Accessing the console through VNC is easier and faster than using the Java-based console available through the bmc web interface. On CentOS/RedHat/Fedora, install the needed VNC client packages:
yum -y install tigervnc tigervnc-server-minimal vncpasswd # -> enter the idrac password and confirm vncviewer -passwd ~/.vnc/passwd1.2.3.4:5901
The tigervnc-server-minimal package is needed in order to get the vncpasswd utility. This creates a passwd file that is used for providing a password when connecting to the VNC server. The VNC server on the bmc’s is listening on port 5901. Only a single connection is allowed by the server.
1.4.10 Building puppet-agent for PPC-based Cumulus Linux
Puppet bruker sitt eget byggeverktøy kalt Vanagon for puppet-agent. Det kjøres mot en remote target, i vårt tilfelle en Debian Wheezy installasjon for PowerPC. En versjon som kan kjøres i qemu kan lastes ned fra http://folk.uib. no/ava009/debian_wheezy_ppc.img.tar.gz (0c27128c6ea2dad8f6d9cb8364e378a7). Er en i bestittelse av en fysisk PowerPC-basert maskin (f.eks. en gammel Mac) vil byggingen gå betraktelig raskere der. Vanagon forsøker å SSH’e som root til target, imaget over tillater dette (med passord). En kan også lage en SSH-nøkkel som man legger inn hos root-brukeren på byggeboksen. Sett så VANAGON_SSH_KEY=
1.4.11 How to create the designate-dashboard RPM package
1. Install tools:
yum install rpm-build
2. Get designate-dashboard from GitHub:
git clone https://github.com/openstack/designate-dashboard.git cd designate-dashboard git checkout stable/pike
3. Build RPM:
1.4. Howtos and guides 55 iaas Documentation, Release 0.1.0
python setup.py bdist_rpm
1.5 Team operations
This is internal information about development and operations for the IaaS team.
1.5.1 Getting started
This is information for new team members. Every team member should be familiar with this information.
Work with source code on github
Source code
When we speak of the source code, we typically refer to norcams/himlar on Github. This is the puppet code, hieradata and bootstrap scripts to get every component up and running the way we need. First make sure you have a Github account and then fork the norcams/himlar repo. You should try to make this repo the origin and then add another remote for norcams/himlar
Policy
When you need to make a change the rule of thumb is the following: • Minor changes can be done directly to master on norcams/himlar. E.g. hieradata for one location, minor code change for none critical components. • All other changes will need a PR (pull request) on Github • All code change should also be deployed first on a development location (e.g. dev01)
2FA on jumphosts (login nodes)
As an extra securiy measure there is now implemented Two-factor authentication for access to the login nodes, and thus the management network. The two components used are • SSH keys and • TOTP The latter by using the Google Authenticator mobile (or compatible) app on the client side.
56 Chapter 1. Contents iaas Documentation, Release 0.1.0
Basic procedure for access
To get access to the internal infrastructure one has to first go through any of the login nodes: • osl-login-01.iaas.uio.no • bgo-login-01.iaas.uib.no All user logins must be authenticated by providing two independent components: 1. SSH key exchange 2. TOTP verification code The public part of the user SSH key is provisioned using Puppet by publisizing the key in hieradata: common/modules/accounts.yaml
In addition the user must be allocated membership of the wheel group (same file). The TOTP setup is done by executing /usr/bin/google-authenticator. From this point on any login for this account requires a verification code (in addition to the automatic exchange of relevant SSH keys).1
Step-by-step setup
• add SSH public key to hieradata: file: common/modules/accounts.yaml key: accounts::ssh_keys • set up user to become member of the wheel group: file: as above key: accounts::users • ensure the account is created on hosts with the login role: file: common/roles/login.yaml key: accounts::accounts • on login nodes, as the user after the account is created: 1. execute google-authenticator and reply to the questions Recommended answers: – Do you want authentication tokens to be time-based (y/n) y – Do you want me to update your “/home/i
1 In the initial set up phase - to enable exisitng users to convert to 2FA - access through SSH keys only is allowed. The “switch” for this is the availability of the user configuration file. To disable this behaviour remove the option nullok from any line in /etc/pam.d/google-authenticator- wheel-only (though hieradata: common/roles/login.yaml and key googleauthenticator::pam::mode::modes:).
1.5. Team operations 57 iaas Documentation, Release 0.1.0
– If the computer that you are logging into isn’t hardened against brute-force login attempts, you can enable rate-limiting for the authentication module. By default, this limits attackers to no more than 3 login attempts every 30s. Do you want to enable rate-limiting (y/n) y 2. Provide the user with the secret key which is printed, the QR code drawn or the URL displayed - all after the initial question. 3. Install a TOTP client application on any of the compatibel user devices (mobile phone, tablets etc). Rec- ommended application is Google Authenticator. 4. In the TOTP application set up a new account following its instructions. The easiest method is if the app provides a mean to configure it through scanning a QR code and the user can be shown the QR code drawn during server initialization, alternatively using the URL. Otherwise enter the secret key printed.
Login procedure
1. log in to a login node from an account with access to the private part of the SSH key provided 2. when prompted start the TOTP/2FA application on the user device and enter the 6 digits displayed 3. login should be successfull
Important: It is paramount that the user device and the login node are in sync with regards to time!
Transfer to new device
If the previous device is lost, then the set up procedure should be repeated to configure a new code. But in those cases where a new device (mainly a phone) is purchased etc, and one still has full control of the old, it is possible to recreate the required QR-code like this: 1. username=
Note: Remember to do this on both login nodes!
Emergency code
If the situation should arise where the user does not have access to the device where the 2FA application is installed, he/she can log in using any of the one time passcodes created duirng setup: • someone with access to the user configuration file2 must retrieve one of the passcodes listed at the bottom • log in as usual but enter the retrieved emergency code in place of a proper verification code • after a successful login this particular emergency code is rendered invalid
2 Default $home/.google_authenticator
58 Chapter 1. Contents iaas Documentation, Release 0.1.0
Secret repository
For data that should not be publically available (on GitHub or elsewhere) there is a in-house repository named secrets This contains these areas: • hieradata Data used by the himlar code. This area is controlled by git • nodes For files only used by or on specific systems • common Non-host specific files, like licenses and proprietary binaries hieradata
Data in here is accessed and controlled by git at git@git.
git clone git@git.iaas.uio.no:hieradata/secrets cd secrets ... edit ... git commit ... git push
For activation consult the Deployment section of this documentation.
nodes and common
Data under these sections are stored manually under the respective directory on any of the login nodes. Afterwards data should be synced between the locations using the script described below. Data under common can be stored arbitrarily. For nodes create a directory name by the shortform of the hostname (i.e. non-FQDN). For data utilized by ansible driven jobs the structure might be dictated by the playbook used. One of those is the SSL certificate distribution.
Syncronization of data between repositories
Note: The script described here is not yet fully functional!
The login node in OSL is defined as master, the implication being:
- data which exists on several locations is set to the content of the osl data - data which exists solely on the slave is copied to the master
To syncronize, run as yourself after any change:
1.5. Team operations 59 iaas Documentation, Release 0.1.0
cd /opt/repo/secrets ./secret-sync.sh[delete]
The delete option removes any files on slave(s) which does not exist on the master. Use this with caution! The rquirements for this to work is that all files and directories are owned and writeable by group wheel!
1.5.2 Development
Puppet design policy
This the policy for himlar puppet code.
Definitions
• himlar: puppet code at https://github.com/norcams/himlar • module: upstream module listed in Puppetfile • profile: a puppet module with classes for norcams adaptations. Found in himlar under profile/ • hieradata: hierarchical yaml files with config data under hieradata/
Profile
• hieradata should include profile classes • all modules should be included in a profile class (never in hieradata) • profile classes should have boolean options to enable a feature with default value set to false • enabling of profile features should be done either in – hieradata/common/modules/profile.yaml for global settings – hieradata/common/role/ for enabling for only one role • each openstack role should have one class named after itself that will include feature classes • location hieradata override should always be done in the same module/role file as in the common version of module/role • module hieradata should be grouped by module classes
Hieradata
• global hiera variables referenced in other hiera files should have generic names and never full class names (e.g. openstack_version and not cinder::db::mysql::password) • profile hashes that needs to be merged should use the same naming as autoloaded input class variables (e.g. profile::openstack::designate::bind_servers)
60 Chapter 1. Contents iaas Documentation, Release 0.1.0
Vagrant
Development in vagrant
Last changed: 2021-09-14 Before you start you need to set up vagrant either with virtualbox or libvirt libvirt. Then following these steps will get you up and running with vagrant.
Source code from git
First make a fork of https://github.com/norcams/himlar and clone that repo to a local folder called himlar. This will be refered to as $himlar in this documentation: mkdir $himlar cd $himlar git clone git@github.com:
Generate CA files
Last changed: 2021-09-14
Warning: If you have problems with the CA: delete provision/ca, check out all files tracked in git and rerun bootstrap.sh
You will need to generate a CA key pair with openssl to sign the certificate used in vagrant to test TLS for endpoint. First make sure openssl is installed on your host computer (if not, run the scipts and copy all the .pem files back to your host): cd $himlar cd provision/ca echo "YOUR_SECRET" > passfile ./bootstrap.sh
NB! You must run the script from the provision/ca directory! The CA chain .pem file can be found in:
$himlar/provision/ca/certs/intermediate/ca-chain.cert.pem
If you trust that no one will have access to your passfile, you could add $himlar/provision/ca/certs/ intermediate/intermediate.cert.pem to your browser to avoid warnings.
1.5. Team operations 61 iaas Documentation, Release 0.1.0
Use in puppet
In puppet, this CA is used to generate certificates defined in the hash: profile::application::openssl::certs
Nodeset
Last changed: 2021-09-14 There are different sets of nodes to use in vagrant. The node set can be changed by setting the environment variable called HIMLAR_NODESET.
Default nodeset
The default nodeset uses the vagrant location. Here we have added all the important roles into one node called vagrant-api-01. The reset of the nodes are optional (like dashboard, access and compute).
Full nodeset
The full nodeset uses the dev location. Here all roles have nodes matching the test and production locations. This will require more resources on the vagrant host (16GB+ RAM, 4+ Cores). To use the full nodeset: export HIMLAR_NODESET=full
Other nodeset
There are also other special case nodeset. To see all nodeset and change them edit $himlar/nodes.yaml.
Vagrant up
The nodes in vagrant should be started in stages. Each stage should complete before the next one are started. First stage: • db-01 • mq-01 • api-01 • dashboard-01 (optional) • access-01 (optional) • monitor-01 (optional) • logger-01 (optional) • proxy-01 (optional) • admin-01 (optional) Second stage:
62 Chapter 1. Contents iaas Documentation, Release 0.1.0
• identity-01 Main stage: • novactrl-01 • image-01 • volume-01 • network-01 • console-01 (optional) • metric-01 (optional) • telemetry-01 (optional) Last stage: • compute-01
Final fixes
A few final stages are needed to start an instance in vagrant.
Host aggregate and AZ
After running vagrant up compute you will need to run vagrant provison novactrl to add the newly created compute node to a host aggregate and the correct availability zone (AZ).
Metadata api
We need to restart openstack-nova-metadata-api on compute-01. This can be done with ansible: ansible-playbook-e"myhosts=vagrant-compute name=openstack-nova-metadata-api.service
˓→" lib/systemd_restart.yaml
Flavors
Flavors are missing. m1 can be added with himlarcli/flavor.py or openstack cli.
Image
You will need a public cirros image to test with. One way to quickly fix this are to use himlarcli/image.py and edit config/images/cirros.yaml and set it to be public. You can then just run:
./image.py update-i cirros.yaml
1.5. Team operations 63 iaas Documentation, Release 0.1.0
Dataporten
See more about setting up dataporten in vagrant After running destroy/up only himlarcli/dataporten.py will be needed. To create a dataporten user in vagrant after setting dashboard up, we can use himlarcli/access.py to add a user request to the queue and process the request and add the user.
Working with web services in vagrant
Last changed: 2021-09-14 Here are some tips to working with the difference web services like dashboard, access and API in vagrant.
/etc/hosts
You will need to update the /etc/hosts on the machine where you run your browser or API calls. Look in common.yaml for the location you are working with, and add all public address. Example for full nodeset:
172.31.24.56 access.dev.iaas.intern 172.31.24.51 dashboard.dev.iaas.intern 172.31.24.51 status.dev.iaas.intern 172.31.16.81 identity.trp.dev.iaas.intern 172.31.24.86 api.dev.iaas.intern 172.31.24.86 compute.api.dev.iaas.intern 172.31.24.86 network.api.dev.iaas.intern 172.31.24.86 image.api.dev.iaas.intern 172.31.24.86 identity.api.dev.iaas.intern 172.31.24.86 volume.api.dev.iaas.intern sshuttle
If you work on a remote vagrant host you will need to have access to vagrants public net. This can be done with sshuttle: sshuttle-r
Vagrant with virtualbox
Last changed: 2021-09-14 You first will need to install the virutalbox and vagrant packages for your operating system. This have been tested on Ubuntu and OSX and works without any other configuration.
64 Chapter 1. Contents iaas Documentation, Release 0.1.0
Vagrant with libvirt
Last changed: 2021-09-14
Contents
• Vagrant with libvirt – Requirements – Setting up the Vagrant environment – Tips and tricks
Requirements
In order to deploy the virtual machines in Vagrant, the host running the VMs must med the following requirements:
Operating system libvirt capable (tested on RHEL/Fedora) Memory 16 GB minimum, 32 GB recommended Disk space 8 GB minimum on /var/lib/libvirt
Setting up the Vagrant environment
In order to deploy the Vagrant environment, follow this guide. 1. Make sure that the requirements are met 2. Ensure that CPU virtualization extensions are enabled on the host. You’ll probably need to enter BIOS setup for this. 3. Create a file /etc/polkit-1/rules.d/10-libvirt.rules with the following contents:
polkit.addRule(function(action, subject) { if ((action.id =="org.libvirt.unix.manage" || action.id =="org.libvirt.unix.monitor") && subject.isInGroup("wheel")) { return polkit.Result.YES; } });
4. Install Vagrant and libvirt. In this case, it is assumed that you’re running Fedora and have the RPMFusion repositories available:
dnf-y install vagrant vagrant-libvirt libvirt-daemon-kvm
If you are running RedHat, you may also need to install:
yum install libvirt-devel
5. Install Vagrant plugins for libvirt:
vagrant plugin install vagrant-libvirt
1.5. Team operations 65 iaas Documentation, Release 0.1.0
6. Start the libvirtd service, and make sure that it is started at boot:
systemctl start libvirtd.service systemctl enable libvirtd.service
7. Add the user that will be running Vagrant to the wheel group:
usermod-G wheel-a
Tips and tricks
Add the following in your ~/.bashrc or similar to always use the same nodeset (example for nodeset “dns”):
# Himlar export HIMLAR_NODESET=dns
Script to provision a node, and keep the output:
#!/bin/bash host=$1
[ -d /tmp/himlar ] || mkdir /tmp/himlar vagrant rsync $host | tee /tmp/himlar/$host vagrant provision $host | tee -a /tmp/himlar/$host
Script to take all nodes up, in the correct order (uses the script provision.sh above):
#!/bin/bash declare -a nodes=( 'db' 'api' 'identity' 'mq' 'dashboard' 'ns' 'resolver-01' 'resolver-02' 'admin' 'image' 'network' 'compute' 'novactrl' 'volume' 'dns' ) for node in "${nodes[@]}"; do vagrant up $node provision.sh $node provision.sh $node done for node in "${nodes[@]}"; do provision.sh $node done
66 Chapter 1. Contents iaas Documentation, Release 0.1.0
Testing in Vagrant
Last changed: 2021-09-14
Warning: This might be outdated.
Connecting to Horizon
Horizon is the web GUI component in OpenStack. If you’ve followed the Setting up the Vagrant environment guide earlier, you should now start the nodes: vagrant up api vagrant up dashboard
Connect a browser to the Horizon GUI: https://172.31.24.51/
If the VMs are running on a remote host, the best approach will be to use an SSH tunnel. Create an SSH tunnel with: ssh-L 8443:172.31.24.51:443
After creating the SSH tunnel, point your browser to: https://localhost:8443/
Note that authentication through Feide Connect (aka “Dataporten”) uses redirection and is not possible when connect- ing through an SSH tunnel.
Setting up local user and tenant
Logging into the VMs is fairly simple. In order to set up a demo user and tenant, log into the master VM: vagrant ssh master
Become root: sudo-i
API authentication configuration
The norcams/himlar repo is available from within the vagrant VM as /opt/himlar. Run the 00- credentials_setup.sh script:
/opt/himlar/tests/00-credentials_setup.sh
This will create 3 files in your home directory:
openstack.config Defines the demo username etc. Used by other tests keystonerc_admin Sets environment variables for administrator keystonerc_demo Sets environment variables for demo user
1.5. Team operations 67 iaas Documentation, Release 0.1.0
In order to “become” the OpenStack administrator, you then only need to source the ~/keystonerc_admin file:
.~/keystonerc_admin
To switch to the demo user, source the ~/keystonerc_demo file:
.~/keystonerc_demo
Create demo user and project (tenant)
This can be accomplished simply by running:
/opt/himlar/tests/01-keystone-create_demo_user.sh
But for the sake of learning, you may want to to this manually as shown below: 1. Source the file that defines the administrator environment:
source~/keystonerc_admin
2. Create a demo tenant (project):
openstack project create-- or-show demoproject
3. Create a demo user and set the password:
openstack user create-- or-show--password himlar0pen demo
4. Associate the demo user with the demo tenant:
openstack user set--project demoproject demo
5. Show the demo user:
openstack user show demo
Upload an image to Glance
This can be accomplished simply by running:
/opt/himlar/tests/02-glance-import_cirros_image.sh
But for the sake of learning, you may want to to this manually as shown below: 1. Source the file that defines the administrator environment:
source~/keystonerc_admin
2. Download CirrOS image:
curl-o/tmp/cirros.img http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_
˓→64-disk.img
3. Upload and create the image in Glance:
68 Chapter 1. Contents iaas Documentation, Release 0.1.0
openstack image create"CirrOS test image"--disk-format qcow2--public--file/
˓→tmp/cirros.img
Note: This can also be accomplished by using Glance directly:
glance image-create--name"CirrOS test image"\ --disk-format qcow2--container-format bare \ --visibility public--file/tmp/cirros.img
4. List images:
openstack image list
Optionally, list images using the Nova API:
nova image-list
Create a network security group
This can be accomplished simply by running:
/opt/himlar/tests/03-neutron-create_security_group_and_rules.sh
But for the sake of learning, you may want to to this manually as shown below: 1. Source the file that defines the administrator environment:
source~/keystonerc_admin
2. Create a network security group called “test_sec_group”:
openstack security group create test_sec_group
3. Add a rule which allows incoming SSH:
openstack security group rule create--proto tcp--dst-port 22 test_sec_group
4. Add a rule which allows incoming ICMP:
openstack security group rule create--proto icmp test_sec_group
5. Show the newly created security group:
openstack security group show test_sec_group--max-width 70
Note: This could have been done using the Neutron API instead of the generic openstack command: neutron security-group-create test_sec_group neutron security-group-rule-create--direction ingress--protocol tcp \ --port_range_min 22--port_range_max 22 test_sec_group neutron security-group-rule-create--protocol icmp--direction ingress test_sec_group neutron security-group-show test_sec_group
1.5. Team operations 69 iaas Documentation, Release 0.1.0
Running himlarcli in vagrant
You will need access to both the public and transport net on the host you plan to run himlarcli. This should work on the same host where you run vagrant.
Himlarcli source code
Clone the repo from https://github.com/norcams/himlarcli and follow the instructions in the README.
config.ini
You will need a working config.ini file for himlarcli. You could either copy the one from vagrant/dev-proxy-01 node or from /opt/himlarcli/ on login. Make sure that the following elementer in config.ini are correct: • auth_url • password (see hieradata/vagrant/common.yaml) • region • keystone_cachain The openstack endpoints used in himlarcli must also resolv. This can be done by editing /etc/hosts on the host.
Testing
instance.py are good for testing both keystone and nova API, and update_images.py for testing glance API.
Setting up dataporten in vagrant
First make sure that the access and proxy-node are parts of your HIMLAR_NODESET. There should be at least one nodeset with access and proxy in nodes.yaml. To use Dataporten to authenticate a user in access and keystone, you will first need to set up two applications at https://dashboard.dataporten.no More help can be found at https://docs.dataporten.no/ Login to Dataporten Dashboard, and register new application. Redirect uri from access should be:
https://access.vagrant.iaas.intern/login https://access.vagrant.iaas.intern/reset
and for dashboard:
https://identity.api.vagrant.iaas.intern:5000/v3/auth/OS-FEDERATION/websso/openid/
˓→redirect
You need to add the following scope in Permissions for each application:
70 Chapter 1. Contents iaas Documentation, Release 0.1.0
email userid-feide profile openid
Also make sure dashboard.vagrant.iaas.intern and access.vagrant.iaas.intern are in /etc/hosts on the machine you are running your browser. Then copy Client ID and Client Secret from Oauth details to: hieradata/secrets/nodes/vagrant-access-01.secrets.yaml hieradata/secrets/nodes/vagrant-identity-01.secrets.yaml
Reference hieradata/secrets/nodes on the other locations for exact content. To allow Dataporten login in horizon, run the dataporten script once in himlarcli as root:
./dataporten.py
Galera DB cluster
Some tips for setting up galera in vagrant:
Setup
Warning: If you have running db-global or db-regional nodes these must be halted or destroyed first!
• use db nodeset • you will need 3 nodes: db-global-01, db-global-02, ha-01 • Start first db-global-01 • Uncomment the marked hieradata:
hieradata/nodes/vagrant/vagrant-db-global-01.yaml hieradata/nodes/vagrant/vagrant-db-global-02.yaml hieradata/nodes/vagrant/vagrant-ha-01.yaml hieradata/vagrant/roles/db-global.yaml
• Provision db-global-01 (will fail) and run galera_new_cluster to start database service and run provision once more with errors • Start ha-01 and make sure garbd is running. You should now have a size 2 cluster (see below) • Start db-global-02 and start mariadb.service. If you have problems starting the database, try to stop iptables first
1.5. Team operations 71 iaas Documentation, Release 0.1.0
Check status
Check the current galera cluster status. If everything is working you should have a size 3 cluster:
SHOW GLOBAL STATUS LIKE'wsrep_cluster_%';
Host overview
Address plan for individual hosts/nodes. Also look at the IP addressing plan for information about location specific network.
Network hosts
node inf net addr leaf-01 mgmt x.x.x.1 leaf-02 mgmt x.x.x.2 leaf-03 mgmt x.x.x.3 leaf-04 mgmt x.x.x.4 mgmt-01 priv x.x.x.1 controller-00 mgmt x.x.x.99
Management hosts
Compute resource (controller) and profile (resources) for virtual nodes are defined under config/nodes/ in him- larcli.
node inf net addr login-01 eth0 mgmt x.x.x.10 admin-01 eth0 mgmt x.x.x.11 proxy-01 eth0 mgmt x.x.x.12 logger-01 eth0 mgmt x.x.x.13 monitor-01 eth0 mgmt x.x.x.14 builder-01 eth0 mgmt x.x.x.15 ns-01 eth0 mgmt x.x.x.16 resolver-01 eth0 mgmt x.x.x.17 resolver-02 eth0 mgmt x.x.x.18
OOB hosts
FOR NOW ONLY APPLIES TO OSL Most hosts have their oob interface (iDrac, ILO, etc) set up with the last octet the same as their trp counterpart. They are also registered in DNS with an -lc ending to the normal host name. Two exceptions to those rules is listed here:
node inf net addr note controller-04 em3.3378 oob — No address assigned/necessary admin-01 eth1 oob x.x.x.9 Interface attached to br2 on host
72 Chapter 1. Contents iaas Documentation, Release 0.1.0 em3.3378 on controller-04 is connected to a bridge interface (br2), which sole purpose is to bridge the OOB network to admin-01. This is to allow admin-01 to control the power interface on all physical nodes (in the same vein as it can control the virtual power interface on all VM’s).
Note: This only applies to the production environments (BGO and OSL) where we have control of the management switches.
Openstack nodes
Management net (mgmt) should have the same last octet as the transport net (trp).
node inf net addr status-01 eth1 trp x.x.x.21 report-01 eth1 trp x.x.x.22 nat-linux-01 eth1 trp x.x.x.26 nat-linux-02 eth1 trp x.x.x.27 mq-01 eth1 trp x.x.x.31 mq-02 eth1 trp x.x.x.32 mq-03 eth1 trp x.x.x.33 dns-01 eth1 trp x.x.x.34 dns-02 eth1 trp x.x.x.35 image-01 eth1 trp x.x.x.36 image-02 eth1 trp x.x.x.37 image-03 eth1 trp x.x.x.38 dns-03 eth1 trp x.x.x.39 db-01 eth1 trp x.x.x.41 db-02 eth1 trp x.x.x.42 db-03 eth1 trp x.x.x.43 volume-01 eth1 trp x.x.x.46 volume-02 eth1 trp x.x.x.47 volume-03 eth1 trp x.x.x.48 dashboard-01 eth1 trp x.x.x.51 dashboard-02 eth1 trp x.x.x.52 dashboard-03 eth1 trp x.x.x.53 dashboard-mgmt-01 eth1 trp x.x.x.54 cephmds-01 eth1 trp x.x.x.55 access-01 eth1 trp x.x.x.56 access-02 eth1 trp x.x.x.57 access-03 eth1 trp x.x.x.58 cephmds-02 eth1 trp x.x.x.59 cephmds-03 eth1 trp x.x.x.60 console-01 eth1 trp x.x.x.61 console-02 eth1 trp x.x.x.62 console-03 eth1 trp x.x.x.63 coordinator-01 eth1 trp x.x.x.64 novactrl-01 eth1 trp x.x.x.66 novactrl-02 eth1 trp x.x.x.67 continues on next page
1.5. Team operations 73 iaas Documentation, Release 0.1.0
Table 1 – continued from previous page node inf net addr novactrl-03 eth1 trp x.x.x.68 network-01 eth1 trp x.x.x.71 network-02 eth1 trp x.x.x.72 network-03 eth1 trp x.x.x.73 telemetry-01 eth1 trp x.x.x.76 telemetry-02 eth1 trp x.x.x.77 telemetry-03 eth1 trp x.x.x.78 identity-01 eth1 trp x.x.x.81 identity-02 eth1 trp x.x.x.82 identity-03 eth1 trp x.x.x.83 rgw-01 eth1 trp x.x.x.84 rgw-02 eth1 trp x.x.x.85 api-01 eth1 trp x.x.x.86 api-02 eth1 trp x.x.x.87 api-03 eth1 trp x.x.x.88 cephmon-object-01 eth1 trp x.x.x.89 cephmon-object-02 eth1 trp x.x.x.90 cephmon-01 eth1 trp x.x.x.91 cephmon-02 eth1 trp x.x.x.92 cephmon-03 eth1 trp x.x.x.93 cephmon-object-03 eth1 trp x.x.x.94 rgw-03 eth1 trp x.x.x.95 metric-01 eth1 trp x.x.x.96 metric-02 eth1 trp x.x.x.97 metric-03 eth1 trp x.x.x.98
Openstack hosts
node inf net addr controller-01 eth1 trp x.x.x.100 controller-02 eth1 trp x.x.x.101 controller-03 eth1 trp x.x.x.102 controller-04 eth1 trp x.x.x.114 compute-01* eth1 trp x.x.x.103 compute-02* eth1 trp x.x.x.104 compute-03* eth1 trp x.x.x.105 compute-04* eth1 trp x.x.x.111 compute-05* eth1 trp x.x.x.112 compute-06* eth1 trp x.x.x.113 compute-07* eth1 trp x.x.x.115 compute-08* eth1 trp x.x.x.116 storage-01* eth1 trp x.x.x.106 storage-02* eth1 trp x.x.x.107 storage-03* eth1 trp x.x.x.108 storage-04* eth1 trp x.x.x.109 storage-05* eth1 trp x.x.x.110
74 Chapter 1. Contents iaas Documentation, Release 0.1.0
Ephemeral hostnames
For puppet to work and to know which node to configure we have cername (or clientcert) be constructed the following way:
To make it easier to know what certname a node uses we also set the hostname equal to the certname for all nodes. When some nodes, e.g. compute, object or storage, change certname we need a way to keep track of the node other than certname. All physical nodes have one permanent name and mgmt IP that will follow the machine from start to end:
This will be used for A and PTR records for mgmt as well. When a node are used in a variant/subrole we use a CNAME to map the ephemeral hostname to the permanent one. This will only be done in the location where the subrole/variant is present and removed when no longer needed.
Physical naming
When we add name tags to machines in the datacenter we use the permanent name:
[
The location part can be omitted to save space.
Rack overview and power consumption
Rack placement and power consumption planning.
1.5. Team operations 75 iaas Documentation, Release 0.1.0
Region OSL
76 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 1
1.5. Team operations 77 iaas Documentation, Release 0.1.0
Estimated power consumption:
Host Vendor/Model Typical Maximum osl-login-01 Dell PowerEdge R610 200 W 300 W osl-torack-01 ? 200 W 370 W osl-torack-02 ? 200 W 370 W osl-compute-01 Dell PowerEdge R630 354 W 541 W osl-compute-02 Dell PowerEdge R630 354 W 541 W osl-compute-03 Dell PowerEdge R630 354 W 541 W osl-compute-04 Dell PowerEdge R630 354 W 541 W osl-compute-05 Dell PowerEdge R630 354 W 541 W osl-compute-06 Dell PowerEdge R630 354 W 541 W osl-compute-07 Dell PowerEdge R640 499 W 702 W osl-compute-08 Dell PowerEdge R640 499 W 702 W osl-compute-09 Dell PowerEdge R640 499 W 702 W osl-compute-10 Dell PowerEdge R640 499 W 702 W osl-compute-42 Supermicro AS -4023S-TRT 300 W 500 W osl-controller-01 Dell PowerEdge R630 232 W 380 W osl-controller-02 Dell PowerEdge R630 232 W 380 W osl-controller-03 Dell PowerEdge R630 232 W 380 W osl-controller-04 Dell PowerEdge R620 242 W 398 W Total 5958 W 9132 W
78 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 2
1.5. Team operations 79 iaas Documentation, Release 0.1.0
Estimated power consumption:
Host Vendor/Model Typical Maximum osl-leaf03 ? osl-leaf04 ? osl-mgmt-01 ? osl-mgmt-opx-01 Dell S3048-ON osl-storage-01 Dell PowerEdge R730xd 368 W 572 W osl-storage-02 Dell PowerEdge R730xd 368 W 572 W osl-storage-03 Dell PowerEdge R730xd 368 W 572 W osl-storage-04 Dell PowerEdge R730xd 368 W 572 W osl-storage-05 Dell PowerEdge R730xd 368 W 572 W osl-storage-06 Dell PowerEdge R740xd 388 W 602 W osl-storage-07 Dell PowerEdge R740xd 388 W 602 W osl-storage-08 Dell PowerEdge R730xd 476 W 680 W osl-storage-09 Dell PowerEdge R730xd 476 W 680 W osl-storage-10 Dell PowerEdge R730xd 476 W 680 W osl-storage-11 Dell PowerEdge R730xd 476 W 680 W osl-storage-12 Dell PowerEdge R730xd 476 W 680 W Total 4996 W 7464 W
80 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 3
1.5. Team operations 81 iaas Documentation, Release 0.1.0
Estimated power consumption:
Host Vendor/Model Typical Maximum osl-mgmt-opx-02 Dell S3048-ON osl-nfs-01 Dell PowerEdge R710 300 W 400 W osl-controller-01 Dell PowerEdge R640 355 W ? 557 W ? osl-controller-02 Dell PowerEdge R640 355 W ? 557 W ? osl-controller-03 Dell PowerEdge R640 355 W ? 557 W ? osl-controller-04 Dell PowerEdge R640 355 W ? 557 W ? osl-compute-11 Dell PowerEdge R7425 579 W 889 W osl-compute-12 Dell PowerEdge R7425 579 W 889 W osl-compute-13 Dell PowerEdge R7425 579 W 889 W osl-compute-14 Dell PowerEdge R7425 579 W 889 W osl-compute-15 Dell PowerEdge R7425 579 W 889 W osl-compute-16 Dell PowerEdge R7425 579 W 889 W osl-compute-21..24 Supermicro AS -2123BT-HTR 2000 W 3000 W osl-compute-25..28 Supermicro AS -2123BT-HTR 2000 W 3000 W Total 9194 W 13962 W
82 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 4
1.5. Team operations 83 iaas Documentation, Release 0.1.0
Estimated power consumption:
Host Vendor/Model Typical Maximum osl-mgmt-opx-03 Dell S3048-ON osl-spine-01 Dell S5232F-ON 360 W 635 W osl-spine-02 Dell S5232F-ON 360 W 635 W osl-compute-18 Dell PowerEdge R6525 354 W 1061 W osl-compute-43 Huawei 500 W ? osl-compute-44 Huawei 500 W ? osl-compute-45 Huawei 500 W ? osl-compute-46 Huawei 500 W ? osl-compute-47 Huawei 500 W ? osl-compute-48 Huawei 500 W ? osl-compute-49 Huawei 500 W ? osl-compute-50 Huawei 500 W ? osl-compute-29..32 Supermicro 2000 W 3000 W osl-compute-33..36 Supermicro 2000 W 3000 W osl-compute-37..40 Supermicro 2000 W 3000 W Total ? ?
Region BGO
84 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 1
1.5. Team operations 85 iaas Documentation, Release 0.1.0
86 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 2
1.5. Team operations 87 iaas Documentation, Release 0.1.0
88 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 3
1.5. Team operations 89 iaas Documentation, Release 0.1.0
90 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 4 (planned)
1.5. Team operations 91 iaas Documentation, Release 0.1.0
92 Chapter 1. Contents iaas Documentation, Release 0.1.0
Rack 5 (planned)
1.5. Team operations 93 iaas Documentation, Release 0.1.0
1.5.3 Operations
DB
Galera management
The galera cluster consist of three nodes: • bgo-db-01 • osl-db-01 • uib-ha-02 (quorum node only) To check the current status as root on db-01: mysql SHOW STATUS LIKE'wsrep_cluster_size';
Cluster size must be 2 or greater.
Start stop quorum node
From bgo-login-01: sudo ssh iaas@uib-ha-02 sudo-i systemctl status garbd systemctl stop garbd systemctl start garbd
Resetting the Quorum
If one of the nodes in the cluster have wsrep_cluster_status non-Primary we will need to reset the quorum. On the node you will make the new master run this in mysql:
SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';
Read more on how to fix this: http://galeracluster.com/documentation-webpages/quorumreset.html
Bootstrap cluster
You will need to bootstrap the cluster if systemctl start mysqld fails on bgo-db-01 for some reason.
Warning: If there are new data on osl-db-01 this will be lost unless we have a db dump and restore it on bgo-db-01 after mysqld have been started.
First stop mysqld on db-01 and garbd on uib-ha-02. On bgo-db-01 edit /var/lib/mysql/grastate.dat and make sure:
94 Chapter 1. Contents iaas Documentation, Release 0.1.0
safe_to_bootstrap:1
Bootstrap cluster on bgo-db-01: galera_new_cluster Verify that mysqld are running and do a restore if necessary:
systemctl status mysqld
Start mysqld on osl-db-01 and garbd på uib-ha-02 Verify cluster size are now 3.
How to bootstrap Himlar
This document describes the procedure to initialize a new environment from a single login node. The systems to be used are all physically installed (including configuration of BIOS/iDrac) but otherwise untouched. loc=[bgo|osl|test01|test02|local1|local2|local3|. . . ]
Prerequisites
• A login node (with an up-to-date /opt/[himlar|repo] hiearchy) which is maintained by Puppet • No management-node installed (controller) • hieradata/${loc}/common.yaml, hieradata/common/common.yaml, hieradata/nodes/${loc}/. . . etc. are popu- lated with relevant data • puppet is disabled on new nodes: ensure $loc/modules/puppet.yaml includes puppet::runmode: ‘none’ • All commands run as the admin user (root) unless noted (consult the document 2FA on jumphosts (login nodes)) • The new controller node (and all further controller and compute nodes) must have CPU virtualization extentions enabled in BIOS
Important: When doing a complete reinstall make sure peerdns: ‘no’ is in the network configuration for the nodes controller-01 and admin-01. Also make sure gateway and DNS points to the login node or wherever there is a connection out and/or a resolver- reachable. This might require toggling of data in ‘common.yaml’ or the relevant node files. This should be manipulated on the code activated on the login node from where the bootstrap process is ini- tialized befoe the run. Changes after installation of the controller node should be activated on the node itself (“/opt/himlar/hieradata”).
1.5. Team operations 95 iaas Documentation, Release 0.1.0
Procedure
1. Make sure the dhcp and tftp services are allowed through the firewall, if RHEL 7/Centos 7: iptables -I INPUT 1 -i
Note: The error message “curl: (33) HTTP server doesn’t seem to support byte ranges. Cannot resume.” is harmless when the script has been previously run. If so this is just an indication that the files to be fetched are already in place. But please make sure the files nevertheless are recent!
4. Boot the relevant physical node using the web GUI on the iDrac or with this command on the login node: idracadm -r
Note: Make sure the system is configured to PXE boot on the relevant (mgmt) interface on first attempt! Might require BIOS setup.
Important: When the new controller is fully installed, the script started in 1) must be quit if the new system is set to primarly attempt PXE boot, otherwise it will enter an endless installation loop!
5. Log on to the freshly installed controller node: (sudo) ssh iaas@$loc-controller-01 6. Run puppet in bootstrap mode: bash /root/puppet_bootstrap.sh 7. Run puppet normally: /opt/himlar/provision/puppetrun.sh 8. Punch a hole in the firewall for traffic to port 8000: iptables -I INPUT 1 -p tcp –dport 8000 -j ACCEPT 9. Initiate installation of the admin node/Foreman: /usr/local/sbin/bootstrap-$loc-admin.sh 1. virsh list should now report the foreman instance as running 2. The install can be monitored with vncviewer $loc-controller.01, virt-manager connected to $loc- controller-01 or your preferred vnc viewer application 3. When the message “Domain creation completed. Restarting guest.” is written to the terminal from where the script was started, the system is installed and ready for use.
96 Chapter 1. Contents iaas Documentation, Release 0.1.0
4. The new controller node can be logged on to from the login node: ssh iaas@$loc-admin-01. . . . 10. When controller node installation is complete the firewall can be restored: iptables -D INPUT 1 repeated until all newly inserted rules are removed. Check with iptables -L -n 11. Log on to the new admin system from the login node, optionally check the install log: /root/install.post.log 12. ensure the system time is correct 13. Put the following in hieradata/
1.5. Team operations 97 iaas Documentation, Release 0.1.0
• Using the himlarcli command the nodes will iautomatically be set up according to the nodes file for the environment. • Recommended sequence: a. leaf nodes if applicable (make sure puppet is run afterwards) b. proxy-01 (make sure puppet is run afterwards) c. Remaining controller nodes (make sure puppet is run afterwards) d. Remaining nodes; may be done by executing: node.py -c config.ini.$loc xxx full This will install all nodes in the list
Important: DO NOT run puppet on any of the nodes unless explicitly specified!
Note: Physical hosts may have to be rebooted or powered on manually. Make sure they are configured to PXE boot on the managment interface on their first boot.
Note: As long as we have common login nodes shared between test and production environments, some additional steps must be performed until successful install of proxy-01: 1) admin-01 must have the login node configured as resolver 2) login node must have a hole punched in the firewall for domain traffic (port 53) on the relevant management interface 3) the login node must be set up to NAT outgoing traffic coming in on the relevant management inter- face (hint: “/root/test02_enable_nat.sh”) 4) admin-01 must have the login node configured as its default gateway configured When proxy-01 is up and running all can be set back to normal.
1. Execute puppet on the node in this sequence: a. mq-01, logger-01 b. db-global-01, db-regional-02, dashboard-01, monitor-01 For dashboard-01 the certificates must be first distributed. c. cephmon-0[1-] d. identity-01, access-01 For access-01 the certificates must be first distributed. For identity-01, it’s important that the openrc file is absent while bootstrapping keystone. Remove the necessary include in the node file before the first puppet run. e. storage0[1-] f. volume-01, image-o1, network-01, novactrl-01, console-01 For console-01 the certificates must be first distributed.
98 Chapter 1. Contents iaas Documentation, Release 0.1.0
g. compute-0[1-] 2. Enable regular puppet execution by removing puppet::runmode: ‘none’ from 1. virsh list should now report the foreman instance as running 1. The install can be monitored with vncviewer $loc-controller.01, virt-manager connected to $loc- controller-01 or your preferred vnc viewer application 2. When the message “Domain creation completed. Restarting guest.”
Deployment of new code
With ansible
To use ansible to deployment: cd $ANSIBLE bin/deploy.sh
Caution: Sometimes the r10k used in provision/puppetmodules.sh will not deploy a new version of a puppet module. Check deployed module version in /etc/puppetlabs/code/modules/
Manual deployment
Deployment is done on the admin-01 node. From login you should reach it by running ssh iaas@
Hieradata and profile sudo -i cd /opt/himlar git pull
Puppet modules
Active puppet modules reside in /etc/puppetlabs/code/modules. For minor changes in Puppetfile this should update the active modules from source: cd /opt/himlar HIMLAR_PUPPETFILE=deploy provision/puppetmodules.sh
To rebuild a module from source: rm -Rf /etc/puppetlabs/code/modules/
1.5. Team operations 99 iaas Documentation, Release 0.1.0
To rebuild all modules from source:
rm -Rf /etc/puppetlabs/code/modules/* cd /opt/himlar provision/puppetmodules.sh
Secrets
Secrets are stored at git@git.
cd /opt/himlar provision/puppetsecrets.sh
Compute management
Note all AZ names will have a region prefix not used in this document. E.g. default-1 will be bgo-default-1 in bgo. Updated information about active aggregate and compute host lists can be found in Trello.
Organization
Each compute host should only be in one host aggregate and one availability zone.
Availability zone
We have 3 different AZ: • default-1 • legacy-1 • iaas-team-only-1*0
Host aggregate
Each AZ have one or more host aggregates with hosts. • central1 (default-1) • group1 (legacy-1) • group2 (legacy-1) • group3 (legacy-1) • placeholder1 (iaas-team-only-1)
0 this is only available to limited projects (e.g. iaas-team). Note that everybody can see it, but will get a “No valid host” error if they try to use it. This is also the default AZ for new compute hosts.
100 Chapter 1. Contents iaas Documentation, Release 0.1.0
Disk setup
Each compute host can either be setup with local disk for instances or use Ceph. This should not be changed without a full reinstall. To see det disk setup for a compute host look inn the hieradata node file.
Aggregate management
All aggregates in legacy-1 will be managed by aggregate.py in himlarcli. This include activate (enable/disable hosts), migrate and notification. The addition and removal of compute hosts in aggregates are manual and can be done with Openstack CLI.
Workflow example
If we need to use a standby compute host (called compute-XY) from placeholder1 in central1 we will need to do the following: • Reinstall compute-XY with correct disk setup • Disable other hosts in iaas-team-only-1 and enable compute-XY • Test compute-XY by staring an instance in iaas-team-only-1 • With Openstack CLI remove compute-XY from placeholder1 aggregate and add to central1
Compute reinstall
For compute host (hypervisor) management use himlarcli and hyervisor.py. This script will have sub commands for enable/disable and move between aggregate. Before you start make sure the compute host are empty and disabled.
Ansible script
NB! Delete the running instances or migrate them to another compute node before you start! To reinstall a compute host with ansible run: cd
Testing
To test the compute host after a reinstall move it to the placeholder1 host aggregate og test with iaas-team-only-1 AZ and iaas-team only project.
1.5. Team operations 101 iaas Documentation, Release 0.1.0
HPC Compute nodes setup and management
Hypervisor hardware
• 2x AMD EPYC 7551 32-Core Processor • 512 GB RAM These hosts have 8 NUMA nodes:
# numactl -H ... node distances: node01234567 0: 10 16 16 16 28 28 22 28 1: 16 10 16 16 28 28 28 22 2: 16 16 10 16 22 28 28 28 3: 16 16 16 10 28 22 28 28 4: 28 28 22 28 10 16 16 16 5: 28 28 28 22 16 10 16 16 6: 22 28 28 28 16 16 10 16 7: 28 22 28 28 16 16 16 10
Flavors
All flavors have the following properties: hw_rng:allowed='True' hw_rng:rate_bytes='1000000' hw_rng:rate_period='60'
In addition, we have set a type (either “alice” or “atlas”): type='s== alice'
We have the following flavors for HPC workloads:
Name RAM (GB) vCPU Properties alice.large 8 2 alice.xlarge 16 4 alice.2xlarge 24 8 atlas.large 8 2 hw:cpu_policy=’dedicated’ hw:cpu_thread_policy=’require’ atlas.xlarge 16 4 hw:cpu_policy=’dedicated’ hw:cpu_thread_policy=’require’ atlas.2xlarge 24 8 atlascpu.2xlarge 24 8 hw:cpu_policy=’dedicated’ hw:cpu_thread_policy=’require’
102 Chapter 1. Contents iaas Documentation, Release 0.1.0
Hypervisor OS
The following parameters is set via the grub boot loader, in /etc/sysconfig/grub: hugepagesz=2M hugepages=245760 transparent_hugepage=never isolcpus=4-127
Nova configuration
Only configuration that is special or relevant to the HPC compute nodes. On novactrl hosts: enabled_filters=...,NUMATopologyFilter,...
On compute hosts: vcpu_pin_set="4-127" reserved_host_memory_mb=4096 cpu_allocation_ratio=1 ram_allocation_ratio=1.5
Network node management
Network node reinstall
With Ansible
To reinstall a network host with ansible run: cd
Where
Testing
Run: etcdctl cluster-health on one of the network nodes, everything should be OK here.
1.5. Team operations 103 iaas Documentation, Release 0.1.0
Logging
We have a node called logger where we run rsyslog-server, Logstash, Kibana and Elasticsearch. The other nodes can be set up to ship logs to logger. rsyslog
You can find the logs for each client under /opt/log/
Kibana
Access to kibana is limited to mgmt network on each site. You will need to run sshuttle or ssh port forwarding to the login node to gain access. Point your browser to: bgo: http://bgo-logger-01.mgmt.iaas.intern:5601/
Foreman
Runs on
Kickstart templates
The templates used to install new nodes are found on the master branch of: https://github.com/norcams/ community-templates.git Included here are only the templates that diverge from upstream. If this repo is updated, you will need to update all admin nodes manually. Run as root on the admin server: foreman-rake templates:sync \ repo="https://github.com/norcams/community-templates.git"\ branch="master" associate="always" prefix="norcams"" or run:
/opt/himlar/provision/foreman-settings.sh
104 Chapter 1. Contents iaas Documentation, Release 0.1.0
Upgrade
Foreman are upgraded by changing the foreman_version i himlar and running the foreman upgrade playbook in ansible. Also remember to rebase our community templates fork against upstream. Upstream has tags corresponding to the Foreman version.
Ansible
The Ansible playbooks we use, are found on Github at https://github.com/norcams/ansible In the documentation $ANSIBLE will be refered to as the directory where you have cloned the ansible repo. This will by default be $HOME/ansible. You should clone this repo to your home catalog on each login node: git clone git@github.com:norcams/ansible
You will need access to your SSH-key (tip: use ssh -A
Setup
Please refer to README.md
Simple tasks
Read task.md for examples of useful simple tasks to perform with ansible.
Patching
Last changed: 2021-08-26
Table of Contents
• Patching – Update repo – Before we start – Normal OS patching – None disruptive patching – Compute (dedicated compute resources/HPC) – Firmware – Testing
1.5. Team operations 105 iaas Documentation, Release 0.1.0
Update repo
Repo list (test and/or prod) are updated during the planing phase of an upgrade. Repo we will need to update for el7 and el8:
* centos-* * ceph-* * epel * mariadb-* * rdo-* * sensu * puppetlabs5
Important: Do NOT update calico-repo without extra planned testing og repackaging.
Avoid updating management repos at the same time as normal patching.
Before we start
Important: Before you start, make sure the repo is up to date with the snapshot you wish to use.
Update ansible inventory for both OSL and BGO $himlarcli/ansible_hosts.py Set location variable according to the environment which is going to be patched: location=bgo or: location=osl
Make sure all nodes will autostart with: sudo ansible-playbook-e"myhosts=$ {location}-controller" lib/autostart_nodes.yaml
Normal OS patching
Important: When we patch BGO and OSL at the same time, make sure to keep one NS node up at all time!
For each for the production regions, BGO and OSL, do the following:
106 Chapter 1. Contents iaas Documentation, Release 0.1.0
Patching controller-04
The node controller-04 is usually running virtual nodes that are not critical to the operation of Openstack, and controller-04 can therefore be patched and rebooted outside of a maintenance window. The controller node and all virtual nodes running on the controller can be patched with a single Ansible playbook: sudo ansible-playbook-e"myhosts=$ {location}-controller-04" lib/yum_update_
˓→controller.yaml
This playbook takes extra options, if needed:
Option Effect async=1 will run yum and puppet in parallel on the vms no_reboot=1 will not reboot controller (vms will still be turned off) exclude="package" will not update package with yum
Also, consider patching Firmware.
Patching other controller nodes
1. Upgrade virtual nodes, while excluding the httpd, mariadb and mod_ssl packages. This is usually safe to do outside of a scheduled maintenance window:
sudo ansible-playbook-e"myhosts=$ {location}-nodes exclude=httpd*,MariaDB*,mod_ ˓→ssl,nfs-utils" lib/yumupdate.yaml
2. While in a scheduled maintenance window, upgrade virtual nodes:
sudo ansible-playbook-e"myhosts=$ {location}-nodes" lib/yumupdate.yaml
3. Check if all virtual nodes are updated:
sudo ansible-playbook-e"myhosts=$ {location}-nodes" lib/checkupdate.yaml
4. Upgrade controller nodes:
sudo ansible-playbook-e"myhosts=$ {location}-controller" lib/yumupdate.yaml
5. Check if all controller nodes are updated:
sudo ansible-playbook-e"myhosts=$ {location}-controller" lib/checkupdate.yaml
For each controller do the following. Make sure cephmon is running without error before starting on the next controller. 1. Check ceph status on cephmon:
sudo ssh iaas@${location}-cephmon-01 'sudo ceph status'
Or, alternatively:
for i in $(seq 1 3); do sudo ssh iaas@${location}-cephmon-0$i 'sudo ceph status' ;
˓→ done
In addition, check “cephmon-object” in BGO:
1.5. Team operations 107 iaas Documentation, Release 0.1.0
for i in $(seq 1 3); do sudo ssh iaas@${location}-cephmon-object-0$i 'sudo ceph
˓→status' ; done
2. Turn off the nodes on the controller before reboot:
sudo ansible-playbook-e"myhosts=$ {location}-controller-
˓→manage_nodes.yaml
Monitor through virt-manager or virsh list that all virtual nodes are shut down before proceeding with reboot- ing the controller. 3. Consider patching Firmware. 4. Reboot the controller node:
sudo ansible-playbook-e"myhosts=$ {location}-controller-
Tip: Check that things work before rebooting controller-04, as error analysis etc. often depends on the virtual nodes running on controller-04.
None disruptive patching
These steps can be done without notification and can be done later then normal patching.
Storage
1. Before you begin, you can avoid automatic rebalancing of the ceph cluster during maintenance. Run this com- mand on a cephmon or storage node:
ceph osd set noout
2. Run ceph status continuously in another window on one of the cephmon nodes:
watch ceph status
Before rebooting a node, check that all OSDs are up, e.g.:
osd: 30 osds: 30 up, 30 in
3. Upgrade storage:
sudo ansible-playbook-e"myhosts=$ {location}-storage" lib/yumupdate.yaml
4. Check if the storage nodes are upgraded:
sudo ansible-playbook-e"myhosts=$ {location}-storage" lib/checkupdate.yaml
5. Consider patching Firmware. 6. Reboot one storage node at the time:
sudo ansible-playbook-e"myhosts=$ {location}-
108 Chapter 1. Contents iaas Documentation, Release 0.1.0
NB! Check ceph status, see above. 7. After all nodes are rebooted, ensure that automatic rebalancing is enabled:
ceph osd unset noout
Compute
None disruptive patching will only be possible for compute nodes running in host aggregate central1. Before you start check to documentation for reinstall of compute 1. You will need an empty compute node first. There will usually always be one in AZ iaas-team-only. Reinstall this first and test it. Disable all other compute nodes and enable the new one. 2. For each compute node migrate all instances to the enabled compute node (the empty one). Use himlarcli/ migrate.py. Then reinstall the newly empty compute node, and start over with the next one. 3. The last compute node will now be empty and can be reinstalled, disabled and added back to the AZ iaas-team- only. Update trello status for Availability zone / Host aggregate.
Leaf
Only reboot one node at a time, and never if one node is a single point of failure.
Warning: Never patch Cumulus VX (virtual appliance). Only physical hardware. Cumulus VX are only used in testing/development.
Upgrade node:
apt-get update apt-get dist-upgrade
Reboot node.
Compute (dedicated compute resources/HPC)
1. Before we start (3-5 days before) we should notify all users in the aggregate (e.g. hpc1)
himlarcli/mail.py aggregate -s 'Scheduled maintenance 2021-03-13' -t notify/
˓→maintenance/hpc.txt --date '2021-03-13 12:00-16:00' hpc1 --debug[--dry-run]
Aggregate to consider patching on second Tuesday of every month:
Aggregate Region Template hpc1 osl notify/maintenance/hpc.txt robin1 osl notify/maintenance/dedicated.txt shpc_cpu1 bgo notify/maintenance/shpc.txt shpc_ram1 bgo notify/maintenance/shpc.txt vgpu1 bgo notify/maintenance/dedicated.txt vgpu1 osl notify/maintenance/dedicated.txt
1.5. Team operations 109 iaas Documentation, Release 0.1.0
1. Purge state database (once per region):
himlarcli/state.py purge instance
2. Check instance status:
himlarcli/aggregate.py instances
3. Stop instances:
himlarcli/aggregate.py stop-instance
4. Upgrade compute HPC:
sudo ansible-playbook-e"myhosts=$ {location}-compute-hpc" lib/yumupdate.yaml
5. Check if the nodes are upgraded:
sudo ansible-playbook-e"myhosts=$ {location}-compute-hpc" lib/checkupdate.yaml
6. Reboot nodes. Always check inventory to make sure the target of myhosts match the intended targets for reboot. Some hosts might be running in other aggregates:
sudo ansible-playbook-e"myhosts=$ {location}-compute-hpc" lib/reboot.yaml
7. Start the instances:
himlarcli/aggregate.py start-instance
Firmware
For physical nodes it might be worth considering firmware patching.
Dell
1. Install DSU on the node:
sudo ansible-playbook-e"myhosts=$ {location}-
2. Upgrade firmware:
sudo ansible-playbook-e"myhosts=$ {location}-
˓→yaml
3. Reboot:
sudo ansible-playbook-e"myhosts=$ {location}-
110 Chapter 1. Contents iaas Documentation, Release 0.1.0
Workaround for problematic r740/r740xd BIOS update
BIOS update for PowerEdge r740/r740xd might fail with a message “BIOS File is Corrupt”, and you have to press F1 to boot and then reflash the BIOS. A robust workaround is to flash the BIOS via det iDRAC. First, flash firmware (only) normally: dsu-n-q--component-type=FRMW'
Download the latest BIOS file for the Windows platform from the Dell website to a login node and upload it to the iDRAC, scheduling a BIOS upgrade at next boot:
/opt/dell/srvadmin/bin/idracadm7-r [bmc_address]-u [username]-p [password] update-
˓→f/tmp/BIOS_NVGR9_WN64_2.10.0.EXE
Then reboot.
Supermicro
Supermicro does not recommend flashing firmware unless it is necessary. Also, there is no automated way to do it. If needed, though, download the necessary firmware from the vendor’s website and upload the BIOS or firmware files via the bmc’s update feature. When finished the server must do a full reset, so it is absolutely best to flash the firmware when the server is down (for example being in the grub boot menu).
Warning: If flashing the BIOS the settings will be lost! Be sure to adjust settings after flashing, otherwise the server won’t boot.
Testing
After patching, we should test the following: • install new instance • ssh to new instance • create volume and attach to instance • detach volume • destroy volume • destroy instance
Only in test01 and test02
Reinstall a compute node and repeat the tests above.
1.5. Team operations 111 iaas Documentation, Release 0.1.0
Migrate a running instances
Last changed: 2019-07-01 This document describes migrating an instance from one compute node (source) to another (target) while the instance is running (e.g. NOT legacy). The aggregate cannot have any NUMA aware instances (e.g. Alice/Atlas workload).
Before you start
• Reinstall and test the target • Make sure the target and source are in the same aggregate • Make sure cpu_model (see nova.conf) is the same on target and source (e.g. Haswell-noTSX)
Migrate
We use himlarcli and migrate.py migrate to do the migration.
Tips
• Test first with --limit 1 to see that everything is working • Use --large on first run to migrate the instances with lots of RAM first
Help
This document describes using openstack cli to migrate an instance to target and what to do when a migration times out: https://docs.openstack.org/nova/latest/admin/live-migration-usage.html
Himlar CLI
Himlar CLI is a command line tool written in Python to manage our Openstack installation through the API. Source code: https://github.com/norcams/himlarcli
Setup
Location
Himlar CLI can be found under /opt/himlarcli on login-01 and proxy-01 nodes. Config file is distributed with puppet under /etc/himlarcli/config.ini.
112 Chapter 1. Contents iaas Documentation, Release 0.1.0
Running on login-01
When running scripts on login, you will be able to use some scripts but not access keystone admin API. So you will not be able to run the script for user or project management here. Since login are used for both test and production locations there are several config files under /opt/himlarcli/ named config.ini.
Deployment
There is an ansible playbook and script to update to the latest version on all nodes:
cd
Script overview
Help
All script will have updated help that can be accessed with the -h options. Most of the script actions will also have its help description. E.G:
./image.py update-h
Options
Common options for all scripts • --debug • --dry-run • -c /path/to/config.ini • --format text|json
Scripts
Simple overview with some of the most important actions. More actions might exist, so please refer to the help text.
1.5. Team operations 113 iaas Documentation, Release 0.1.0
script actions notes project.py list list all projects show list project details and roles in project create create new project delete delete project and instances grant grant access for user to project image.py update update images (gold or private) usage show image usage purge purge unused images usage check image usage quota.py update update default quota show show default quota usage.py volume show volume usage and quota in gb migrate.py migrate migrate all instances from a compute host with shared storage list list all instances with vm state on a compute host
Publish status messages to Slack, Twitter and UH-IaaS status API
Help
Publishing to all our communication channels is done by using a simple script called publish_notify.py. Get usage info with the -h option. E.G:
./publish_status.py-h ./publish_status.py important-h ./publish_status.py info-h ./publish_status.py event-h
There are currently three commands: • important will publish to Slack, Twitter and the status database displayed on status.uh-iaas.no. Use for impor- tant messages about events that will directly or indirectly affect users, like planned outages. • info will publish to Twitter and status database. Use for informal messages, of general interest, like if there’s a new service or image available. • event will publish to the event database, which is mostly intended for internal usage. Use for technical de- tails about events that affected availability, stability, user experience etc. Publically available via API but not displayed on our status pages.
Example usage
The following example will publish an important status message to Slack, Twitter and status API. It will be tagged ‘important’ in the status API, meaning it will show up under “New issues and warnings” in our Grafana dashboard. The -l option will add a single whitespace and a hardcoded message linking to our status dashboard on messages going to Slack and Twitter:
$ ./publish_status.py important -m "We're currently having some issues. Sorry for the
˓→inconvenience." -l
Which will return:
114 Chapter 1. Contents iaas Documentation, Release 0.1.0
The following message will be published: We're currently having some issues. Sorry
˓→for the inconvenience. For live updates visit https://status.uh-iaas.no Are you sure you want to publish? (yes|no)?
This has to be interactively confirmed by typing yes. Notice that you should always use punctuation after the last sentence in your message or template. This example uses a pre-made template instead of a command-line argument as message, replacing the variables $date and $region with -d and -r:
./publish_status.py important-t./misc/notify_maintenance.txt-r bgo-d'September
˓→20 between 15:00 and 16:00'
Which will return:
The following message will be published: Services will be unavailable on September 20
˓→between 15:00 and 16:00 in BGO due to maintenance. Running instances will not be
˓→affected. Are you sure you want to publish? (yes|no)?
This example will publish an info message about a new image on Twitter and status API. It will be tagged ‘info’ in the status API, meaning it will show up under “News” in our Grafana dashboard:
./publish_status info-m'New image Fedora 28 is availible."
Repository server
Introduction
For local caching of external repositories and to facilitate a repository of packages created by the NREC team etc., a server system is installed. Because the production environment has to be carefully managed, some issues are raised which is attempted resolved by this setup: • Production servers must only run well-known versions and combinations of software (which is supposed to be tested and approved before deployment) • Possible to check state of code at any date in the past for debugging • Possible to test new code without disturbing production environment • Ability to maintain our software in the same manner as any external RPM repository • Means to distribute all kind of data files without versioning To accomplish all of this we have implemented both a system for versioned/snapshotted mirrors of any ‘external’ repo (whatever the location), a local ordinary RPM repository and a general distribution point. • Hostname: iaas-repo.uio.no • Alias (used in code): download.iaas.uio.no • Access: as for normal infrastructure nodes (iaas-user from one of the login nodes) • Repo root directory: /var/www/html/nrec • Available protocols: https
1.5. Team operations 115 iaas Documentation, Release 0.1.0
Note: The implementor accepts the fact that the naming scheme for these directories is misleading! Please read the description before assuming anything related to the role of the directory!
Executive Summary repo is a locally hosted mirror of a set of external repositories. A snapshot is taken of each repo every night, and these snapshots resides inside the snapshot directory stamped by date. In the test and prod directories every repository utilized by the UH-IaaS infrastructure has a pointer to one of those snapshots. Those pointers are never moved without consideration or testing, especially the links in the prod directory. The upshot is thus: packages and files can be trusted not to be updated or altered in an uncontrolled fashion, and is available locally at all times. It is possible to set up further such repos, in case a certain installation requires packages from a very specific date (other than in test and prod). nrec-internal and aptrepo should be assumed to be like any other external repositories, just that these exter- nal repositories are coincidentally managed by the NREC team. Data configured into these are then available for consumption in the same controlled manner as any other external repository which is mirrored locally. rpm, nonfree, nonfree/yum-nonfree, nrec-resources and ports are unmanaged repositories without the foremen- tioned snapshotting and consistent control. Data located here is available instantly, but outside of any version control and without any kind of meta data.
116 Chapter 1. Contents iaas Documentation, Release 0.1.0
Diagram of setup
Directory description
• repo: Mirror hierarchy. This is where all defined repositories are mirrored. Content is mirrored nightly. • snapshots: Nightly snapshot of all mirrors under repo. Each snapshot is named by the date and time of creation. • prod: For each repository a pointer (symbolic link) to a snapshot of the same. • test: As for prod, but separate links (usually for a more recent snapshot which is supposed to be used for production next). • nrec-internal (prev. yumrepo): Locally maintained RPM repository. Mirrored under repo as any external repository is. Available for el7 and el8. • aptrepo: Locally maintained APT repository. Mirrored under repo as any external repository is (named nrec- internal-apt). • rpm: Generic file distribution. No metadata, versioning, mirroring or snapshotting. • nonfree Generic file distribution. No metadata, versioning, mirroring or snapshotting. Only accessible from login and proxy-nodes!
1.5. Team operations 117 iaas Documentation, Release 0.1.0
• nrec-resources Generic file distribution. No metadata, versioning, mirroring or snapshotting. Only accessible from NREC allocated IP ranges (incl. user instances)! • nonfree/yum-nonfree RPM repository. No versioning, mirroring or snapshotting. Only accessible from login and proxy-nodes! Available for el7 and el8. • gem: Local Ruby Gem distribution. No metadata, versioning, mirroring or snapshotting. • ports: For FreeBSD packages. No metadata, versioning, mirroring or snapshotting.
Common attributes and requirements
Packages built locally by the project are made available for use by storing it in one of the prepared directories depend- ing on whether the package is to be part of a yum repository or as a stand-alone package or file, and whether it should be exposed to the world or only internally. The iaas group owns all files and directories under the repository root directory; the hierarchy is configured with the set group ID bit. Accordingly, all relevant repo operations can (and should) be done as the iaas user. NOTE Make sure new packages and files have the correct SELinux label:
sudo restorecon
or:
sudo restorecon-R
Detailed descriptions
In addition to the mirror service of true external repositories, the service contains and offers several proper local ones. Each of these serves different purposes, and thus has to be handled and maintained using distinct procedures. This section describes each and how to add and update packages and files.
YUM repository
Directory name: nrec-internal/el[78] (wolrd wide availability) nonfree/yum-nonfree/el[78] (internally available) For local RPM packages which are maintained in the same way as any external RPM packages from ordinary reposi- tories, there are YUM repos located in nrec-internal and nonfree/yum-nonfree. The former have world wide availability and is versioned/snapshot’ed, while the latter is only available locally and is additionally not ver- sioned. These files/packages are considered and consumed exactly like any other, external, repository used by the project/code! IMPORTANT After all file operations update the repository meta data:
sudo/usr/bin/createrepo
˓→el[78]
URL: https://download.iaas.uio.no/nrec/nrec-internal https://download.iaas.uio.no/nrec/nonfree/yum-nonfree
118 Chapter 1. Contents iaas Documentation, Release 0.1.0
Note: NREC-INTERNAL: This repository is mirrored and snapshotted just like any external repository. As such it can be reached through the test and prod interfaces described elsewhere.
Client configuration (example)
Example of client configuration in a yum repo file under /etc/yum.repos.d/:
[nrec-internal] name=NREC internal repo baseurl=https://download.iaas.uio.no/nrec/prod/nrec-internal/el7 enabled=1 gpgcheck=0 priority=10
For the internal (nonfree) repository:
[nrec-nonfree] name=Internal NREC repository baseurl=https://download.iaas.uio.no/uh-iaas/nonfree/yum-nonfree/el7 enabled=1 gpgcheck=0 priority=10
APT repository
Directory name: aptrepo iFor local APT-packages which belongs in an ordinary DEB-based repository ithere is a similar setup as for the above mentioned YUM repository. This is located in aptrepo. These files/packages are then considered and consumed exactly like any other, external, repository used by the project/code. The architectures and codenames supported are described in the distribution file located in the apt subdirectory of the repo-admin GIT repository.
Steps to import packages
1. Save new package to the incoming subdirectory inside aptrepo 2. Execute the deb repo tool inside the aptrepo directory:
reprepro-b.--confdir/etc/kelda/prod/apt includedeb wheezy incoming/\ *
(replace *wheezy* with whatever codename is considered)
3. Remove package(s) from the incoming directory URL: https://download.iaas.uio.no/nrec/aptrepo
Note: This repository is mirrored and snapshotted ijust like any external repository (named nrec-internal-apt). As such it can be reached through the test and prod interfaces described elsewhere.
1.5. Team operations 119 iaas Documentation, Release 0.1.0
Client configuration (example)
Example of client configuration in /etc/apt/sources.list: deb [trusted=yes] https://download.iaas.uio.no/nrec/prod/nrec-internal-apt wheezy main
Ruby Gem repository
Directory name: gem Gems which are locally produced or adapted might be installed into this repository. The gems might then be installed through the ‘sensu_gem* puppet provider or using the –source parameter for gem install.
Steps to import gems
• upload package into the gems subdirectory • remove all files named ‘*specs*’ (should be 6 all in all) • remove the quick subdirecory recursively • run as the iaas user: gem generate_index –update –directory . (ignoring errors) For upload procedure, see below.
Standalone file archives
Directory name: rpm, nrec-resources and nonfree Files (RPM packages or other types) which are needed by the project but which should or cannot use the local YUM repository, can be distributed from the generic archive located under the rpm, nrec-resources or nonfree subdirectory. No additional operations required, other than the ensuring correct SELinux label as described above. URL: https://download.iaas.uio.no/nrec/rpm URL: https://download.iaas.uio.no/nrec/nonfree URL: https://download. iaas.uio.no/nrec/nrec-resources The distinction between those, is that nonfree is only accessible from a restricted set of IP addresses (at the time of writing the login and proxy nodes), nrec-resources from all NREC allocated ranges (infra and instances) whereas rpm is reachable from the world. The access lists for the restricted areas are maintained in the repo-admin gitolite repositoryi, in the httpd subdirectory.
Upload procedure
Probably the simplest way to upload a file to the rpm (or nonfree) archive is to first place the file on an available web site and then download it into the archive on download: 1. upload file to a web archive (for instance https://folk.uio.no for UiO affiliated personel) 2. log in to download from one of the login nodes in the usual manner:
sudo ssh iaas@download.iaas.uio.no
3. cd /var/www/html/nrec/rpm 4. download the file with wget, curl or something like that
120 Chapter 1. Contents iaas Documentation, Release 0.1.0
Local mirror and snapshot service
To facility tight control of the code and files used in our environment, and to ensure the availability in case of network or external system outages, etc., a local mirror and snapshot service is implemented. Content and description of included subdirectories:
Short Long Description URL name name repo Reposi- Latest sync from external sources https://download.iaas.uio. tory no/nrec/repo snap- Snapshots Regular (usually daily) snapshots of data in repo https://download.iaas.uio. shots no/nrec/snapshots test Test repo Pointer to a specific snapshot in time, usually newer than https://download.iaas.uio. prod no/nrec/test prod Pro- Pointer to a specific snapshot in time with well-tested data, https://download.iaas.uio. duction used in production environments no/nrec/prod repo
Usage is normally as follows: repo for development or other use of most up-to-date code test test code which is aimed for next production release prod production code snapshots can be used to test against code from any specific date in the past
Mirror
Directory: repo Each mirrored repository is located directly beneath the repo folder. Which “external” (which might actually be located locally) repository is to be mirrored, is defined by data in the internal repo-admin git repo (see below for access details). All repositories listed in the file repo.config is attempted accessed and synced. The type of repository - as defined in the configuration file for the appropriate listing - determines what actions are taken on the data. As this is mainly YUM repositories, the appropriate metadata commands are executed to create a proper local repository. Any YUM repo defined in the configuration must have a corresponding repo-definition in a suitable file in the yum. repos.d subdirectory (in the git repo!). The mirroring is done once every night by a root cron job. To access the most current data in the mirror, us this URL: https://download.iaas.uio.no/nrec/repo/
This repository also contains the access list configuration for the restricted areas like nonfree and nrec-resources.
1.5. Team operations 121 iaas Documentation, Release 0.1.0
Snapshots
Directory: snapshots Every night a cron job runs to create snapshots of all mirrored repositories (of all kinds). A snapshot subdirectory is created named by the current date and time. Under this, all repos can be accessed. This way any data can be retrieved from any data in the past on which a snapshot has been taken. current: In the snapshots directory there is always a special “snapshot” named current. This entry is at any time linked to the most current snapshot. To access the snapshot library:
https://download.iaas.uio.no/nrec/snapshots/
Note: The snapshot data are created using a system of hardlinks. This way unaltered data is not duplicated, which conserves space considerably.
Test and prod
Directories: test, prod All mirrored repos used by UH IaaS can be accessed through a static and well known historic version using the test and prod interfaces. By configuring the appropriate files in the internal repo-admin git repo, each repo might have a test and prod pointer linking to a specific snapshot of this repository. NB: each and every mirrored repo can be set up to link to separate snapshots!
Important: This is the access point to use in the production and test environments!
Configuration
Configuration for the repositories is stored in the internal git repo:
git@git.iaas.uio.no:repo-admin
The iaas user has READ permissions and should be used to pull the configuration to the repository server.
Files
config Generic configuartion (for now the location of the repo root only) repo.config Definition of the external repositories to mirror test.config Which snapshots and local repositories to point to in test prod.config Which snapshots and local repositories to point to in prod
122 Chapter 1. Contents iaas Documentation, Release 0.1.0
Considerations
• test should never point to a snapshot older than what the corresponding prod are linking to • Pointers in prod must also exist in test, the rationale being that this somewhat ensures that prod has already been tested. Links in the prod configuration which does not also exist in the test configuration will not be activated (removed if the exists)! • If there is more than one link listed to the same repo the most current is always the one activated. • Existing links not listed in the current configuration will be removed!
Update procedure
1. Clone or pull the git repo locally:
git@git.iaas.uio.no:repo-admin
This must be done on a node inside the set up (like the login nodes) due to access restrictions on the local git repo. 2. Edit one or both files: prod.config and/or test.config (or any of the other config files), entering or changing to reflect the date required (consult the web page for exact timestamp to use. 3. Commit and push to the central git repo. 4. On osl-login-01 run the ansible job update_repo.yaml:
sudo ansible-playbook-e"myhosts=download" lib/update_repo.yaml
This action pull the latest config and update the pointers in test and prod.
Publicizing procedure
Normal (automatic) rpm, nonfree, nrec-resources and gem: Files placed inside this location is instantly accessible, provided correct SELinux labeling. No snapshotting provided! Access lists set up via the configuration and scripts in the httpd subdirectory of the repo-admin repo documented above. nrec-internal and aptrepo: Files placed inside this location is instantly accessible, provided correct SELinux label- ing. No snapshotting provided through this interface! For this use the SNAPSHOT, TEST or PROD interfaces instead. repo: Any repositories which are mirrored (including YUMREPO) have new files accessible here after the mirroring job is run during night time. The version available is always the most recent! snapshots: Every night after mirror job completion a snapshot of the current mirrors are taken. Any of these snapshots are available through this interface below a directory named by the timestamp [YYYY-MM-DD-hhmm]. The most current snapshot is additionally presented as “current”. test, prod: These interfaces should be seen as a static representation of data from specific date/times. Each mirrored repository (if configured to be listed here) is listed with a link to a specific snapshot of the repo in question. The PROD repository is what is used in the production environment and should never be more recent than TEST (this is actually prohibited by the setup routine for these pointers). Data is available concurrently with the snapshots it is linked to.
1.5. Team operations 123 iaas Documentation, Release 0.1.0
Manual routine for instant publicizing
rpm, nonfree (incl. yum-nonfree), gem and ports: Nothing required! nrec-internal and aptrepo: New files are available through the ordinary interfaces after mirroring and snapshotting. This is usually done nightly, but the routines might be run manually if necessary: 1. sudo /opt/kelda/repoadmin.sh -e prod sync 2. sudo /opt/kelda/repoadmin.sh -e prod snapshot
Caveats
• Any changes in the local YUM or APT repository (nrec-internal resp. aptrepo) is not accessible through the mirror interface (repo) until after the next upcoming mirror job (usually during the next night, check crontab on the mirror server for details). After this, the data should be accessible under the repo link. • New data mirrored is available under the snapshot link only after the next snapshot run (check crontab for details). This is normally scheduled for some time after the nightly mirror job. • Data stored in any of the local repositories are instantly accessible when accessed using the direct URL’s as listed above.
Purging of old/unused data
For conservation of disk space there is a janitor script which may be used to remove (purge) snapshots which are no longer used:
/usr/local/sbin/snapshot_cleanup.sh
Note: Only snapshots older than the oldest snapshot still referenced by any test or prod pointers may be deleted.
Invocation: [ sudo ] /usr/local/sbin/snapshot_cleanup.sh [-d|u] [ [-t
˓→| [-r
-u: print usage text and exit -d: dry-run (just print what would otherwise be deleted) -t: purge snapshots older than timestamp provided Timestamp format equals format used by kelda (config fields and snapshot directory naming) -r: expunge named repository, complete with mirror and every snapshot of it (but only snapshots of this particular mirror)
NB: -t and -r are mutually exclusive!
If no -t or -r argument provided then all snapshots older than oldest still in use are removed! For now there is no automatic invocation, and any cleanup should be done manually. User confirmation is requested. If running as the iaas user then sudo is required.
124 Chapter 1. Contents iaas Documentation, Release 0.1.0
SSL certificates
Generation
Generation for *.iaas.uio.no and *.iaas.uib.no are done following each organizations normal process. *.uh-iaas.no are done at UNINETT. For self signed certificates we use our own Root CA git repo on login: git@git.iaas.uib.no:ca_setup
This repo should also all key, certs and config files for all certs in use.
Naming conventions
Type Filename cachain /etc/pki/tls/certs/cachain.pem key /etc/pki/tls/private/
Storage
All three files should be stored on login under full path with root set at /opt/repo/secrets/nodes/
Distribution
There is an ansible playbook to push secrets to nodes (see tasks.md for more info). This requires that a YAML config file describes the files and mode. Example of inventory/host_vars/bgo-db-01.yaml: secret_files: cert: path:'/etc/pki/tls/certs/db01.bgo.uhdc.no.cert.pem' mode:'0644' owner:'root' group:'root' key: path:'/etc/pki/tls/private/db01.bgo.uhdc.no.key.pem' mode:'0644' owner:'root' group:'root' cafile: path:'/etc/pki/tls/certs/cachain.pem' mode:'0640' owner:'root' group:'mysql'
1.5. Team operations 125 iaas Documentation, Release 0.1.0
SSL overview
Last changed: 2020-02-17 nrec.no
Domain Type Termination Certificate access.nrec.no A bgo-api-01 *.nrec.no access-bgo.nrec.no A bgo-api-01 *.nrec.no access-osl.nrec.no A osl-api-01 *.nrec.no report.nrec.no CNAME bgo-api-01 *.nrec.no report-osl.nrec.no A bgo-api-01 *.nrec.no report-bgo.nrec.no A osl-api-01 *.nrec.no status.nrec.no CNAME bgo-api-01 *.nrec.no status-bgo.nrec.no A bgo-api-01 *.nrec.no status-osl.nrec.no A osl-api-01 *.nrec.no request.nrec.no A bgo-api-01 *.nrec.no api.nrec.no A bgo-api-01 *.nrec.no api-osl.nrec.no A osl-api-01 *.nrec.no docs.nrec.no CNAME sslproxy.ha.uib.no docs.nrec.no www.nrec.no CNAME sslproxy.ha.uib.no www.nrec.no nrec.no A 129.177.6.241 nrec.no dashboard.nrec.no A bgo-dashboard-01 dashboard.nrec.no dashboard-osl.nrec.no A osl-dashboard-01 NA dashboard-bgo.nrec.no A bgo-dashboard-01 NA console.osl.nrec.no A osl-api-01 console.osl.nrec.no compute.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no identity.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no network.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no image.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no volume.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no placement.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no metric.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no dns.api.osl.nrec.no A osl-api-01 *.api.osl.nrec.no resolver.osl.nrec.no A NA NA console.bgo.nrec.no A bgo-api-01 console.bgo.nrec.no compute.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no identity.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no network.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no image.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no volume.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no placement.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no metric.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no dns.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no object.api.bgo.nrec.no A bgo-api-01 *.api.bgo.nrec.no resolver.bgo.nrec.no A NA NA
126 Chapter 1. Contents iaas Documentation, Release 0.1.0
DNS
We use the domains uh-iaas.no and uhdc.no.
Architecture
The illustration below shows the architecture of the DNS infrastructure. This is within a single location. In our case, with two or more locations, the NS nodes act as master/slave for each other. One NS node is the master for a given zone, and the others act as slaves for that zone.
We have two resolving DNS nodes in each location. They are set up in a redundant fashion where anycast is used to achieve redundancy. The idea is that instances in BGO will use the BGO resolver as primary and the OSL resolver as secondary, and vice versa for OSL instances.
1.5. Team operations 127 iaas Documentation, Release 0.1.0
Public zone: uh-iaas.no
This zone is delegated to: • ns1.uh-iaas.no (master - located in OSL) • ns2.uh-iaas.no (slave - located in BGO) Editing this zone is done via Puppet hieradata. Since the OSL NS-host (ns1.uh-iaas.no) is master for this zone, editing this zone should be done in hieradata/osl/common.yaml. In this zone we have the following records:
RECORD TYPE VALUE access.uh-iaas.no A 158.39.77.250 access-bgo.uh-iaas.no A 158.39.77.250 access-osl.uh-iaas.no A 158.37.63.250 api.uh-iaas.no A 158.39.77.250 dashboard.uh-iaas.no A 158.39.77.254 dashboard-bgo.uh-iaas.no A 158.39.77.254 dashboard-osl.uh-iaas.no A 158.37.63.254 ns1.uh-iaas.no A 158.37.63.251 ns2.uh-iaas.no A 158.39.77.251 report.uh-iaas.no A 158.39.77.250 report-bgo.uh-iaas.no A 158.39.77.250 report-osl.uh-iaas.no A 158.37.63.250 request.uh-iaas.no A 158.39.77.250 status.uh-iaas.no A 158.39.77.250 status-bgo.uh-iaas.no A 158.39.77.250 status-osl.uh-iaas.no A 158.37.63.250 ns1.uh-iaas.no AAAA 2001:700:2:82ff::251 ns2.uh-iaas.no AAAA 2001:700:2:83ff::251 docs.uh-iaas.no CNAME uh-iaas.readthedocs.io www.uh-iaas.no CNAME norcams.github.io uh-iaas.no MX 10 uninett-no.mx1.staysecuregroup.com uh-iaas.no MX 20 uninett-no.mx2.staysecuregroup.net
Internally delegated zone: osl.uh-iaas.no
This zone is delegated to: • ns1.uh-iaas.no (master - located in OSL) • ns2.uh-iaas.no (slave - located in BGO) Editing this zone is done via Puppet hieradata. Since the OSL NS-host (ns1.uh-iaas.no) is master for this zone, editing this zone should be done in hieradata/osl/common.yaml. In this zone we have the following records:
128 Chapter 1. Contents iaas Documentation, Release 0.1.0
RECORD TYPE VALUE compute.api.osl.uh-iaas.no A 158.37.63.250 identity.api.osl.uh-iaas.no A 158.37.63.250 image.api.osl.uh-iaas.no A 158.37.63.250 metric.api.osl.uh-iaas.no A 158.37.63.250 network.api.osl.uh-iaas.no A 158.37.63.250 placement.api.osl.uh-iaas.no A 158.37.63.250 volume.api.osl.uh-iaas.no A 158.37.63.250 console.osl.uh-iaas.no A 158.37.63.250 resolver.osl.uh-iaas.no A 158.37.63.252 resolver.osl.uh-iaas.no AAAA 2001:700:2:82ff::252
Internally delegated zone: bgo.uh-iaas.no
This zone is delegated to: • ns1.uh-iaas.no (slave - located in OSL) • ns2.uh-iaas.no (master - located in BGO) Editing this zone is done via Puppet hieradata. Since the BGO NS-host (ns2.uh-iaas.no) is master for this zone, editing this zone should be done in hieradata/bgo/common.yaml. In this zone we have the following records:
RECORD TYPE VALUE compute.api.bgo.uh-iaas.no A 158.39.77.250 identity.api.bgo.uh-iaas.no A 158.39.77.250 image.api.bgo.uh-iaas.no A 158.39.77.250 metric.api.bgo.uh-iaas.no A 158.39.77.250 network.api.bgo.uh-iaas.no A 158.39.77.250 placement.api.bgo.uh-iaas.no A 158.39.77.250 volume.api.bgo.uh-iaas.no A 158.39.77.250 console.bgo.uh-iaas.no A 158.39.77.250 resolver.bgo.uh-iaas.no A 158.39.77.252 resolver.bgo.uh-iaas.no AAAA 2001:700:2:83ff::252
Internal zone: uhdc.no
The following zones in uhdc.no has been delegated to us:
ZONE NS bgo.uhdc.no ns1.uh-iaas.no (slave) ns2.uh-iaas.no (master) osl.uhdc.no ns1.uh-iaas.no (master) ns2.uh-iaas.no (slave) test01.uhdc.no alf.uib.no begonia.uib.no test02.uhdc.no ns1.uio.no ns2.uio.no
Editing these zones is done via Puppet hieradata. Most of the records should be present in hiera- data/common/common.yaml.
1.5. Team operations 129 iaas Documentation, Release 0.1.0
Test domains
For public domain in test we have a delegated subdomains:
REGION DOMAIN test01 test.iaas.uib.no test02 test.iaas.uio.no
OpenStack Designate (DNS)
Warning: This document is under construction. Designate is not yet in production.
Managing top-level domains (TLDs)
We manage a list of legal top-level domains, in which the users can create their zones. With such a list in place, zones may only be created with the TLDs listed there. Without such a list, any TLD would be legal, and users could also register zones such as “com”, thus preventing any other user to create a domain under the “.com” TLD. A list of all valid top-level domains is maintained by the Internet Assigned Numbers Authority (IANA) and is updated from time to time. The list is available here: • http://data.iana.org/TLD/tlds-alpha-by-domain.txt This list can be imported into Designate via the himlarcli command dns.py:
# download the TLD list from IANA curl http://data.iana.org/TLD/tlds-alpha-by-domain.txt-o/tmp/tlds-alpha-by-domain.
˓→txt
# import the TLD list into designate ./dns.py tld_import--file/tmp/tlds-alpha-by-domain.txt
The tld_import action will add any TLDs which aren’t already registered, with a special comment that marks the TLD as being bulk imported. Any registered TLDs that aren’t on the import list, and also have the special bulk import comment, will be deleted. This means that we can add our own TLDs if needed, as they will not be deleted by the bulk import. The bulk import as show above is designed to run automatically by cron etc. The himlarcli command also have actions to create, delete and update a TLD, and to view the list or a specific TLD.
Managing blacklists
Blacklisting is done to prevent users from creating domains that we don’t want them to create. This is mostly done to protect a domain from eventual future use. In order to prevent any users from creating the domain foo.com and any subdomains:
./dns.py blacklist_create \ --comment'Protect domain foo.com including any subdomains'\ --pattern'^([A-Za-z0-9_\-]+\.) *foo\.com\.$'
If we want to only protect the domain itself, but allow users to create subdomains in the domain, we can use a simpler pattern:
130 Chapter 1. Contents iaas Documentation, Release 0.1.0
./dns.py blacklist_create \ --comment'Protect domain foo.com while allowing subdomains'\ --pattern'^foo\.com\.$'
Listing the blacklists is done via the blacklist_list action (the option –pretty formats the output in a table). Example:
$ ./dns.py blacklist_list --pretty +------+------
˓→+------+ | pattern | description
˓→| id | +------+------
˓→+------+ | ^([A-Za-z0-9_\-]+\.)*foo\.com\.$ | Protect domain foo.com including any subdomains ˓→| 8958bf52-8e64-4a86-87ea-2087b7bc6d60 | | ^bar\.net\.$ | Protect domain bar.net while allowing subdomains
˓→| b3f7fc9f-67a8-4d07-aabc-444f0e4d67c4 | +------+------
˓→+------+
Updating a blacklist entry is done via the blacklist_update action. Updating the comment (example):
$ ./dns.py blacklist_update --comment 'new comment' \ --id b3f7fc9f-67a8-4d07-aabc-444f0e4d67c4
And updating the pattern (example):
$ ./dns.py blacklist_update --new-pattern '^tralala\.org\.$' \ --id b3f7fc9f-67a8-4d07-aabc-444f0e4d67c4
Deleting a blacklist entry is done via the blacklist_delete action:
./dns.py blacklist_delete--id
Blacklisted domains
The following domains should be blacklisted in production:
1.5. Team operations 131 iaas Documentation, Release 0.1.0
DO- PATTERN COMMENT MAIN uh- ^([A-Za-z0-9_\-]+\. Official UH-IaaS domain, managed outside of Openstack Des- iaas.no )*uh-iaas\.no\.$ ignate uhdc.no ^([A-Za-z0-9_\-]+\. Official UH-IaaS domain, managed outside of Openstack Des- )*uhdc\.no\.$ ignate uio.no ^([A-Za-z0-9_\-]+\. Domain belonging to UiO. Instances in UH-IaaS are not al- )*uio\.no\.$ lowed to have UiO addresses uib.no ^uib\.no\.$ Domain belonging to UiB. The domain itself is forbidden, but subdomains are allowed uio- ^uiocloud\.no\.$ Domain belonging to UiO. The domain itself is forbidden, but cloud.no subdomains are allowed uninett.no ^([A-Za-z0-9_\-]+\. Domain belonging to Uninett. Reserved for possible future use )*uninett\.no\.$ ntnu.no ^([A-Za-z0-9_\-]+\. Domain belonging to NTNU. Reserved for possible future use )*ntnu\.no\.$ uia.no ^([A-Za-z0-9_\-]+\. Domain belonging to UiA. Reserved for possible future use )*uia\.no\.$
These are added with these commands:
./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *uh-iaas\.no\.$'--comment ˓→'Official UH-IaaS domain, managed outside of Openstack Designate' ./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *uhdc\.no\.$'--comment ˓→'Official UH-IaaS domain, managed outside of Openstack Designate' ./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *uio\.no\.$'--comment ˓→'Domain belonging to UiO. Instances in UH-IaaS are not allowed to have UiO addresses
˓→' ./dns.py blacklist_create--pattern'^uib\.no\.$'--comment'Domain belonging to UiB.
˓→The domain itself is forbidden, but subdomains are allowed' ./dns.py blacklist_create--pattern'^uiocloud\.no\.$'--comment'Domain belonging to
˓→UiO. The domain itself is forbidden, but subdomains are allowed' ./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *uninett\.no\.$'--comment ˓→'Domain belonging to Uninett. Reserved for possible future use' ./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *ntnu\.no\.$'--comment ˓→'Domain belonging to NTNU. Reserved for possible future use' ./dns.py blacklist_create--pattern'^([A-Za-z0-9_\-]+\.) *uia\.no\.$'--comment ˓→'Domain belonging to UiA. Reserved for possible future use'
ETCD
How to fix etcd cluster
The etcd cluster needs to be reconfigured from time to time, and typically if a network node is reinstalled. In order to include a newly installed (or otherwise reset) node into an existing cluster, some action is required.
132 Chapter 1. Contents iaas Documentation, Release 0.1.0
With Ansible
There’s an Ansible playbook available that automates all the steps required to bring a reinstalled node back into the cluster. Check the following before you run it: • The node must be fully provisioned by Puppet after reinstall. etcd will fail, which is why we have this playbook, but the node must otherwise be properly installed. • The node needs a script that we deploy with Puppet located in /usr/local/sbin/bootstrap-etcd-member • There should be no empty variables in the script. If there are, check for missing hieradata. The playbook needs two variables: • ‘member’ is the hostname of the node we’re bringing back into the cluster. • ‘manage_from’ is the hostname of another node in a functioning cluster where we delete the old member and re-add the newly installed one. The playbook will figure out the rest. This example re-adds bgo-network-02 into the cluster, managed from bgo-network-01: sudo ansible-playbook-e'member=bgo-network-02 manage_from=bgo-network-01' lib/
˓→reconfigure_etcd_cluster.yaml
If successful, all tasks should run without errors except ‘TASK [Bootstrap member]’ which is supposed to fail. This task runs etcd in the foreground asynchronously, which is needed for bootstrapping, and exits after 30 seconds. Then it runs Puppet, which will start etcd as a systemd service.
Expanding from a single-node cluster
The above playbook will bring back a reinstalled node into an already existing multi-node cluster, however, expanding from a single-node cluster into a multi-node one requires a slightly different procedure. First, the initial-cluster parameter must contain all the nodes present when bootstrapping, This is configured by Puppet, including the script we use for bootstrapping. Expand the cluster node by node; starting with the initial single node cluster, add a second node in the configuration, then use the Ansible playbook called expand_etcd_cluster: sudo ansible-playbook-e'member=bgo-network-02 manage_from=bgo-network-01 member_
˓→ip=172.18.0.72' lib/expand_etcd_cluster.yaml
We need to provide the IP address of the new member since we cannot fetch it from etcdctl. If successful, all tasks should run without errors except ‘TASK [Bootstrap member]’ which is supposed to fail. This task runs etcd in the foreground asynchronously, which is needed for bootstrapping, and exits after 30 seconds. Then it runs Puppet, which will start etcd as a systemd service. Check the etcd cluster by running the following command on one of the nodes: etcdctl cluster-health which should report both nodes being healthy. Then proceed with the next node.
1.5. Team operations 133 iaas Documentation, Release 0.1.0
Fix compute nodes after etcd outage
If the etcd cluster running on the network nodes has been unresponsive for an extended period of time, the following services need to be restarted on compute, after verifying that the cluster is healthy: etcd etcd-grpc-proxy calico-dhcp-agent calico-felix openstack-nova-compute openstack-nova-metadata-api
We have an Ansible playbook for this task: sudo ansible-playbook -e 'myhosts=bgo-compute' lib/restart_etcd_compute.yaml <- Fix!
Manually
If Ansible fails for some reason you can try to do the procedure manually. Following the official docs are usually sufficient, available here: https://coreos.com/etcd/docs/latest/v2/runtime-configuration.html Keep the following in mind: • Delete the old member from the cluster first, then add the new one. • When you add the new member, etcdctl returns three environment variables you must set on the node you’re bootstrapping • Make sure the etcd service is stopped and /var/lib/etcd is empty on the node you’re bootstrapping • Disable puppet agent until you’re done • The etcd commands for the node you’re bootstrapping must be run as the etcd user. Bootstrapping example (bgo-network-01): etcd--listen-client-urls http://0.0.0.0:2379,http://127.0.0.1:4001--advertise-
˓→client-urls http://172.18.0.71:2379--listen-peer-urls http://0.0.0.0:2380--
˓→initial-advertise-peer-urls http://172.18.0.71:2380--data-dir/var/lib/etcd/bgo-
˓→network-01.etcd
After bootstrapping, stop etcd running in foreground, enable puppet and initiate a puppetrun. This will start the etcd service, and you’re good to go. uib-ha
In location uib we are running 6 nodes (2 test and 4 prod) with haproxy and corosync/pacemaker. These nodes run load balancing/HA for different UiB only services, but can also be used for UH-IaaS services. Access to the nodes are only available through bgo-login-01.
134 Chapter 1. Contents iaas Documentation, Release 0.1.0
Domains
The service are setup with to different domains: • uibproxy.ha.uib.no (use this for uib only services) • pubproxy.ha.uib.no (use this for public services) • sslproxy.ha.uib.no (use for public services with ssl termination) • gridproxy.ha.uib.no (used for NetApp grid uib service) gridproxy runs on dedicated hardware (uib-ha-grid-01 anduib-ha-grid-02) Test domains: • uibtestproxy.ha.uib.no • pubtestproxy.ha.uib.no • ssltestproxy.ha.uib.no Each domain use 2 public IPs with round-robin A-records.
New service
The setup a new service using uib-ha you will need to follow two steps: 1. Add your domain to the domain list hash in himlar/hiaradata/uib/ha.yaml 2. Create a CNAME records to one of the 2 A-records above
Deployment
The cluster runs himlar but without any admin node: Deploy wit ansible: sudo ansible-playbook-e"myhosts=uib-ha-test" lib/deploy_himlar.yaml
Run puppet: cd/opt/himlar provision/puppetrun.sh
Management
Some tips and tricks to manage the cluster (each cluster are two nodes). Check status: pcs status
Set node in standby (for patching or booting): pcs node standby
Set node back to online:
1.5. Team operations 135 iaas Documentation, Release 0.1.0
pcs node unstandby
Hardware maintenance
Dell PowerEdge servers
Upgrading firmware
Normal procedure: 1. Run Dell System Update:
dsu
2. Press a + ENTER to select all available updates 3. Press c + ENTER to commit and start the upgrade 4. When it finishes, reboot the server:
reboot
On older servers (tested on 11th gen), like we have in test, upgrading firmware on storage components may fail. If this happens: 1. Fix libs:
cd/opt/dell/srvadmin/lib64/ ln-fs libstorelibir-3.so libstorelibir.so.5
2. Run dsu again:
dsu
3. Press a + ENTER to select all available updates 4. Press c + ENTER to commit and start the upgrade 5. Return libs to default:
cd/opt/dell/srvadmin/lib64/ ln-fs libstorelibir.so.5.07-0 libstorelibir.so.5
6. Verify that libs are returned to normal:
ls-l/opt/dell/srvadmin/lib64/libstorelibir.so.5
7. Reboot:
reboot
136 Chapter 1. Contents iaas Documentation, Release 0.1.0
Fix corrupt bios after firmware update
1. Get service tag for the machine with corrupt bios 2. Download new bios from https://www.dell.com/support/home/no-no 3. Store the EXE file on a machine with oob-net access (e.g. login) 4. make sure idracadm7 is installed and run as root:
/opt/dell/srvadmin/bin/idracadm7-r
˓→-f
Setting proper name on the iDRAC: racadm config -g cfgLanNetworking -o cfgDNSRacName $(hostname -s)
Hardware inventory
An easy way to get the inventory of a Dell server is to run the monitoring plugin in debug mode, from the monitor server:
/usr/lib64/nagios/plugins/check_openmanage-H test02-controller-01-d iDRAC reset
If we have problems with the iDRAC, e.g. the redfish endpoint do not respond, we can reset the iDRAC from the host os: yum install srvadmin-idrac.x86_64 /opt/dell/srvadmin/bin/idracadm7 racreset
Report API
Last updated: 2020-06-17 We are running a REST API on https://report.nrec.no with internally developed endpoints for a few utilities needed. This is setup with python-flask and Swagger (now OpenAPI). The DB backend is running on db-global node with master-master replication between BGO and OSL. A quick HTML overview of the report design in Archimate. Original archimate file
1.5. Team operations 137 iaas Documentation, Release 0.1.0
Endpoints
See documentation of all endpoints here.
Instance
Used to collect instance information (POST) using report-utils and to read instances owner and information (GET).
Status
Used to store NREC status messages. This is used to view status messages at https://status.nrec.no and will be posted by notification scripts in himlarcli.
Security
The API has also implemented a simple OAuth2 Bearer token authentication. The tokens are manually created and deleted (revoked) and is stored hashed with bcrypt in the DB backend.
Scopes
• read = used to access get instance api • admin = used to access post status api
Report UTILS
Last updated: 2021-06-18 The clients (as set up on NREC GOLD images) are set up with a systemd service set, which downloads and run a script. The script collects data about the instance and reports back to the Report API. The data collected include kernel versions, uptime and so on.
Implementation - client (instance)
The GOLD images get a wrapper script (/usr/local/sbin/report_wrapper.h) installed together with a systemd.service (report.service) and timer (service.timer). The timer starts the service at intervals, which again executes the wrapper script. The only job of this is to download the latest version of the report.sh script and execute it. The report-script is the collector, which get all the data and delivers it to the API.
138 Chapter 1. Contents iaas Documentation, Release 0.1.0
report script distribution
As mentioned above, the wrapper script (started by systemd) donwloads the actual report script from a web ser- vice on the report nodes (mainly bgo-report-01). This service was previously set up by a separate git module (norcams/report-utils). This is now OBSOLETE, and the service is set up by Puppet and configured through hiera- data.
puppet code
The web serving area, containing all the scripts and links, is set up by the Puppet code in profile/manifest/ applications/report.pp. It mainly creates directories, scripts and links as configured in the variable re- port_utils (see below). The content of the files is fetched from files contained in the Puppet module, and ordered as configured in the variable.
Script code
The scripts are joined fragments, which are all stored inside the modules files are (profile/files/ application/report). The fragment to select is set up in the report_utils variable, including any sub directories from this point on. The ordering is relevant!
Configuration
Most set up is configured through the variable report_utils. Mainly using Hieradata: pro- file::application::report::report_utils in the report.yaml role file.
Structure