VMware Tanzu Kubernetes Grid

VMware Tanzu Kubernetes Grid 1.3 VMware Tanzu Kubernetes Grid

You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/

VMware, Inc. 3401 Hillview Ave. Palo Alto, CA 94304 www.vmware.com

© Copyright 2021 VMware, Inc. All rights reserved. Copyright and trademark information.

VMware, Inc. 2 Contents

1 VMware Tanzu Kubernetes Grid 1.3 Documentation 12 Tanzu Kubernetes Grid Architecture 12 Use the Tanzu Kubernetes Grid Documentation 13 Intended Audience 14

2 Tanzu Kubernetes Grid Concepts 15 Management Cluster 15 Tanzu Kubernetes Clusters 16 Tanzu Kubernetes Cluster Plans 16 Shared and In-Cluster Services 16 Tanzu Kubernetes Grid Instance 16 Bootstrap Machine 16 Tanzu Kubernetes Grid Installer 16 Tanzu Kubernetes Grid and Cluster Upgrades 17

3 Install the Tanzu CLI and Other Tools 18 Prerequisites 19 Download and Unpack the Tanzu CLI and kubectl 20 Install the Tanzu CLI 21 Install the Tanzu CLI Plugins 22 Tanzu CLI Help 23 Install kubectl 23 What to Do Next 24 Install the Carvel Tools 25 Install ytt 25 Install kapp 26 Install kbld 27 Install imgpkg 28 Tanzu CLI Command Reference 29 Table of Equivalents 32 Tanzu CLI Configuration File Variable Reference 33 Common Variables for All Infrastructure Providers 33 vSphere 44 Amazon EC2 49 Microsoft Azure 51 Customizing Clusters, Plans, and Extensions with ytt Overlays 54 Clusters and Cluster Plans 54 Extensions and Shared Services 55

VMware, Inc. 3 VMware Tanzu Kubernetes Grid

4 Deploying Management Clusters 57 Overview 57 Installer UI vs. CLI 58 Platforms 58 Configuring the Management Cluster 59 What Happens When You Create a Management Cluster 59 Core Add-ons 60 Prepare to Deploy Management Clusters 61 Prepare to Deploy Management Clusters to vSphere 61 Prepare to Deploy Management Clusters to Amazon EC2 69 Prepare to Deploy Management Clusters to Microsoft Azure 82 Enabling Identity Management in Tanzu Kubernetes Grid 88 Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment 93 Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch 104 Prepare a vSphere Management as a Service Infrastructure 118 Deploy Management Clusters with the Installer Interface 122 Prerequisites 122 Set the TKG_BOM_CUSTOM_IMAGE_TAG 123 Start the Installer Interface 124 Configure the Infrastructure Provider 126 Configure the Management Cluster Settings 132 (vSphere Only) Configure VMware NSX Advanced Load Balancer 135 Configure Metadata 137 (vSphere Only) Configure Resources 139 Configure the Kubernetes Network and Proxies 139 Configure Identity Management 141 (vSphere Only) Select the Base OS Image 144 Register with Tanzu Mission Control 145 Finalize the Deployment 145 What to Do Next 148 Deploy Management Clusters from a Configuration File 149 Prerequisites 149 Create the Cluster Configuration File 151 (v1.3.1 Only) Set the TKG_BOM_CUSTOM_IMAGE_TAG 152 Run the tanzu management-cluster create Command 152 What to Do Next 154 Create a Management Cluster Configuration File 154 Configure Identity Management After Management Cluster Deployment 178 Prerequisites 179 Connect kubectl to the Management Cluster 179 Check the Status of an OIDC Identity Management Service 180

VMware, Inc. 4 VMware Tanzu Kubernetes Grid

Check the Status of an LDAP Identity Management Service 182 Provide the Callback URI to the OIDC Provider 183 Generate a kubeconfig to Allow Authenticated Users to Connect to the Management Cluster 184 Create a Role Binding on the Management Cluster 187 Examine the Management Cluster Deployment 192 Management Cluster Networking 192 Configure DHCP Reservations for the Control Plane Nodes (vSphere Only) 192 Verify the Deployment of the Management Cluster 192 Retrieve Management Cluster kubeconfig 194 What to Do Next 195

5 Deploying Tanzu Kubernetes Clusters 196 About Tanzu Kubernetes Clusters 197 Tanzu Kubernetes Clusters, kubectl, and kubeconfig 197 Using the Tanzu CLI to Create and Manage Clusters in vSphere with Tanzu 198 Deploy Tanzu Kubernetes Clusters 198 Prerequisites for Cluster Deployment 198 Create a Tanzu Kubernetes Cluster Configuration File 199 Deploy a Tanzu Kubernetes Cluster with Minimum Configuration 199 Deploy a Cluster with Different Numbers of Control Plane and Worker Nodes 201 Configure Common Settings 203 Deploy a Cluster in a Specific Namespace 203 Create Tanzu Kubernetes Cluster Manifest Files 203 Advanced Configuration of Tanzu Kubernetes Clusters 203 What to Do Next 204 Deploy Tanzu Kubernetes Clusters to vSphere 204 Tanzu Kubernetes Cluster Template 204 Deploy a Cluster with a Custom OVA Image 207 Configure DHCP Reservations for the Control Plane Nodes 208 What to Do Next 208 Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster 209 Prerequisites 209 Step 1: Add the Supervisor Cluster 209 Deploy Tanzu Kubernetes Clusters to Amazon EC2 213 Tanzu Kubernetes Cluster Template 213 Tanzu Kubernetes Cluster Plans and Node Distribution across AZs 216 Deploy a Cluster that Shares a VPC and NAT Gateway(s) with the Management Cluster 217 Deploy a Cluster to an Existing VPC and Add Subnet Tags 218 Deploy a Prod Cluster from a Dev Management Cluster 219 What to Do Next 219 Deploy Tanzu Kubernetes Clusters to Azure 220

VMware, Inc. 5 VMware Tanzu Kubernetes Grid

Create a Network Security Group for Each Cluster 220 Azure Private Clusters 220 Tanzu Kubernetes Cluster Template 221 What to Do Next 223 Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions 223 List Available Versions 224 List Available Upgrades 224 How Tanzu Kubernetes Grid Updates Kubernetes Versions 224 Deploy a Cluster with a Non-Default Kubernetes Version 225 Deploy a Cluster with an Alternate OS or Custom Machine Image 226 Customize Tanzu Kubernetes Cluster Networking 226 Deploy a Cluster with a Non-Default CNI 226 Deploy Pods with Routable, No-NAT IP Addresses (NSX-T) 228 Create Persistent Volumes with Storage Classes 232 Overview: PersistentVolume, PersistentVolumeClaim, and StorageClass 232 Supported Storage Types 232 Default Storage Classes 233 Set Up CNS and Create a Storage Policy (vSphere) 234 Create a Custom Storage Class 235 Use a Custom Storage Class in a Cluster 236 Enable Offline Volume Expansion for vSphere CSI (vSphere 7) 236 Configure Tanzu Kubernetes Plans and Clusters 240 Where Cluster Configuration Values Come From 240 Files to Edit, Files to Leave Alone 241 Configuration Precedence Order 243 ytt Overlays 244

6 Managing Cluster Lifecycles 248 Manage Your Management Clusters 249 List Management Clusters and Change Context 249 See Management Cluster Details 249 Management Clusters, kubectl, and kubeconfig 250 Management Clusters and Their Configuration Files 250 Add Existing Management Clusters to Your Tanzu CLI 251 Delete Management Clusters from Your Tanzu CLI Configuration 252 Scale Management Clusters 252 Update Management Cluster Credentials (vSphere) 253 Manage Participation in CEIP 253 Create Namespaces in the Management Cluster 253 Delete Management Clusters 254 What to Do Next 255

VMware, Inc. 6 VMware Tanzu Kubernetes Grid

Managing Participation in CEIP 255 Opt In or Opt Out of the VMware CEIP 256 Add Entitlement Account Number and Environment Type to Telemetry Profile 257 Identify the Entitlement Account Number 258 Update the Management Cluster 260 Enable Identity Management After Management Cluster Deployment 261 Overview 261 Obtain Your Identity Provider Details 261 Generate a Kubernetes Secret for the Pinniped Add-on 261 Check the Status of the Identity Management Service 264 (OIDC Only) Provide the Callback URI to the OIDC Provider 264 Generate a Non-Admin kubeconfig 264 Create Role Bindings for Your Management Cluster Users 264 Enable Identity Management in Workload Clusters 264 Connect to and Examine Tanzu Kubernetes Clusters 266 Obtain Lists of Deployed Tanzu Kubernetes Clusters 266 Export Tanzu Kubernetes Cluster Details to a File 267 Retrieve Tanzu Kubernetes Cluster kubeconfig 268 Authenticate Connections to a Workload Cluster 270 Configure a Role Binding on a Workload Cluster 272 Examine the Deployed Cluster 273 Access a Workload Cluster as a Standard User 276 Scale Tanzu Kubernetes Clusters 278 Scale Worker Nodes with Cluster Autoscaler 278 Scale a Cluster Horizontally With the Tanzu CLI 279 Scale a Cluster Vertically With kubectl 279 Update and Troubleshoot Core Add-On Configuration 280 Default Core Add-On Configuration 280 Updating and Troubleshooting Core Add-on Configuration 281 Tanzu Kubernetes Cluster Secrets 287 Update Management and Workload Cluster Credentials (vSphere) 287 Update Workload Cluster Credentials (vSphere) 287 Trust Custom CA Certificates on Cluster Nodes 288 Configure Machine Health Checks for Tanzu Kubernetes Clusters 289 About MachineHealthCheck 289 Create or Update a MachineHealthCheck 290 Retrieve a MachineHealthCheck 291 Delete a MachineHealthCheck 291 Back Up and Restore Clusters 291 Setup Overview 291 Install the Velero CLI 292

VMware, Inc. 7 VMware Tanzu Kubernetes Grid

Set Up a Storage Provider 293 Deploy Velero Server to Clusters 294 vSphere Backup and Restore 296 AWS Backup and Restore 296 Azure Backup and Restore 297 Delete Tanzu Kubernetes Clusters 298 Step One: List Clusters 298 Step Two: Delete Volumes and Services 298 Step Three: Delete Cluster 300

7 Deploying and Managing Extensions and Shared Services 301 Locations and Dependencies 301 Preparing to Deploy the Extensions 302 Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle 302 Install Cert Manager on Workload Clusters 303 Create a Shared Services Cluster 304 Add Certificates to the Kapp Controller 305 Implementing Ingress Control with Contour 306 Prerequisites 307 Prepare the Tanzu Kubernetes Cluster for Contour Deployment 307 Deploy Contour on the Tanzu Kubernetes Cluster 308 Access the Envoy Administration Interface Remotely 310 Visualize the Internal Contour Directed Acyclic Graph (DAG) 311 Optional Configuration 312 Update a Contour Deployment 315 Implementing Log Forwarding with Fluent Bit 316 Prerequisites 317 Prepare the Cluster for Fluent Bit Deployment 317 Prepare the Fluent Bit Configuration File for an Elastic Search Output Plugin 319 Prepare the Fluent Bit Configuration File for a Kafka Output Plugin 320 Prepare the Fluent Bit Configuration File for a Splunk Output Plugin 321 Prepare the Fluent Bit Configuration File for an HTTP Endpoint Output Plugin 322 Prepare the Fluent Bit Configuration File for a Syslog Output Plugin 324 Deploy the Fluent Bit Extension 325 Monitoring and Viewing Logs 326 Optional Configuration 328 Update a Running Fluent Bit Deployment 330 Implementing Monitoring with Prometheus and Grafana 331 Deploy Prometheus on Tanzu Kubernetes Clusters 332 Deploy Grafana on Tanzu Kubernetes Clusters 342 Implementing Service Discovery with External DNS 349

VMware, Inc. 8 VMware Tanzu Kubernetes Grid

Prerequisites 349 Prepare a Cluster for External DNS Deployment 350 Choose the External DNS Provider 350 Deploy Harbor Registry as a Shared Service 356 Using the Harbor Shared Service in Internet-Restricted Environments 356 Harbor Registry and External DNS 356 Prerequisites 356 Prepare a Shared Services Cluster for Harbor Deployment 357 Deploy Harbor Extension on the Shared Services Cluster 357 Connect to the Harbor User Interface 360 Push and Pull Images to and from the Harbor Extension 361 Push the Tanzu Kubernetes Grid Images into the Harbor Registry 362 ytt Overlays and Example: Clean Up S3 and Trust Let's Encrypt 363 Update a Running Harbor Deployment 364 Implementing User Authentication 364 Delete Tanzu Kubernetes Grid Extensions 365 Prepare to Delete Extensions 365 Delete the Contour Extension 365 Delete the Fluent Bit Extension 365 Delete the Prometheus and Grafana Extensions 366 Delete the External DNS Extension 366 Delete the Harbor Extension 366 Delete the Dex and Gangway Extensions 367 Delete the Extensions Utilities 367

8 Building Machine Images 368 Overview: Kubernetes Image Builder 368 Custom Images Replace Default Images 369 Cluster API 369 Build a Custom Machine Image 370 Prerequisites 370 Procedure 370 Use a Custom Machine Image 374

9 Upgrading Tanzu Kubernetes Grid 377 Prerequisites 378 Procedure 378 Download and Install the Tanzu CLI 378 Import Configuration Files from Existing v1.2 Management Clusters 379 Replace v1.2 tkg Commands with tanzu Commands 379 Prepare to Upgrade Clusters on vSphere 380

VMware, Inc. 9 VMware Tanzu Kubernetes Grid

VMware Cloud on AWS SDDC Compatibility 381 Prepare to Upgrade Clusters on Amazon EC2 381 Prepare to Upgrade Clusters on Azure 382 Set the TKG_BOM_CUSTOM_IMAGE_TAG 382 Upgrade Management Clusters 383 Upgrade Workload Clusters 383 Upgrade the Tanzu Kubernetes Grid Extensions 383 Register Core Add-ons 383 Upgrade Crash Recovery and Diagnostics 384 Install NSX Advanced Load Balancer After Tanzu Kubernetes Grid Upgrade (vSphere) 384 What to Do Next 385 Upgrade Management Clusters 385 Prerequisites 385 Procedure 386 Replace Connectivity API with a Load Balancer 388 Update the Callback URL for Management Clusters with OIDC Authentication 389 What to Do Next 391 Upgrade Tanzu Kubernetes Clusters 391 Prerequisites 391 Procedure 391 What to Do Next 395 Upgrade Tanzu Kubernetes Grid Extensions 395 Considerations for Upgrading Extensions from v1.2.x to v1.3.x 396 Prerequisites 396 Upgrade the Contour Extension 397 Upgrade the Fluent Bit Extension 401 Upgrade the Prometheus Extension 403 Register Core Add-ons 408 About Add-on Lifecycle Management in Tanzu Kubernetes Grid 408 Prerequisites 409 Register the Core Add-ons 409 Select an OS During Cluster Upgrade 416 Upgrade vSphere Deployments in an Internet-Restricted Environment 416

10 Tanzu Kubernetes Grid Security and Networking 418 Ports and Protocols 418 Tanzu Kubernetes Grid Firewall Rules 419 CIS Benchmarking for Clusters 420

11 Tanzu Kubernetes Grid Logs and Troubleshooting 422 Access the Tanzu Kubernetes Grid Logs 422

VMware, Inc. 10 VMware Tanzu Kubernetes Grid

Access Management Cluster Deployment Logs 422 Monitor Tanzu Kubernetes Cluster Deployments in Cluster API Logs 423 Audit Logging 423 Overview 423 Kubernetes Audit Logs 424 System Audit Logs for Nodes 424 Troubleshooting Tips for Tanzu Kubernetes Grid 424 Clean Up After an Unsuccessful Management Cluster Deployment 424 Delete Users, Contexts, and Clusters with kubectl 425 Kind Cluster Remains after Deleting Management Cluster 426 Failed Validation, Credentials Error on Amazon EC2 426 Failed Validation, Legal Terms Error on Azure 427 Deploying a Tanzu Kubernetes Cluster Times Out, but the Cluster Is Created 427 Pods Are Stuck in Pending on Cluster Due to vCenter Connectivity 428 Tanzu Kubernetes Grid UI Does Not Display Correctly on Windows 428 Running tanzu management-cluster create on macOS Results in kubectl Version Error 429 Connect to Cluster Nodes with SSH 429 Recover Management Cluster Credentials 430 Restore ~/.tanzu Directory 430 Disable nfs-utils on Photon OS Nodes 431 Requests to NSX Advanced Load Balancer VIP fail with the message no route to host 431 Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostics 432 Install or Upgrade the Crashd Binary 432 Run Crashd on Photon OS Tanzu Kubernetes Grid Clusters 433 Use an Existing Bootstrap Cluster to Deploy Management Clusters 435 Identify or Create a Local Bootstrap Cluster 436 Deploy a Management Cluster with an Existing Bootstrap Cluster 437 Delete a Management Cluster with an Existing Bootstrap Cluster 438

VMware, Inc. 11 VMware Tanzu Kubernetes Grid 1.3 Documentation 1

VMware Tanzu Kubernetes Grid provides organizations with a consistent, upstream-compatible, regional Kubernetes substrate that is ready for end-user workloads and ecosystem integrations. You can deploy Tanzu Kubernetes Grid across software-defined datacenters (SDDC) and public cloud environments, including vSphere, Microsoft Azure, and Amazon EC2.

This chapter includes the following topics: n Tanzu Kubernetes Grid Architecture n Use the Tanzu Kubernetes Grid Documentation n Intended Audience

Tanzu Kubernetes Grid Architecture

Tanzu Kubernetes Grid allows you to run Kubernetes with consistency and make it available to your developers as a utility, just like the electricity grid. Tanzu Kubernetes Grid has a native awareness of the multi-cluster paradigm, not just for clusters, but also for the services that your clusters share.

VMware, Inc. 12 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Grid builds on trusted upstream and community projects and delivers a Kubernetes platform that is engineered and supported by VMware, so that you do not have to build your Kubernetes environment by yourself. In addition to Kubernetes binaries that are tested, signed, and supported by VMware, Tanzu Kubernetes Grid provides the services such as networking, authentication, ingress control, and logging that a production Kubernetes environment requires.

For more information about the key components of Tanzu Kubernetes Grid, how you use them, and what they do, see Chapter 2 Tanzu Kubernetes Grid Concepts.

Use the Tanzu Kubernetes Grid Documentation

The documentation for Tanzu Kubernetes Grid provides information about how to install, configure, and use Tanzu Kubernetes Grid. This documentation applies to all 1.3.x releases. n Chapter 2 Tanzu Kubernetes Grid Concepts introduces the main components of Tanzu Kubernetes Grid. n Chapter 3 Install the Tanzu CLI and Other Tools describes the prerequisites for installing Tanzu Kubernetes Grid, how to install the Tanzu CLI.

VMware, Inc. 13 VMware Tanzu Kubernetes Grid

n Chapter 4 Deploying Management Clusters describes how to set up your environment for deployment of management clusters to vSphere, Azure, and Amazon EC2, how to deploy management clusters to your chosen provider, and how to manage your management clusters after deployment. n Chapter 5 Deploying Tanzu Kubernetes Clusters describes how to use the Tanzu CLI to deploy Tanzu Kubernetes clusters from your management clusters, and how to manage the lifecycle of those clusters. n Chapter 7 Deploying and Managing Extensions and Shared Services describes how to set up local shared services for your Tanzu Kubernetes clusters, such as authentication and authorization, logging, networking, and ingress control. n Chapter 8 Building Machine Images describes how to build your own OS images to run in cluster nodes. n Chapter 9 Upgrading Tanzu Kubernetes Grid describes how to upgrade your Tanzu Kubernetes Grid installation, and how to upgrade the management clusters and Tanzu Kubernetes clusters that you deployed with a previous version. n Troubleshooting Tips for Tanzu Kubernetes Grid includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying management clusters and Tanzu Kubernetes clusters. This section also describes how to use the Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostics. n Tanzu CLI Command Reference lists all of the commands and options of the Tanzu CLI, and provides links to the section in which they are documented.

Intended Audience

This information is intended for administrators who want to install Tanzu Kubernetes Grid and use it to create and manage Tanzu Kubernetes clusters and their associated resources. This information is also intended for application administrators and developers who want to use Tanzu Kubernetes Grid to deploy and manage modern apps in a Kubernetes architecture. The information is written for users who have a basic understanding of Kubernetes and are familiar with container deployment concepts. In-depth knowledge of Kubernetes is not required.

VMware, Inc. 14 Tanzu Kubernetes Grid Concepts 2

This topic describes the key elements and concepts of a Tanzu Kubernetes Grid deployment.

This chapter includes the following topics: n Management Cluster n Tanzu Kubernetes Clusters n Tanzu Kubernetes Cluster Plans n Shared and In-Cluster Services n Tanzu Kubernetes Grid Instance n Bootstrap Machine n Tanzu Kubernetes Grid Installer n Tanzu Kubernetes Grid and Cluster Upgrades

Management Cluster

A management cluster is the first element that you deploy when you create a Tanzu Kubernetes Grid instance. The management cluster is a Kubernetes cluster that performs the role of the primary management and operational center for the Tanzu Kubernetes Grid instance. This is where Cluster API runs to create the Tanzu Kubernetes clusters in which your application workloads run, and where you configure the shared and in-cluster services that the clusters use.

NOTE: On vSphere 7, it is recommended to use a built-in supervisor cluster from vSphere with Tanzu instead of deploying a Tanzu Kubernetes Grid management cluster. Deploying a Tanzu Kubernetes Grid management cluster to vSphere 7 when vSphere with Tanzu is not enabled is supported, but the preferred option is to enable vSphere with Tanzu and use the Supervisor Cluster. For details, see vSphere with Tanzu Provides Management Cluster.

When you deploy a management cluster, networking with Antrea is automatically enabled in the management cluster. The management cluster is purpose-built for operating the platform and managing the lifecycle of Tanzu Kubernetes clusters. As such, the management cluster should not be used as a general purpose compute environment for end-user workloads.

VMware, Inc. 15 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Clusters

After you have deployed a management cluster, you use the Tanzu CLI to deploy CNCF conformant Kubernetes clusters and manage their lifecycle. These clusters, known as Tanzu Kubernetes clusters, are the clusters that handle your application workloads, that you manage through the management cluster. Tanzu Kubernetes clusters can run different versions of Kubernetes, depending on the needs of the applications they run. You can manage the entire lifecycle of Tanzu Kubernetes clusters by using the Tanzu CLI. Tanzu Kubernetes clusters implement Antrea for pod-to-pod networking by default.

Tanzu Kubernetes Cluster Plans

A cluster plan is the blueprint that describes the configuration with which to deploy a Tanzu Kubernetes cluster. It provides a set of configurable values that describe settings like the number of control plane machines, worker machines, VM types, and so on.

This release of Tanzu Kubernetes Grid provides two default templates, dev and prod.

Shared and In-Cluster Services

Shared and in-cluster services are services that run in the Tanzu Kubernetes Grid instance, to provide authentication and authorization of Tanzu Kubernetes clusters, logging, and ingress control.

Tanzu Kubernetes Grid Instance

A Tanzu Kubernetes Grid instance is a full deployment of Tanzu Kubernetes Grid, including the management cluster, the deployed Tanzu Kubernetes clusters, and the shared and in-cluster services that you configure. You can operate many instances of Tanzu Kubernetes Grid, for different environments, such as production, staging, and test; for different IaaS providers, such as vSphere, Azure, and Amazon EC2; and for different failure domains, for example Datacenter-1, AWS us-east-2, or AWS us-west-2.

Bootstrap Machine

The bootstrap machine is the laptop, host, or server on which you download and run the Tanzu CLI. This is where the initial bootstrapping of a management cluster occurs, before it is pushed to the platform where it will run.

Tanzu Kubernetes Grid Installer

The Tanzu Kubernetes Grid installer is a graphical wizard that you start up by running the tanzu management-cluster create --ui command. The installer wizard runs locally on the bootstrap machine, and provides a user interface to guide you through the process of deploying a management cluster.

VMware, Inc. 16 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Grid and Cluster Upgrades

Upgrading a Tanzu Kubernetes Grid release means upgrading the management clusters created by the CLI version of that release.

Upgrading a management or Tanzu Kubernetes (workload) cluster in Tanzu Kubernetes Grid means migrating its nodes to run on a base VM image with a newer version of Kubernetes: n Management clusters upgrade to the latest available version of Kubernetes. n Workload clusters upgrade by default to the current Kubernetes version of their management cluster. Or you can specify other, non-default Kubernetes versions to upgrade workload clusters to.

To find out which Kubernetes versions are available in Tanzu Kubernetes Grid, see List Available Versions.

VMware, Inc. 17 Install the Tanzu CLI and Other Tools 3

This topic explains how to install and initialize the Tanzu command line interface (CLI) on a bootstrap machine. The bootstrap machine is the laptop, host, or server that you deploy management and workload clusters from, and that keeps the Tanzu and Kubernetes configuration files for your deployments. The bootstrap machine is typically local, but it can also be a physical machine or VM that you access remotely.

Once the Tanzu CLI is installed, the second and last step to deploying Tanzu Kubernetes Grid is using the Tanzu CLI to create or designate a management cluster on each cloud provider that you use.

The Tanzu CLI then communicates with the management cluster to create and manage workload clusters on the cloud provider.

This chapter includes the following topics: n Prerequisites n Download and Unpack the Tanzu CLI and kubectl n Install the Tanzu CLI n What to Do Next n Install the Carvel Tools n Tanzu CLI Command Reference n Tanzu CLI Configuration File Variable Reference

VMware, Inc. 18 VMware Tanzu Kubernetes Grid

n Customizing Clusters, Plans, and Extensions with ytt Overlays

Prerequisites

VMware provides Tanzu CLI binaries for Linux, macOS, and Windows systems.

The bootstrap machine on which you run the Tanzu CLI must meet the following requirements: n A browser, if you intend to use the Tanzu Kubernetes Grid installer interface. You can use the Tanzu CLI without a browser, but for first deployments, it is strongly recommended to use the installer interface. n A Linux, Windows, or macOS operating system with a minimum system configuration of 6 GB of RAM and a 2-core CPU. n A Docker client installed and running on your bootstrap machine:

n Linux: Docker

n Windows: Docker Desktop

n macOS: Docker Desktop n For Windows and macOS Docker clients, you must allocate at least 6 GB of memory in Docker Desktop to accommodate the kind container. See Settings for Docker Desktop in the kind documentation. n System time is synchronized with a Network Time Protocol (NTP) server. n On VMware Cloud on AWS and Azure VMware Solution, the bootstrap machine must be a cloud VM, not a local physical machine. See Prepare a vSphere Management as a Service Infrastructure for setup instructions. n If you intend to run the Tanzu CLI on a Linux machine, add your non-root user to the docker group. Create the group if it does not already exist. This enables the Tanzu CLI to access the Docker socket, which is owned by the root user. For more information, see the Docker documentation.

VMware, Inc. 19 VMware Tanzu Kubernetes Grid

Download and Unpack the Tanzu CLI and kubectl

The tanzu CLI ships alongside a compatible version of the kubectl CLI.

To download and unpack both CLIs:

1 Go to https://my.vmware.com and log in with your My VMware credentials.

2 Visit the Tanzu Kubernetes Grid downloads page

3 In the VMware Tanzu Kubernetes Grid row, click Go to Downloads.

4 In the Select Version dropdown, select 1.3.1.

5 Under Product Downloads, scroll to the section labeled VMware Tanzu CLI 1.3.1 CLI.

n For macOS, locate VMware Tanzu CLI for Mac and click Download Now.

n For Linux, locate VMware Tanzu CLI for Linux and click Download Now.

n For Windows, locate VMware Tanzu CLI for Windows and click Download Now.

6 Navigate to the Kubectl 1.20.5 for VMware Tanzu Kubernetes Grid 1.3.1 section of the download page.

n For macOS, locate kubectl cluster cli v1.20.5 for Mac and click Download Now.

n For Linux, locate kubectl cluster cli v1.20.5 for Linux and click Download Now.

n For Windows, locate kubectl cluster cli v1.20.5 for Windows and click Download Now.

7 (Optional) Verify that your downloaded files are unaltered from the original. VMware provides a SHA-1, a SHA-256, and an MD5 checksum for each download. To obtain these checksums, click Read More under the entry that you want to download. For more information, see Using Cryptographic Hashes.

8 On your system, create a new directory named tanzu. If you previously unpacked artifacts for previous releases to this folder, delete the folder's existing contents.

VMware, Inc. 20 VMware Tanzu Kubernetes Grid

9 In the tanzu folder, unpack the Tanzu CLI bundle file for your operating system. To unpack the bundle file, use the extraction tool of your choice. For example, the tar -xvf command.

n For macOS, unpack tanzu-cli-bundle-v1.3.1-darwin-amd64.tar.

n For Linux, unpack tanzu-cli-bundle-v1.3.1-linux-amd64.tar.

n For Windows, unpack tanzu-cli-bundle-v1.3.1-windows-amd64.tar.

After you unpack the bundle file, in your tanzu folder, you will see a cli folder with multiple subfolders and files.

The files in the cli directory, such as ytt, kapp, and kbld, are required by the Tanzu Kubernetes Grid extensions and add-ons. You will need these files later when you install the extensions and register add-ons.

10 Unpack the kubectl binary for your operating system:

n For macOS, unpack kubectl-mac-v1.20.5-vmware.1.gz.

n For Linux, unpack kubectl-linux-v1.20.5-vmware.1.gz.

n For Windows, unpack kubectl-windows-v1.20.5-vmware.1.exe.gz.

Install the Tanzu CLI

After you have downloaded and unpacked the Tanzu CLI on your bootstrap machine, you must make it available to the system.

1 Navigate to the tanzu/cli folder that you unpacked in the previous section.

2 Make the CLI available to the system:

n For macOS:

1 Install the binary to /usr/local/bin: sudo install core/v1.3.1/tanzu-core-darwin_amd64 /usr/local/bin/tanzu

2 Confirm that the binary is executable by running the ls command.

VMware, Inc. 21 VMware Tanzu Kubernetes Grid

n For Linux:

1 Install the binary to /usr/local/bin: sudo install core/v1.3.1/tanzu-core-linux_amd64 /usr/local/bin/tanzu

1 Confirm that the binary is executable by running the ls command.

n For Windows:

1 Create a new Program Files\tanzu folder.

2 In the unpacked cli folder, locate and copy the core\v1.3.1\tanzu-core- windows_amd64.exe into the new Program Files\tanzu folder.

3 Rename tanzu-core-windows_amd64.exe to tanzu.exe.

4 Right-click the tanzu folder, select Properties > Security, and make sure that your user account has the Full Control permission.

5 Use Windows Search to search for env.

6 Select Edit the system environment variables and click the Environment Variables button.

7 Select the Path row under System variables and click Edit.

8 Click New to add a new row and enter the path to the tanzu CLI.

3 At the command line in a new terminal, run tanzu version to check that the correct version of the CLI is properly installed.

If you are running on macOS, you might encounter the following error:

"tanzu" cannot be opened because the developer cannot be verified.

If this happens, you need to create a security exception for the tanzu executable. Locate the tanzu app in Finder, control-click the app, and select Open.

Install the Tanzu CLI Plugins

After you have installed the tanzu core executable, you must install the CLI plugins related to Tanzu Kubernetes cluster management and feature operations.

1 (Optional) Remove existing plugins from any previous CLI installations.

tanzu plugin clean

2 Navigate to the tanzu folder that contains the cli folder.

3 Run the following command from the tanzu directory to install all the plugins for this release.

tanzu plugin install --local cli all

VMware, Inc. 22 VMware Tanzu Kubernetes Grid

4 Check plugin installation status.

tanzu plugin list

If successful, you should see a list of all installed plugins. For example:

NAME LATEST VERSION DESCRIPTION REPOSITORY VERSION STATUS cluster v1.3.1 Kubernetes cluster operations core v1.3.1 installed login v1.3.1 Login to the platform core v1.3.1 installed pinniped-auth v1.3.1 Pinniped authentication operations (usually not directly invoked) core v1.3.1 installed kubernetes-release v1.3.1 Kubernetes release operations core v1.3.1 installed management-cluster v1.3.1 Kubernetes management cluster operations tkg v1.3.1 installed

Tanzu CLI Help

Run tanzu --help to see the list of commands that the Tanzu CLI provides.

You can view help text for any command group with the --help option to see information about that specific command or command group. For example, tanzu login --help, tanzu management- cluster --help, or tanzu management-cluster create --help.

For more information about the Tanzu CLI, see the Tanzu CLI Command Reference.

Install kubectl

After you have downloaded and unpacked kubectl on your bootstrap machine, you must make it available to the system.

1 Navigate to the kubectl binary that you unpacked in Download and Unpack the Tanzu CLI and kubectl above.

2 Make the CLI available to the system:

n For macOS:

1 Install the binary to /usr/local/bin:

sudo install kubectl-mac-v1.20.5-vmware.1 /usr/local/bin/kubectl

2 Confirm that the binary is executable by running the ls command.

n For Linux:

1 Install the binary to /usr/local/bin:

sudo install kubectl-linux-v1.20.5-vmware.1 /usr/local/bin/kubectl

VMware, Inc. 23 VMware Tanzu Kubernetes Grid

2 Confirm that the binary is executable by running the ls command.

n For Windows:

1 Create a new Program Files\kubectl folder.

2 Locate and copy the kubectl-windows-v1.20.5-vmware.1.exe file into the new Program Files\kubectl folder.

3 Rename kubectl-windows-v1.20.5-vmware.1.exe to kubectl.exe.

4 Right-click the kubectl folder, select Properties > Security, and make sure that your user account has the Full Control permission.

5 Use Windows Search to search for env.

6 Select Edit the system environment variables and click the Environment Variables button.

7 Select the Path row under System variables and click Edit.

8 Click New to add a new row and enter the path to the kubectl CLI.

3 Run kubectl version to check that the correct version of the CLI is properly installed.

What to Do Next

With the Tanzu CLI and kubectl installed, you can set up and use your bootstrap machine to deploy management clusters clusters to vSphere, Amazon EC2, and Microsoft Azure.

n For information about how to deploy management clusters to your chosen platform, see Chapter 4 Deploying Management Clusters. n If you have vSphere 7 and the vSphere with Tanzu feature is enabled, you can directly use the Tanzu CLI to deploy Tanzu Kubernetes clusters to vSphere with Tanzu, without deploying a management cluster. For information about how to connect the Tanzu CLI to a vSphere with Tanzu Supervisor Cluster, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

VMware, Inc. 24 VMware Tanzu Kubernetes Grid

The Tanzu Kubernetes Grid download bundle also includes the Carvel tools, that you might need for troubleshooting or customization purposes.

Install the Carvel Tools

The Tanzu Kubernetes Grid uses the following tools from the Carvel open-source project: n ytt n kapp n kbld n imgpkg

Tanzu Kubernetes Grid provides signed binaries for ytt, kapp, kbld, imgpkg, and vendir that are bundled with the Tanzu CLI. The bundle also includes vendir, a Kubernetes directory structure tool, that is not currently required by end users, but is provided for convenience.

1 Navigate to the location on your bootstrap environment machine where you unpacked the Tanzu CLI bundle tar file for your OS.

For example, the tanzu folder, that you created in the previous procedure.

2 Open the cli folder.

cd cli

Install ytt ytt is a YAML templating tool, that is used to deploy the Tanzu Kubernetes Grid extensions. You might need to install ytt if you need to troubleshoot the deployment of the Tanzu Kubernetes Grid extensions, or if you need to use overlays to customize your extensions or cluster templates.

MacOS and Linux:

1 Unpack the ytt binary and make it executable.

Linux:

gunzip ytt-linux-amd64-v0.31.0+vmware.1.gz chmod ugo+x ytt-linux-amd64-v0.31.0+vmware.1

Mac OS:

gunzip ytt-darwin-amd64-v0.31.0+vmware.1.gz chmod ugo+x ytt-darwin-amd64-v0.31.0+vmware.1

1 Move the binary to /usr/local/bin and rename it to ytt:

Linux:

mv ./ytt-linux-amd64-v0.31.0+vmware.1 /usr/local/bin/ytt

VMware, Inc. 25 VMware Tanzu Kubernetes Grid

Mac OS:

mv ./ytt-darwin-amd64-v0.31.0+vmware.1 /usr/local/bin/ytt

1 Confirm that the binary is executable by running the ls command.

Windows:

1 Unpack the the ytt binary.

gunzip ytt-windows-amd64-v0.31.0+vmware.1.gz

1 Rename ytt-windows-amd64-v0.30.0+vmware.1 to ytt.exe

2 Create a new Program Files\ytt folder and copy the ytt.exe file into it.

3 Right-click the ytt folder, select Properties > Security, and make sure that your user account has the Full Control permission.

4 Use Windows Search to search for env.

5 Select Edit the system environment variables and click the Environment Variables button.

6 Select the Path row under System variables and click Edit.

7 Click New to add a new row and enter the path to the ytt tool.

At the command line in a new terminal, run ytt version to check that the correct version of ytt is properly installed.

Install kapp kapp is a Kubernetes applications CLI, that is used to manage the Tanzu Kubernetes Grid extensions. You might need to install kapp if you need to troubleshoot the deployment of the Tanzu Kubernetes Grid extensions, or if you need to use overlays to customize your extensions or cluster templates.

MacOS and Linux:

1 Unpack the kapp binary and make it executable.

Linux:

gunzip kapp-linux-amd64-v0.36.0+vmware.1.gz chmod ugo+x kapp-linux-amd64-v0.36.0+vmware.1

Mac OS:

gunzip kapp-darwin-amd64-v0.36.0+vmware.1.gz chmod ugo+x kapp-darwin-amd64-v0.36.0+vmware.1

1 Move the binary to /usr/local/bin and rename it to kapp:

VMware, Inc. 26 VMware Tanzu Kubernetes Grid

Linux:

mv ./kapp-linux-amd64-v0.36.0+vmware.1 /usr/local/bin/kapp

Mac OS:

mv ./kapp-darwin-amd64-v0.36.0+vmware.1 /usr/local/bin/kapp

1 Confirm that the binary is executable by running the ls command.

Windows:

1 Unpack the the kapp binary.

gunzip kapp-windows-amd64-v0.36.0+vmware.1.gz

1 Rename kapp-windows-amd64-v0.36.0+vmware.1 to kapp.exe

2 Create a new Program Files\kapp folder and copy the kapp.exe file into it.

3 Right-click the kapp folder, select Properties > Security, and make sure that your user account has the Full Control permission.

4 Use Windows Search to search for env.

5 Select Edit the system environment variables and click the Environment Variables button.

6 Select the Path row under System variables and click Edit.

7 Click New to add a new row and enter the path to the kapp tool.

At the command line in a new terminal, run kapp version to check that the correct version of kapp is properly installed.

Install kbld kbld is a Kubernetes image builder. It is used by Tanzu Kubernetes Grid to build the Tanzu Kubernetes Grid extensions. You might need to install kbld if you need to troubleshoot the deployment of the Tanzu Kubernetes Grid extensions, or if you need to use overlays to customize your extensions or cluster templates.

MacOS and Linux:

1 Unpack the kbld binary and make it executable.

Linux:

gunzip kbld-linux-amd64-v0.28.0+vmware.1.gz chmod ugo+x kbld-linux-amd64-v0.28.0+vmware.1

VMware, Inc. 27 VMware Tanzu Kubernetes Grid

Mac OS:

gunzip kbld-darwin-amd64-v0.28.0+vmware.1.gz chmod ugo+x kbld-darwin-amd64-v0.28.0+vmware.1

1 Move the binary to /usr/local/bin and rename it to kbld:

Linux:

mv ./kbld-linux-amd64-v0.28.0+vmware.1 /usr/local/bin/kbld

Mac OS:

mv ./kbld-darwin-amd64-v0.28.0+vmware.1 /usr/local/bin/kbld

1 Confirm that the binary is executable by running the ls command.

Windows:

1 Unpack the the kbld binary.

gunzip kbld-windows-amd64-v0.28.0+vmware.1.gz

1 Rename kbld-windows-amd64-v0.28.0+vmware.1 to kbld.exe

2 Create a new Program Files\kbld folder and copy the kbld.exe file into it.

3 Right-click the kbld folder, select Properties > Security, and make sure that your user account has the Full Control permission.

4 Use Windows Search to search for env.

5 Select Edit the system environment variables and click the Environment Variables button.

6 Select the Path row under System variables and click Edit.

7 Click New to add a new row and enter the path to the kbld tool.

At the command line in a new terminal, run kbld version to check that the correct version of kbld is properly installed.

Install imgpkg imgpkg: Kubernetes image packaging tool, that is required to deploy Tanzu Kubernetes Grid in Internet-restricted environments and when building your own node images.

MacOS and Linux:

1 Unpack the imgpkg binary and make it executable.

Linux:

gunzip imgpkg-linux-amd64-v0.5.0+vmware.1.gz chmod ugo+x imgpkg-linux-amd64-v0.5.0+vmware.1

VMware, Inc. 28 VMware Tanzu Kubernetes Grid

Mac OS:

gunzip imgpkg-darwin-amd64-v0.5.0+vmware.1.gz chmod ugo+x imgpkg-darwin-amd64-v0.5.0+vmware.1

1 Move the binary to /usr/local/bin and rename it to imgpkg:

Linux:

mv ./imgpkg-linux-amd64-v0.5.0+vmware.1 /usr/local/bin/imgpkg

Mac OS:

mv ./imgpkg-darwin-amd64-v0.5.0+vmware.1 /usr/local/bin/imgpkg

1 Confirm that the binary is executable by running the ls command.

Windows:

1 Unpack the the imgpkg binary.

gunzip imgpkg-windows-amd64-v0.5.0+vmware.1.gz

1 Rename imgpkg-windows-amd64-v0.5.0+vmware.1 to imgpkg.exe

2 Create a new Program Files\imgpkg folder and copy the imgpkg.exe file into it.

3 Right-click the imgpkg folder, select Properties > Security, and make sure that your user account has the Full Control permission.

4 Use Windows Search to search for env.

5 Select Edit the system environment variables and click the Environment Variables button.

6 Select the Path row under System variables and click Edit.

7 Click New to add a new row and enter the path to the imgpkg tool.

At the command line in a new terminal, run imgpkg version to check that the correct version of imgpkg is properly installed.

Tanzu CLI Command Reference

The table below lists all of the commands and options of the Tanzu CLI, and provides links to the section in which they are documented.

Command Options Description tanzu *

-h, --help Common Tanzu Kubernetes Grid Options tanzu completion *

VMware, Inc. 29 VMware Tanzu Kubernetes Grid

Command Options Description

-h, --help Output shell completion code for the specified shell tanzu cluster *

--log-file-v,, --verbose tanzu cluster create

-, --dry-run Create Tanzu Kubernetes Cluster Manifest Files

-f, --file Deploy Tanzu Kubernetes Clusters

--tkr Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster tanzu cluster credentials update

-n, --namespace --vsphere-password --vsphere- Tanzu Kubernetes Cluster Secrets user tanzu cluster delete

-n, --namespace -y, --yes Delete Tanzu Kubernetes Clusters tanzu cluster get

--disable-grouping --disable-no- -n, -- Chapter 5 Deploying Tanzu namespace --show-all-conditions Kubernetes Clusters tanzu cluster kubeconfig get Chapter 5 Deploying Tanzu Kubernetes Clusters Connect to and Examine Tanzu Kubernetes Clusters

--admin--export-file-n, --namespace Connect to and Examine Tanzu Kubernetes Clusters Upgrade Tanzu Kubernetes Clusters tanzu cluster list

--include-management-cluster Connect to and Examine Tanzu Kubernetes ClustersUpgrade Management ClustersUpgrade Tanzu Kubernetes Clusters

-n, --namespace -o, --output Connect to and Examine Tanzu Kubernetes Clusters tanzu cluster machinehealthcheck delete

-m, --mhc-name -n, --namespace -y, --yes Configure Machine Health Checks for Tanzu Kubernetes Clusters tanzu cluster machinehealthcheck get

-m, --mhc-name -n, --namespace Configure Machine Health Checks for Tanzu Kubernetes Clusters

VMware, Inc. 30 VMware Tanzu Kubernetes Grid

Command Options Description tanzu cluster machinehealthcheck set

--match-labels -m, --mhc-name -n, --namespace -- Configure Machine Health Checks for node-startup-timeout --unhealthy-conditions Tanzu Kubernetes Clusters tanzu cluster scale

-c, --controlplane-machine-count -n, -- Manage Your Management namespace -w, --worker-machine-count ClustersScale Tanzu Kubernetes Clusters tanzu cluster upgrade

-n, --namespace --os-arch --os-name --os- Upgrade Tanzu Kubernetes Clusters version -t, --timeout --tkr -y, --yes Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions tanzu config init *

-h, --help Initializes the configuration with defaults tanzu config server delete

-y, --yes Delete Management Clusters from Your Tanzu CLI Configuration tanzu config server list Delete Management Clusters from Your Tanzu CLI Configuration tanzu config show *

-h, --help Shows the current configuration tanzu init Not available in this version of the Tanzu CLI tanzu kubernetes-release get Deploy Tanzu Kubernetes Clusters with tanzu kubernetes-release Different Kubernetes Versions Upgrade Tanzu available-upgrades get Kubernetes Clusters Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster tanzu kubernetes-release os get

--region tanzu login--apiToken--context--endpoint--kubeconfig--name--serverConnect to and Examine Tanzu Kubernetes ClustersManage Your Management ClustersUse the Tanzu CLI with a vSphere with Tanzu Supervisor Clustertanzu management-cluster ceip-participation gettanzu management- cluster ceip-participation set Managing Participation in CEIPtanzu management-cluster create- b--bind--browser-u--uiDeploy Management Clusters with the Installer Interface-f--file-t-- timeout-y--yesDeploy Management Clusters from a Configuration File-e--use-existing-bootstrap- clusterUse an Existing Bootstrap Cluster to Deploy Management Clusterstanzu management- cluster credentials updateTanzu Kubernetes Cluster Secretstanzu management-cluster deleteDelete Management Clusterstanzu management-cluster getConnect to and Examine Tanzu Kubernetes ClustersExamine the Management Cluster DeploymentManage Your Management ClustersUpgrade Management Clusters

VMware, Inc. 31 VMware Tanzu Kubernetes Grid

tanzu management-cluster importChapter 9 Upgrading Tanzu Kubernetes Gridtanzu management- cluster kubeconfig get--admin--export-fileExamine the Management Cluster DeploymentConfigure Identity Management After Management Cluster Deploymenttanzu management-cluster registerRegister the Management Cluster with Tanzu Mission Controltanzu management-cluster permissions aws gettanzu management-cluster permissions aws setManagement Cluster Configuration for Amazon EC2tanzu management-cluster upgrade--os- arch--os-name--os-version-t--timeout-y--yesUpgrade Management Clusterstanzu pinniped-auth *-h--helpPinniped authentication operations tanzu pinniped-auth login *-h--help--ca-bundle strings--client-id string--concierge-authenticator-name string--concierge-authenticator- type--concierge-ca-bundle-data string--concierge-authenticator-name string--concierge- authenticator-type--concierge-ca-bundle-data string--concierge-endpoint string--concierge- namespace string--enable-concierge--issuer string--listen-port uint16--request-audience string--scopes strings--session-cache string--skip-browserLog in using an OpenID Connect provider tanzu plugin clean**tanzu plugin install**tanzu plugin list*Chapter 3 Install the Tanzu CLI and Other Toolstanzu plugin delete *-h--helpDeletes a Tanzu plugin tanzu plugin describe * -h, --help Describes a Tanzu plugin tanzu plugin upgrade *-h--helpUpgrades a Tanzu plugin tanzu update *-h--helpUpdates the Tanzu CLI tanzu versionChapter 3 Install the Tanzu CLI and Other Tools

* Some tanzu plugin commands such as tanzu plugin repo and tanzu plugin update are not functional in the current release.

Table of Equivalents

The tanzu command-line interface (CLI) works differently from the tkg CLI used in previous versions of Tanzu Kubernetes Grid. Because of these differences, many tkg commands do not have direct tanzu equivalents, and vice-versa. But other commands do have direct or close equivalencies across both CLIs:

TKG CLI command Tanzu CLI command Notes tkg init tanzu management-cluster create Some tkg flags not replicated in tanzu

tkg init --ui tanzu management-cluster create --ui Some tkg flags not replicated in tanzu

tkg set management-cluster tanzu login --server SERVER MANAGEMENT_CLUSTER_NAME

tkg add management-cluster tanzu login --kubeconfig CONFIG --context OPTIONAL_CONTEXT --name SERVER-NAME

tkg get management-cluster tanzu login The tanzu login command displays the currently configured management clusters tkg create cluster tanzu cluster create Some tkg flags not replicated in tanzu tkg config cluster tanzu cluster create --dry-run Some tkg flags not replicated in tanzu

VMware, Inc. 32 VMware Tanzu Kubernetes Grid

TKG CLI command Tanzu CLI command Notes tkg delete cluster tanzu cluster delete tkg get cluster tanzu cluster list tkg scale cluster tanzu cluster scale tkg upgrade cluster tanzu cluster upgrade tkg get credentials tanzu management-cluster kubeconfig get -- admin tkg get credentials WORKLOAD_CLUSTER_NAME tanzu cluster kubeconfig get WORKLOAD_CLUSTER_NAME --admin tkg delete machinehealthcheck tanzu cluster machinehealthcheck delete tkg get machinehealthcheck tanzu cluster machinehealthcheck get tkg set machinehealthcheck tanzu cluster machinehealthcheck set tkg get ceip-participation tanzu management-cluster ceip-participation get tkg set ceip-participation tanzu management-cluster ceip-participation set tkg delete management-cluster tanzu management-cluster delete tkg upgrade management-cluster tanzu management-cluster upgrade tkg version tanzu version tkg get kubernetesversions tanzu kubernetes-release get

Tanzu CLI Configuration File Variable Reference

This reference lists all the variables that you can specify to provide configuration options to the Tanzu CLI.

To set these variables in a YAML configuration file, leave a space between the colon (:) and the variable value. For example:

CLUSTER_NAME: my-cluster

Line order in the configuration file does not matter. Options are presented here in alphabetical order.

Common Variables for All Infrastructure Providers

This section lists variables that are common to all infrastructure providers. These variables may apply to management clusters, Tanzu Kubernetes clusters, or both. For more information, see Configure Basic Management Cluster Creation Information in Create a Management Cluster Configuration File. For the variables that are specific to workload clusters, see Deploy Tanzu Kubernetes Clusters.

VMware, Inc. 33 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

CLUSTER_CIDR ✔ ✔ Optional, set if you want to override the default value. The CIDR range to use for pods. By default, this range is set to 100.96.0.0/11. Change the default value only if the recommended range is unavailable.

CLUSTER_NAME ✔ ✔ This name must comply with DNS hostname requirements as outlined in RFC 952 and amended in RFC 1123, and must be 42 characters or less. For workload clusters, this setting is overridden by the CLUSTER_NAME argument passed to tanzu cluster create. For management clusters, if you do not specify CLUSTER_NAME, a unique name is generated.

CLUSTER_PLAN ✔ ✔ Required. Set to dev, prod, or a custom plan as exemplified in New Plan nginx. The dev plan deploys a cluster with a single control plane node. The prod plan deploys a highly available cluster with three control plane nodes.

CNI ✖ ✔ Optional, set if you want to override the default value. Do not override the default value for management clusters. Container network interface. By default, CNI is set to antrea. If you want to customize your Antrea configuration, see Antrea CNI Configuration below. For Tanzu Kubernetes clusters, you can set CNI to antrea, calico, or none. Setting none allows you to provide your own CNI. For more information about CNI options, see Deploy a Cluster with a Non-Default CNI.

ENABLE_AUDIT_LOGGING ✔ ✔ Optional, set if you want to override the default value. Audit logging for the Kubernetes API server. The default value is false. To enable audit logging, set the variable to true. Tanzu Kubernetes Grid writes these logs to /var/log/ kubernetes/audit.log. For more information, see Audit Logging.

ENABLE_AUTOSCALER ✖ ✔ Optional, set if you want to override the default value. The default value is false. If set to true, you must include additional variables.

ENABLE_CEIP_PARTICIPATION ✔ ✖ Optional, set if you want to override the default value. The default value is true. false opts out of the VMware Customer Experience Improvement Program. You can also opt in or out of the program after deploying the management cluster. For information, see Opt In or Out of the VMware CEIP in Managing Participation in CEIP and Customer Experience Improvement Program ("CEIP").

ENABLE_DEFAULT_STORAGE_CLASS ✖ ✔ Optional, set if you want to override the default value. The default value is true. For information about storage classes, see [Create Persistent Volumes with Storage Classes] (tanzu-k8s-clusters/storage.md).

ENABLE_MHC ✔ ✔ Optional, set if you want to override the default value. The default value is true. See Machine Health Checks below.

VMware, Inc. 34 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

IDENTITY_MANAGEMENT_TYPE ✔ ✔ Required. Set either oidc or ldap. Additional OIDC or LDAP settings are required. For more information, see Identity Providers below. Set none to disable identity management. It is strongly recommended to enable identity management for production deployments. In workload cluster configuration files, replicate the variable setting from their management cluster configuration.

INFRASTRUCTURE_PROVIDER ✔ ✔ Required. Set to vsphere, aws, or azure.

NAMESPACE ✖ ✔ Optional, set if you want to override the default value. By default, Tanzu Kubernetes Grid deploys Tanzu Kubernetes clusters to the default namespace.

SERVICE_CIDR ✔ ✔ Optional, set if you want to override the default value. The CIDR range to use for the Kubernetes services. By default, this range is set to 100.64.0.0/13. Change this value only if the recommended range is unavailable.

TMC_REGISTRATION_URL ✔ ✖ Optional. Set if you want to register your management cluster with Tanzu Mission Control. For more information, see Register Your Management Cluster with Tanzu Mission Control.

Identity Providers - OIDC

If you set IDENTITY_MANAGEMENT_TYPE: oidc, set the following variables to configure an OIDC identity provider. For more information, see Configure Identity Management in Create a Management Cluster Configuration File.

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

IDENTITY_MANAGEMENT_TYPE ✔ ✖ Enter oidc.

CERT_DURATION ✔ ✖ Optional. Default 2160h. Set this variable if you configure Pinniped and Dex to use self-signed certificates managed by certifcate-manager.

CERT_RENEW_BEFORE ✔ ✖ Optional. Default 360h. Set this variable if you configure Pinniped and Dex to use self-signed certificates managed by certifcate-manager.

OIDC_IDENTITY_PROVIDER_CLIENT_ID ✔ ✖ Required. The client_id value that you obtain from your OIDC provider. For example, if your provider is Okta, log in to Okta, create a Web application, and select the Client Credentials options in order to get a client_id and secret.

VMware, Inc. 35 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

OIDC_IDENTITY_PROVIDER_CLIENT_SECRET ✔ ✖ Required. The Base64 secret value that you obtain from your OIDC provider.

OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM ✔ ✖ Required. The name of your groups claim. This is used to set a user’s group in the JSON Web Token (JWT) claim. The default value is groups.

OIDC_IDENTITY_PROVIDER_ISSUER_URL ✔ ✖ Required. The IP or DNS address of your OIDC server.

OIDC_IDENTITY_PROVIDER_SCOPES ✔ ✖ Required. A comma separated list of additional scopes to request in the token response. For example, "email,offline_access".

OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM ✔ ✖ Required. The name of your username claim. This is used to set a user’s username in the JWT claim. Depending on your provider, enter claims such as user_name, email, or code.

SUPERVISOR_ISSUER_URL ✔ ✖ Do not modify. This variable is automatically updated in the configuration file when you run the tanzu cluster create command command.

SUPERVISOR_ISSUER_CA_BUNDLE_DATA_B64 ✔ ✖ Do not modify. This variable is automatically updated in the configuration file when you run the tanzu cluster create command command.

Identity Providers - LDAP

If you set IDENTITY_MANAGEMENT_TYPE: ldap, set the following variables to configure an LDAP identity provider. For more information, see Enabling Identity Management in Tanzu Kubernetes Grid and Configure Identity Management in Create a Management Cluster Configuration File.

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

LDAP_BIND_DN ✔ ✖ Optional. The DN for an application service account. The connector uses these credentials to search for users and groups. Not required if the LDAP server provides access for anonymous authentication.

LDAP_BIND_PASSWORD ✔ ✖ Optional. The password for an application service account, if LDAP_BIND_DN is set.

LDAP_GROUP_SEARCH_BASE_DN ✔ ✖ Optional. The point from which to start the LDAP search. For example, OU=Groups,OU=domain,DC=io.

LDAP_GROUP_SEARCH_FILTER ✔ ✖ Optional. An optional filter to be used by the LDAP search

VMware, Inc. 36 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE ✔ ✖ Optional. The attribute of the group record that holds the user/member information. For example, member.

LDAP_GROUP_SEARCH_NAME_ATTRIBUTE ✔ ✖ Optional. The LDAP attribute that holds the name of the group. For example, cn.

LDAP_GROUP_SEARCH_USER_ATTRIBUTE ✔ ✖ Optional. The attribute of the user record that is used as the value of the membership attribute of the group record. For example, distinguishedName, dn.

LDAP_HOST ✔ ✖ Required. The IP or DNS address of your LDAP server. If the LDAP server is listening on the default port 636, which is the secured configuration, you do not need to specify the port. If the LDAP server is listening on a different port, provide the address and port of the LDAP server, in the form "host:port".

LDAP_ROOT_CA_DATA_B64 ✔ ✖ Optional. If you are using an LDAPS endpoint, paste the base64 encoded contents of the LDAP server certificate.

LDAP_USER_SEARCH_BASE_DN ✔ ✖ Optional. The point from which to start the LDAP search. For example, OU=Users,OU=domain,DC=io.

LDAP_USER_SEARCH_EMAIL_ATTRIBUTE ✔ ✖ Optional. The LDAP attribute that holds the email address. For example, email, userPrincipalName.

LDAP_USER_SEARCH_FILTER ✔ ✖ Optional. An optional filter to be used by the LDAP search.

LDAP_USER_SEARCH_ID_ATTRIBUTE ✔ ✖ Optional. The LDAP attribute that contains the user ID. Similar to LDAP_USER_SEARCH_USERNAME.

LDAP_USER_SEARCH_NAME_ATTRIBUTE ✔ ✖ Optional. The LDAP attribute that holds the given name of the user. For example, givenName. This variable is not exposed in the installer interface.

LDAP_USER_SEARCH_USERNAME ✔ ✖ Optional. The LDAP attribute that contains the user ID. For example, uid, sAMAccountName.

Node Configuration Configure the size and number of control plane and worker nodes, and the operating system that the node instances run. For more information, see Configure Node Settings in Create a Management Cluster Configuration File.

VMware, Inc. 37 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

CONTROL_PLANE_MACHINE_COUNT ✖ ✔ Optional. Deploy a Tanzu Kubernetes cluster with more control plane nodes than the dev and prod plans define by default. The number of control plane nodes that you specify must be odd.

CONTROLPLANE_SIZE ✔ ✔ Optional. Size for control plane node VMs. Overrides the VSPHERE_CONTROL_PLANE_* parameters. See SIZE for possible values.

NODE_STARTUP_TIMEOUT ✔ ✔ Optional, set if you want to override the default value. The default value is 20m.

OS_ARCH ✔ ✔ Optional. Architecture for node VM OS. Default and only current choice is amd64.

OS_NAME ✔ ✔ Optional. Node VM OS. Defaults to ubuntu for Ubuntu LTS. Can also be photon for Photon OS on vSphere or amazon for Amazon Linux on Amazon EC2.

OS_VERSION ✔ ✔ Optional. Version for OS_NAME OS, above. Defaults to 20.04 for Ubuntu. Can be 3 for Photon on vSphere and 2 for Amazon Linux on Amazon EC2.

SIZE ✔ ✔ Optional. Size for both control plane and worker node VMs. Overrides the CONTROLPLANE_SIZE and WORKER_SIZE parameters. For vSphere, set small, medium, large, or extra- large. For Amazon EC2, set an instance type, for example, t3.small. For Azure, set an instance type, for example, Standard_D2s_v3.

WORKER_MACHINE_COUNT ✖ ✔ Optional. Deploy a Tanzu Kubernetes cluster with more worker nodes than the dev and prod plans define by default.

WORKER_SIZE ✔ ✔ Optional. Size for worker node VMs. Overrides the VSPHERE_WORKER_* parameters. See SIZE for possible values.

Cluster Autoscaler

Additonal variables to set if ENABLE_AUTOSCALER is set to true. For information about Cluster Autoscaler, Scale Tanzu Kubernetes Clusters.

VMware, Inc. 38 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AUTOSCALER_MAX_NODES_TOTAL ✖ ✔ Maximum total number of nodes in the cluster, worker plus control plane. Cluster Autoscaler does not attempt to scale your cluster beyond this limit. If set to 0, Cluster Autoscaler makes scaling decisions based on the minimum and maximum values that you configure for each machine deployment. Default 0. See below.

AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD ✖ ✔ Amount of time that Cluster Autoscaler waits after a scale-up operation and then resumes scale-down scans. Default 10m.

AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE ✖ ✔ Amount of time that Cluster Autoscaler waits after deleting a node and then resumes scale- down scans. Default 10s.

AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE ✖ ✔ Amount of time that Cluster Autoscaler waits after a scale-down failure and then resumes scale-down scans. Default 3m.

AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME ✖ ✔ Amount of time that Cluster Autoscaler must wait before scaling down an eligible node. Default 10m.

AUTOSCALER_MAX_NODE_PROVISION_TIME ✖ ✔ Maximum amount of time Cluster Autoscaler waits for a node to be provisioned. Default 15m.

AUTOSCALER_MIN_SIZE_0 ✖ ✔ Required, all IaaSes. Minimum number of worker nodes. Cluster Autoscaler does not attempt to scale down the nodes below this limit. For prod clusters on Amazon EC2, AUTOSCALER_MIN_SIZE_0 sets the minimum number of worker nodes in the first AZ. If not set, defaults to the value of WORKER_MACHINE_COUNT for clusters with a single machine deployment or WORKER_MACHINE_COUNT_0 for clusters with multiple machine deployments.

AUTOSCALER_MAX_SIZE_0 ✖ ✔ Required, all IaaSes. Maximum number of worker nodes. Cluster Autoscaler does not attempt to scale up the nodes beyond this limit. For prod clusters on Amazon EC2, AUTOSCALER_MAX_SIZE_0 sets the maximum number of worker nodes in the first AZ. If not set, defaults to the value of WORKER_MACHINE_COUNT for clusters with a single machine deployment or WORKER_MACHINE_COUNT_0 for clusters with multiple machine deployments.

VMware, Inc. 39 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AUTOSCALER_MIN_SIZE_1 ✖ ✔ Required, use only for prod clusters on Amazon EC2. Minimum number of worker nodes in the second AZ. Cluster Autoscaler does not attempt to scale down the nodes below this limit. If not set, defaults to the value of WORKER_MACHINE_COUNT_1.

AUTOSCALER_MAX_SIZE_1 ✖ ✔ Required, use only for prod clusters on Amazon EC2. Maximum number of worker nodes nodes in the second AZ. Cluster Autoscaler does not attempt to scale up the nodes beyond this limit. If not set, defaults to the value of WORKER_MACHINE_COUNT_1.

AUTOSCALER_MIN_SIZE_2 ✖ ✔ Required, use only for prod clusters on Amazon EC2. Minimum number of worker nodes in the third AZ. Cluster Autoscaler does not attempt to scale down the nodes below this limit. If not set, defaults to the value of WORKER_MACHINE_COUNT_2.

AUTOSCALER_MAX_SIZE_2 ✖ ✔ Required, use only for prod clusters on Amazon EC2. Maximum number of worker nodes in the third AZ. Cluster Autoscaler does not attempt to scale up the nodes beyond this limit. If not set, defaults to the value of WORKER_MACHINE_COUNT_2.

Proxy Configuration If your environment includes proxies, you can optionally configure Tanzu Kubernetes Grid to send outgoing HTTP and HTTPS traffic from kubelet, containerd, and the control plane to your proxies.

Tanzu Kubernetes Grid allows you to enable proxies for any of the following: n For both the management cluster and one or more Tanzu Kubernetes clusters n For the management cluster only n For one or more Tanzu Kubernetes clusters

For more information, see Configure Proxies in Create a Management Cluster Configuration File.

VMware, Inc. 40 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

TKG_HTTP_PROXY ✔ ✔ Optional, set if you want to configure a proxy; to disable your proxy configuration for an individual cluster, set this to "". The URL of your HTTP proxy, formatted as follows: PROTOCOL://USERNAME:PASSWORD@FQDN-OR- IP:PORT, where: n Required. PROTOCOL is http. n Optional. USERNAME and PASSWORD are your HTTP proxy username and password. Include these if the proxy requires authentication. n Required. FQDN-OR-IP and PORT are the FQDN or IP address and port number of your HTTP proxy. For example, http://user:[email protected]:1234 or http:// myproxy.com:1234. If you set TKG_HTTP_PROXY, you must also set TKG_HTTPS_PROXY.

TKG_HTTPS_PROXY ✔ ✔ Optional, set if you want to configure a proxy. The URL of your HTTPS proxy. You can set this variable to the same value as TKG_HTTP_PROXY or provide a different value. The URL must start with http://. If you set TKG_HTTPS_PROXY, you must also set TKG_HTTP_PROXY.

TKG_NO_PROXY ✔ ✔ Optional. One or more comma-separated network CIDRs or hostnames that must bypass the HTTP(S) proxy. For example, noproxy.yourdomain.com,192.168.0.0/24. Internally, Tanzu Kubernetes Grid appends localhost, 127.0.0.1, the values of CLUSTER_CIDR and SERVICE_CIDR, .svc, and .svc.cluster.local to the value that you set in TKG_NO_PROXY. It also appends your AWS VPC CIDR and 169.254.0.0/16 for deployments to Amazon EC2 and your Azure VNET CIDR, 169.254.0.0/16, and 168.63.129.16 for deployments to Azure. For vSphere, you must manually add the CIDR of VSPHERE_NETWORK, which includes the IP address of your control plane endpoint, to TKG_NO_PROXY. If you set VSPHERE_CONTROL_PLANE_ENDPOINT to an FQDN, add both the FQDN and VSPHERE_NETWORK to TKG_NO_PROXY. Important: If the cluster VMs need to communicate with external services and infrastructure endpoints in your Tanzu Kubernetes Grid environment, ensure that those endpoints are reachable by your proxies or add them to TKG_NO_PROXY. Depending on your environment configuration, this may include, but is not limited to, your OIDC or LDAP server, Harbor, NSX-T and NSX Advanced Load Balancer for deployments on vSphere, and AWS VPC CIDRs that are external to the cluster.

Antrea CNI Configuration

Additonal optional variables to set if CNI is set to antrea. For more information, see Configure Antrea CNI in Create a Management Cluster Configuration File.

VMware, Inc. 41 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

ANTREA_NO_SNAT ✔ ✔ Optional. Default false. Set to true to disable Source Network Address Translation (SNAT).

ANTREA_TRAFFIC_ENCAP_MODE ✔ ✔ Optional. Default "encap". Set to either noEncap, hybrid, or NetworkPolicyOnly. For information about using NoEncap or Hybrid traffic modes, see NoEncap and Hybrid Traffic Modes of Antrea in the Antrea documentation.

ANTREA_PROXY ✔ ✔ Optional. Default false. Enables or disables AntreaProxy, to replace kube-proxy for pod-to-ClusterIP Service traffic, for better performance and lower latency. Note that kube-proxy is still used for other types of Service traffic.

ANTREA_POLICY ✔ ✔ Optional. Default true. Enables or disables the Antrea-native policy API, which are policy CRDs specific to Antrea. Also, the implementation of Kubernetes Network Policies remains active when this variable is enabled. For information about using network policies, see Antrea Network Policy CRDs in the Antrea documentation.

ANTREA_TRACEFLOW ✔ ✔ Optional. Default false. Set to true to enable Traceflow. For information about using Traceflow, see the Traceflow User Guide in the Antrea documentation.

Machine Health Checks If you want to configure machine health checks for management and Tanzu Kubernetes clusters, set the following variables. For more information, see Configure Machine Health Checks in Create a Management Cluster Configuration File. For information about how to perform Machine Health Check operations after cluster deployment, see Configure Machine Health Checks for Tanzu Kubernetes Clusters.

VMware, Inc. 42 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

ENABLE_MHC ✔ ✔ Optional, set if you want to override the default value. The default value is true. This variable enables or disables the MachineHealthCheck controller, which provides node health monitoring and node auto-repair for worker nodes in management and Tanzu Kubernetes clusters. You can also enable or disable MachineHealthCheck after deployment by using the CLI. For instructions, see Configure Machine Health Checks for Tanzu Kubernetes Clusters.

MHC_UNKNOWN_STATUS_TIMEOUT ✔ ✔ Optional, set if you want to override the default value. The default value is 5m. By default, if the Ready condition of a node remains Unknown for longer than 5m, MachineHealthCheck considers the machine unhealthy and recreates it.

MHC_FALSE_STATUS_TIMEOUT ✔ ✔ Optional, set if you want to override the default value. The default value is 5m. By default, if the Ready condition of a node remains False for longer than 5m, MachineHealthCheck considers the machine unhealthy and recreates it.

Private Image Repository Configuration If you deploy Tanzu Kubernetes Grid management clusters and Kubernetes clusters in environments that are not connected to the Internet, you need to set up a private image repository within your firewall and populate it with the Tanzu Kubernetes Grid images. For information about setting up a private image repository, see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment and Deploy Harbor Registry as a Shared Service.

VMware, Inc. 43 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

TKG_CUSTOM_IMAGE_REPOSITORY ✔ ✔ Required if you deploy Tanzu Kubernetes Grid in an Internet-restricted environment. Provide the IP address or FQDN of your private registry. For example, custom-image- repository.io/yourproject.

TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY ✔ ✔ Optional. Set to true if your private image registry uses a self-signed certificate and you do not use TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE. Because the Tanzu connectivity webhook injects the Harbor CA certificate into cluster nodes, TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY should always be set to false when using Harbor as a shared service.

TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE ✔ ✔ Optional. Set if your private image registry uses a self-signed certificate. Provide the CA certificate in base64 encoded format, for example TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: "LS0t[...]tLS0tLQ=="". vSphere

The options in the table below are the minimum options that you specify in the cluster configuration file when deploying Tanzu Kubernetes clusters to vSphere. Most of these options are the same for both the Tanzu Kubernetes cluster and the management cluster that you use to deploy it.

For more information about the configuration files for vSphere, see Management Cluster Configuration for vSphere and Deploy Tanzu Kubernetes Clusters to vSphere.

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

DEPLOY_TKG_ON_VSPHERE7 ✔ ✔ Optional. If deploying to vSphere 7, set to true to skip the prompt about deployment on vSphere 7, or false. See Management Clusters on vSphere with Tanzu.

ENABLE_TKGS_ON_VSPHERE7 ✔ ✔ Optional if deploying to vSphere 7. Set to true to be redirected to the vSphere with Tanzu enablement UI page, or false. See Management Clusters on vSphere with Tanzu.

VIP_NETWORK_INTERFACE ✔ ✔ Optional. Set to eth0, eth1, etc. Network interface name, for example an Ethernet interface. Defaults to eth0.

VMware, Inc. 44 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

VSPHERE_CONTROL_PLANE_DISK_GIB ✔ ✔ Optional. The size in gigabytes of the disk for the control plane node VMs. Include the quotes (""). For example, "30".

VSPHERE_CONTROL_PLANE_ENDPOINT ✔ ✔ Required. Static virtual IP address for API requests to the Tanzu Kubernetes cluster. If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address.

VSPHERE_CONTROL_PLANE_MEM_MIB ✔ ✔ Optional. The amount of memory in megabytes for the control plane node VMs. Include the quotes (""). For example, "2048".

VSPHERE_CONTROL_PLANE_NUM_CPUS ✔ ✔ Optional. The number of CPUs for the control plane node VMs. Include the quotes (""). Must be at least 2. For example, "2".

VSPHERE_DATACENTER ✔ ✔ Required. The name of the datacenter in which to deploy the cluster, as it appears in the vSphere inventory. For example, /MY-DATACENTER.

VSPHERE_DATASTORE ✔ ✔ Required. The name of the vSphere datastore for the cluster to use, as it appears in the vSphere inventory. For example, /MY-DATACENTER/datastore/MyDatastore.

VSPHERE_FOLDER ✔ ✔ Required. The name of an existing VM folder in which to place Tanzu Kubernetes Grid VMs, as it appears in the vSphere inventory. For example, if you created a folder named TKG, the path is /MY-DATACENTER/vm/TKG.

VSPHERE_INSECURE ✔ ✔ Optional. Set to true or false to bypass thumbprint verification. If false, set VSPHERE_TLS_THUMBPRINT.

VSPHERE_NETWORK ✔ ✔ Required. The name of an existing vSphere network to use as the Kubernetes service network, as it appears in the vSphere inventory. For example, VM Network.

VSPHERE_PASSWORD ✔ ✔ Required. The password for the vSphere user account. This value is base64-encoded when you run tanzu cluster create.

VSPHERE_RESOURCE_POOL ✔ ✔ Required. The name of an existing resource pool in which to place this Tanzu Kubernetes Grid instance, as it appears in the vSphere inventory. To use the root resource pool for a cluster, enter the full path, for example for a cluster named cluster0 in datacenter MY- DATACENTER, the full path is /MY-DATACENTER/host/cluster0/ Resources.

VSPHERE_SERVER ✔ ✔ Required. The IP address or FQDN of the vCenter Server instance on which to deploy the Tanzu Kubernetes cluster.

VMware, Inc. 45 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

VSPHERE_SSH_AUTHORIZED_KEY ✔ ✔ Required. Paste in the contents of the SSH public key that you created in Prepare to Deploy Management Clusters to vSphere. For example, "ssh-rsa NzaC1yc2EA [...] hnng2OYYSl+8ZyNz3fmRGX8uPYqw== [email protected]".

VSPHERE_STORAGE_POLICY_ID ✔ ✔ Optional. The name of a VM storage policy for the management cluster, as it appears in Policies and Profiles > VM Storage Policies. If VSPHERE_DATASTORE is set, the storage policy must include it. Otherwise, the cluster creation process chooses a datastore that compatible with the policy.

VSPHERE_TEMPLATE ✖ ✔ Optional. Specify the path to an OVA file if you are using multiple custom OVA images for the same Kubernetes version, in the format /MY-DC/vm/MY-FOLDER-PATH/MY-IMAGE. For more information, see Deploy a Cluster with a Custom OVA Image.

VSPHERE_TLS_THUMBPRINT ✔ ✔ Required if VSPHERE_INSECURE is false. The thumbprint of the vCenter Server certificate. For information about how to obtain the vCenter Server certificate thumbprint, see Obtain vSphere Certificate Thumbprints. This value can be skipped if user wants to use insecure connection by setting `VSPHERE_INSECURE` to `true`.

VSPHERE_USERNAME ✔ ✔ Required. A vSphere user account with the required privileges for Tanzu Kubernetes Grid operation. For example, [email protected].

VSPHERE_WORKER_DISK_GIB ✔ ✔ Optional. The size in gigabytes of the disk for the worker node VMs. Include the quotes (""). For example, "50".

VSPHERE_WORKER_MEM_MIB ✔ ✔ Optional. The amount of memory in megabytes for the worker node VMs. Include the quotes (""). For example, "4096".

VSPHERE_WORKER_NUM_CPUS ✔ ✔ Optional. The number of CPUs for the worker node VMs. Include the quotes (""). Must be at least 2. For example, "2".

NSX Advanced Load Balancer For information about how to deploy NSX Advanced Load Balancer, see Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch.

VMware, Inc. 46 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AVI_ENABLE ✔ ✖ Optional. Set to true or false. Enables NSX Advanced Load Balancer. If true, you must set the required variables listed in NSX Advanced Load Balancer below. Defaults to false.

AVI_ADMIN_CREDENTIAL_NAME ✔ ✖ Optional. The name of the Kubernetes Secret that contains the NSX Advanced Loader Balancer controller admin username and password. Default avi-controller-credentials.

AVI_AKO_IMAGE_PULL_POLICY ✔ ✖ Optional. Default IfNotPresent.

AVI_CA_DATA_B64 ✔ ✖ Required. The contents of the Controller Certificate Authority that is used to sign the Controller certificate. It must be base64 encoded.

AVI_CA_NAME ✔ ✖ Optional. The name of the Kubernetes Secret that holds the NSX Advanced Loader Balancer Controller Certificate Authority. Default avi- controller-ca.

AVI_CLOUD_NAME ✔ ✖ Required. The cloud that you created in your NSX Advanced Load Balancer deployment. For example, Default-Cloud.

AVI_CONTROLLER ✔ ✖ Required. The IP or hostname of the NSX Advanced Loader Balancer controller.

AVI_DATA_NETWORK ✔ ✖ Required. The name of the Network on which the Load Balancer floating IP subnet or IP Pool is configured. This Network must be present in the same vCenter Server instance as the Kubernetes network that Tanzu Kubernetes Grid uses, that you specify in the SERVICE_CIDR variable. This allows NSX Advanced Load Balancer to discover the Kubernetes network in vCenter Server and to deploy and configure Service Engines.

AVI_DATA_NETWORK_CIDR ✔ ✖ Required.The CIDR of the subnet to use for the load balancer VIP. This comes from one of the VIP network's configured subnets. You can see the subnet CIDR for a particular Network in the Infrastructure - Networks view of the NSX Advanced Load Balancer interface.

AVI_DISABLE_INGRESS_CLASS ✔ ✖ Optional. Disable Ingress Class. Default false.

AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER ✔ ✖ Optional. Use AKO as the default Ingress Controller. Default false.

AVI_INGRESS_SERVICE_TYPE ✔ ✖ Optional. Specifies whether the AKO functions in ClusterIP mode or NodePort mode. Defaults to NodePort.

VMware, Inc. 47 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AVI_INGRESS_SHARD_VS_SIZE ✔ ✖ Optional. AKO uses a sharding logic for Layer 7 ingress objects. A sharded VS involves hosting multiple insecure or secure ingresses hosted by one virtual IP or VIP. Set to LARGE, MEDIUM, or SMALL. Default SMALL. Use this to control the layer 7 VS numbers. This applies to both secure/ insecure VSes but does not apply for passthrough.

AVI_LABELS ✔ ✖ Optional. Optional labels in the format key: value. When set, NSX Advanced Load Balancer is enabled only on workload clusters that have this label. For example team: tkg.

AVI_NAMESPACE ✔ ✖ Optional. The namespace for AKO operator. Default "tkg-system-networking".

AVI_PASSWORD ✔ ✖ Required. The password that you set for the Controller admin when you deployed it.

AVI_SERVICE_ENGINE_GROUP ✔ ✖ Required. Name of the Service Engine Group. For example, Default-Group.

AVI_USERNAME ✔ ✖ Required. The admin username that you set for the Controller host when you deployed it.

NSX-T Pod Routing These variables configure routable-IP address workload pods, as described in Deploy a Cluster with Routable-IP Pods. All variables are strings in double-quotes, for example "true".

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

NSXT_POD_ROUTING_ENABLED ✖ ✔ Optional. "true" enables NSX-T routable pods with the variables below. Default is "false". See Deploy a Cluster with Routable-IP Pods.

NSXT_MANAGER_HOST ✖ ✔ Required if NSXT_POD_ROUTING_ENABLED= "true". IP address of NSX-T Manager.

NSXT_ROUTER_PATH ✖ ✔ Required if NSXT_POD_ROUTING_ENABLED= "true". T1 router path shown in NSX-T Manager.

For username/password authentication to NSX-T:

NSXT_USERNAME ✖ ✔ Username for logging in to NSX-T Manager.

NSXT_PASSWORD ✖ ✔ Password for logging in to NSX-T Manager.

VMware, Inc. 48 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

For authenticating to NSX-T using credentials and storing them in a Kubernetes secret (also set NSXT_USERNAME and NSXT_PASSWORD above):

NSXT_SECRET_NAMESPACE ✖ ✔ The namespace with the secret containing NSX-T username and password. Default is "kube-system".

NSXT_SECRET_NAME ✖ ✔ The name of the secret containing NSX-T username and password. Default is "cloud-provider-vsphere- nsxt-credentials".

For certificate authentication to NSX-T:

NSXT_ALLOW_UNVERIFIED_SSL ✖ ✔ Set this to "true" if NSX-T uses a self-signed certificate. Default is false.

NSXT_ROOT_CA_DATA_B64 ✖ ✔ Required if NSXT_ALLOW_UNVERIFIED_SSL= "false". Base64-encoded Certificate Authority root certificate string that NSX-T uses for LDAP authentication.

NSXT_CLIENT_CERT_KEY_DATA ✖ ✔ Base64-encoded cert key file string for local client certificate.

NSXT_CLIENT_CERT_DATA ✖ ✔ Base64-encoded cert file string for local client certificate.

For remote authentication to NSX- T with VMware Identity Manager, on VMware Cloud (VMC):

NSXT_REMOTE_AUTH ✖ ✔ Set this to "true" for remote authentication to NSX- T with VMware Identity Manager, on VMware Cloud (VMC). Default is "false".

NSXT_VMC_AUTH_HOST ✖ ✔ VMC authentication host. Default is empty.

NSXT_VMC_ACCESS_TOKEN ✖ ✔ VMC authentication access token. Default is empty.

Amazon EC2

The variables in the table below are the options that you specify in the cluster configuration file when deploying Tanzu Kubernetes clusters to Amazon EC2. Many of these options are the same for both the Tanzu Kubernetes cluster and the management cluster that you use to deploy it.

For more information about the configuration files for Amazon EC2, see Management Cluster Configuration for Amazon EC2 and Deploy Tanzu Kubernetes Clusters to Amazon EC2.

VMware, Inc. 49 VMware Tanzu Kubernetes Grid

Can be set in...

Management cluster Variable YAML Description Tanzu Kubernetes cluster YAML

AWS_ACCESS_KEY_ID ✔ ✔ Required. The access key ID for your AWS account. Alternatively, you can specify account credentials as a local environment variables or in your AWS default credential provider chain.

AWS_NODE_AZRequiredabcus-west-2aprodAWS_NODE_AZ_1AWS_NODE_AZ_2us-west-2aus-west-2bus- west-2cAWS_NODE_AZ_1AWS_NODE_AZ_2OptionalprodAWS_NODE_AZAWS_NODE_AZus-west-2aap- northeast-2bAWS_PRIVATE_NODE_CIDROptionalAWS_VPC_CIDRAWS_NODE_AZprodAWS_PRIVATE_NODE_CIDR_1AW S_PRIVATE_NODE_CIDR_210.0.0.0/24. AWS_PRIVATE_NODE_CIDR_1 ✔ ✔ Optional. If the recommended range of 10.0.2.0/24 is not available, enter a different IP range in CIDR format. When Tanzu Kubernetes Grid

deploys your management cluster, it creates this subnetwork in AWS_NODE_AZ_1. See AWS_PRIVATE_NODE_CIDR

above for more information. AWS_PRIVATE_NODE_CIDR_2 ✔ ✔ Optional. If the recommended range of 10.0.4.0/24 is not available, enter a different IP range in CIDR format. When Tanzu Kubernetes Grid deploys your management cluster, it creates this subnetwork in AWS_NODE_AZ_2. See AWS_PRIVATE_NODE_CIDR above for more information. AWS_PUBLIC_NODE_CIDR ✔ ✔ Optional. Set this variable if you set AWS_VPC_CIDR. If the recommended range of 10.0.1.0/24 is not available, enter a different IP range in CIDR format for public nodes to use. When Tanzu Kubernetes Grid deploys your management cluster, it creates this subnetwork in AWS_NODE_AZ. To deploy a prod management cluster with three control plane nodes, you must

also set AWS_PUBLIC_NODE_CIDR_1 and AWS_PUBLIC_NODE_CIDR_2. AWS_PUBLIC_NODE_CIDR_1 ✔ ✔ Optional. If the recommended range of 10.0.3.0/24 is not available, enter a different IP range in CIDR format. When Tanzu

Kubernetes Grid deploys your management cluster, it creates this subnetwork in AWS_NODE_AZ_1. See

AWS_PUBLIC_NODE_CIDR above for more information. AWS_PUBLIC_NODE_CIDR_2 ✔ ✔ Optional. If the recommended range of 10.0.5.0/24 is not available, enter a different IP range in CIDR format. When Tanzu Kubernetes

Grid deploys your management cluster, it creates this subnetwork in AWS_NODE_AZ_2. See AWS_PUBLIC_NODE_CIDR above for more information. AWS_PRIVATE_SUBNET_ID ✔ ✔ Optional. If you set AWS_VPC_ID to use an existing

VPC, enter the ID of a private subnet that already exists in AWS_NODE_AZ. This setting is optional. If you do not set it, tanzu management-cluster create identifies the private subnet automatically. To deploy a prod management cluster with three control plane nodes, you must also set AWS_PRIVATE_SUBNET_ID_1 and

AWS_PRIVATE_SUBNET_ID_2. AWS_PRIVATE_SUBNET_ID_1 ✔ ✔ Optional. The ID of a private subnet that exists in

AWS_NODE_AZ_1. If you do not set this variable, tanzu management-cluster create identifies the private

subnet automatically. See AWS_PRIVATE_SUBNET_ID above for more information. AWS_PRIVATE_SUBNET_ID_2 ✔ ✔

Optional. The ID of a private subnet that exists in AWS_NODE_AZ_2. If you do not set this variable, tanzu

management-cluster create identifies the private subnet automatically. See AWS_PRIVATE_SUBNET_ID above for

more information. AWS_PUBLIC_SUBNET_ID ✔ ✔ Optional. If you set AWS_VPC_ID to use an existing VPC, enter

the ID of a public subnet that already exists in AWS_NODE_AZ. This setting is optional. If you do not set it, tanzu management-cluster create identifies the public subnet automatically. To deploy a prod management

cluster with three control plane nodes, you must also set AWS_PUBLIC_SUBNET_ID_1 and AWS_PUBLIC_SUBNET_ID_2.

AWS_PUBLIC_SUBNET_ID_1 ✔ ✔ Optional. The ID of a public subnet that exists in AWS_NODE_AZ_1. If you do not set this variable, tanzu management-cluster create identifies the public subnet automatically. See

AWS_PUBLIC_SUBNET_ID above for more information. AWS_PUBLIC_SUBNET_ID_2 ✔ ✔ Optional. The ID of a public

subnet that exists in AWS_NODE_AZ_2. If you do not set this variable, tanzu management-cluster create

VMware, Inc. 50 VMware Tanzu Kubernetes Grid

identifies the public subnet automatically. See AWS_PUBLIC_SUBNET_ID above for more information. AWS_REGION

✔ ✔ Required. The name of the AWS region in which to deploy the cluster. For example, us-west-2. You can also specify the us-gov-east and us-gov-west regions in AWS GovCloud. If you have already set a different region as an environment variable, for example, in Prepare to Deploy Management Clusters to

Amazon EC2, you must unset that environment variable. For example, us-west-2, ap-northeast-2, etc. AWS_SECRET_ACCESS_KEYRequiredAWS_SESSION_TOKENOptionalAWS_SSH_KEY_NAMERequiredAWS_VPC_IDOp tionalAWS_PUBLIC_SUBNET_IDAWS_PRIVATE_SUBNET_IDAWS_VPC_IDAWS_VPC_CIDRAWS_VPC_CIDROptional10.0 .0.0/16AWS_VPC_CIDRAWS_PUBLIC_NODE_CIDRAWS_PRIVATE_NODE_CIDRAWS_VPC_CIDRAWS_VPC_CIDRAWS_VPC_I DBASTION_HOST_ENABLEDOptional"true""true""false"AWS_VPC_IDBASTION_HOST_ENABLED"true"CONTROL_P LANE_MACHINE_TYPERequiredSIZECONTROLPLANE_SIZEt3.smallm5.largeNODE_MACHINE_TYPERequiredSIZE WORKER_SIZEt3.smallm5.large

Microsoft Azure

The variables in the table below are the options that you specify in the cluster configuration file when deploying Tanzu Kubernetes clusters to Azure. Many of these options are the same for both the Tanzu Kubernetes cluster and the management cluster that you use to deploy it.

For more information about the configuration files for Azure, see Management Cluster Configuration for Microsoft Azure and Deploy Tanzu Kubernetes Clusters to Azure.

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AZURE_CLIENT_ID ✔ ✔ Required. The client ID of the app for Tanzu Kubernetes Grid that you registered with Azure.

AZURE_CLIENT_SECRET ✔ ✔ Required. Your Azure client secret from Register a Tanzu Kubernetes Grid App on Azure.

AZURE_CUSTOM_TAGS ✔ ✔ Optional. Comma-separated list of tags to apply to Azure resources created for the cluster. A tag is a key-value pair, for example, "foo=bar, plan=prod". For more information about tagging Azure resources, see Use tags to organize your Azure resources and management hierarchy and Tag support for Azure resources in the Microsoft Azure documentation.

AZURE_ENVIRONMENT ✔ ✔ Optional, set if you want to override the default value. The default value is AzurePublicCloud. Supported clouds are AzurePublicCloud, AzureChinaCloud, AzureGermanCloud, AzureUSGovernmentCloud.

VMware, Inc. 51 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AZURE_LOCATION ✔ ✔ Required. The name of the Azure region in which to deploy the cluster. For example, eastus.

AZURE_RESOURCE_GROUP ✔ ✔ Optional. The name of the Azure resource group that you want to use for the cluster. Defaults to the CLUSTER_NAME. Must be unique to each cluster. AZURE_RESOURCE_GROUP and AZURE_VNET_RESOURCE_GROUP are the same by default.

AZURE_SSH_PUBLIC_KEY_B64 ✔ ✔ Required. Your SSH public key, created in Prepare to Deploy Management Clusters to Microsoft Azure, converted into base64 with newlines removed. For example, c3NoLXJzYSBB [...] vdGFsLmlv.

AZURE_SUBSCRIPTION_ID ✔ ✔ Required. The subscription ID of your Azure subscription.

AZURE_TENANT_ID ✔ ✔ Required. The tenant ID of your Azure account.

Networking

AZURE_ENABLE_ACCELERATED_NETWORKING ✔ ✔ Reserved for future use.. Set to true to enable Azure accelerated networking on VMs based on compatible Azure Tanzu Kubernetes release (TKr) images. Currently, Azure TKr do not support Azure accelerated networking.

AZURE_ENABLE_PRIVATE_CLUSTER ✔ ✔ Optional. Set this to true to configure the cluster as private and use an Azure Internal Load Balancer (ILB) for its incoming traffic. For more information, see Azure Private Clusters.

AZURE_FRONTEND_PRIVATE_IP ✔ ✔ Optional. Set this if AZURE_ENABLE_PRIVATE_CLUSTER is true and you want to override the default internal load balancer address of 10.0.0.100.

AZURE_VNET_CIDR ✔ ✔ Optional, set if you want to deploy the cluster to a new VNET and subnets and AZURE_CONTROL_PLANE_SUBNET_CIDR override the default values. By default, AZURE_NODE_SUBNET_CIDR AZURE_VNET_CIDR is set to 10.0.0.0/16, AZURE_CONTROL_PLANE_SUBNET_CIDR to 10.0.0.0/24, and AZURE_NODE_SUBNET_CIDR to 10.0.1.0/24.

AZURE_VNET_NAME ✔ ✔ Optional, set if you want to deploy the cluster to an existing VNET and subnets AZURE_CONTROL_PLANE_SUBNET_NAME

VMware, Inc. 52 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AZURE_NODE_SUBNET_NAME or assign names to a new VNET and subnets.

AZURE_VNET_RESOURCE_GROUP ✔ ✔ Optional, set if you want to override the default value. The default value is set to the value of AZURE_RESOURCE_GROUP.

Control Plane VMs

AZURE_CONTROL_PLANE_DATA_DISK_SIZE_GIB ✔ ✔ Optional. Size of data disk and OS disk, as described in Azure documentation AZURE_CONTROL_PLANE_OS_DISK_SIZE_GIB Disk roles, for control plane VMs, in GB. Examples: 128, 256. Control plane nodes are always provisioned with a data disk.

AZURE_CONTROL_PLANE_MACHINE_TYPE ✔ ✔ Optional, set if you want to override the default value. An Azure VM size for the control plane node VMs, chosen to fit expected workloads. The default value is Standard_D2s_v3. The minimum requirement for Azure instance types is 2 CPUs and 8 GB memory. For possible values, see the Tanzu Kubernetes Grid installer interface.

AZURE_CONTROL_PLANE_OS_DISK_STORAGE_ACCOUNT_TYPE ✔ ✔ Optional. Type of Azure storage account for control plane VM disks. Example: Premium_LRS.

Worker Node VMs

AZURE_ENABLE_NODE_DATA_DISK ✔ ✔ Optional. Set to true to provision a data disk for each worker node VM, as described in Azure documentation Disk roles. Default: false.

AZURE_NODE_DATA_DISK_SIZE_GIB ✔ ✔ Optional. Set this variable if AZURE_ENABLE_NODE_DATA_DISK is true. Size of data disk, as described in Azure documentation Disk roles, for worker VMs, in GB. Examples: 128, 256.

AZURE_NODE_OS_DISK_SIZE_GIB ✔ ✔ Optional. Size of OS disk, as described in Azure documentation Disk roles, for worker VMs, in GB. Examples: 128, 256.

VMware, Inc. 53 VMware Tanzu Kubernetes Grid

Can be set in...

Management Variable cluster YAML Description Tanzu Kubernetes cluster YAML

AZURE_NODE_MACHINE_TYPE ✔ ✔ Optional, set if you want to override the default value. An Azure VM size for the worker node VMs, chosen to fit expected workloads. The default value is Standard_D2s_v3. For possible values, see the Tanzu Kubernetes Grid installer interface.

AZURE_NODE_OS_DISK_STORAGE_ACCOUNT_TYPE ✔ ✔ Optional. Set this variable if AZURE_ENABLE_NODE_DATA_DISK is true. Type of Azure storage account for worker VM disks. Example: Premium_LRS.

Customizing Clusters, Plans, and Extensions with ytt Overlays

This topic explains how you can use ytt Overlays to customize Tanzu Kubernetes (workload) clusters, cluster plans, extensions, and shared services.

Tanzu Kubernetes Grid distributes configuration files for clusters and cluster plans in the ~/.tanzu/tkg/providers/ directory, and for extensions and shared services in the Tanzu Kubernetes Grid Extensions Bundle.

You can customize these configurations by adding or modifying configuration files directly, or by using ytt overlays.

Directly customizing configuration files is simpler, but if you are comfortable with ytt overlays, they let you customize configurations at different scopes and manage multiple, modular configuration files, without destructively editing upstream and inherited configuration values.

For more information about ytt, including overlay examples and an interactive validator tool, see: n Carvel Tools > ytt > Interactive Playground n The IT Hollow: Using YTT to Customize TKG Deployments

Clusters and Cluster Plans

The ~/.tanzu/tkg/providers/ directory includes ytt directories and overlay. files at different levels, which lets you scope configuration settings at each level: n Provider- and version-specific ytt directories. For example, ~/.tanzu/tkg/providers/ infrastructure-aws/v0.6.4/ytt.

n Specific configurations for provider API version.

VMware, Inc. 54 VMware Tanzu Kubernetes Grid

n The base-template.yaml file contains all-caps placeholders such as "${CLUSTER_NAME}" and should not be edited.

n The overlay.yaml file is tailored to overlay values into base-template.yaml. n Provider-wide ytt directories. For example, ~/.tanzu/tkg/providers/infrastructure-aws/ytt.

n Provider-wide configurations that apply to all versions. n Top-level ytt directory, ~/.tanzu/tkg/providers/ytt.

n Cross-provider configurations.

n Organized into numbered directories, and processed in number order.

n ytt traverses directories in alphabetical order and overwrites duplicate settings, so you can create a /04_user_customizations subdirectory for configurations that take precedence over any in lower-numbered ytt subdirectories.

IMPORTANT: You can only use ytt overlays to modify workload clusters. Using ytt overlays to modify management clusters is not supported.

Cluster and Plan Examples

Examples of ytt overlays for customizing workload clusters and cluster plans include: n Nameservers on vSphere n Trust Custom CA Certificates on Cluster Nodes n Disable Bastion Server on AWS in the TKG Lab repository n New nginx Workload Plan

Extensions and Shared Services

The Tanzu Kubernetes Grid Extensions Bundle includes templates for ytt overlays to implement various customizations.

These ytt overlay templates are in overlay subdirectories in the following locations: n Implementing Ingress Control with Contour: ingress/contour/overlays n Deploy Harbor Registry as a Shared Service: registry/harbor/overlays n Implementing Monitoring with Prometheus and Grafana: monitoring/prometheus/overlays and monitoring/grafana/overlays n Implementing Log Forwarding with Fluent Bit: logging/fluent-bit/overlays n (Deprecated) User Authentication with Dex and Gangway: authentication/dex/overlays and authentication/gangway/overlays

VMware, Inc. 55 VMware Tanzu Kubernetes Grid

Before deploying an extension, you can use these overlay templates or create and apply your own overlays to the extension as follows:

1 In the extensions bundle directory, under the extension's /overlays directory, modify an overlay template or create a new overlay to contain your custom values: n Existing template: Find and modify the template that fits your need. The template filenames indicate their use; for example change-namespace.yaml and update-registry.yaml. n New overlay: Create a new ytt overlay file. For example, to add an annotation to the Grafana extension's HTTP Proxy, create a update-grafana-httproxy.yaml overlay with contents:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:yaml", "yaml") #@overlay/match by=overlay.subset({"kind": "HTTPProxy", "metadata": {"name": "grafana- httpproxy"}}) --- metadata: #@overlay/match missing_ok=True annotations: #@overlay/match missing_ok=True dns.alpha.kubernetes.io/hostname: grafana.tkg.vclass.local

1 Save the overlay content as a secret in the extension's namespace. For example with update- grafana-httproxy.yaml above, run:

kubectl create secret generic grafana-httpproxy --from-file=update-grafana-httpproxy.yaml=update- grafana-httpproxy.yaml -n tanzu-system-monitoring

1 In the extension's deployment file, under extensions/ add a reference to the new secret. For example for the grafana-httproxy secret above, add the following to the file /extensions/ monitoring/grafana/grafana-extension.yaml under spec.template.ytt.inline.pathsFrom, after the existing grafana-data-values setting:

- secretRef: name: grafana-httpproxy

The examples below show some specific use cases for creating and applying custom overlays.

Extension and Shared Service Examples

Examples of applying ytt overlay files for customizing extensions and shared services include: n Contour: External DNS Annotation n Harbor: Clean Up S3 and Trust Let's Encrypt

For more examples, see the TKG Lab repository and its Step by Step setup guide.

VMware, Inc. 56 Deploying Management Clusters 4

This topic summarizes how to deploy a Tanzu Kubernetes Grid management cluster or designate one from vSphere with Tanzu. Deploying or designating a management cluster completes the Tanzu Kubernetes Grid installation process and makes Tanzu Kubernetes Grid operational.

This chapter includes the following topics: n Overview n Installer UI vs. CLI n Platforms n Configuring the Management Cluster n What Happens When You Create a Management Cluster n Core Add-ons n Prepare to Deploy Management Clusters n Deploy Management Clusters with the Installer Interface n Deploy Management Clusters from a Configuration File n Configure Identity Management After Management Cluster Deployment n Examine the Management Cluster Deployment

Overview

After you have performed the steps described in Chapter 3 Install the Tanzu CLI and Other Tools, you can deploy management clusters to the platforms of your choice.

NOTE: On vSphere with Tanzu, available on vSphere 7 and later, VMware recommends configuring the built-in supervisor cluster as a management cluster instead of using the tanzu CLI to deploy a new management cluster. Deploying a Tanzu Kubernetes Grid management cluster to vSphere 7 when vSphere with Tanzu is not enabled is supported, but the preferred option is to enable vSphere with Tanzu and use the Supervisor Cluster. For details, see vSphere with Tanzu Provides Management Cluster.

VMware, Inc. 57 VMware Tanzu Kubernetes Grid

The management cluster is a Kubernetes cluster that runs Cluster API operations on a specific cloud provider to create and manage workload clusters on that provider. The management cluster is also where you configure the shared and in-cluster services that the workload clusters use.

Installer UI vs. CLI

You can deploy management clusters in two ways: n Run the Tanzu Kubernetes Grid installer, a wizard interface that guides you through the process of deploying a management cluster. This is the recommended method. n Create and edit YAML configuration files, and use them to deploy a management cluster with the CLI commands.

Platforms

You can deploy and manage Tanzu Kubernetes Grid management clusters on: n vSphere 6.7u3 n vSphere 7, if vSphere with Tanzu is not enabled. For more information, see vSphere with Tanzu Provides Management Cluster. n Amazon Elastic Compute Cloud (Amazon EC2) n Microsoft Azure

You can deploy the management cluster as either a single control plane, for development, or as a highly-available multi-node control plane, for production environments. n For information about the required setup for management cluster deployment to your infrastructure of choice, see Prepare to Deploy Management Clusters. n When you have set up your infrastructure, see Deploy Management Clusters with the Installer Interface or Deploy Management Clusters from a Configuration File. n After you have deployed a management cluster to the platform of your choice, Examine the Management Cluster Deployment. n If you want to register your management cluster with Tanzu Mission Control, follow the procedure in Register Your Management Cluster with Tanzu Mission Control. n After you have deployed one or more management clusters to one or more platforms, use the Tanzu CLI to Manage Your Management Clusters. n If necessary, you can Back Up and Restore Clusters.

VMware, Inc. 58 VMware Tanzu Kubernetes Grid

Configuring the Management Cluster

You deploy your management cluster by running the tanzu management-cluster create command on the bootstrap machine. You configure the management cluster in different ways, depending on whether you specify --ui to launch the installer interface: n Installer Interface: UI input n CLI: Set configuration parameters, like AZURE_NODE_MACHINE_TYPE:

n As local environment variables

n In the cluster configuration file passed to the --file option

The tanzu management-cluster create command uses these sources and inputs in the following order of increasing precedence:

1 ~/.tanzu/tkg/providers/config_default.yaml: This file contains system defaults, and should not be changed.

2 With the --file option: The cluster configuration file, which defaults to ~/.tanzu/tkg/cluster- config.yaml. This file configures specific invocations of tanzu management-cluster create and other tanzu commands. Use different --file files to save multiple configurations.

3 Local environment variables: Parameter settings in your local environment override settings from config files. Use them to make quick config choices without having to search and edit a config file.

4 With the --ui option: Installer UI input. When you run tanzu management-cluster create --ui, the installer sets all management cluster configuration values from user input and ignores all other CLI options.

What Happens When You Create a Management Cluster

Running tanzu management-cluster create creates a temporary management cluster using a Kubernetes in Docker (kind) cluster on the bootstrap machine. After creating the temporary management cluster locally, Tanzu Kubernetes Grid uses it to provision the final management cluster in the platform of your choice.

In the process, tanzu management-cluster create creates or modifies CLI configuration and state files in the user's home directory on the local bootstrap machine:

Location Content Change

~/.tanzu/tkg/bom/ Bill of Materials (BoM) files that list specific versions of all of Add if not already present the packages that Tanzu Kubernetes Grid requires when it creates a cluster with a specific OS and Kubernetes version. Tanzu Kubernetes Grid adds to this directory as new Tanzu Kubernetes release versions are published.

~/.tanzu/tkg/providers/ Configuration template files for Cluster API, cloud providers, Add if not already present and other dependencies, organized with ytt overlays for non- destructive modification.

VMware, Inc. 59 VMware Tanzu Kubernetes Grid

~/.tanzu/tkg/providers- Backups of /providers directories from previous installations. Add if not first installation TIMESTAMP-HASH/

~/.tanzu/config.yaml Names, contexts, and certificate file locations for the Add new management management clusters that the tanzu CLI knows about, and cluster information and set which is the current one. it as current.

~/.tanzu/tkg/cluster- Default cluster configuration file that the tanzu cluster create Add empty file if not config.yaml and tanzu management-cluster create commands use if you do already present. not specify one with --file. Best practice is to use a configuration file unique to each cluster.

~/.tanzu/tkg/ Cluster configuration file that tanzu management-cluster create Create file clusterconfigs/ --ui writes out with values input from the installer interface. IDENTIFIER.yaml IDENTIFIER is an unique identifier generated by the installer.

~/.tanzu/tkg/config.yaml List of configurations and locations for the Tanzu Kubernetes Add if not already present Grid core and all of its providers.

~/.tanzu/tkg/providers/ Similar to ~/.tanzu/tkg/config.yaml, but only lists providers Add if not already present config.yaml and configurations in the ~/.tanzu/tkg/providers directory, not configuration files used by core Tanzu Kubernetes Grid.

~/.tanzu/tkg/providers/ System-wide default configurations for providers. Best Add if not already present config_default.yaml practice is not to edit this file, but to change provider configs through ytt overlay files.

~/.kube-tkg/config Management cluster kubeconfig file containing names and Add new management certificates for the management clusters that the tanzu CLI cluster info and set the knows about. Location overridden by the KUBECONFIG cluster as the current- environment variable. context.

~/.kube/config Configuration and state for the kubectl CLI, including all Add new management management and workload clusters, and which is the current cluster name, context, and context. certificate info. Do not change current kubectl context to new cluster.

Core Add-ons

When you deploy a management or a workload cluster, Tanzu Kubernetes Grid installs the following core add-ons in the cluster: n CNI: cni/calico or cni/antrea n (vSphere only) vSphere CPI: cloud-provider/vsphere-cpi n (vSphere only) vSphere CSI: csi/vsphere-csi n Authentication: authentication/pinniped n Metrics Server: metrics/metrics-server

Tanzu Kubernetes Grid manages the lifecycle of the core add-ons. For example, it automatically upgrades the add-ons when you upgrade your management and workload clusters.

For more information about the core add-ons, see Update and Troubleshoot Core Add-On Configuration.

VMware, Inc. 60 VMware Tanzu Kubernetes Grid

Prepare to Deploy Management Clusters

Before you can use the Tanzu CLI or installer interface to deploy a management cluster, you must make sure that your infrastructure provider is correctly set up. n For information about how to set up a vSphere infrastructure provider, see Prepare to Deploy Management Clusters to vSphere. n For information about how to set up an Amazon EC2 infrastructure provider, see Prepare to Deploy Management Clusters to Amazon EC2. n For information about how to set up a Microsoft Azure infrastructure provider, see Prepare to Deploy Management Clusters to Microsoft Azure.

For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid.

If you need to deploy Tanzu Kubernetes Grid in an environment with no external Internet access, see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment.

To deploy Tanzu Kubernetes Grid to VMware Cloud on AWS or to Azure VMware Solution, see Prepare a vSphere Management as a Service Infrastructure.

Prepare to Deploy Management Clusters to vSphere

Before you can use the Tanzu CLI or installer interface to deploy a management cluster, you must prepare your vSphere environment. You must make sure that vSphere meets the general requirements, and import the base image templates from which Tanzu Kubernetes Grid creates cluster node VMs. Each base image template contains a version of a machine OS and a version of Kubernetes.

General Requirements n A machine with the Tanzu CLI, Docker, and kubectl installed. See Chapter 3 Install the Tanzu CLI and Other Tools.

n This is the bootstrap machine from which you run tanzu, kubectl and other commands.

n The bootstrap machine can be a local physical machine or a VM that you access via a console window or client shell. n A vSphere 7, vSphere 6.7u3, VMware Cloud on AWS, or Azure VMware Solution account with:

n vSphere 6.7: an Enterprise Plus license.

n vSphere 7: see vSphere with Tanzu Provides Management Cluster below.

n VMware Cloud on AWS: Deployed SDDC version is compatible with this version of Tanzu Kubernetes Grid. See the VMware Product Interoperability Matrix.

n At least the permissions described in Required Permissions for the vSphere Account.

VMware, Inc. 61 VMware Tanzu Kubernetes Grid

n Your vSphere instance has the following objects in place:

n Either a standalone host or a vSphere cluster with at least two hosts

n If you are deploying to a vSphere cluster, ideally vSphere DRS is enabled.

n Optionally, a resource pool in which to deploy the Tanzu Kubernetes Grid Instance

n A VM folder in which to collect the Tanzu Kubernetes Grid VMs

n A datastore with sufficient capacity for the control plane and worker node VM files

n If you intend to deploy multiple Tanzu Kubernetes Grid instances to the same vSphere instance, create a dedicated resource pool, VM folder, and network for each instance that you deploy. n You have done the following to prepare your vSphere environment:

n Created a base image template that matches the management cluster's Kubernetes version. See Import the Base Image Template into vSphere.

n Created a vSphere account for Tanzu Kubernetes Grid, with a role and permissions that let it manipulate vSphere objects as needed. See Required Permissions for the vSphere Account. n A network with:

n A DHCP server configured with Option 3 (Router) and Option 6 (DNS) with which to connect the cluster node VMs that Tanzu Kubernetes Grid deploys. The node VMs must be able to connect to vSphere.

n A set of available static virtual IP addresses for all of the clusters that you create, including both management and Tanzu Kubernetes clusters.

n Every cluster that you deploy to vSphere requires one static IP address for Kube-VIP to use for the API server endpoint. You specify this static IP address when you deploy a management cluster. Make sure that these IP addresses are not in the DHCP range, but are in the same subnet as the DHCP range. Before you deploy management clusters to vSphere, make a DHCP reservation for Kube-VIP on your DHCP server. Use an auto-generated MAC Address when you make the DHCP reservation for Kube-VIP, so that the DHCP server does not assign this IP to other machines.

n Each control plane node of every cluster that you deploy requires a static IP address. This includes both management clusters and Tanzu Kubernetes clusters. These static IP addresses are required in addition to the static IP address that you assign to Kube- VIP when you deploy a management cluster. To make the IP addresses that your DHCP server assigned to the control plane nodes static, you can configure a DHCP reservation for each control plane node in the cluster, after you deploy it. For instructions on how to configure DHCP reservations, see your DHCP server documentation.

n Traffic allowed out to vCenter Server from the network on which clusters will run.

VMware, Inc. 62 VMware Tanzu Kubernetes Grid

n Traffic allowed between your local bootstrap machine and port 6443 of all VMs in the clusters you create. Port 6443 is where the Kubernetes API is exposed.

n Traffic allowed between port 443 of all VMs in the clusters you create and vCenter Server. Port 443 is where the vCenter Server API is exposed.

n Traffic allowed between your local bootstrap machine out to the image repositories listed in the management cluster Bill of Materials (BoM) file, over port 443, for TCP. The BoM file is under ~/.tanzu/tkg/bom/ and its name includes the Tanzu Kubernetes Grid version, for example bom-1.3.1+vmware.2.yaml for v1.3.1.

n The Network Time Protocol (NTP) service running on all hosts, and the hosts running on UTC. To check the time settings on hosts:

1 Use SSH to log in to the ESXi host.

2 Run the date command to see the timezone settings.

3 If the timezone is incorrect, run esxcli system time set. n If your vSphere environment runs NSX-T Data Center, you can use the NSX-T Data Center interfaces when you deploy management clusters. Make sure that your NSX-T Data Center setup includes a segment on which DHCP is enabled. Make sure that NTP is configured on all ESXi hosts, on vCenter Server, and on the bootstrap machine.

Or see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment for installing without external network access. vSphere with Tanzu Provides Management Cluster On vSphere 7 and later, the vSphere with Tanzu feature includes a Supervisor Cluster that you can configure as a management cluster for Tanzu Kubernetes Grid. This means that on vSphere 7, you do not need to use the tanzu management-cluster create to deploy a management cluster if vSphere with Tanzu is enabled. Deploying a Tanzu Kubernetes Grid management cluster to vSphere 7 when vSphere with Tanzu is not enabled is supported, but the preferred option is to enable vSphere with Tanzu and use the built-in Supervisor Cluster.

The Tanzu CLI works with both management clusters deployed through vSphere with Tanzu and management clusters deployed by Tanzu Kubernetes Grid on Azure, Amazon EC2, and vSphere 6.7, letting you deploy and manage workload clusters across multiple infrastructures using a single tool. For more information, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

For information about the vSphere with Tanzu feature in vSphere 7, see vSphere with Tanzu Configuration and Management in the vSphere 7 documentation.

NOTE: On VMware Cloud on AWS and Azure VMware Solution, you cannot create a supervisor cluster, and need to deploy a management cluster to run tanzu commands.

VMware, Inc. 63 VMware Tanzu Kubernetes Grid

Static VIPs and Load Balancers for vSphere Each management cluster and Tanzu Kubernetes cluster that you deploy to vSphere requires one static virtual IP address for external requests to the cluster's API server. You must be able to assign this IP address, so it cannot be within your DHCP range, but it must be in the same subnet as the DHCP range.

The cluster control plane's Kube-vip pod uses this static virtual IP address to serve API requests, and the API server certificate includes the address to enable secure TLS communication. In Tanzu Kubernetes clusters, Kube-vip runs in a basic, Layer-2 failover mode, assigning the virtual IP address to one control plane node at a time. In this mode, Kube-vip does not function as a true load balancer for control plane traffic.

Tanzu Kubernetes Grid also does not use Kube-vip as a load balancer for workloads in workload clusters. Kube-vip is used solely by the cluster's API server.

To load-balance workloads on vSphere, use NSX Advanced Load Balancer, also known as Avi Load Balancer, Essentials Edition. You must deploy the NSX Advanced Load Balancer in your vSphere instance before you deploy management clusters. See Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch.

Import a Base Image Template into vSphere Before you can deploy a cluster to vSphere, you must import into vSphere a base image template containing the OS and Kubernetes versions that the cluster nodes run on. For each supported pair of OS and Kubernetes versions, VMware publishes a base image template in OVA format, for deploying clusters to vSphere. After you import the OVA into vSphere, you must convert the resulting VM into a VM template.

Supported base images for cluster nodes depend on the type of cluster, as follows: n Management Cluster: OVA must have Kubernetes v1.20.5, the default version for Tanzu Kubernetes Grid v1.3.1. So it must be one of the following:

n Ubuntu v20.04 Kubernetes v1.20.5 OVA

n Photon v3 Kubernetes v1.20.5 OVA n Workload Clusters: OVA can have any supported combination of OS and Kubernetes version, as packaged in a Tanzu Kubernetes release. See Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions.

To import a base image template into vSphere:

1 Go to the Tanzu Kubernetes Grid downloads page, log in with your My VMware credentials, and click Go to Downloads.

2 Download a Tanzu Kubernetes Grid OVA for the cluster nodes. For the management cluster, this must be one of the Kubernetes v1.20.5 OVA downloads.

Important: Make sure you download the most recent OVA base image templates in the event of security patch releases. You can find updated base image templates that include security patches on the Tanzu Kubernetes Grid product download page.

VMware, Inc. 64 VMware Tanzu Kubernetes Grid

3 In the vSphere Client, right-click an object in the vCenter Server inventory, select Deploy OVF template.

4 Select Local file, click the button to upload files, and navigate to the downloaded OVA file on your local machine.

5 Follow the installer prompts to deploy a VM from the OVA.

n Accept or modify the appliance name

n Select the destination datacenter or folder

n Select the destination host, cluster, or resource pool

n Accept the end user license agreements (EULA)

n Select the disk format and destination datastore

n Select the network for the VM to connect to

NOTE: If you select thick provisioning as the disk format, when Tanzu Kubernetes Grid creates cluster node VMs from the template, the full size of each node's disk will be reserved. This can rapidly consume storage if you deploy many clusters or clusters with many nodes. However, if you select thin provisioning, as you deploy clusters this can give a false impression of the amount of storage that is available. If you select thin provisioning, there might be enough storage available at the time that you deploy clusters, but storage might run out as the clusters run and accumulate data.

6 Click Finish to deploy the VM.

7 When the OVA deployment finishes, right-click the VM and select Template > Convert to Template.

NOTE: Do not power on the VM before you convert it to a template.

8 In the VMs and Templates view, right-click the new template, select Add Permission, and assign the tkg-user to the template with the TKG role.

For information about how to create the user and role for Tanzu Kubernetes Grid, see Required Permissions for the vSphere Account above.

Repeat the procedure for each of the Kubernetes versions for which you downloaded the OVA file.

Required Permissions for the vSphere Account The vCenter Single Sign On account that you provide to Tanzu Kubernetes Grid when you deploy a management cluster must have the correct permissions in order to perform the required operations in vSphere.

It is not recommended to provide a vSphere administrator account to Tanzu Kubernetes Grid, because this provides Tanzu Kubernetes Grid with far greater permissions than it needs. The best way to assign permissions to Tanzu Kubernetes Grid is to create a role and a user account, and then to grant that user account that role on vSphere objects.

VMware, Inc. 65 VMware Tanzu Kubernetes Grid

NOTE: If you are deploying Tanzu Kubernetes clusters to vSphere 7 and vSphere with Tanzu is enabled, you must set the Global > Cloud Admin permission in addition to the permissions listed below. If you intend to use Velero to back up and restore management or workload clusters, you must also set the permissions listed in Credentials and Privileges for VMDK Access in the Virtual Disk Development Kit Programming Guide.

1 In the vSphere Client, go to Administration > Access Control > Roles, and create a new role, for example TKG, with the following permissions.

vSphere Object Required Permission

Cns Searchable

Datastore Allocate space Browse datastore Low level file operations

Global (if using Velero Disable methods Enable methods Licenses for backup and restore)

Network Assign network

Profile-driven storage Profile-driven storage view

Resource Assign virtual machine to resource pool

Sessions Message Validate session

Virtual machine Change Configuration > Add existing disk Change Configuration > Add new disk Change Configuration > Add or remove device Change Configuration > Advanced configuration Change Configuration > Change CPU count Change Configuration > Change Memory Change Configuration > Change Settings Change Configuration > Configure Raw device Change Configuration > Extend virtual disk Change Configuration > Modify device settings Change Configuration > Remove disk Change Configuration > Toggle disk change tracking* Edit Inventory > Create from existing Edit Inventory > Remove Interaction > Power On Interaction > Power Off Provisioning > Allow read-only disk access* Provisioning > Allow virtual machine download* Provisioning > Deploy template Snapshot Management > Create snapshot* Snapshot Management > Remove snapshot* *Required to enable the Velero plugin, as described in Back Up and Restore Clusters. You can add these permissions when needed later.

vApp Import

2 In Administration > Single Sign On > Users and Groups, create a new user account in the appropriate domain, for example tkg-user.

3 In the Hosts and Clusters, VMs and Templates, Storage, and Networking views, right-click the objects that your Tanzu Kubernetes Grid deployment will use, select Add Permission, and assign the tkg-user with the TKG role to each object. n Hosts and Clusters

n The root vCenter Server object

n The Datacenter and all of the Host and Cluster folders, from the Datacenter object down to the cluster that manages the Tanzu Kubernetes Grid deployment

n Target hosts and clusters

VMware, Inc. 66 VMware Tanzu Kubernetes Grid

n Target resource pools, with propagate to children enabled n VMs and Templates

n The deployed Tanzu Kubernetes Grid base image templates

n Target VM and Template folders, with propagate to children enabled n Storage

n Datastores and all storage folders, from the Datacenter object down to the datastores that will be used for Tanzu Kubernetes Grid deployments n Networking

n Networks or distributed port groups to which clusters will be assigned

n Distributed switches

Minimum VM Sizes for Cluster Nodes Configure the sizes of your management and Tanzu Kubernetes (workload) cluster nodes depending on cluster complexity and expected demand.

For all clusters on vSphere, you configure these with the --size, --controlplane-size, and worker- size options to tkg init and tkg create cluster. Or for greater granularity, you can use the VSPHERE_* _DISK_GIB, _NUM_CPUS, and _MEM_MIB configuration variables.

For management clusters, the installer interface Instance Type field also configures node VM sizes.

For single-worker management and workload clusters running sample applications, use the following minimum VM sizes: n No services installed: small n Basic services installed (dex, Gangway, Wavefront, Fluentbit, Contour, Envoy, and TMC agent): medium

Create an SSH Key Pair In order for the Tanzu CLI to connect to vSphere from the machine on which you run it, you must provide the public key part of an SSH key pair to Tanzu Kubernetes Grid when you deploy the management cluster. If you do not already have one on the machine on which you run the CLI, you can use a tool such as ssh-keygen to generate a key pair.

1 On the machine on which you will run the Tanzu CLI, run the following ssh-keygen command. ssh-keygen -t rsa -b 4096 -C "[email protected]"

2 At the prompt Enter file in which to save the key (/root/.ssh/id_rsa): press Enter to accept the default.

3 Enter and repeat a password for the key pair.

VMware, Inc. 67 VMware Tanzu Kubernetes Grid

4 Add the private key to the SSH agent running on your machine, and enter the password you created in the previous step.

ssh-add ~/.ssh/id_rsa

5 Open the file .ssh/id_rsa.pub in a text editor so that you can easily copy and paste it when you deploy a management cluster.

Obtain vSphere Certificate Thumbprints If your vSphere environment uses untrusted, self-signed certificates to authenticate connections, you must verify the thumbprint of the vCenter Server when you deploy a management cluster. If your vSphere environment uses trusted certificates that are signed by a known Certificate Authority (CA), you do not need to verify the thumbprint.

You can use either SSH and OpenSSL or the Platform Services Controller to obtain certificate thumbprints. vCenter Server Appliance You can use SSH and OpenSSL to obtain the certificate thumbprint for a vCenter Server Appliance instance.

1 Use SSH to connect to the vCenter Server Appliance as root user. $ ssh root@vcsa_address

2 Use openssl to view the certificate thumbprint. openssl x509 -in /etc/vmware-vpx/ssl/rui.crt -fingerprint -sha1 -noout 3 Copy the certificate thumbprint so that you can verify it when you deploy a management cluster.

Platform Services Controller On vSphere 6.7u3, you can obtain a vCenter Server certificate thumbprint by logging into the Platform Services Controller for that vCenter Server instance. If you are deploying a management cluster to vSphere 7, there is no Platform Services Controller.

1 Log in to the Platform Services Controller interface.

n Embedded Platform Services Controller: https://vcenter_server_address/psc

n Standalone Platform Services Controller: https://psc_address/psc

2 Select Certificate Management and enter a vCenter Single Sign-On password.

3 Select Machine Certificates, select a certificate, and click Show Details.

4 Copy the certificate thumbprint so that you can verify it when you deploy a management cluster.

VMware, Inc. 68 VMware Tanzu Kubernetes Grid

What to Do Next For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid.

If you are using Tanzu Kubernetes Grid in an environment with an external internet connection, once you have set up identity management, you are ready to deploy management clusters to vSphere. n Deploy Management Clusters with the Installer Interface. This is the preferred option for first deployments. n Deploy Management Clusters from a Configuration File. This is the more complicated method, that allows greater flexibility of configuration and automation. n If you are using Tanzu Kubernetes Grid in an internet-restricted environment, see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment for the additional steps to perform. n If you want to deploy clusters to Amazon EC2 and Azure as well as to vSphere, see Prepare to Deploy Management Clusters to Amazon EC2 and Prepare to Deploy Management Clusters to Microsoft Azure for the required setup for those platforms.

Prepare to Deploy Management Clusters to Amazon EC2

This topic explains how to prepare Amazon EC2 for running Tanzu Kubernetes Grid.

Before you can use the Tanzu CLI or installer interface to deploy a management cluster, you must prepare the bootstrap machine on which you run the Tanzu CLI and set up your Amazon Web Services Account (AWS) account.

If you are installing Tanzu Kubernetes Grid on VMware Cloud on AWS, you are installing to a vSphere environment. See Preparing VMware Cloud on AWS in Prepare a vSphere Management as a Service Infrastructure to prepare your environment, and Prepare to Deploy Management Clusters to vSphere to deploy management clusters.

General Requirements n The Tanzu CLI installed locally. See Chapter 3 Install the Tanzu CLI and Other Tools. n You have the access key and access key secret for an active AWS account. For more information, see AWS Account and Access Keys in the AWS documentation. n Your AWS account must have at least the permissions described in Required Permissions for the AWS Account.

VMware, Inc. 69 VMware Tanzu Kubernetes Grid

n Your AWS account has sufficient resource quotas for the following. For more information, see Amazon VPC Quotas in the AWS documentation and Resource Usage in Your Amazon Web Services Account below:

n Virtual Private Cloud (VPC) instances. By default, each management cluster that you deploy creates one VPC and one or three NAT gateways. The default NAT gateway quota is 5 instances per availability zone, per account.

n Elastic IP (EIP) addresses. The default EIP quota is 5 EIP addresses per region, per account. n Traffic is allowed between your local bootstrap machine and port 6443 of all VMs in the clusters you create. Port 6443 is where the Kubernetes API is exposed. n Traffic is allowed between your local bootstrap machine and the image repositories listed in the management cluster Bill of Materials (BOM) file, over port 443, for TCP.

n The BOM file is under ~/.tanzu/tkg/bom/ and its name includes the Tanzu Kubernetes Grid version, for example tkg-bom-1.3.1+vmware.1.yaml for v1.3.1.

n Run a DNS lookup on all imageRepository values to find their IPs, for example projects.registry.vmware.com/tkg requires network access to 208.91.0.233. n The AWS CLI installed locally. n jq installed locally.

The AWS CLI uses jq to process JSON when creating SSH key pairs. It is also used to prepare the environment or configuration variables when you deploy Tanzu Kubernetes Grid by using the CLI.

Or see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment for installing without external network access.

Resource Usage in Your AWS Account For each cluster that you create, Tanzu Kubernetes Grid provisions a set of resources in your AWS account.

For development management clusters that are not configured for high availability, Tanzu Kubernetes Grid provisions the following resources: n 3 VMs, including a control plane node, a worker node (to run the cluster agent extensions) and, by default, a bastion host. If you specify additional VMs in your node pool, those are provisioned as well. n 4 security groups, one for the load balancer and one for each of the initial VMs. n 1 private subnet and 1 public subnet in the specified availability zone. n 1 public and 1 private route table in the specified availability zone. n 1 classic load balancer. n 1 internet gateway.

VMware, Inc. 70 VMware Tanzu Kubernetes Grid

n 1 NAT gateway in the specified availability zone. n By default, 1 EIP, for the NAT gateway, when clusters are deployed in their own VPC. You can optionally share VPCs rather than creating new ones, such as a workload cluster sharing a VPC with its management cluster.

For production management clusters, which are configured for high availability, Tanzu Kubernetes Grid provisions the following resources to support distribution across three availability zones: n 3 control plane VMs n 3 private and public subnets n 3 private and public route tables n 3 NAT gateways n By default, 3 EIPs, one for each NAT gateway, for clusters deployed in their own VPC. You can optionally share VPCs rather than creating new ones, such as a workload cluster sharing a VPC with its management cluster.

AWS implements a set of default limits or quotas on these types of resources and allows you to modify the limits. Typically, the default limits are sufficient to get started creating clusters from Tanzu Kubernetes Grid. However, as you increase the number of clusters you are running or the workloads on your clusters, you will encroach on these limits. When you reach the limits imposed by AWS, any attempts to provision that type of resource fail. As a result, Tanzu Kubernetes Grid will be unable to create a new cluster, or you might be unable to create additional deployments on your existing clusters. Therefore, regularly assess the limits you have specified in AWS account and adjust them as necessary to fit your business needs.

For information about the sizes of cluster node instances, see Amazon EC2 Instance Types in the AWS documentation.

Virtual Private Clouds and NAT Gateway Limits If you create a new Virtual Private Cloud (VPC) when you deploy a development management cluster, Tanzu creates a dedicated NAT gateway for the management cluster. If you deploy a production management cluster, Tanzu creates three NAT gateways, one in each of the availability zones. In this case, by default, Tanzu Kubernetes Grid creates a new VPC and one or three NAT gateways for each Tanzu Kubernetes cluster that you deploy from that management cluster. By default, AWS allows five NAT gateways per availability zone per account. Consequently, if you always create a new VPC for each cluster, you can create only five development clusters in a single availability zone. If you already have five NAT gateways in use, Tanzu Kubernetes Grid is unable to provision the necessary resources when you attempt to create a new cluster. If you do not want to change the default quotas, to create more than five development clusters in a given availability zone, you must share existing VPCs, and therefore their NAT gateways, between multiple clusters.

VMware, Inc. 71 VMware Tanzu Kubernetes Grid

There are three possible scenarios regarding VPCs and NAT gateway usage when you deploy management clusters and Tanzu Kubernetes clusters. n Create a new VPC and NAT gateway(s) for every management cluster and Tanzu Kubernetes cluster

If you deploy a management cluster and use the option to create a new VPC and if you make no modifications to the configuration when you deploy Tanzu Kubernetes clusters from that management cluster, the deployment of each of the Tanzu Kubernetes clusters also creates a VPC and one or three NAT gateways. In this scenario, you can deploy one development management cluster and up to 4 development Tanzu Kubernetes clusters, due to the default limit of 5 NAT gateways per availability zone. n Reuse a VPC and NAT gateway(s) that already exist in your availability zone(s)

If a VPC already exists in the availability zone(s) in which you are deploying a management cluster, for example a VPC that you created manually or by using tools such as CloudFormation or Terraform, you can specify that the management cluster should use this VPC. In this case, all of the Tanzu Kubernetes clusters that you deploy from that management cluster also use the specified VPC and its NAT gateway(s).

An existing VPC must be configured with the following networking:

n Two subnets for development clusters or six subnets for production clusters

n One NAT gateway for development clusters or three NAT gateways for production clusters

n One internet gateway and corresponding routing tables n Create a new VPC and NAT gateway(s) for the management cluster and deploy Tanzu Kubernetes clusters that share that VPC and NAT gateway(s)

If you are starting with an empty availability zone(s), you can deploy a management cluster and use the option to create a new VPC. If you want the Tanzu Kubernetes clusters to share a VPC that Tanzu Kubernetes Grid created, you must modify the cluster configuration when you deploy Tanzu Kubernetes clusters from this management cluster.

For information about how to deploy management clusters that either create or reuse a VPC, see Deploy Management Clusters with the Installer Interface and Deploy Management Clusters from a Configuration File.

For information about how to deploy Tanzu Kubernetes clusters that share a VPC that Tanzu Kubernetes Grid created when you deployed the management cluster, see Deploy a Cluster that Shares a VPC with the Management Cluster.

Required Permissions for the AWS Account Your AWS account must have at least the following permissions: n Required IAM Resources: Tanzu Kubernetes Grid creates these resources when you deploy a management cluster to your AWS account for the first time.

VMware, Inc. 72 VMware Tanzu Kubernetes Grid

n Required Permissions for tanzu management-cluster create: Tanzu Kubernetes Grid uses these permissions when you run tanzu management-cluster create or deploy your management clusters from the installer interface.

Required IAM Resources When you deploy your first management cluster to Amazon EC2, you instruct Tanzu Kubernetes Grid to create a CloudFormation stack, tkg-cloud-vmware-com, in your AWS account. This CloudFormation stack defines the identity and access management (IAM) resources that Tanzu Kubernetes Grid uses to deploy and run clusters on Amazon EC2, which includes the following IAM policies, roles, and profiles: n AWS::IAM::InstanceProfile:

n control-plane.tkg.cloud.vmware.com

n controllers.tkg.cloud.vmware.com

n nodes.tkg.cloud.vmware.com n AWS::IAM::ManagedPolicy:

n arn:aws:iam::YOUR-ACCOUNT-ID:policy/control-plane.tkg.cloud.vmware.com. This policy is attached to the control-plane.tkg.cloud.vmware.com IAM role.

n arn:aws:iam::YOUR-ACCOUNT-ID:policy/nodes.tkg.cloud.vmware.com. This policy is attached to the control-plane.tkg.cloud.vmware.com and nodes.tkg.cloud.vmware.com IAM roles.

n arn:aws:iam::YOUR-ACCOUNT-ID:policy/controllers.tkg.cloud.vmware.com. This policy is attached to the controllers.tkg.cloud.vmware.com and control-plane.tkg.cloud.vmware.com IAM roles. n AWS::IAM::Role:

n control-plane.tkg.cloud.vmware.com

n controllers.tkg.cloud.vmware.com

n nodes.tkg.cloud.vmware.com

The AWS user that you provide to Tanzu Kubernetes Grid when you create the CloudFormation stack must have permissions to manage IAM resources, such as IAM policies, roles, and instance profiles. You need to create only one CloudFormation stack per AWS account, regardless of whether you use a single or multiple AWS regions for your Tanzu Kubernetes Grid environment.

After Tanzu Kubernetes Grid creates the CloudFormation stack, AWS stores its template as part of the stack. To retrieve the template from CloudFormation, you can navigate to CloudFormation > Stacks in the AWS console or use the aws cloudformation get-template CLI command. For more information about CloudFormation stacks, see Working with Stacks in the AWS documentation.

VMware, Inc. 73 VMware Tanzu Kubernetes Grid

Required AWS Permissions for tanzu management-cluster create The AWS user that you provide to Tanzu Kubernetes Grid when you deploy a management cluster must have at least the following permissions: n The permissions that are defined in the control-plane.tkg.cloud.vmware.com, nodes.tkg.cloud.vmware.com, and controllers.tkg.cloud.vmware.com IAM polices of the tkg- cloud-vmware-com CloudFormation stack. To retrieve these policies from CloudFormation, you can navigate to CloudFormation > Stacks in the AWS console. For more information, see Required IAM Resources above. n If you intend to deploy the management cluster from the installer interface, your AWS user must also have the "ec2:DescribeInstanceTypeOfferings" and "ec2:DescribeInstanceTypes" permissions. If your AWS user does not currently have these permissions, you can create a custom policy that includes the permissions and attach it to your AWS user.

For example, in Tanzu Kubernetes Grid v1.3.x, the control-plane.tkg.cloud.vmware.com, nodes.tkg.cloud.vmware.com, and controllers.tkg.cloud.vmware.com IAM polices include the following permissions:

The control-plane.tkg.cloud.vmware.com IAM policy:

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeTags", "ec2:DescribeInstances", "ec2:DescribeImages", "ec2:DescribeRegions", "ec2:DescribeRouteTables", "ec2:DescribeSecurityGroups", "ec2:DescribeSubnets", "ec2:DescribeVolumes", "ec2:CreateSecurityGroup", "ec2:CreateTags", "ec2:CreateVolume", "ec2:ModifyInstanceAttribute", "ec2:ModifyVolume", "ec2:AttachVolume", "ec2:AuthorizeSecurityGroupIngress", "ec2:CreateRoute", "ec2:DeleteRoute", "ec2:DeleteSecurityGroup", "ec2:DeleteVolume", "ec2:DetachVolume", "ec2:RevokeSecurityGroupIngress", "ec2:DescribeVpcs", "elasticloadbalancing:AddTags", "elasticloadbalancing:AttachLoadBalancerToSubnets", "elasticloadbalancing:ApplySecurityGroupsToLoadBalancer",

VMware, Inc. 74 VMware Tanzu Kubernetes Grid

"elasticloadbalancing:CreateLoadBalancer", "elasticloadbalancing:CreateLoadBalancerPolicy", "elasticloadbalancing:CreateLoadBalancerListeners", "elasticloadbalancing:ConfigureHealthCheck", "elasticloadbalancing:DeleteLoadBalancer", "elasticloadbalancing:DeleteLoadBalancerListeners", "elasticloadbalancing:DescribeLoadBalancers", "elasticloadbalancing:DescribeLoadBalancerAttributes", "elasticloadbalancing:DetachLoadBalancerFromSubnets", "elasticloadbalancing:DeregisterInstancesFromLoadBalancer", "elasticloadbalancing:ModifyLoadBalancerAttributes", "elasticloadbalancing:RegisterInstancesWithLoadBalancer", "elasticloadbalancing:SetLoadBalancerPoliciesForBackendServer", "elasticloadbalancing:AddTags", "elasticloadbalancing:CreateListener", "elasticloadbalancing:CreateTargetGroup", "elasticloadbalancing:DeleteListener", "elasticloadbalancing:DeleteTargetGroup", "elasticloadbalancing:DescribeListeners", "elasticloadbalancing:DescribeLoadBalancerPolicies", "elasticloadbalancing:DescribeTargetGroups", "elasticloadbalancing:DescribeTargetHealth", "elasticloadbalancing:ModifyListener", "elasticloadbalancing:ModifyTargetGroup", "elasticloadbalancing:RegisterTargets", "elasticloadbalancing:SetLoadBalancerPoliciesOfListener", "iam:CreateServiceLinkedRole", "kms:DescribeKey" ], "Resource": [ "*" ], "Effect": "Allow" } ] }

The nodes.tkg.cloud.vmware.com IAM policy:

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "ec2:DescribeInstances", "ec2:DescribeRegions", "ecr:GetAuthorizationToken", "ecr:BatchCheckLayerAvailability", "ecr:GetDownloadUrlForLayer", "ecr:GetRepositoryPolicy", "ecr:DescribeRepositories", "ecr:ListImages", "ecr:BatchGetImage" ], "Resource": [

VMware, Inc. 75 VMware Tanzu Kubernetes Grid

"*" ], "Effect": "Allow" }, { "Action": [ "secretsmanager:DeleteSecret", "secretsmanager:GetSecretValue" ], "Resource": [ "arn:*:secretsmanager:*:*:secret:aws.cluster.x-k8s.io/*" ], "Effect": "Allow" }, { "Action": [ "ssm:UpdateInstanceInformation", "ssmmessages:CreateControlChannel", "ssmmessages:CreateDataChannel", "ssmmessages:OpenControlChannel", "ssmmessages:OpenDataChannel", "s3:GetEncryptionConfiguration" ], "Resource": [ "*" ], "Effect": "Allow" } ] }

The controllers.tkg.cloud.vmware.com IAM policy:

{ "Version": "2012-10-17", "Statement": [ { "Action": [ "ec2:AllocateAddress", "ec2:AssociateRouteTable", "ec2:AttachInternetGateway", "ec2:AuthorizeSecurityGroupIngress", "ec2:CreateInternetGateway", "ec2:CreateNatGateway", "ec2:CreateRoute", "ec2:CreateRouteTable", "ec2:CreateSecurityGroup", "ec2:CreateSubnet", "ec2:CreateTags", "ec2:CreateVpc", "ec2:ModifyVpcAttribute", "ec2:DeleteInternetGateway", "ec2:DeleteNatGateway", "ec2:DeleteRouteTable", "ec2:DeleteSecurityGroup",

VMware, Inc. 76 VMware Tanzu Kubernetes Grid

"ec2:DeleteSubnet", "ec2:DeleteTags", "ec2:DeleteVpc", "ec2:DescribeAccountAttributes", "ec2:DescribeAddresses", "ec2:DescribeAvailabilityZones", "ec2:DescribeInstances", "ec2:DescribeInternetGateways", "ec2:DescribeImages", "ec2:DescribeNatGateways", "ec2:DescribeNetworkInterfaces", "ec2:DescribeNetworkInterfaceAttribute", "ec2:DescribeRouteTables", "ec2:DescribeSecurityGroups", "ec2:DescribeSubnets", "ec2:DescribeVpcs", "ec2:DescribeVpcAttribute", "ec2:DescribeVolumes", "ec2:DetachInternetGateway", "ec2:DisassociateRouteTable", "ec2:DisassociateAddress", "ec2:ModifyInstanceAttribute", "ec2:ModifyNetworkInterfaceAttribute", "ec2:ModifySubnetAttribute", "ec2:ReleaseAddress", "ec2:RevokeSecurityGroupIngress", "ec2:RunInstances", "ec2:TerminateInstances", "tag:GetResources", "elasticloadbalancing:AddTags", "elasticloadbalancing:CreateLoadBalancer", "elasticloadbalancing:ConfigureHealthCheck", "elasticloadbalancing:DeleteLoadBalancer", "elasticloadbalancing:DescribeLoadBalancers", "elasticloadbalancing:DescribeLoadBalancerAttributes", "elasticloadbalancing:DescribeTags", "elasticloadbalancing:ModifyLoadBalancerAttributes", "elasticloadbalancing:RegisterInstancesWithLoadBalancer", "elasticloadbalancing:DeregisterInstancesFromLoadBalancer", "elasticloadbalancing:RemoveTags", "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeInstanceRefreshes", "ec2:CreateLaunchTemplate", "ec2:CreateLaunchTemplateVersion", "ec2:DescribeLaunchTemplates", "ec2:DescribeLaunchTemplateVersions", "ec2:DeleteLaunchTemplate", "ec2:DeleteLaunchTemplateVersions" ], "Resource": [ "*" ], "Effect": "Allow" }, {

VMware, Inc. 77 VMware Tanzu Kubernetes Grid

"Action": [ "autoscaling:CreateAutoScalingGroup", "autoscaling:UpdateAutoScalingGroup", "autoscaling:CreateOrUpdateTags", "autoscaling:StartInstanceRefresh", "autoscaling:DeleteAutoScalingGroup", "autoscaling:DeleteTags" ], "Resource": [ "arn:*:autoscaling:*:*:autoScalingGroup:*:autoScalingGroupName/*" ], "Effect": "Allow" }, { "Condition": { "StringLike": { "iam:AWSServiceName": "autoscaling.amazonaws.com" } }, "Action": [ "iam:CreateServiceLinkedRole" ], "Resource": [ "arn:*:iam::*:role/aws-service-role/autoscaling.amazonaws.com/ AWSServiceRoleForAutoScaling" ], "Effect": "Allow" }, { "Condition": { "StringLike": { "iam:AWSServiceName": "elasticloadbalancing.amazonaws.com" } }, "Action": [ "iam:CreateServiceLinkedRole" ], "Resource": [ "arn:*:iam::*:role/aws-service-role/elasticloadbalancing.amazonaws.com/ AWSServiceRoleForElasticLoadBalancing" ], "Effect": "Allow" }, { "Condition": { "StringLike": { "iam:AWSServiceName": "spot.amazonaws.com" } }, "Action": [ "iam:CreateServiceLinkedRole" ], "Resource": [ "arn:*:iam::*:role/aws-service-role/spot.amazonaws.com/AWSServiceRoleForEC2Spot" ],

VMware, Inc. 78 VMware Tanzu Kubernetes Grid

"Effect": "Allow" }, { "Action": [ "iam:PassRole" ], "Resource": [ "arn:*:iam::*:role/*.tkg.cloud.vmware.com" ], "Effect": "Allow" }, { "Action": [ "secretsmanager:CreateSecret", "secretsmanager:DeleteSecret", "secretsmanager:TagResource" ], "Resource": [ "arn:*:secretsmanager:*:*:secret:aws.cluster.x-k8s.io/*" ], "Effect": "Allow" } ] }

Configure AWS Account Credentials and SSH Key To enable Tanzu Kubernetes Grid VMs to launch on Amazon EC2, you must configure your AWS account credentials and then provide the public key part of an SSH key pair to Amazon EC2 for every region in which you plan to deploy management clusters.

To configure your AWS account credentials and SSH key pair, perform the following steps.

Configure AWS Credentials Tanzu Kubernetes Grid uses the default AWS credentials provider chain. You must set your account credentials to create an SSH key pair for the region where you plan to deploy Tanzu Kubernetes Grid clusters.

To deploy your management cluster on AWS, you have several options for configuring the AWS account used to access EC2. n You can specify your AWS account credentials statically in local environment variables. n You can use a credentials profile, which you can store in a shared credentials file, such as ~/.aws/credentials, or a shared config file, such as ~/.aws/config. You can manage profiles by using the aws configure command.

VMware, Inc. 79 VMware Tanzu Kubernetes Grid

Local Environment Variables One option for configuring AWS credentials is to set local environment variables on your bootstrap machine. To use local environment variables, set the following environment variables for your AWS account: n export AWS_ACCESS_KEY_ID=aws_access_key, where aws_access_key is your AWS access key. n export AWS_SECRET_ACCESS_KEY=aws_access_key_secret, where aws_access_key_secret is your AWS access key secret. n export AWS_SESSION_TOKEN=aws_session_token, where aws_session_token is the AWS session token granted to your account. You only need to specify this variable if you are required to use a temporary access key. For more information about using temporary access keys, see Understanding and getting your AWS credentials. n export AWS_REGION=aws_region, where aws_region is the AWS region in which you intend to deploy the cluster. For example, us-west-2.

For the full list of AWS regions, see AWS Service Endpoints. In addition to the regular AWS regions, you can also specify the us-gov-east and us-gov-west regions in AWS GovCloud.

Credential Files and Profiles As an alternative to using local environment variables, you can store AWS credentials in a shared or local credentials file. An AWS credential file can store multiple accounts as named profiles. The credential files and profiles are applied after local environment variables as part of the AWS default credential provider chain.

To set up credentials files and profiles for your AWS account on the bootstrap machine, you can use the aws configure CLI command.

To customize which AWS credential files and profiles to use, you can set the following environment variables: n export AWS_SHARED_CREDENTIAL_FILE=path_to_credentials_file where path_to_credentials_file is the location and name of the credentials file that contains your AWS access key information. If you do not define this environment variable, the default location and filename is $HOME/.aws/credentials. n export AWS_PROFILE=profile_name where profile_name is the profile name that contains the AWS access key you want to use. If you do not specify a value for this variable, the profile name default is used. For more information about using named profiles, see Named profiles in AWS documentation.

NOTE: Any named profiles that you create in your AWS credentials file appear as selectable options in the AWS Credential Profile drop-down in the Tanzu Kubernetes Grid Installer UI for Amazon EC2.

For more information about working AWS credentials and the default AWS credential provider chain, see Best practices for managing AWS access keys in the AWS documentation.

VMware, Inc. 80 VMware Tanzu Kubernetes Grid

Register an SSH Public Key with Your AWS Account After you have set your AWS account credentials using either local environment variables or in a credentials file and profile, you can generate an SSH key pair for your AWS account.

NOTE: AWS supports only RSA keys. The keys required by AWS are of a different format to those required by vSphere. You cannot use the same key pair for both vSphere and AWS deployments.

If you do not already have an SSH key pair for the account and region you are using to deploy the management cluster, create one by performing the steps below:

1 For each region that you plan to use with Tanzu Kubernetes Grid, create a named key pair, and output a .pem file that includes the name. For example, the following command uses default and saves the file as default.pem.

aws ec2 create-key-pair --key-name default --output json | jq .KeyMaterial -r > default.pem

To create a key pair for a region that is not the default in your profile, or set locally as AWS_DEFAULT_REGION, include the --region option.

2 Log in to your Amazon EC2 dashboard and go to Network & Security > Key Pairs to verify that the created key pair is registered with your account.

Tag AWS Resources

If both of the following are true, you must add the kubernetes.io/cluster/YOUR-CLUSTER- NAME=shared tag to the public subnet or subnets that you intend to use for the management cluster: n You deploy the management cluster to an existing VPC that was not created by Tanzu Kubernetes Grid. n You want to create services of type LoadBalancer in the management cluster.

Adding the kubernetes.io/cluster/YOUR-CLUSTER-NAME=shared tag to the public subnet or subnets enables you to create services of type LoadBalancer in the management cluster. To add this tag, follow the steps below:

1 Gather the ID or IDs of the public subnet or subnets within your existing VPC that you want to use for the management cluster. To deploy a prod management cluster, you must provide three subnets.

2 Create the required tag by running the following command:

aws ec2 create-tags --resources YOUR-PUBLIC-SUBNET-ID-OR-IDS --tags Key=kubernetes.io/cluster/ YOUR-CLUSTER-NAME,Value=shared

Where:

n YOUR-PUBLIC-SUBNET-ID-OR-IDS is the ID or IDs of the public subnet or subnets that you gathered in the previous step.

n YOUR-CLUSTER-NAME is the name of the management cluster that you want to deploy.

VMware, Inc. 81 VMware Tanzu Kubernetes Grid

For example:

aws ec2 create-tags --resources subnet-00bd5d8c88a5305c6 subnet-0b93f0fdbae3436e8 subnet-06b29d20291797698 --tags Key=kubernetes.io/cluster/my-management-cluster,Value=shared

If you want to use services of type LoadBalancer in a Tanzu Kubernetes cluster after you deploy the cluster to a VPC that was not created by Tanzu Kubernetes Grid, follow the tagging instructions in Deploy a Cluster to an Existing VPC and Add Subnet Tags (Amazon EC2).

What to Do Next For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid.

If you are using Tanzu Kubernetes Grid in an environment with an external internet connection, once you have set up identity management, you are ready to deploy management clusters to Amazon EC2. n Deploy Management Clusters with the Installer Interface. This is the preferred option for first deployments. n Deploy Management Clusters from a Configuration File. This is the more complicated method that allows greater flexibility of configuration. n If you want to deploy clusters to vSphere and Azure as well as to Amazon EC2, see Prepare to Deploy Management Clusters to vSphere and Prepare to Deploy Management Clusters to Microsoft Azure for the required setup for those platforms.

Prepare to Deploy Management Clusters to Microsoft Azure

This topic explains how to prepare Microsoft Azure for running Tanzu Kubernetes Grid.

If you are installing Tanzu Kubernetes Grid on Azure VMware Solution (AVS), you are installing to a vSphere environment. See Preparing Azure VMware Solution on Microsoft Azure in Prepare a vSphere Management as a Service Infrastructure to prepare your environment and Prepare to Deploy Management Clusters to vSphere to deploy management clusters.

Installation Process Overview The following diagram shows the high-level steps for installing a Tanzu Kubernetes Grid management cluster on Azure, and the interfaces you use to perform them.

These steps include the preparations listed below plus the procedures described in either Deploy Management Clusters with the Installer Interface or Deploy Management Clusters from a Configuration File.

VMware, Inc. 82 VMware Tanzu Kubernetes Grid

General Requirements n The Tanzu CLI installed locally. See Chapter 3 Install the Tanzu CLI and Other Tools. n A Microsoft Azure account with:

n Permissions required to register an app. See Permissions required for registering an app in the Azure documentation.

VMware, Inc. 83 VMware Tanzu Kubernetes Grid

n Sufficient VM core (vCPU) quotas for your clusters. A standard Azure account has a quota of 10 vCPU per region. Tanzu Kubernetes Grid clusters require 2 vCPU per node, which translates to:

n Management cluster:

n dev plan: 4 vCPU (1 main, 1 worker)

n prod plan: 8 vCPU (3 main , 1 worker)

n Each workload cluster:

n dev plan: 4 vCPU (1 main, 1 worker)

n prod plan: 12 vCPU (3 main , 3 worker)

n For example, assuming a single management cluster and all clusters with the same plan:

Plan Workload Clusters vCPU for Workload vCPU for Management Total vCPU

Dev 1 4 4 8

5 20 24

Prod 1 12 8 20

5 60 68 n Traffic is allowed between your local bootstrap machine and the image repositories listed in the management cluster Bill of Materials (BoM) file, over port 443, for TCP.

n The BoM file is under ~/.tanzu/tkg/bom/, and its name includes the Tanzu Kubernetes Grid version. For example, tkg-bom-v1.3.1+vmware.1.yaml.

n Run a DNS lookup on all imageRepository values to find their CNAMEs. n (Optional) OpenSSL installed locally, to create a new keypair or validate the download package thumbprint. See OpenSSL. n (Optional) A VNET with:

n A subnet for the management cluster control plane node

n A Network Security Group on the control plane subnet with the following inbound security rules, to enable SSH and Kubernetes API server connections:

n Allow TCP over port 22 for any source and destination

n Allow TCP over port 6443 for any source and destination. Port 6443 is where the Kubernetes API is exposed on VMs in the clusters you create.

n A subnet and Network Security Group for the management cluster worker nodes.

If you do not use an existing VNET, the installation process creates a new one. n The Azure CLI installed locally. See Install the Azure CLI in the Microsoft Azure documentation.

VMware, Inc. 84 VMware Tanzu Kubernetes Grid

Or see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment for installing without external network access.

Network Security Groups on Azure Tanzu Kubernetes Grid management and workload clusters on Azure require the following Network Security Groups (NSGs) to be defined on their VNET: n One control plane NSG shared by the control plane nodes of all clusters, including the management cluster and the workload clusters that it manages. n One worker NSG for each cluster, for the cluster's worker nodes.

If you do not specify a VNET when deploying a management cluster, the deployment process creates a new VNET along with the NSGs required for the management cluster. If you optionally create a VNET for Tanzu Kubernetes Grid before deploying a management cluster, you must also create these NSGs as described in the General Requirements above.

For each workload cluster that you deploy later, you need to create a worker NSG named CLUSTER-NAME-node-nsg, where CLUSTER-NAME is the name of the workload cluster. This worker NSG must have the same VNET and region as its management cluster.

Register Tanzu Kubernetes Grid as an Azure Client App Tanzu Kubernetes Grid manages Azure resources as a registered client application that accesses Azure through a service principal account. The following steps register your Tanzu Kubernetes Grid application with Azure Active Directory, create its account, create a client secret for authenticating communications, and record information needed later to deploy a management cluster.

1 Log in to the Azure Portal.

2 Record your Tenant ID by hovering over your account name at upper-right, or else browse to Azure Active Directory > > Properties > Tenant ID. The value is a GUID, for example b39138ca-3cee-4b4a-a4d6-cd83d9dd62f0.

3 Browse to Active Directory > App registrations and click + New registration.

4 Enter a display name for the app, such as tkg, and select who else can use it. You can leave the Redirect URI (optional) field blank.

5 Click Register. This registers the application with an Azure service principal account as described in How to: Use the portal to create an Azure AD application and service principal that can access resources in the Azure documentation.

6 An overview pane for the app appears. Record its Application (client) ID value, which is a GUID.

7 From the Azure Portal top level, browse to Subscriptions. At the bottom of the pane, select one of the subscriptions you have access to, and record its Subscription ID. Click the subscription listing to open its overview pane.

8 Select to Access control (IAM) and click Add a role assignment.

VMware, Inc. 85 VMware Tanzu Kubernetes Grid

9 In the Add role assignment pane

n Select the Owner role

n Leave Assign access to selection as "Azure AD user, group, or service principal"

n Under Select enter the name of your app, tkg. It appears underneath under Selected Members

10 Click Save. A popup appears confirming that your app was added as an owner for your subscription.

11 From the Azure Portal > Azure Active Directory > App Registrations, select your tkg app under Owned applications. The app overview pane opens.

12 From Certificates & secrets > Client secrets click + New client secret.

13 In the Add a client secret popup, enter a Description, choose an expiration period, and click Add.

14 Azure lists the new secret with its generated value under Client Secrets. Record the value.

Accept the Base Image License To run management cluster VMs on Azure, accept the license for their base Kubernetes version and machine OS.

1 Sign in to the Azure CLI as your tkg client application.

az login --service-principal --username AZURE_CLIENT_ID --password AZURE_CLIENT_SECRET --tenant AZURE_TENANT_ID

Where AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, and AZURE_TENANT_ID are your tkg app's client ID and secret and your tenant ID, as recorded in Register Tanzu Kubernetes Grid as an Azure Client App.

2 Run the az vm image terms accept command, specifying the --plan and your Subscription ID.

In Tanzu Kubernetes Grid v1.3.1, the default cluster image --plan value is k8s-1dot20dot5- ubuntu-2004, based on Kubernetes version 1.20.5 and the machine OS, Ubuntu 20.04. Run the following command:

az vm image terms accept --publisher vmware-inc --offer tkg-capi --plan k8s-1dot20dot5- ubuntu-2004 --subscription AZURE_SUBSCRIPTION_ID

Where AZURE_SUBSCRIPTION_ID is your Azure subscription ID.

You must repeat this to accept the base image license for every version of Kubernetes or OS that you want to use when you deploy clusters, and every time that you upgrade to a new version of Tanzu Kubernetes Grid.

VMware, Inc. 86 VMware Tanzu Kubernetes Grid

Create an SSH Key Pair (Optional) You deploy management clusters from a machine referred to as the bootstrap machine, using the Tanzu CLI. To connect to Azure, the bootstrap machine must provide the public key part of an SSH key pair. If your bootstrap machine does not already have an SSH key pair, you can use a tool such as ssh-keygen to generate one.

1 On your bootstrap machine, run the following ssh-keygen command. ssh-keygen -t rsa -b 4096 -C "[email protected]"

2 At the prompt Enter file in which to save the key (/root/.ssh/id_rsa): press Enter to accept the default.

3 Enter and repeat a password for the key pair.

4 Add the private key to the SSH agent running on your machine, and enter the password you created in the previous step.

ssh-add ~/.ssh/id_rsa

5 Open the file .ssh/id_rsa.pub in a text editor so that you can easily copy and paste it when you deploy a management cluster.

Preparation Checklist Use this checklist to make sure you are prepared to deploy a Tanzu Kubernetes Grid management cluster to Azure: n Tanzu CLI installed

n Run tanzu version. The output should list version: v1.3.1. n Azure account

n Log in to the Azure web portal at https://portal.azure.com. n Azure CLI installed

n Run az version. The output should list the current version of the Azure CLI as listed in Install the Azure CLI, in the Microsoft Azure documentation. n Registered tkg app

n In the Azure portal, select Active Directory > App Registrations > Owned applications and confirm that your tkg app is listed as configured in Register Tanzu Kubernetes Grid as an Azure Client App above, and with a current certificate. n Base VM image license accepted

n Run az vm image terms show --publisher vmware-inc --offer tkg-capi --plan k8s-1dot20dot5-ubuntu-2004. The output should contain "accepted": true.

VMware, Inc. 87 VMware Tanzu Kubernetes Grid

What to Do Next For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid.

If you are using Tanzu Kubernetes Grid in an environment with an external internet connection, once you have set up identity management, you are ready to deploy management clusters to Azure. n Deploy Management Clusters with the Installer Interface. This is the preferred option for first deployments. n Deploy Management Clusters from a Configuration File. This is the more complicated method, that allows greater flexibility of configuration. n If you want to deploy clusters to vSphere and Amazon EC2 as well as to Azure, see Prepare to Deploy Management Clusters to vSphere and Prepare to Deploy Management Clusters to Amazon EC2 for the required setup for those platforms.

Enabling Identity Management in Tanzu Kubernetes Grid

Tanzu Kubernetes Grid implements user authentication with Pinniped. Pinniped allows you to plug external OpenID Connect (OIDC) or LDAP identity providers (IDP) into Tanzu Kubernetes clusters, so that you can control user access to those clusters. Pinniped is an open-source authentication service for Kubernetes clusters. If you use LDAP authentication, Pinniped uses Dex as the endpoint to connect to your upstream LDAP identity provider. If you use OIDC, Pinniped provides its own endpoint, so Dex is not required. Pinniped and Dex run automatically as in- cluster services in your management clusters if you enable identity management during management cluster deployment.

IMPORTANT: n In Tanzu Kubernetes Grid v1.3.0, Pinniped used Dex as the endpoint for both OIDC and LDAP providers. In Tanzu Kubernetes Grid v1.3.1 and later, Pinniped no longer requires Dex and uses the Pinniped endpoint for OIDC providers. In Tanzu Kubernetes Grid v1.3.1 and later, Dex is only used if you use an LDAP provider. Consequently, it is strongly recommended to use Tanzu Kubernetes Grid v1.3.1 if you want to implement identity management on new management clusters. If you have already used Tanzu Kubernetes Grid v1.3.0 to deploy management clusters that implement OIDC authentication, when you upgrade those management clusters to v1.3.1, you must perform additional steps to update the Pinniped configuration. For information about the additional steps to perform, see Update the Callback URL for Management Clusters with OIDC Authentication in Upgrade Management Clusters. n Previous versions of Tanzu Kubernetes Grid included optional Dex and Gangway extensions to provide identity management. These manually deployed Dex and Gangway extensions from previous versions are deprecated in this release. If you manually deployed the Dex and Gangway extensions on clusters in a previous release and you upgrade the clusters to this version of Tanzu Kubernetes Grid, it is strongly recommended to migrate your identity

VMware, Inc. 88 VMware Tanzu Kubernetes Grid

management implementation from Dex and Gangway to Pinniped and Dex. If you did not implement Dex and Gangway on clusters from a previous version of Tanzu Kubernetes Grid and you upgrade them to this version, it is also strongly recommended to implement Pinniped and Dex on those clusters.

About Tanzu Kubernetes Grid Identity Management The process for implementing identity management is as follows: n The Tanzu Kubernetes Grid administrator creates a management cluster, specifying an external OIDC or LDAP IDP. n Authentication service components are deployed into the management cluster, using the OIDC or LDAP IDP specified during deployment. n The administrator creates a Tanzu Kubernetes (workload) cluster. The workload cluster inherits the authentication configuration from the management cluster. n The administrator creates a role binding to associate a given user with a given role on the workload cluster. n The administrator provides the kubeconfig for the workload cluster to the user. n A user uses the kubeconfig to connect to the workload cluster, for example, by running kubectl get pods --kubeconfig . n The management cluster authenticates the user with the IDP. n The workload cluster either allows or denies the kubectl get pods request, depending on the permissions of the user's role.

In the image below, the blue arrows represent the authentication flow between the workload cluster, the management cluster and the external IDP. The green arrows represent Tanzu CLI and kubectl traffic between the workload cluster, the management cluster and the external IDP.

VMware, Inc. 89 VMware Tanzu Kubernetes Grid

What Happens When You Enable Identity Management The diagram below shows the identity management components that Tanzu Kubernetes Grid deploys in the management cluster and in Tanzu Kubernetes (workload) clusters when you enable identity management.

VMware, Inc. 90 VMware Tanzu Kubernetes Grid

Understanding the diagram: n The purple-bordered rectangles show the identity management components, which include Pinniped, Dex, and a post-deployment job in the management cluster and Pinniped and a post-deployment job in the workload cluster. In Tanzu Kubernetes Grid v1.3.0, Pinniped uses Dex as the endpoint for both OIDC and LDAP providers. In v1.3.1 and later, Dex is deployed only for LDAP providers. n The gray-bordered rectangles show the components that Tanzu Kubernetes Grid uses to control the lifecycle of the identity management components, which include the Tanzu CLI, tanzu-addons-manager, and kapp-controller. n The green-bordered rectangle shows the Pinniped add-on secret created for the management cluster. n The orange-bordered rectangle in the management cluster shows the Pinniped add-on secret created for the workload cluster. The secret is mirrored to the workload cluster.

Internally, Tanzu Kubernetes Grid deploys the identity management components as a core add- on, pinniped. When you deploy a management cluster with identity management enabled, the Tanzu CLI creates a Kubernetes secret for the pinniped add-on in the management cluster. tanzu- addons-manager reads the secret, which contains your IDP configuration information, and instructs kapp-controller to configure the pinniped add-on using the configuration information from the secret.

VMware, Inc. 91 VMware Tanzu Kubernetes Grid

The Tanzu CLI creates a separate pinniped add-on secret for each workload cluster that you deploy from the management cluster. All secrets are stored in the management cluster.

Obtain Your Identity Provider Details Before you can deploy a management cluster with identity management enabled, you must have an identity provider. Tanzu Kubernetes Grid supports LDAPS and OIDC identity providers.

To use your company's internal LDAPS server as the identity provider, obtain LDAPS information from your LDAP administrator.

To use OIDC as the identity provider, you must have an account with an IDP that supports the OpenID Connect standard, for example Okta.

Example: Register a Tanzu Kubernetes Grid Application in Okta To use Okta as your OIDC provider, you must create an account with Okta and register an application for Tanzu Kubernetes Grid with your account.

1 If you do not have one, create an Okta account.

2 Go to the Admin portal by clicking the Admin button.

3 Go to Applications, and click Add Application.

4 Click Create New App.

5 For Platform, select Web and for Sign on method, select OpenID Connect, then click Create.

6 Give your application a name.

7 Enter a placeholder Login redirect URI.

For example, enter http://localhost:8080/callback. You will update this with the real URL after you deploy the management cluster.

8 Click Save.

9 In the General tab for your application, copy and save the Client ID and Client secret.

You will need these credentials when you deploy the management cluster.

10 In the Assignments tab, assign people and groups to the application.

The people and groups that you assign to the application will be the users who can access the management cluster and the Tanzu Kubernetes clusters that you use it to deploy.

What to Do Next You can now deploy management clusters that implement identity management, to restrict access to clusters to authorized users. n Deploy Management Clusters with the Installer Interface n Deploy Management Clusters from a Configuration File

VMware, Inc. 92 VMware Tanzu Kubernetes Grid

If you implement identity management, after you deploy the management cluster, there are post- deployment steps to perform, that are described in Configure Identity Management After Management Cluster Deployment.

Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment

You can deploy Tanzu Kubernetes Grid management clusters and Tanzu Kubernetes (workload) clusters in environments that are not connected to the Internet, such as: n Proxied environments n Airgapped environments, with no physical connection to the Internet

This topic explains how to deploy management clusters to internet-restricted environments on vSphere or AWS.

You do not need to perform these procedures if you are using Tanzu Kubernetes Grid in a connected environment that can pull images over an external Internet connection.

General Prerequisites Before you can deploy management clusters and Tanzu Kubernetes clusters in an Internet- restricted environment, you must have: n A private Docker registry such as Harbor, against which this procedure has been tested. For vSphere, install the Docker registry within your firewall. You can configure the Docker registry with SSL certificates signed by a trusted CA, or with self-signed certificates. The registry must not implement user authentication. For example, if you use a Harbor registry, the project must be public, and not private. To install Harbor:

a Download the binaries for the latest Harbor release.

b Follow the Harbor Installation and Configuration instructions in the Harbor documentation. n (Optional) If needed, a USB key or other portable storage medium for bringing the private registry behind an airgap after it is populated with Tanzu Kubernetes Grid images. n An Internet-connected Linux bootstrap machine that:

n Is not inside the internet-restricted environment.

n Has minimum 2 GB RAM, 2 vCPU and 30 GB hard disk space. - Has the Docker client app installed.

n Can connect to your private Docker registry.

n Has the Tanzu CLI installed. See Chapter 3 Install the Tanzu CLI and Other Tools to download, unpack, and install the Tanzu CLI binary on your Internet-connected system.

n Has a version of yq installed that is equal to or above v4.5.

n If you intend to install one or more of the Tanzu Kubernetes Grid extensions, for example Harbor, the Carvel tools are installed. For more information, see Install the Carvel Tools.

VMware, Inc. 93 VMware Tanzu Kubernetes Grid vSphere Prerequisites and Architecture On vSphere, in addition to the general prerequisites above, you must: n Create an SSH key pair. See Create an SSH Key Pair in Deploy Management Clusters to vSphere. n Upload to vSphere the OVAs from which node VMs are created. See Import a Base Image Template into vSphere in Prepare to Deploy Management Clusters to vSphere. n Add the rules listed in Tanzu Kubernetes Grid Firewall Rules to your firewall. vSphere Architecture An internet-restricted Tanzu Kubernetes Grid installation on vSphere has firewalls and communication between major components as shown here.

Amazon EC2 Prerequisites and Architecture For an Internet-restricted installation on Amazon EC2, in addition to the general prerequisites above, you also need: n An Amazon EC2 VPC with no internet gateway ("offline VPC") configured as described below.

n Your internet-connected bootstrap machine must be able to access IP addresses within this offline VPC.

VMware, Inc. 94 VMware Tanzu Kubernetes Grid

n An Amazon S3 bucket. n A Docker registry installed within your offline VPC, configured equivalently to the internet- connected registry above. n A Linux bootstrap VM within your offline VPC, provisioned similarly to the internet-connected machine above. n Before you deploy the offline management cluster, you must configure its load balancer with an internal scheme as described in Step 5: Initialize Tanzu Kubernetes Grid, below.

By default, deploying Tanzu Kubernetes Grid to Amazon EC2 creates a new, internet-connected VPC. To deploy to an Internet-restricted environment, you must first create an offline VPC. For details, see the Reuse a VPC and NAT gateway(s) that already exist in your availability zone(s) instructions in the Deploy Management Clusters to Amazon EC2 topic.

After you create the offline VPC, you must add following endpoints to it: n Service endpoints:

n sts

n ssm

n ec2

n ec2messages

n elasticloadbalancing

n secretsmanager

n ssmmessages n Gateway endpoint to your Amazon S3 storage bucket

To add the service endpoints to your VPC:

1 In the AWS console, browse to VPC Dashboard > Endpoints

2 For each of the above services

a Click Create Endpoint

b Search for the service and select it under Service Name

c Select your VPC and its Subnets

d Enable DNS Name for the endpoint

e Select a Security group that allows VMs in the VPC to access the endpoint

f Select Policy > Full Access

g Click Create endpoint

VMware, Inc. 95 VMware Tanzu Kubernetes Grid

To add the Amazon S3 gateway endpoint to your VPC, follow the instructions in Endpoints for Amazon S3 in the AWS documentation. n When an offline VPC has an S3 endpoint to dedicated S3 storage, all traffic between the VPC and S3 travels via internal AWS cloud infrastructure, and never over the open Internet.

Amazon EC2 Architecture An internet-restricted Tanzu Kubernetes Grid installation on Amazon EC2 has firewalls and communication between major components as shown here. Security Groups (SG) are automatically created between the control plane and workload domains, and between the workload components and control plane components.

VMware, Inc. 96 VMware Tanzu Kubernetes Grid

Step 1: Prepare Environment The following procedures apply both for the initial deployment of Tanzu Kubernetes Grid in an internet-restricted environment and to upgrading an existing internet-restricted Tanzu Kubernetes Grid deployment.

1 On the machine with an Internet connection on which you installed the Tanzu CLI, run the tanzu init and tanzu management-cluster create commands.

n The tanzu management-cluster create command does not need to complete.

Running tanzu init and tanzu management-cluster create for the first time installs the necessary Tanzu Kubernetes Grid configuration files in the ~/.tanzu/tkg folder on your system. The script that you create and run in subsequent steps requires the Bill of Materials (BoM) YAML files in the ~/.tanzu/tkg/bom folder to be present on your machine. The scripts in this procedure use the BoM files to identify the correct versions of the different Tanzu Kubernetes Grid component images to pull.

2 Set the IP address or FQDN of your local registry as an environment variable.

In the following command example, replace custom-image-repository.io with the address of your private Docker registry.

On Windows platforms, use the SET command instead of export. Include the name of the project in the value:

export TKG_CUSTOM_IMAGE_REPOSITORY="custom-image-repository.io/yourproject"

3 If your private Docker registry uses a self-signed certificate, provide the CA certificate in base64 encoded format.

export TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE=LS0t[...]tLS0tLQ==

If you specify the CA certificate in this option, it is automatically injected into all Tanzu Kubernetes clusters that you create in this Tanzu Kubernetes Grid instance.

On Windows platforms, use the SET command instead of export.

4 If your airgapped environment has a DNS server, check that it includes an entry for your private Docker registry. If your environment lacks a DNS server, modify overlay files as follows to add the registry into the /etc/hosts files of the TKr Controller and all control plane and worker nodes: n Add the following to the ytt overlay file for your infrastructure, ~/.tanzu/tkg/providers/ infrastructure-IAAS/ytt/IAAS-overlay.yaml where IAAS is vsphere, aws, or azure.

#@ load("@ytt:overlay", "overlay")

#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"}) --- apiVersion: controlplane.cluster.x-k8s.io/v1alpha3 kind: KubeadmControlPlane spec:

VMware, Inc. 97 VMware Tanzu Kubernetes Grid

kubeadmConfigSpec: preKubeadmCommands: #! Add nameserver to all k8s nodes #@overlay/append - echo "PRIVATE-REGISTRY-IP PRIVATE-REGISTRY-HOSTNAME" >> /etc/hosts

#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}) --- apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3 kind: KubeadmConfigTemplate spec: template: spec: preKubeadmCommands: #! Add nameserver to all k8s nodes #@overlay/append - echo "PRIVATE-REGISTRY-IP PRIVATE-REGISTRY-HOSTNAME" >> /etc/hosts

Where PRIVATE-REGISTRY-IP and PRIVATE-REGISTRY-HOSTNAME are the IP address and name of your private Docker registry. n In your TKr Controller customization overlay file, ~/.tanzu/tkg/providers/ytt/ 03_customizations/01_tkr/tkr_overlay.lib.yaml add the following into the spec.template.spec section, before the containers block and at the same indent level:

#@overlay/match missing_ok=True hostAliases: - ip: PRIVATE-REGISTRY-IP hostnames: - PRIVATE-REGISTRY-HOSTNAME

Where PRIVATE-REGISTRY-IP and PRIVATE-REGISTRY-HOSTNAME are the IP address and name of your private Docker registry.

Step 2: Generate the publish-images Script

1 Copy and paste the following shell script in a text editor, and save it as gen-publish-images.sh.

#!/usr/bin/env bash # Copyright 2020 The TKG Contributors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.

set -euo pipefail

VMware, Inc. 98 VMware Tanzu Kubernetes Grid

TANZU_BOM_DIR=${HOME}/.tanzu/tkg/bom LEGACY_BOM_DIR=${HOME}/.tkg/bom INSTALL_INSTRUCTIONS='See https://github.com/mikefarah/yq#install for installation instructions'

echodual() { echo "$@" 1>&2 echo "#" "$@" }

if [ -z "$TKG_CUSTOM_IMAGE_REPOSITORY" ]; then echo "TKG_CUSTOM_IMAGE_REPOSITORY variable is not defined" >&2 exit 1 fi

if [[ -d "$TANZU_BOM_DIR" ]]; then BOM_DIR="${TANZU_BOM_DIR}" elif [[ -d "$LEGACY_BOM_DIR" ]]; then BOM_DIR="${LEGACY_BOM_DIR}" else echo "Tanzu Kubernetes Grid directories not found. Run CLI once to initialise." >&2 exit 2 fi

if ! [ -x "$(command -v imgpkg)" ]; then echo 'Error: imgpkg is not installed.' >&2 exit 3 fi

if ! [ -x "$(command -v yq)" ]; then echo 'Error: yq is not installed.' >&2 echo "${INSTALL_INSTRUCTIONS}" >&2 exit 3 fi

echo "set -euo pipefail" echodual "Note that yq must be version above or equal to version 4.9.2 and below version 5."

actualImageRepository="" # Iterate through BoM file to create the complete Image name # and then pull, retag and push image to custom registry. for TKG_BOM_FILE in "$BOM_DIR"/*.yaml; do echodual "Processing BOM file ${TKG_BOM_FILE}" # Get actual image repository from BoM file actualImageRepository=$(yq e '.imageConfig.imageRepository' "$TKG_BOM_FILE") yq e '.. | select(has("images"))|.images[] | .imagePath + ":" + .tag ' "$TKG_BOM_FILE" | while read -r image; do actualImage=${actualImageRepository}/${image} customImage=$TKG_CUSTOM_IMAGE_REPOSITORY/${image} echo "docker pull $actualImage" echo "docker tag $actualImage $customImage" echo "docker push $customImage" echo "" done echodual "Finished processing BOM file ${TKG_BOM_FILE}"

VMware, Inc. 99 VMware Tanzu Kubernetes Grid

echo "" done

# Iterate through TKr BoM file to create the complete Image name # and then pull, retag and push image to custom registry. list=$(imgpkg tag list -i ${actualImageRepository}/tkr-bom) for imageTag in ${list}; do if [[ ${imageTag} == v* ]]; then TKR_BOM_FILE="tkr-bom-${imageTag//_/+}.yaml" echodual "Processing TKR BOM file ${TKR_BOM_FILE}"

actualTKRImage=${actualImageRepository}/tkr-bom:${imageTag} customTKRImage=${TKG_CUSTOM_IMAGE_REPOSITORY}/tkr-bom:${imageTag} echo "" echo "docker pull $actualTKRImage" echo "docker tag $actualTKRImage $customTKRImage" echo "docker push $customTKRImage" imgpkg pull --image ${actualImageRepository}/tkr-bom:${imageTag} --output "tmp" > /dev/null 2>&1 yq e '.. | select(has("images"))|.images[] | .imagePath + ":" + .tag ' "tmp/$TKR_BOM_FILE" | while read -r image; do actualImage=${actualImageRepository}/${image} customImage=$TKG_CUSTOM_IMAGE_REPOSITORY/${image} echo "docker pull $actualImage" echo "docker tag $actualImage $customImage" echo "docker push $customImage" echo "" done rm -rf tmp echodual "Finished processing TKR BOM file ${TKR_BOM_FILE}" echo "" fi done

list=$(imgpkg tag list -i ${actualImageRepository}/tkr-compatibility) for imageTag in ${list}; do if [[ ${imageTag} == v* ]]; then echodual "Processing TKR compatibility image" actualImage=${actualImageRepository}/tkr-compatibility:${imageTag} customImage=$TKG_CUSTOM_IMAGE_REPOSITORY/tkr-compatibility:${imageTag} echo "" echo "docker pull $actualImageRepository/tkr-compatibility:$imageTag" echo "docker tag $actualImage $customImage" echo "docker push $customImage" echo "" echodual "Finished processing TKR compatibility image" fi done

2 Make the gen-publish-images script executable.

chmod +x gen-publish-images.sh

VMware, Inc. 100 VMware Tanzu Kubernetes Grid

3 Generate a publish-images shell script that is populated with the address of your private Docker registry.

./gen-publish-images.sh > publish-images.sh

4 Verify that the generated script contains the correct registry address.

cat publish-images.sh

Step 3: Run the publish-images Script

1 Make the publish-images script executable.

chmod +x publish-images.sh

2 Log in to your local private registry.

docker login ${TKG_CUSTOM_IMAGE_REPOSITORY}

3 Run the publish-images script to pull the required images from the public Tanzu Kubernetes Grid registry, retag them, and push them to your private registry.

./publish-images.sh

If your registry lacks sufficient storage for all images in the publish-images script, re-generate and re-run the script after either:

n Increasing the persistentVolumeClaim.registry.size value in your Harbor extension configuration. See Deploy Harbor Registry as a Shared Service and the extensions/ registry/harbor/harbor-data-values.yaml.example file in the VMware Tanzu Kubernetes Grid Extensions Manifest download for your version on the Tanzu Kubernetes Grid downloads page.

n Removing additional BoM files from the ~/.tanzu/tkg/bom directory, as described in Step 1: Prepare Environment.

4 When the script finishes, do the following, depending on your infrastructure: n vSphere: Turn off your Internet connection. n Amazon EC2: Use the offline VPC's S3 gateway to transfer the Docker containers from the online registry to the offline registry, and the tanzu CLI and dependencies from the online bootstrap machine to the offline bootstrap machine.

You can do this from the AWS console by uploading and downloading tar archives from the online registry and machine, and to the offline registry and machine.

VMware, Inc. 101 VMware Tanzu Kubernetes Grid

Step 4: Set Environment Variables

As long as the TKG_CUSTOM_IMAGE_REPOSITORY variable remains set, when you deploy clusters, Tanzu Kubernetes Grid will pull images from your local private registry rather than from the external public registry. To make sure that Tanzu Kubernetes Grid always pulls images from the local private registry, add TKG_CUSTOM_IMAGE_REPOSITORY to the cluster configuration file, which defaults to ~/.tanzu/tkg/cluster-config.yaml. If your Docker registry uses self-signed certificates, also add TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY or TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE to the cluster configuration file.

TKG_CUSTOM_IMAGE_REPOSITORY: custom-image-repository.io/yourproject TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: true

If you specify TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE, set TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY to false and provide the CA certificate in base64 encoded format by executing base64 -w 0 your-ca.crt.

TKG_CUSTOM_IMAGE_REPOSITORY: custom-image-repository.io/yourproject TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: false TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: LS0t[...]tLS0tLQ==

Step 5: Initialize Tanzu Kubernetes Grid

1 If your offline bootstrap machine does not have a ~/.tanzu directory, because you have not yet created a Tanzu Kubernetes Grid management cluster with it, run tanzu cluster list to create the directory.

2 (Amazon EC2) On the offline machine, customize your AWS management cluster's load balancer template to use an internal scheme, avoiding the need for a public-facing load balancer. To do this, add the following into the overlay file ~/.tanzu/tkg/providers/ytt/ 03_customizations/internal_lb.yaml:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#@overlay/match by=overlay.subset({"kind":"AWSCluster"}) --- apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3 kind: AWSCluster spec: #@overlay/match missing_ok=True controlPlaneLoadBalancer: #@overlay/match missing_ok=True scheme: "internal"

VMware, Inc. 102 VMware Tanzu Kubernetes Grid

What to Do Next Your Internet-restricted environment is now ready for you to deploy or upgrade Tanzu Kubernetes Grid management clusters and start deploying Tanzu Kubernetes clusters on vSphere or Amazon EC2. n Prepare to Deploy Management Clusters to vSphere n Prepare to Deploy Management Clusters to Amazon EC2 n If you performed this procedure as a part of an upgrade, see Chapter 9 Upgrading Tanzu Kubernetes Grid.

You can also optionally deploy the Tanzu Kubernetes Grid extensions and use the Harbor Shared service instead of your private Docker registry.

Deploying the Tanzu Kubernetes Grid Extensions in an Internet Restricted Environment If you are using Tanzu Kubernetes Grid in an internet-restricted environment, after you download and unpack the Tanzu Kubernetes Grid extensions bundle, you must edit the extension files so that Tanzu Kubernetes Grid pulls images from your local private Docker registry rather than from the external Internet.

1 On a machine with an internet connection, Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle.

2 Copy the unpacked folder of extensions manifests to the machine within your firewall on which you run the tanzu CLI.

3 Use the search and replace utility of your choice to search recursively through the tkg- extensions-v1.3.1+vmware.1 folder and replace projects.registry.vmware.com/tkg with the address of your private Docker registry, for example custom-image-repository.io/yourproject.

After making this change, when you implement the Tanzu Kubernetes Grid extensions, images will be pulled from your local private Docker registry rather than from projects.registry.vmware.com/tkg.

To deploy the extensions, must also install the Carvel tools on the machine on which you run the tanzu CLI. For more information, see Install the Carvel Tools.

Using the Harbor Shared Service in Internet-Restricted Environments In Internet-restricted environments, you can set up Harbor as a shared service so that your Tanzu Kubernetes Grid instance uses it instead of an external registry. As described in the procedures above, to deploy Tanzu Kubernetes Grid in an Internet-restricted environment, you must have a private container registry running in your environment before you can deploy a management cluster. This private registry is a central registry that is part of your infrastructure and available to your whole environment, but is not necessarily based on Harbor or supported by VMware. This private registry is not a Tanzu Kubernetes Grid shared service; you deploy that registry later.

VMware, Inc. 103 VMware Tanzu Kubernetes Grid

After you use this central registry to deploy a management cluster in an Internet-restricted environment, you configure Tanzu Kubernetes Grid so that Tanzu Kubernetes clusters pull images from the central registry rather than from the external Internet. If the central registry uses a trusted CA certificate, connections between Tanzu Kubernetes clusters and the registry are secure. If your central registry uses self-signed certificates, you can disable TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY and specify the TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE option. Setting this option automatically injects your self-signed certificates into your Tanzu Kubernetes clusters.

In either case, after you use your central registry to deploy a management cluster in an Internet- restricted environment, VMware recommends deploying the Harbor shared service in your Tanzu Kubernetes Grid instance and then configuring Tanzu Kubernetes Grid so that Tanzu Kubernetes clusters pull images from the Harbor shared service managed by Tanzu Kubernetes Grid, rather than from the central registry.

On infrastructures with load balancing, VMware recommends installing the External DNS service alongside the Harbor service, as described in Harbor Registry and External DNS.

For information about how to deploy the Harbor shared service, see Deploy Harbor Registry as a Shared Service.

Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch

If you use VMware Tanzu Kubernetes Grid to deploy management clusters to Amazon EC2 or Microsoft Azure, Amazon EC2 or Azure load balancer instances are created. To provide load balancing services to deployments on vSphere, Tanzu Kubernetes Grid includes VMware NSX Advanced Load Balancer Essentials Edition.

NSX Advanced Load Balancer, formerly known as Avi Vantage, provides an L4 load balancing solution. NSX Advanced Load Balancer includes a Kubernetes operator that integrates with the Kubernetes API to manage the lifecycle of load balancing and ingress resources for workloads.

NSX Advanced Load Balancer Deployment Topology NSX Advanced Load Balancer includes the following components: n Avi Kubernetes Operator (AKO) provides the load balancer functionality for Kubernetes clusters. It listens to Kubernetes Ingress and Service Type LoadBalancer objects and interacts with the Avi Controller APIs to create VirtualService objects. n Service Engines (SE) implement the data plane in a VM form factor. n Avi Controller manages VirtualService objects and interacts with the vCenter Server infrastructure to manage the lifecycle of the service engines (SEs). It is the portal for viewing the health of VirtualServices and SEs and the associated analytics that NSX Advanced Load Balancer provides. It is also the point of control for monitoring and maintenance operations such as backup and restore.

VMware, Inc. 104 VMware Tanzu Kubernetes Grid

n SE Groups provide a unit of isolation in the form of a set of Service Engines, for example a dedicated SE group for specific important namespaces. This offers control in the form of the flavor of SEs (CPU, Memory, and so on) that needs to be created and also the limits on the maximum number of SEs that are permitted.

You can deploy NSX Advanced Load Balancer in the topology illustrated in the figure below.

The topology diagram above shows the following configuration: n Avi controller is connected to the management port group. n The service engines are connected to the management port group and one or more VIP port groups. Service engines run in dual-arm mode. n Avi Kubernetes Operator is installed on the Tanzu Kubernetes clusters and should be able to route to the controller’s management IP. n Avi Kubernetes Operator is installed in NodePort mode only.

Recommendations n For set ups with a small number of Tanzu Kubernetes clusters that each have a large number of nodes, it is recommended to use one dedicated SE group per cluster. n For set ups with a large number of Tanzu Kubernetes clusters that each have a small number of nodes, it is recommended to share an SE group between multiple clusters. n An SE group can be shared by any number of workload clusters as long as the sum of the number of distinct cluster node networks and the number of distinct cluster VIP networks is no bigger than 8. n All clusters can share a single VIP network or each cluster can have a dedicated VIP network. n Clusters that share a VIP network should be grouped by labels. A dedicated AKODeploymentConfig should be created in the management cluster.

VMware, Inc. 105 VMware Tanzu Kubernetes Grid

n For simplicity, in a lab environment all components can be connected to the same port group on which the Tanzu Kubernetes clusters are connected.

In the topology illustrated above, NSX Advanced Load Balancer provides the following networking, IPAM, isolation, tenancy, and Avi Kubernetes Operator functionalities.

Networking n SEs are deployed in a dual-arm mode in relation to the data path, with connectivity both to the VIP network and to the workload cluster node network. n The VIP network and the workload networks must be discoverable in the same vCenter Cloud so Avi Controller could create SEs attached to both networks. n VIP and SE data interface IP addresses are allocated from the VIP network. n There can only be one VIP network per workload cluster. However, different VIP networks could be assigned to different workload clusters, for example in a large Tanzu Kubernetes Grid deployment.

IPAM n If DHCP is not available, IPAM for the VIP and SE Interface IP address is managed by Avi Controller. n The IPAM profile in Avi Controller is configured with a Cloud and a set of Usable Networks. n If DHCP is not configured for the VIP network, at least one static pool must be created for the target network.

Resource Isolation n Dataplane isolation across Tanzu Kubernetes clusters can be provided by using SE Groups. The vSphere admin can configure a dedicated SE Group and configure that for a set of Tanzu Kubernetes clusters that need isolation. n SE Groups offer the ability to control the resource characteristics of the SEs created by the Avi Controller, for example, CPU, memory, and so on.

Tenancy With NSX Advanced Load Balancer Essentials, all workload cluster users are associated with the single admin tenant.

Avi Kubernetes Operator Avi Kubernetes Operator is installed on Tanzu Kubernetes clusters. It is configured with the Avi Controller IP address and the user credentials that Avi Kubernetes Operator uses to communicate with the Avi Controller. A dedicated user per workload is created with the admin tenant and a customized role. This role has limited access, as defined in https://github.com/ avinetworks/avi-helm-charts/blob/master/docs/AKO/roles/ako-essential.json.

VMware, Inc. 106 VMware Tanzu Kubernetes Grid

Install Avi Controller on vCenter Server You install Avi Controller on vCenter Server by downloading and deploying an OVA template. These instructions provide guidance specific to deploying Avi Controller for Tanzu Kubernetes Grid.

1 Make sure your vCenter environment fulfills the prerequisites described in Installing Avi Vantage for VMware vCenter in the Avi Networks documentation.

2 Access the Avi Networks portal from the Tanzu Kubernetes Grid downloads page.

3 In the VMware NSX Advanced Load Balancer row, click Go to Downloads.

4 Click Download Now to go the NSX Advanced Load Balancer Customer Portal.

5 In the customer portal, go to Software > 20.1.3.

6 Scroll down to VMware, and click the download button for Controller OVA.

7 Log in to the vSphere Client.

8 In the vSphere Client, right-click an object in the vCenter Server inventory, select Deploy OVF template.

9 Select Local File, click the button to upload files, and navigate to the downloaded OVA file on your local machine.

10 Follow the installer prompts to deploy a VM from the OVA template, referring to the Deploying Avi Controller OVA instructions in the Avi Networks documentation.

Select the following options in the OVA deployment wizard:

n Provide a name for the Controller VM, for example, nsx-adv-lb-controller and the datacenter in which to deploy it.

n Select the cluster in which to deploy the Controller VM.

n Review the OVA details, then select a datastore for the VM files. For the disk format, select Thick Provision Lazy Zeroed.

n For the network mapping, select a port group for the Controller to use to communicate with vCenter Server. The network must have access to the management network on which vCenter Server is running.

n If DHCP is available, you can use it for controller management.

n Specify the management IP address, subnet mask, and default gateway. If you use DHCP, you can leave these fields empty.

n Leave the key field in the template empty.

n On the final page of the installer, click Finish to start the deployment.

It takes some time for the deployment to finish.

11 When the OVA deployment finishes, power on the resulting VM.

After you power on the VM, it takes some time for it to be ready to use.

VMware, Inc. 107 VMware Tanzu Kubernetes Grid

12 In vCenter, create a vSphere account for the Avi controller, with permissions as described in VMware User Role for Avi Vantage in the Avi Networks documentation.

Avi Controller Setup: Basics For full details of how to set up the Controller, see the Performing the Avi Controller Initial setup in the Avi Controller documentation.

This section provides some information about configuration that has been validated on Tanzu Kubernetes Grid, as well as some tips that are not included in the Avi Controller documentation.

NOTE: If you are using an existing Avi Controller, you must make sure that the VIP Network that is be used during Tanzu Kubernetes Grid management cluster deployment has a unique name across all AVI Clouds.

1 In a browser, go to the IP address of the Controller VM.

2 Configure a password to create an admin account.

3 Optionally set DNS Resolvers and NTP server information, set the backup passphrase, and click Next.

Setting the backup passphrase is mandatory.

4 Select None to skip SMTP configuration, and click Next.

5 For Orchestrator Integration, select VMware.

6 Enter the vCenter Server credentials and the IP address or FQDN of the vCenter Server instance.

7 For Permissions, select Write.

This allows the Controller to create and manage SE VMs.

8 For SDN Integration select None and click Next.

9 Select the vSphere Datacenter.

10 For System IP Address Management, select DHCP.

11 For Virtual Service Placement Settings, leave both check boxes unchecked and click Next.

12 Select a distributed virtual switch to use as the management network, select DHCP and click Next.

n The switch is used for the management network NIC in the SEs.

n Select the same network as you used when you deployed the controller.

13 For Support Multiple Tenants, select No.

VMware, Inc. 108 VMware Tanzu Kubernetes Grid

Avi Controller Setup: IPAM and DNS There are additional settings to configure in the Controller UI before you can use NSX Advanced Load Balancer.

1 In the Controller UI, go to Applications > Templates > Profiles > IPAM/DNS Profiles, click Create and select IPAM Profile.

n Enter a name for the profile, for example, tkg-ipam-profile.

n Leave the Type set to Avi Vantage IPAM.

n Leave Allocate IP in VRF unchecked.

n Click Add Usable Network.

n Select Default-Cloud.

n For Usable Network, select the distributed virtual switch that you selected in the preceding procedure.

n (Optional) Click Add Usable Network to configure additional VIP networks.

n Click Save.

VMware, Inc. 109 VMware Tanzu Kubernetes Grid

2 In the IPAM/DNS Profiles view, click Create again and select DNS Profile.

NOTE: The DNS Profile is optional for using Service type LoadBalancer.

n Enter a name for the profile, for example, tkg-dns-profile.

n For Type, select AVI Vantage DNS

n Click Add DNS Service Domain and enter at least one Domain Name entry, for example tkg.nsxlb.vmware.com.

n This should be from a DNS domain that you can manage.

n This is more important for the L7 Ingress configurations, in which the Controller bases the logic to route traffic on hostnames.

n Ingress resources that the Controller manages should use host names that belong to the domain name that you select here.

VMware, Inc. 110 VMware Tanzu Kubernetes Grid

n This domain name is also used for Services of type LoadBalancer, but it is mostly relevant if you use AVI DNS VS as your Name Server.

n Each Virtual Service will create an entry in the AVI DNS configuration. For example, service.namespace.tkg-lab.vmware.com.

n Click Save.

3 Click the menu in the top left corner and select Infrastructure > Clouds.

4 For Default-Cloud, click the edit icon and under IPAM Profile and DNS Profile, select the IPAM and DNS profiles that you created above.

VMware, Inc. 111 VMware Tanzu Kubernetes Grid

5 Select the DataCenter tab.

n Leave DHCP enabled. This is set per network.

n Leave the IPv6... and Static Routes... check boxes unchecked.

6 Do not update the Network section yet.

7 Save the cloud configuration.

8 Go to Infrastructure > Networks and click the edit icon for the network you are using as the VIP network.

9 Edit the network to add a pool of IPs to be used as a VIP.

Edit the subnet and add an IP Address pool range within the boundaries, for example 192.168.14.210-192.168.14.219.

VMware, Inc. 112 VMware Tanzu Kubernetes Grid

Avi Controller Setup: Custom Certificate The default NSX Advanced Load Balancer certificate does not contain the Controller's IP or FQDN in the Subject Alternate Names (SAN), however valid SANs must be defined in Avi Controller's certificate. You consequently must create a custom certificate to provide when you deploy management clusters.

1 In the Controller UI, click the menu in the top left corner and select Templates > Security > SSL/TLS Certificates, click Create, and select Controller Certificate.

2 Enter the same name in the Name and Common Name text boxes.

3 Select Self-Signed.

4 For Subject Alternate Name (SAN), enter either the IP address or FQDN, or both, of the Controller VM.

If only the IP address or FQDN is used, it must match the value that you use for Controller Host when you configure NSX Advanced Load Balancer settings during management cluster deployment, or specify in the AVI_CONTROLLER variable in the management cluster configuration file.

5 Leave the other fields empty and click Save.

6 In the menu in the top left corner, select Administration > Settings > Access Settings, and click the edit icon in System Access Settings.

7 Delete all of the certificates in SSL/TLS Certificate.

8 Use the SSL/TLS Certificate drop-down menu to add the custom certificate that you created above.

9 In the menu in the top left corner, select Templates > Security > SSL/TLS Certificates, select the certificate you create and click the export icon.

10 Copy the certificate contents.

You will need the certificate contents when you deploy management clusters.

Avi Controller Setup: Essentials License Finish setting up the Avi Controller by enabling the Essentials license, if required.

1 In the Controller UI, go to Administration > Settings > Licensing. The Licensing screen appears.

2 In the Licensing screen, click the crank wheel icon that is next to Licensing.

3 In the list of license types, select Essentials License. Click Save, and then click Next.

4 In the Licensing screen, verify that the license has been set to Essentials.

5 To create a default gateway route for the traffic to flow from the service engines to the Pods and then back to the clients, go to Infrastructure > Routing > Staic Route in the Controller UI, and click CREATE.

VMware, Inc. 113 VMware Tanzu Kubernetes Grid

6 In the Edit Static Route:1 screen, enter the following details:

n Gateway Subnet: 0.0.0.0/0

n Next Hop: The gateway IP address of the virtual IP network that you want to use

7 Click SAVE.

After the Essentials Tier is enabled on a Controller that has not been configured already, the default service Engine group is switched to the Legacy (Active/Standby) HA mode, which is the only mode that Essentials Tier supports.

Update the Avi Certificate Tanzu Kubernetes Grid authenticates to the Avi Controller by using certificates. When these certificates near expiration, update them by using the Tanzu CLI. You can update the certificates in an existing workload cluster, or in a management cluster for use by new workload clusters. Newly-created workload clusters obtain their Avi certificate from their management cluster.

Update the Avi Certificate in an Existing Workload Cluster Updating the Avi certificate in an existing workload cluster is performed through the workload cluster context in the Tanzu CLI. Before performing this task, ensure that you have the workload cluster context and the new base64 encoded Avi certificate details. For more information on obtaining the workload cluster context, see Retrieve Tanzu Kubernetes Cluster kubeconfig.

1 In the Tanzu CLI, run the following command to switch the context to the workload cluster:

kubectl config use-context *WORKLOAD-CLUSTER-CONTEXT*

2 Run the following command to update the avi-secret value under avi-system namespace:

kubectl edit secret avi-secret -n avi-system

Within your default text editor that pops up, update the certificateAuthorityData field with your new base64 encoded certificate data.

3 Save the changes.

4 Run the following command to obtain the number of Avi Kubernetes Operator (AKO) pods in your environment:

kubectl get pod -n avi-system

Record the number of pods in the output. The values start from 0, which suggests one AKO pod in the environment.

5 Run the following command to restart the AKO pods:

kubectl delete ako-NUMBER -n avi-system

Where NUMBER is the number of AKO pods in your environment recorded in the previous step.

VMware, Inc. 114 VMware Tanzu Kubernetes Grid

Update the Avi Certificate in a Management Cluster Workload clusters obtain their Avi certificates from their management cluster. This procedure updates the Avi certificate in a management cluster. The management cluster then includes the updated certificate in any new workload clusters that it creates.

Before performing this task, ensure that you have the management cluster context and the new base64 encoded Avi certificate details. For more information on obtaining the management cluster context, see Retrieve Tanzu Kubernetes Cluster kubeconfig.

1 In the Tanzu CLI, run the following command to switch the context to the management cluster:

kubectl config use-context MANAGEMENT-CLUSTER-CONTEXT

2 Run the following command to update the avi-controller-ca value under tkg-system- networking namespace:

kubectl edit secret avi-controller-ca -n tkg-system-networking

Within your default text editor that pops up, update the certificateAuthorityData field with your new base64 encoded certificate data.

3 Save the changes.

4 Run the following command to obtain the AKO Controller Manager string:

kubectl get pod -n tkg-system-networking

Note down the random string in the output. You will require this string while restarting the AKO pods.

5 Run the following command to restart the AKO pods:

kubectl delete po ako-operator-controller-manager-RANDOM STRING -n tkg-system-networking

Where RANDOM STRING is the string that you noted down in Step 5.

Create an additional service engine group for NSX Advanced Load Balancer The NSX Advanced Load Balancer Essentials Tier has limited high-availability (HA) capabilities. To distribute the load balancer services to different service engine groups (SEG), create additional SEGs on the Avi Controller, and create a new AKO configuration object (akodeploymentconfig object) in a YAML file in the management cluster. Alternatively, you can update an existing akodeploymentconfig object in the management cluster with the name of the new SEG.

1 In the Avi Controller UI, go to Infrastructure > Service Engine Groups, and click CREATE to create the new SEG.

VMware, Inc. 115 VMware Tanzu Kubernetes Grid

2 Create the service engine group as follows:

3 If you want to create a new akodeploymentconfig object for the new SEG, do the following steps on the command terminal:

a Run the following command to open the text editor.

vi FILE_NAME

Where FILE_NAME is the name of the akodeploymentconfig YAML file that you want to create.

b Add the AKO configuration details in the file. The following is an example:

apiVersion: networking.tkg.tanzu.vmware.com/v1alpha1 kind: AKODeploymentConfig metadata: name: install-ako-for-all spec:

VMware, Inc. 116 VMware Tanzu Kubernetes Grid

adminCredentialRef: name: avi-controller-credentials namespace: tkg-system-networking certificateAuthorityRef: name: avi-controller-ca namespace: tkg-system-networking cloudName: Default-Cloud controller: 10.184.74.162 dataNetwork: cidr: 10.184.64.0/20 name: VM Network extraConfigs: cniPlugin: antrea disableStaticRouteSync: true image: pullPolicy: IfNotPresent repository: projects.registry.vmware.com/tkg/ako version: v1.4.3_vmware.1 ingress: defaultIngressController: false disableIngressClass: true serviceEngineGroup: SEG-1

c Save the file, and exit the text editor.

d Run the following command to apply the new configuration:

kubectl apply -f FILE_NAME

Where FILE NAME is the name of the YAML file that you have created.

4 If you want to update an existing akodeploymentconfig object for the new SEG, do the following steps on the command terminal:

a Run the following command to open the akodeploymentconfig object:

kubectl edit adc ADC_NAME

Where ADC_NAME is the name of the akodeploymentconfig object in the YAML file.

b Update the SEG name in the text editor that pops up.

c Save the file, and exit the text editor.

5 Run the following command to verify that the new configuration is present in the management cluster:

kubectl get adc ADC_NAME -o yaml

Where ADC_NAME is the name of the akodeploymentconfig object in the YAML file.

In the file, verify that the adc.spec.serviceEngineGroup field displays the name of the new service engine group.

1 Switch the context to the workload cluster by using the kubectl utility.

VMware, Inc. 117 VMware Tanzu Kubernetes Grid

2 Run the following command to view the AKO deployment information:

kubectl get cm avi-k8s-config -n avi-system -o yaml

In the output, verify that the service engine group has been updated.

3 Run the following command to verify that AKO is running:

kubectl get pod -n avi-system

What to Do Next Your NSX Advanced Load Balancer deployment is ready for you to use with management clusters. n Deploy Management Clusters with the Installer Interface n Deploy Management Clusters from a Configuration File

Prepare a vSphere Management as a Service Infrastructure

Tanzu Kubernetes Grid runs on two Management as a Service (MaaS) products that provide a vSphere interface and environment to public cloud infrastructures: VMware Cloud on AWS and Azure VMware Solution.

This topic explains how to prepare these services and use them to create a bootstrap machine for deploying Tanzu Kubernetes Grid. For both VMware Cloud on AWS and Azure VMware Solution, the bootstrap machine is not a local physical machine, but is instead a cloud VM jumpbox that connects vSphere with its underlying infrastructure.

Preparing VMware Cloud on AWS To run Tanzu Kubernetes Grid on VMware Cloud on AWS, set up a Software-Defined Datacenter (SDDC) and create a bootstrap VM as follows. The bootstrap machine is a VM managed through vCenter:

1 Log into the VMC Console and create a new SDDC by following the procedure Deploy an SDDC from the VMC Console in the VMware Cloud on AWS documentation.

n After you click Deploy SDDC, the SDDC creation process typically takes 2-3 hours.

2 Once the SDDC is created, open its pane in the VMC Console and click Networking & Security > Network > Segments.

3 The Segment List shows sddc-cgw-network-1 with a subnet CIDR of 192.168.1.1/24, giving 256 addresses. If you need more internal IP addresses, you can:

n Open sddc-cgw-network-1 and modify its subnet CIDR to something broader, like 192.168.1.1/20.

n Click Add Segment and create another network segment with a different subnet. Make sure the new subnet CIDR does not overlap with sddc-cgw-network-1 or any other existing segments.

VMware, Inc. 118 VMware Tanzu Kubernetes Grid

4 Open sddc-cgw-network-1 and any other network segments. For each segment, click Edit DHCP Config. A Set DHCP Config pane appears.

5 In the Set DHCP Config pane:

n Set DHCP Config to Enabled.

n Set DHCP Ranges to an IP address range or CIDR within the segment's subnet, but that leaves a pool of addresses free to serve as static IP addresses for Tanzu Kubernetes clusters. Each management cluster and workload cluster that Tanzu Kubernetes Grid creates will require a unique static IP address from this pool.

6 To enable access to vCenter, add a firewall rule or set up a VPN, following the Connect to vCenter Server instructions in the VMware Cloud on AWS documentation.

7 To confirm access to vCenter, click OPEN VCENTER at upper-right in the SDDC pane. The vCenter client should appear.

8 From the vCenter portal, deploy your bootstrap machine and enable access to it following Deploy Workload VMs in the VMware Cloud on AWS documentation.

n You can log into the bootstrap machine by clicking Launch Web Console on its vCenter summary pane.

n (Optional) If you want to ssh into the bootstrap machine, in addition to using the web console within vCenter, see Set Up a VMware Cloud Bootstrap Machine for ssh, below.

9 When installing the Tanzu CLI, deploying management clusters, and performing other operations, follow the instructions for vSphere, not the instructions for Amazon EC2.

Set Up a VMware Cloud Bootstrap Machine for ssh

To set up your bootstrap machine for access via ssh, follow these procedures in the VMware Cloud for AWS documentation:

1 Assign a Public IP Address to a VM to request a public IP address for the bootstrap machine.

2 Create or Modify NAT Rules to create a NAT rule for the bootstrap machine, configured with:

n Public IP: The public IP address requested above.

n Internal IP: The IP address of the bootstrap machine. Can be either a static or DHCP IP.

3 The Procedure in Add or Modify Compute Gateway Firewall Rules to add a compute gateway rule allowing access to the VM.

Preparing Azure VMware Solution on Microsoft Azure To run Tanzu Kubernetes Grid on Azure VMware Solution (AVS), set up AVS and its Windows 10 jumphost as follows. The jumphost serves as the bootstrap machine for Tanzu Kubernetes Grid:

1 Log into NSX-T Manager as admin.

VMware, Inc. 119 VMware Tanzu Kubernetes Grid

2 Unless you are intentionally deploying to an airgapped environment, confirm that AVS is configured to allow internet connectivity for AVS-hosted VMs. This is not enabled by default. To configure this, you can either:

n Route outbound internet traffic through your on-premises datacenter by configuring Express Route Global Reach.

n Allow internet access via the AVS Express Route connection to the Azure network by logging into the Azure portal, navigating to the AVS Private Cloud object, selecting Manage > Connectivity, flipping the Internet enabled toggle to Enabled, and clicking Save.

3 Under Networking > Connectivity > Segments, click Add Segment, and configure the new segment with:

n Segment Name: An identifiable name, like avs_tkg

n Connected Gateway: The Tier-1 gateway that was predefined as part of your AVS account

n Subnets: A subnet such as 192.168.20.1/24

n DHCP Config > DHCP Range: An address range or CIDR within the subnet, for example 192.168.20.10-192.168.20.100. This range must exclude a pool of subnet addresses that DHCP cannot assign, leaving them free to serve as static IP addresses for Tanzu Kubernetes clusters. Each management cluster and workload cluster that Tanzu Kubernetes Grid creates will require a unique static IP address from the pool outside of this DHCP range.

n Transport Zone: Select the Overlay transport zone that was predefined as part of your AVS account.

Note: After you create the segment, it should be visible in vCenter.

VMware, Inc. 120 VMware Tanzu Kubernetes Grid

4 From the IP Management > DHCP pane, click Add Server, and configure the new DHCP server with:

n Server Name: An identifiable name, like avs_tkg_dhcp

n Server IP Address: A range that does not overlap with the subnet of the segment created above, for example 192.168.30.1/24.

n Lease Time: 5400 seconds; shorter than the default interval, to release IP addresses sooner

5 Under Networking > Connectivity > Tier-1 Gateways, open the predefined gateway.

6 Click the Tier-1 gateway's IP Address Management setting and associate it with the DHCP server created above.

7 Configure a DNS forwarder in NSX-T Manager or the Azure portal: n NSX-T Manager:

a Under Networking > IP Management > DNS, click DNS Zones.

b Click Add DNS Zone > Add Default Zone, and provide the following:

n Zone Name: An identifiable name like avs_tkg_dns_zone.

n DNS Servers: Up to three comma-separated IP addresses representing valid DNS servers.

c Click Save, and then select the DNS Services tab

d Click Add DNS Service, and provide the following:

n Name: An identifiable name, like avs_tkg_dns_svc.

n Tier0/Tier1 Gateway: The Tier-1 gateway that was predefined as part of your AVS account.

n DNS Service IP: An IP address that does not overlap with the any other subnets created, such as 192.168.40.1.

n Default DNS Zone: Select the Zone Name defined earlier.

e Click Save. n Azure Portal:

a Navigate to the AVS Private Cloud object and select Workload Networking > DNS.

b With the DNS zones tab selected, click Add and provide the following:

n Type: Default DNS zone.

n DNS zone name: An identifiable name like avs_tkg_dns_zone.

n DNS server IP: Up to three DNS servers.

c Click OK and then click the DNS service tab.

VMware, Inc. 121 VMware Tanzu Kubernetes Grid

d Click Add and provide the following:

n Name: An identifiable name, like avs_tkg_dns_svc.

n DNS Service IP: An IP address that does not overlap. with the any other subnets created, such as 192.168.40.1

n Default DNS Zone: Select the DNS zone name defined earlier.

e Click OK.

1 When installing the Tanzu CLI, deploying management clusters, and performing other operations, follow the instructions for vSphere, not the instructions for Azure. Configure the management cluster with:

n Kubernetes Network Settings > Network Name: The name of the new segment.

n Management Cluster Settings > Virtual IP Address The IP address range of the new segment.

What to Do Next Your infrastructure and bootstrap machine are ready for you to deploy the Tanzu CLI. See Chapter 3 Install the Tanzu CLI and Other Tools for instructions, and then proceed to Prepare to Deploy Management Clusters to vSphere.

Deploy Management Clusters with the Installer Interface

This topic describes how to use the Tanzu Kubernetes Grid installer interface to deploy a management cluster to vSphere, Amazon Elastic Compute Cloud (Amazon EC2), and Microsoft Azure. The Tanzu Kubernetes Grid installer interface guides you through the deployment of the management cluster, and provides different configurations for you to select or reconfigure. If this is the first time that you are deploying a management cluster to a given infrastructure provider, it is recommended to use the installer interface.

Prerequisites

Before you can deploy a management cluster, you must make sure that your environment meets the requirements for the target infrastructure provider.

General Prerequisites n Make sure that you have met all of the requirements and followed all of the procedures in Chapter 3 Install the Tanzu CLI and Other Tools. n For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid. n If you want to register your management cluster with Tanzu Mission Control, follow the procedure in Register Your Management Cluster with Tanzu Mission Control.

VMware, Inc. 122 VMware Tanzu Kubernetes Grid

n If you are deploying clusters in an internet-restricted environment to either vSphere or Amazon EC2, you must also perform the steps in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment. n Read the Tanzu Kubernetes Grid 1.3.1 Release Notes for updates related to security patches. vSphere Prerequisites n Make sure that you have met the all of the requirements listed in Prepare to Deploy Management Clusters to vSphere. n NOTE: On vSphere with Tanzu, you do not need to deploy a management cluster. See Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

Amazon EC2 Prerequisites n Make sure that you have met the all of the requirements listed Prepare to Deploy Management Clusters to Amazon EC2. n For information about the configurations of the different sizes of node instances, for example t3.large, or t3.xlarge, see Amazon EC2 Instance Types. n For information about when to create a Virtual Private Cloud (VPC) and when to reuse an existing VPC, see Resource Usage in Your Amazon Web Services Account.

Microsoft Azure Prerequisites n Make sure that you have met the requirements listed in Prepare to Deploy Management Clusters to Microsoft Azure. n For information about the configurations of the different sizes of node instances for Azure, for example, Standard_D2s_v3 or Standard_D4s_v3, see Sizes for virtual machines in Azure.

Set the TKG_BOM_CUSTOM_IMAGE_TAG

Before you can deploy a management cluster, you must specify the correct BOM file to use as a local environment variable. In the event of a patch release to Tanzu Kubernetes Grid, the BOM file may require an update to coincide with updated base image files.

Note For more information about recent security patch updates to VMware Tanzu Kubernetes Grid v1.3, see the VMware Tanzu Kubernetes Grid v1.3.1 Release Notes and this Knowledgebase Article.

On the machine where you run the Tanzu CLI, perform the following steps:

1 Remove any existing BOM data.

rm -rf ~/.tanzu/tkg/bom

2 Specify the updated BOM to use by setting the following variable.

export TKG_BOM_CUSTOM_IMAGE_TAG="v1.3.1-patch1"

VMware, Inc. 123 VMware Tanzu Kubernetes Grid

3 Run tanzu management-cluster create command with no additional parameters.

tanzu management-cluster create

This command produces an error but results in the BOM files being downloaded to ~/.tanzu/tkg/bom.

Start the Installer Interface

Warning: The tanzu management-cluster create command takes time to complete. While tanzu management-cluster create is running, do not run additional invocations of tanzu management- cluster create on the same bootstrap machine to deploy multiple management clusters, change context, or edit ~/.kube-tkg/config.

1 On the machine on which you downloaded and installed the Tanzu CLI, run the tanzu management-cluster create command with the --ui option.

tanzu management-cluster create --ui

The installer interface launches in a browser and takes you through steps to configure the management cluster.

n To make the installer interface appear locally if you are SSH-tunneling in to the bootstrap machine or X11-forwarding its display, you may need to run tanzu management-cluster create --ui with the --browser none option described in Installer Interface Options below.

The tanzu management-cluster create --ui command saves the settings from your installer input in a cluster configuration file. After you confirm your input values on the last pane of the installer interface, the installer saves them to ~/.tanzu/tkg/clusterconfigs with a generated filename of the form UNIQUE-ID.yaml.

By default Tanzu Kubernetes Grid saves the kubeconfig for all management clusters in the ~/.kube-tkg/config file. If you want to save the kubeconfig file for your management cluster to a different location, set the KUBECONFIG environment variable before running tanzu management- cluster create.

KUBECONFIG=/path/to/mc-kubeconfig.yaml

If the prerequisites are met, tanzu management-cluster create --ui launches the Tanzu Kubernetes Grid installer interface.

By default, tanzu management-cluster create --ui opens the installer interface locally, at http://127.0.0.1:8080 in your default browser. The Installer Interface Options section below explains how you can change where the installer interface runs, including running it on a different machine from the tanzu CLI.

2 Click the Deploy button for VMware vSphere, Amazon EC2, or Microsoft Azure.

VMware, Inc. 124 VMware Tanzu Kubernetes Grid

Installer Interface Options

By default, tanzu management-cluster create --ui opens the installer interface locally, at http:// 127.0.0.1:8080 in your default browser. You can use the --browser and --bind options to control where the installer interface runs: n --browser specifies the local browser to open the interface in.

n Supported values are chrome, firefox, safari, ie, edge, or none.

n Use none with --bind to run the interface on a different machine, as described below. n --bind specifies the IP address and port to serve the interface from.

Warning: Serving the installer interface from a non-default IP address and port could expose the tanzu CLI to a potential security risk while the interface is running. VMware recommends passing in to the --bind option an IP and port on a secure network.

Use cases for --browser and --bind include: n If another process is already using http://127.0.0.1:8080, use --bind to serve the interface from a different local port. n To make the installer interface appear locally if you are SSH-tunneling in to the bootstrap machine or X11-forwarding its display, you may need to use –browser none.

VMware, Inc. 125 VMware Tanzu Kubernetes Grid

n To run the tanzu CLI and create management clusters on a remote machine, and run the installer interface locally or elsewhere:

a On the remote bootstrap machine, run tanzu management-cluster create --ui with the following options and values:

n --bind: an IP address and port for the remote machine

n --browser: none

tanzu management-cluster create --ui --bind 192.168.1.87:5555 --browser none

b On the local UI machine, browse to the remote machine's IP address to access the installer interface.

Configure the Infrastructure Provider

The options to configure the infrastructure provider section of the installer interface depend on which provider you are using. n Configure a vSphere Infrastructure Provider n Configure an Amazon EC2 Provider n Configure a Microsoft Azure Infrastructure Provider

Configure a vSphere Infrastructure Provider 1 In the IaaS Provider section, enter the IP address or fully qualified domain name (FQDN) for the vCenter Server instance on which to deploy the management cluster.

Tanzu Kubernetes Grid does not support IPv6 addresses. This is because upstream Kubernetes only provides alpha support for IPv6. Always provide IPv4 addresses in the procedures in this topic.

2 Enter the vCenter Single Sign On username and password for a user account that has the required privileges for Tanzu Kubernetes Grid operation, and click Connect.

VMware, Inc. 126 VMware Tanzu Kubernetes Grid

3 Verify the SSL thumbprint of the vCenter Server certificate and click Continue if it is valid.

For information about how to obtain the vCenter Server certificate thumbprint, see Obtain vSphere Certificate Thumbprints.

4 If you are deploying a management cluster to a vSphere 7 instance, confirm whether or not you want to proceed with the deployment.

VMware, Inc. 127 VMware Tanzu Kubernetes Grid

On vSphere 7, the vSphere with Tanzu option includes a built-in supervisor cluster that works as a management cluster and provides a better experience than a separate management cluster deployed by Tanzu Kubernetes Grid. Deploying a Tanzu Kubernetes Grid management cluster to vSphere 7 when vSphere with Tanzu is not enabled is supported, but the preferred option is to enable vSphere with Tanzu and use the Supervisor Cluster. VMware Cloud on AWS and Azure VMware Solution do not support a supervisor cluster, so you need to deploy a management cluster. For information, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

To reflect the recommendation to use vSphere with Tanzu when deploying to vSphere 7, the Tanzu Kubernetes Grid installer behaves as follows:

n If vSphere with Tanzu is enabled, the installer informs you that deploying a management cluster is not possible, and exits.

n If vSphere with Tanzu is not enabled, the installer informs you that deploying a Tanzu Kubernetes Grid management cluster is possible but not recommended, and presents a choice:

n Configure vSphere with Tanzu opens the vSphere Client so you can configure your Supervisor Cluster as described in Configuring and Managing a Supervisor Cluster in the vSphere documentation.

n Deploy TKG Management Cluster allows you to continue deploying a management cluster, against recommendation for vSphere 7, but as required for VMware Cloud on AWS and Azure VMware Solution. When using vSphere 7, the preferred option is to enable vSphere with Tanzu and use the built-in Supervisor Cluster instead of deploying a Tanzu Kubernetes Grid management cluster.

5 Select the datacenter in which to deploy the management cluster from the Datacenter drop- down menu.

6 Paste the contents of your SSH public key into the text box and click Next.

VMware, Inc. 128 VMware Tanzu Kubernetes Grid

For the next steps, go to Configure the Management Cluster Settings.

Configure a Amazon EC2 Infrastructure Provider 1 In the IaaS Provider section, enter credentials for your Amazon EC2 account. You have two options:

n In the AWS Credential Profile drop-down, you can select an already existing AWS credential profile. If you select a profile, the access key and session token information configured for your profile are passed to the Installer without displaying actual values in the UI. For information about setting up credential profiles, see Credential Files and Profiles.

n Alternately, enter AWS account credentials directly in the Access Key ID and Secret Access Key fields for your Amazon EC2 account. Optionally specify an AWS session token in Session Token if your AWS account is configured to require temporary credentials. For more information on acquiring session tokens, see Using temporary credentials with AWS resources.

2 In Region, select the AWS region in which to deploy the management cluster. If you intend to deploy a production management cluster, this region must have at least three availability zones. This region must also be registered with the SSH key entered in the next field.

3 In SSH Key Name, specify the name of an SSH key that is already registered with your Amazon EC2 account and in the region where you are deploying the management cluster. You may have set this up in Configure AWS Account Credentials and SSH Key.

4 If this is the first time that you are deploying a management cluster, select the Automate creation of AWS CloudFormation Stack checkbox, and click Connect.

This CloudFormation stack creates the identity and access management (IAM) resources that Tanzu Kubernetes Grid needs to deploy and run clusters on Amazon EC2. For more information, see Required IAM Resources in Prepare to Deploy Management Clusters to Amazon EC2.

IMPORTANT: The Automate creation of AWS CloudFormation Stack checkbox replaces the clusterawsadm command line utility that existed in Tanzu Kubernetes Grid v1.1.x and earlier. For existing management and Tanzu Kubernetes clusters initially deployed with v1.1.x or earlier, continue to use the CloudFormation stack that was created by running the clusterawsadm alpha bootstrap create-stack command.

VMware, Inc. 129 VMware Tanzu Kubernetes Grid

5 If the connection is successful, click Next.

6 In the VPC for AWS section, do one of the following:

n To create a new VPC, select Create new VPC on AWS, check that the pre-filled CIDR block is available, and click Next. If the recommended CIDR block is not available, enter a new IP range in CIDR format for the management cluster to use. The recommended CIDR block for VPC CIDR is 10.0.0.0/16.

n To use an existing VPC, select Select an existing VPC and select the VPC ID from the drop-down menu. The VPC CIDR block is filled in automatically when you select the VPC.

For the next steps, go to Configure the Management Cluster Settings.

VMware, Inc. 130 VMware Tanzu Kubernetes Grid

Configure a Microsoft Azure Infrastructure Provider IMPORTANT: If this is the first time that you are deploying a management cluster to Azure with a new version of Tanzu Kubernetes Grid, for example v1.3.1, make sure that you have accepted the base image license for that version. For information, see Accept the Base Image License in Prepare to Deploy Management Clusters to Microsoft Azure.

1 In the IaaS Provider section, enter the Tenant ID, Client ID, Client Secret, and Subscription ID for your Azure account and click Connect. You recorded these values when you registered an Azure app and created a secret for it using the Azure Portal.

2 Select the Azure region in which to deploy the management cluster.

3 Paste the contents of your SSH public key, such as .ssh/id_rsa.pub, into the text box.

4 Under Resource Group, select either the Select an existing resource group or the Create a new resource group radio button.

n If you select Select an existing resource group, use the drop-down menu to select the group, then click Next.

n If you select Create a new resource group, enter a name for the new resource group and then click Next.

VMware, Inc. 131 VMware Tanzu Kubernetes Grid

5 In the VNET for Azure section, select either the Create a new VNET on Azure or the Select an existing VNET radio button.

n If you select Create a new VNET on Azure, use the drop-down menu to select the resource group in which to create the VNET and provide the following:

n A name and a CIDR block for the VNET. The default is 10.0.0.0/16.

n A name and a CIDR block for the control plane subnet. The default is 10.0.0.0/24.

n A name and a CIDR block for the worker node subnet. The default is 10.0.1.0/24.

After configuring these fields, click Next.

n If you select Select an existing VNET, use the drop-down menus to select the resource group in which the VNET is located, the VNET name, the control plane and worker node subnets, and then click Next.

Configure the Management Cluster Settings

This section applies to all infrastructure providers.

1 In the Management Cluster Settings section, select the Development or Production tile.

n If you select Development, the installer deploys a management cluster with a single control plane node.

VMware, Inc. 132 VMware Tanzu Kubernetes Grid

n If you select Production, the installer deploys a highly available management cluster with three control plane nodes.

2 In either of the Development or Production tiles, use the Instance type drop-down menu to select from different combinations of CPU, RAM, and storage for the control plane node VM or VMs.

Choose the configuration for the control plane node VMs depending on the expected workloads that it will run. For example, some workloads might require a large compute capacity but relatively little storage, while others might require a large amount of storage and less compute capacity. If you select an instance type in the Production tile, the instance type that you selected is automatically selected for the Worker Node Instance Type. If necessary, you can change this.

If you plan on registering the management cluster with Tanzu Mission Control, ensure that your Tanzu Kubernetes clusters meet the requirements listed in Requirements for Registering a Tanzu Kubernetes Cluster with Tanzu Mission Control in the Tanzu Mission Control documentation.

n vSphere: Select a size from the predefined CPU, memory, and storage configurations. The minimum configuration is 2 CPUs and 4 GB memory.

n Amazon EC2: Select an instance size. The drop-down menu lists choices alphabetically, not by size. The minimum configuration is 2 CPUs and 8 GB memory. The list of compatible instance types varies in different regions. For information about the configuration of the different sizes of instances, see Amazon EC2 Instance Types.

n Microsoft Azure: Select an instance size. The minimum configuration is 2 CPUs and 8 GB memory. The list of compatible instance types varies in different regions. For information about the configurations of the different sizes of node instances for Azure, see Sizes for virtual machines in Azure.

3 Optionally enter a name for your management cluster.

If you do not specify a name, Tanzu Kubernetes Grid automatically generates a unique name. If you do specify a name, that name must end with a letter, not a numeric character, and must be compliant with DNS hostname requirements as outlined in RFC 952 and amended in RFC 1123.

4 Under Worker Node Instance Type, select the configuration for the worker node VM.

VMware, Inc. 133 VMware Tanzu Kubernetes Grid

5 Deselect the Machine Health Checks checkbox if you want to disable MachineHealthCheck.

MachineHealthCheck provides node health monitoring and node auto-repair on the clusters that you deploy with this management cluster. You can enable or disable MachineHealthCheck on clusters after deployment by using the CLI. For instructions, see Configure Machine Health Checks for Tanzu Kubernetes Clusters.

6 (Azure Only) If you are deploying the management cluster to Azure, click Next.

For the next steps for an Azure deployment, go to Configure Metadata.

7 (vSphere Only) Under Control Plane Endpoint, enter a static virtual IP address or FQDN for API requests to the management cluster.

Ensure that this IP address is not in your DHCP range, but is in the same subnet as the DHCP range. If you mapped an FQDN to the VIP address, you can specify the FQDN instead of the VIP address. For more information, see Static VIPs and Load Balancers for vSphere.

8 (Amazon EC2 only) Optionally, disable the Bastion Host checkbox if a bastion host already exists in the availability zone(s) in which you are deploying the management cluster.

If you leave this option enabled, Tanzu Kubernetes Grid creates a bastion host for you.

9 (Amazon EC2 only) Configure Availability Zones

a From the Availability Zone 1 drop-down menu, select an availability zone for the management cluster. You can select only one availability zone in the Development tile. See the image below.

If you selected the Production tile above, use the Availability Zone 1, Availability Zone 2, and Availability Zone 3 drop-down menus to select three unique availability zones for the management cluster. When Tanzu Kubernetes Grid deploys the management cluster, which includes three control plane nodes, it distributes the control plane nodes across these availability zones.

VMware, Inc. 134 VMware Tanzu Kubernetes Grid

b To complete the configuration of the Management Cluster Settings section, do one of the following:

n If you created a new VPC in the VPC for AWS section, click Next.

n If you selected an existing VPC in the VPC for AWS section, use the VPC public subnet and VPC private subnet drop-down menus to select existing subnets on the VPC and click Next. The image below shows the Development tile.

10 Click Next.

n If you are deploying the management cluster to vSphere, go to Configure VMware NSX Advanced Load Balancer.

n If you are deploying the management cluster to Amazon EC2 or Azure, go to Configure Metadata.

(vSphere Only) Configure VMware NSX Advanced Load Balancer

VMware NSX Advanced Load Balancer provides an L4 load balancing solution for vSphere. NSX Advanced Load Balancer includes a Kubernetes operator that integrates with the Kubernetes API to manage the lifecycle of load balancing and ingress resources for workloads. To use NSX Advanced Load Balancer, you must first deploy it in your vSphere environment. For information, see Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch.

In the optional VMware NSX Advanced Load Balancer section, you can configure Tanzu Kubernetes Grid to use NSX Advanced Load Balancer. By default all workload clusters will use the load balancer.

1 For Controller Host, enter the IP address or FQDN of the Controller VM.

2 Enter the username and password that you set for the Controller host when you deployed it, and click Verify Credentials.

3 Use the Cloud Name drop-down menu to select the cloud that you created in your NSX Advanced Load Balancer deployment.

For example, Default-Cloud.

4 Use the Service Engine Group Name drop-down menu to select a Service Engine Group.

For example, Default-Group.

VMware, Inc. 135 VMware Tanzu Kubernetes Grid

5 For VIP Network Name, use the drop-down menu to select the name of the network where the load balancer floating IP Pool resides.

The VIP network for NSX Advanced Load Balancer must be present in the same vCenter Server instance as the Kubernetes network that Tanzu Kubernetes Grid uses. This allows NSX Advanced Load Balancer to discover the Kubernetes network in vCenter Server and to deploy and configure Service Engines. The drop-down menu is present in Tanzu Kubernetes Grid v1.3.1 and later. In v1.3.0, you enter the name manually.

You can see the network in the Infrastructure > Networks view of the NSX Advanced Load Balancer interface.

6 For VIP Network CIDR, use the drop-down menu to select the CIDR of the subnet to use for the load balancer VIP.

This comes from one of the VIP Network's configured subnets. You can see the subnet CIDR for a particular network in the Infrastructure > Networks view of the NSX Advanced Load Balancer interface. The drop-down menu is present in Tanzu Kubernetes Grid v1.3.1 and later. In v1.3.0, you enter the CIDR manually.

7 Paste the contents of the Certificate Authority that is used to generate your Controller Certificate into the Controller Certificate Authority text box.

If you have a self-signed Controller Certificate, the Certificate Authority is the same as the Controller Certificate.

8 (Optional) Enter one or more cluster labels to identify clusters on which to selectively enable NSX Advanced Load Balancer or to customize NSX Advanced Load Balancer Settings per group of clusters.

By default, all clusters that you deploy with this management cluster will enable NSX Advanced Load Balancer. All clusters will share the same VMware NSX Advanced Load Balancer Controller, Cloud, Service Engine Group, and VIP Network as you entered previously. This cannot be changed later. To only enable the load balancer on a subset of clusters, or to preserve the ability to customize NSX Advanced Load Balancer settings for a group of clusters, add labels in the format key: value. For example team: tkg.

This is useful in the following scenarios:

n You want to configure different sets of workload clusters to different Service Engine Groups to implement isolation or to support more Service type Load Balancers than one Service Engine Group's capacity.

n You want to configure different sets of workload clusters to different Clouds because they are deployed in separate sites.

VMware, Inc. 136 VMware Tanzu Kubernetes Grid

NOTE: Labels that you define here will be used to create a label selector. Only workload cluster Cluster objects that have the matching labels will have the load balancer enabled. As a consequence, you are responsible for making sure that the workload cluster's Cluster object has the corresponding labels. For example, if you use team: tkg, to enable the load balancer on a workload cluster, you will need to perform the following steps after deployment of the management cluster:

a Set kubectl to the management cluster's context.

kubectl config set-context management-cluster@admin

b Label the Cluster object of the corresponding workload cluster with the labels defined. If you define multiple key-values, you need to apply all of them.

kubectl label cluster team=tkg

9 Click Next to configure metadata.

Configure Metadata

This section applies to all infrastructure providers.

VMware, Inc. 137 VMware Tanzu Kubernetes Grid

In the optional Metadata section, optionally provide descriptive information about this management cluster.

Any metadata that you specify here applies to the management cluster and to the Tanzu Kubernetes clusters that it manages, and can be accessed by using the cluster management tool of your choice. n Location: The geographical location in which the clusters run. n Description: A description of this management cluster. The description has a maximum length of 63 characters and must start and end with a letter. It can contain only lower case letters, numbers, and hyphens, with no spaces. n Labels: Key/value pairs to help users identify clusters, for example release : beta, environment : staging, or environment : production. For more information, see Labels and Selectors in the Kubernetes documentation. You can click Add to apply multiple labels to the clusters.

If you are deploying to vSphere, click Next to go to Configure Resources. If you are deploying to Amazon EC2 or Azure, click Next to go to Configure the Kubernetes Network and Proxies.

VMware, Inc. 138 VMware Tanzu Kubernetes Grid

(vSphere Only) Configure Resources

1 In the Resources section, select vSphere resources for the management cluster to use, and click Next.

n Select the VM folder in which to place the management cluster VMs.

n Select the vSphere datastores for the management cluster to use. The storage policy for the VMs can be specified only when you deploy the management cluster from a configuration file.

n Select the cluster, host, or resource pool in which to place the management cluster.

If appropriate resources do not already exist in vSphere, without quitting the Tanzu Kubernetes Grid installer, go to vSphere to create them. Then click the refresh button so that the new resources can be selected.

Configure the Kubernetes Network and Proxies

This section applies to all infrastructure providers.

1 In the Kubernetes Network section, configure the networking for Kubernetes services and click Next.

n (vSphere only) Under Network Name, select a vSphere network to use as the Kubernetes service network.

n Review the Cluster Service CIDR and Cluster Pod CIDR ranges. If the recommended CIDR ranges of 100.64.0.0/13 and 100.96.0.0/11 are unavailable, update the values under Cluster Service CIDR and Cluster Pod CIDR.

VMware, Inc. 139 VMware Tanzu Kubernetes Grid

2 (Optional) To send outgoing HTTP(S) traffic from the management cluster to a proxy, toggle Enable Proxy Settings and follow the instructions below to enter your proxy information. Tanzu Kubernetes Grid applies these settings to kubelet, containerd, and the control plane.

You can choose to use one proxy for HTTP traffic and another proxy for HTTPS traffic or to use the same proxy for both HTTP and HTTPS traffic.

a To add your HTTP proxy information:

1 Under HTTP Proxy URL, enter the URL of the proxy that handles HTTP requests. The URL must start with http://. For example, http://myproxy.com:1234.

2 If the proxy requires authentication, under HTTP Proxy Username and HTTP Proxy Password, enter the username and password to use to connect to your HTTP proxy.

b To add your HTTPS proxy information:

n If you want to use the same URL for both HTTP and HTTPS traffic, select Use the same configuration for https proxy.

n If you want to use a different URL for HTTPS traffic, do the following:

a Under HTTPS Proxy URL, enter the URL of the proxy that handles HTTPS requests. The URL must start with http://. For example, http://myproxy.com:1234.

b If the proxy requires authentication, under HTTPS Proxy Username and HTTPS Proxy Password, enter the username and password to use to connect to your HTTPS proxy.

c Under No proxy, enter a comma-separated list of network CIDRs or hostnames that must bypass the HTTP(S) proxy.

VMware, Inc. 140 VMware Tanzu Kubernetes Grid

For example, noproxy.yourdomain.com,192.168.0.0/24.

n vSphere: You must enter the CIDR of the vSphere network that you selected under Network Name. The vSphere network CIDR includes the IP address of your Control Plane Endpoint. If you entered an FQDN under Control Plane Endpoint, add both the FQDN and the vSphere network CIDR to No proxy. Internally, Tanzu Kubernetes Grid appends localhost, 127.0.0.1, the values of Cluster Pod CIDR and Cluster Service CIDR, .svc, and .svc.cluster.local to the list that you enter in this field.

n Amazon EC2: Internally, Tanzu Kubernetes Grid appends localhost, 127.0.0.1, your VPC CIDR, Cluster Pod CIDR, and Cluster Service CIDR, .svc, .svc.cluster.local, and 169.254.0.0/16 to the list that you enter in this field.

n Azure: Internally, Tanzu Kubernetes Grid appends localhost, 127.0.0.1, your VNET CIDR, Cluster Pod CIDR, and Cluster Service CIDR, .svc, .svc.cluster.local, 169.254.0.0/16, and 168.63.129.16 to the list that you enter in this field.

Important: If the management cluster VMs need to communicate with external services and infrastructure endpoints in your Tanzu Kubernetes Grid environment, ensure that those endpoints are reachable by the proxies that you configured above or add them to No proxy. Depending on your environment configuration, this may include, but is not limited to, your OIDC or LDAP server, Harbor, and in the case of vSphere, NSX-T and NSX Advanced Load Balancer.

Configure Identity Management

This section applies to all infrastructure providers. For information about how Tanzu Kubernetes Grid implements identity management, see Enabling Identity Management in Tanzu Kubernetes Grid.

1 In the Identity Management section, optionally disable Enable Identity Management Settings .

You can disable identity management for proof-of-concept deployments, but it is strongly recommended to implement identity management in production deployments. If you disable identity management, you can reenable it later. For instructions on how to reenable identity management, see Enable Identity Management After Management Cluster Deployment.

2 If you enable identity management, select OIDC or LDAPS.

OIDC:

VMware, Inc. 141 VMware Tanzu Kubernetes Grid

Provide details of your OIDC provider account, for example, Okta.

n Issuer URL: The IP or DNS address of your OIDC server.

n Client ID: The client_id value that you obtain from your OIDC provider. For example, if your provider is Okta, log in to Okta, create a Web application, and select the Client Credentials options in order to get a client_id and secret.

n Client Secret: The secret value that you obtain from your OIDC provider.

n Scopes: A comma separated list of additional scopes to request in the token response. For example, openid,groups,email.

n Username Claim: The name of your username claim. This is used to set a user's username in the JSON Web Token (JWT) claim. Depending on your provider, enter claims such as user_name, email, or code.

n Groups Claim: The name of your groups claim. This is used to set a user's group in the JWT claim. For example, groups.

LDAPS:

Provide details of your company's LDAPS server. All settings except for LDAPS Endpoint are optional.

n LDAPS Endpoint: The IP or DNS address of your LDAPS server. Provide the address and port of the LDAP server, in the form host:port.

VMware, Inc. 142 VMware Tanzu Kubernetes Grid

n Bind DN: The DN for an application service account. The connector uses these credentials to search for users and groups. Not required if the LDAP server provides access for anonymous authentication.

n Bind Password: The password for an application service account, if Bind DN is set.

Provide the user search attributes.

n Base DN: The point from which to start the LDAP search. For example, OU=Users,OU=domain,DC=io.

n Filter: An optional filter to be used by the LDAP search.

n Username: The LDAP attribute that contains the user ID. For example, uid, sAMAccountName.

Provide the group search attributes.

n Base DN: The point from which to start the LDAP search. For example, OU=Groups,OU=domain,DC=io.

n Filter: An optional filter to be used by the LDAP search.

n Name Attribute: The LDAP attribute that holds the name of the group. For example, cn.

n User Attribute: The attribute of the user record that is used as the value of the membership attribute of the group record. For example, distinguishedName, dn.

n Group Attribute: The attribute of the group record that holds the user/member information. For example, member.

Paste the contents of the LDAPS server CA certificate into the Root CA text box.

VMware, Inc. 143 VMware Tanzu Kubernetes Grid

3 If you are deploying to vSphere, click Next to go to Select the Base OS Image. If you are deploying to Amazon EC2 or Azure, click Next to go to Register with Tanzu Mission Control.

(vSphere Only) Select the Base OS Image

In the OS Image section, use the drop-down menu to select the OS and Kubernetes version image template to use for deploying Tanzu Kubernetes Grid VMs, and click Next.

The drop-down menu includes all of the image templates that are present in your vSphere instance that meet the criteria for use as Tanzu Kubernetes Grid base images. The image template must include the correct version of Kubernetes for this release of Tanzu Kubernetes Grid. If you have not already imported a suitable image template to vSphere, you can do so now without quitting the Tanzu Kubernetes Grid installer. After you import it, use the Refresh button to make it available in the drop-down menu.

VMware, Inc. 144 VMware Tanzu Kubernetes Grid

Register with Tanzu Mission Control

This section applies to all infrastructure providers, however the functionality described in this section is being rolled out in Tanzu Mission Control.

Note At time of publication, you can only register Tanzu Kubernetes Grid management clusters on certain infrastructure providers. For a list of currently supported providers, see Requirements for Registering a Tanzu Kubernetes Cluster with Tanzu Mission Control in the Tanzu Mission Control documentation.

You can also register your Tanzu Kubernetes Grid management cluster with Tanzu Mission Control after you deploying the cluster. For more information, see Register Your Management Cluster with Tanzu Mission Control.

1 In the Registration URL field, copy and paste the registration URL you obtained from Tanzu Mission Control.

2 If the connection is successful, you can review the configuration YAML retrieved from the URL.

3 Click Next.

Finalize the Deployment

This section applies to all infrastructure providers.

1 In the CEIP Participation section, optionally deselect the check box to opt out of the VMware Customer Experience Improvement Program.

VMware, Inc. 145 VMware Tanzu Kubernetes Grid

You can also opt in or out of the program after the deployment of the management cluster. For information about the CEIP, see Managing Participation in CEIP and https:// www.vmware.com/solutions/trustvmware/ceip.html.

2 Click Review Configuration to see the details of the management cluster that you have configured.

The image below shows the configuration for a deployment to vSphere.

VMware, Inc. 146 VMware Tanzu Kubernetes Grid

When you click Review Configuration, Tanzu Kubernetes Grid populates the cluster configuration file, which is located in the ~/.tanzu/tkg/clusterconfigs subdirectory, with the settings that you specified in the interface. You can optionally copy the cluster configuration file without completing the deployment. You can copy the cluster configuration file to another bootstrap machine and deploy the management cluster from that machine. For example, you might do this so that you can deploy the management cluster from a bootstrap machine that does not have a Web browser.

3 (Optional) Under CLI Command Equivalent, click the Copy button to copy the CLI command for the configuration that you specified.

Copying the CLI command allows you to reuse the command at the command line to deploy management clusters with the configuration that you specified in the interface. This can be useful if you want to automate management cluster deployment.

4 (Optional) Click Edit Configuration to return to the installer wizard to modify your configuration.

5 Click Deploy Management Cluster.

Deployment of the management cluster can take several minutes. The first run of tanzu management-cluster create takes longer than subsequent runs because it has to pull the required Docker images into the image store on your bootstrap machine. Subsequent runs do not require this step, so are faster. You can follow the progress of the deployment of the management cluster in the installer interface or in the terminal in which you ran tanzu management-cluster create --ui. If the machine on which you run tanzu management-cluster create shuts down or restarts before the local operations finish, the deployment will fail. If you inadvertently close the browser or browser tab in which the deployment is running before it finishes, the deployment continues in the terminal.

NOTE: The screen capture below shows the deployment status page in Tanzu Kubernetes Grid v1.3.1.

VMware, Inc. 147 VMware Tanzu Kubernetes Grid

What to Do Next n The installer saves the configuration of the management cluster to ~/.tanzu/tkg/ clusterconfigs with a generated filename of the form UNIQUE-ID.yaml. After the deployment has completed, you can rename the configuration file to something memorable, for example the name that you provided to the management cluster, and save it in a different location for future use. n If you enabled identity management on the management cluster, you must perform post- deployment configuration steps to allow users to access the management cluster. For more information, see Configure Identity Management After Management Cluster Deployment. n For information about what happened during the deployment of the management cluster and how to connect kubectl to the management cluster, see Examine the Management Cluster Deployment. n If you need to deploy more than one management cluster, on any or all of vSphere, Azure, and Amazon EC2, see Manage Your Management Clusters. This topic also provides information about how to add existing management clusters to your CLI instance, obtain credentials, scale and delete management clusters, add namespaces, and how to opt in or out of the CEIP.

VMware, Inc. 148 VMware Tanzu Kubernetes Grid

Deploy Management Clusters from a Configuration File

You can use the Tanzu CLI to deploy a management cluster to vSphere, Amazon Elastic Compute Cloud (Amazon EC2), and Microsoft Azure with a configuration that you specify in a YAML configuration file.

Prerequisites

Before you can deploy a management cluster, you must make sure that your environment meets the requirements for the target infrastructure provider.

General Prerequisites n Make sure that you have met the all of the requirements and followed all of the procedures in Chapter 3 Install the Tanzu CLI and Other Tools. n For production deployments, it is strongly recommended to enable identity management for your clusters. For information about the preparatory steps to perform before you deploy a management cluster, see Enabling Identity Management in Tanzu Kubernetes Grid. n If you want to register your management cluster with Tanzu Mission Control, follow the procedure in Register Your Management Cluster with Tanzu Mission Control. n If you are deploying clusters in an internet-restricted environment to either vSphere or Amazon EC2, you must also perform the steps in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment. n It is strongly recommended to use the Tanzu Kubernetes Grid installer interface rather than the CLI to deploy your first management cluster to a given infrastructure provider. When you deploy a management cluster by using the installer interface, it populates a cluster configuration file for the management cluster with the required parameters. You can use the created configuration file as a model for future deployments from the CLI to this infrastructure provider. n If you plan on registering the management cluster with Tanzu Mission Control, ensure that your Tanzu Kubernetes clusters meet the requirements listed in Requirements for Registering a Tanzu Kubernetes Cluster with Tanzu Mission Control in the Tanzu Mission Control documentation. n Read the Tanzu Kubernetes Grid 1.3.1 Release Notes for updates related to security patches. vSphere Prerequisites n Make sure that you have met the all of the requirements listed in Prepare to Deploy Management Clusters to vSphere. n NOTE: On vSphere with Tanzu, you do not need to deploy a management cluster. See Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

VMware, Inc. 149 VMware Tanzu Kubernetes Grid

Amazon EC2 Prerequisites n Make sure that you have met the all of the requirements listed Prepare to Deploy Management Clusters to Amazon EC2. n For information about the configurations of the different sizes of node instances, for example, t3.large or t3.xlarge, see Amazon EC2 Instance Types. n For information about when to create a Virtual Private Cloud (VPC) and when to reuse an existing VPC, see Resource Usage in Your Amazon Web Services Account. n If this is the first time that you are deploying a management cluster to Amazon EC2, create a Cloud Formation stack for Tanzu Kubernetes Grid in your AWS account by following the instructions in Create IAM Resources below.

Create IAM Resources Before you deploy a management cluster to Amazon EC2 for the first time, you must create a CloudFormation stack for Tanzu Kubernetes Grid, tkg-cloud-vmware-com, in your AWS account. This CloudFormation stack includes the identity and access management (IAM) resources that Tanzu Kubernetes Grid needs to create and run clusters on Amazon EC2. For more information, see Required IAM Resources in Prepare to Deploy Management Clusters to Amazon EC2.

1 If you have already created the CloudFormation stack for Tanzu Kubernetes Grid in your AWS account, skip the rest of this procedure.

2 If you have not already created the CloudFormation stack for Tanzu Kubernetes Grid in your AWS account, ensure that AWS authentication variables are set either in the local environment or in your AWS default credential provider chain. For instructions, see Configure AWS Account Credentials and SSH Key.

If you have configured AWS credentials in multiple places, the credential settings used to create the CloudFormation stack are applied in the following order of precedence:

n Credentials set in the local environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN and AWS_REGION are applied first.

n Credentials stored in a shared credentials file as part of the default credential provider chain. You can specify the location of the credentials file to use in the local environment variable AWS_SHARED_CREDENTIAL_FILE. If this environment variable in not defined, the default location of $HOME/.aws/credentials is used. If you use credential profiles, the command uses the profile name specified in the AWS_PROFILE local environment configuration variable. If you do not specify a value for this variable, the profile named default is used.

For an example of how the default AWS credential provider chain is interpreted for apps, see Working with AWS Credentials in the AWS documentation.

3 Run the following command:

tanzu management-cluster permissions aws set

VMware, Inc. 150 VMware Tanzu Kubernetes Grid

For more information about this command, run tanzu management-cluster permissions aws set -- help.

IMPORTANT: The tanzu management-cluster permissions aws set command replaces the clusterawsadm command line utility that existed in Tanzu Kubernetes Grid v1.1.x and earlier. For existing management and Tanzu Kubernetes clusters initially deployed with v1.1.x or earlier, continue to use the CloudFormation stack that was created by running the clusterawsadm alpha bootstrap create-stack command. For Tanzu Kubernetes Grid v1.2 and later clusters, use the tkg- cloud-vmware-com stack.

Microsoft Azure Prerequisites n Make sure that you have met the requirements listed in Prepare to Deploy Management Clusters to Microsoft Azure. n For information about the configurations of the different sizes of node instances for Azure, for example, Standard_D2s_v3 or Standard_D4s_v3, see Sizes for virtual machines in Azure.

Create the Cluster Configuration File

Before creating a management cluster using the Tanzu CLI, you must define its configuration in a YAML configuration file that provides the base configuration for the cluster. When you deploy the management cluster from the CLI, you specify this file by using the --file option of the tanzu management-cluster create command.

Running tanzu management-cluster create command for the first time creates the ~/.tanzu/tkg subdirectory that contains the Tanzu Kubernetes Grid configuration files.

If you have previously deployed a management cluster by running tanzu management-cluster create --ui, the ~/.tanzu/tkg/clusterconfigs directory contains management cluster configuration files with settings saved from each invocation of the installer interface. Depending the infrastructure on which you deployed the management cluster, you can use these files as templates for cluster configuration files for new deployments to the same infrastructure. Alternatively, you can create management cluster configuration files from the templates that are provided in this documentation. n To use the configuration file from a previous deployment that you performed by using the installer interface, make a copy of the configuration file with a new name, open it in a text editor, and update the configuration. For information about how to update all of the settings, see the Tanzu CLI Configuration File Variable Reference. n To create a new configuration file, see Create a Management Cluster Configuration File. This section provides configuration file templates for each infrastructure provider.

VMware recommends using a dedicated configuration file for each management cluster, with configuration settings specific to a single infrastructure.

VMware, Inc. 151 VMware Tanzu Kubernetes Grid

(v1.3.1 Only) Set the TKG_BOM_CUSTOM_IMAGE_TAG

Before you can deploy a management cluster, you must specify the correct BOM file to use as a local environment variable. In the event of a patch release to Tanzu Kubernetes Grid, the BOM file may require an update to coincide with updated base image files.

Note For more information about recent security patch updates to VMware Tanzu Kubernetes Grid v1.3, see the VMware Tanzu Kubernetes Grid v1.3.1 Release Notes and this Knowledgebase Article.

On the machine where you run the Tanzu CLI, perform the following steps:

1 Remove any existing BOM data.

rm -rf ~/.tanzu/tkg/bom

2 Specify the updated BOM to use by setting the following variable.

export TKG_BOM_CUSTOM_IMAGE_TAG="v1.3.1-patch1"

3 Run tanzu management-cluster create command with no additional parameters.

tanzu management-cluster create

This command produces an error but results in the BOM files being downloaded to ~/.tanzu/tkg/bom.

Run the tanzu management-cluster create Command

After you have created or updated the cluster configuration file and downloaded the most recent BOM, you can deploy a management cluster by running the tanzu management-cluster create -- file CONFIG-FILE command, where CONFIG-FILE is the name of the configuration file. If your configuration file is the default ~/.tanzu/tkg/cluster-config.yaml, you can omit the --file option.

Warning: The tanzu management-cluster create command takes time to complete. While tanzu management-cluster create is running, do not run additional invocations of tanzu management- cluster create on the same bootstrap machine to deploy multiple management clusters, change context, or edit ~/.kube-tkg/config.

To deploy a management cluster, run the tanzu management-cluster create command. For example:

tanzu management-cluster create --file path/to/cluster-config-file.yaml

Validation Checks

When you run tanzu management-cluster create, the command performs several validation checks before deploying the management cluster. The checks are different depending on the infrastructure to which you are deploying the management cluster. n vSphere

VMware, Inc. 152 VMware Tanzu Kubernetes Grid

The command verifies that the target vSphere infrastructure meets the following requirements:

n The vSphere credentials that you provided are valid.

n Nodes meet the minimum size requirements.

n Base image template exists in vSphere and is valid for the specified Kubernetes version.

n Required resources including the resource pool, datastores, and folder exist in vSphere. n Amazon EC2

The command verifies that the target Amazon EC2 infrastructure meets the following requirements:

n The AWS credentials that you provided are valid.

n Cloud Formation stack exists.

n Node Instance type is supported.

n Region and AZ match. n Azure

The command verifies that the target Azure infrastructure meets the following requirements:

n The Azure credentials that you provided are valid.

n The public SSH key is encoded in base64 format.

n The node instance type is supported.

If any of these conditions are not met, the tanzu management-cluster create command fails.

Monitoring Progress

When you run tanzu management-cluster create, you can follow the progress of the deployment of the management cluster in the terminal. The first run of tanzu management-cluster create takes longer than subsequent runs because it has to pull the required Docker images into the image store on your bootstrap machine. Subsequent runs do not require this step, so are faster.

If tanzu management-cluster create fails before the management cluster deploys, you should clean up artifacts on your bootstrap machine before you re-run tanzu management-cluster create. See the Troubleshooting Tips topic for details. If the machine on which you run tanzu management- cluster create shuts down or restarts before the local operations finish, the deployment will fail.

If the deployment succeeds, you see a confirmation message in the terminal:

Management cluster created! You can now create your first workload cluster by running tanzu cluster create [name] -f [file]

VMware, Inc. 153 VMware Tanzu Kubernetes Grid

What to Do Next n If you enabled identity management on the management cluster, you must perform post- deployment configuration steps to allow users to access the management cluster. For more information, see Configure Identity Management After Management Cluster Deployment. n For information about what happened during the deployment of the management cluster, how to connect kubectl to the management cluster, and how to create namespaces see Examine the Management Cluster Deployment. n If you need to deploy more than one management cluster, on any or all of vSphere, Azure, and Amazon EC2, see Manage Your Management Clusters. This topic also provides information about how to add existing management clusters to your CLI instance, obtain credentials, scale and delete management clusters, add namespaces, and how to opt in or out of the CEIP.

Create a Management Cluster Configuration File

This documentation includes configuration file templates that you can use to deploy management clusters to each of vSphere, Amazon EC2, and Azure. The templates include all of the options that are relevant to deploying management clusters on a given infrastructure provider. You can copy the templates and follow the instructions in this section to update them.

Consult the Tanzu CLI Configuration File Variable Reference for details about each setting. The sections below also contain links to other sections of this documentation to provide additional information.

IMPORTANT: n As described in Configuring the Management Cluster, environment variables override values from a cluster configuration file. To use all settings from a cluster configuration file, unset any conflicting environment variables before you deploy the management cluster from the CLI. n Tanzu Kubernetes Grid does not support IPv6 addresses. This is because upstream Kubernetes only provides alpha support for IPv6. Always provide IPv4 addresses in settings in the configuration file. n Some parameters configure identical properties. For example, the SIZE property configures the same infrastructure settings as all of the control plane and worker node size and type properties for the different infrastructure providers, but at a more general level. In such cases, avoid setting conflicting or redundant properties.

Create the Configuration File 1 Copy and paste the contents of the template for your infrastructure provider into a text editor.

Copy a template from one of the following locations:

n Management Cluster Configuration for vSphere

n Management Cluster Configuration for Amazon EC2

VMware, Inc. 154 VMware Tanzu Kubernetes Grid

n Management Cluster Configuration for Microsoft Azure

For example, if you have already deployed a management cluster from the installer interface, you can save the file in the default location for cluster configurations, ~/.tanzu/tkg/ clusterconfigs.

2 Save the file with a .yaml extension and an appropriate name, for example aws-mgmt-cluster- config.yaml.

The subsequent sections describe how to update the settings that are common to all infrastructure providers as well as the settings that are specific to each of vSphere, Amazon EC2, and Azure.

Configure Basic Management Cluster Creation Information The basic management cluster creation settings define the infrastructure on which to deploy the management cluster and other basic settings. They are common to all infrastructure providers. n For CLUSTER_PLAN specify whether you want to deploy a development cluster, which provides a single control plane node, or a production cluster, which provides a highly available management cluster with three control plane nodes. Specify dev or prod. n For INFRASTRUCTURE_PROVIDER, specify aws, azure, or vsphere.

INFRASTRUCTURE_PROVIDER: aws

INFRASTRUCTURE_PROVIDER: azure

INFRASTRUCTURE_PROVIDER: vsphere n Optionally disable participation in the VMware Customer Experience Improvement Program (CEIP) by setting ENABLE_CEIP_PARTICIPATION to false. For information about the CEIP, see Managing Participation in CEIP and https://www.vmware.com/solutions/trustvmware/ ceip.html. n Optionally uncomment and update TMC_REGISTRATION_URL to register the management cluster with Tanzu Mission Control. For information about Tanzu Mission Control, see Register Your Management Cluster with Tanzu Mission Control. n Optionally disable audit logging by setting ENABLE_AUDIT_LOGGING to false. For information about audit logging, see Audit Logging. n If the recommended CIDR ranges of 100.64.0.0/13 and 100.96.0.0/11 are unavailable, update CLUSTER_CIDR for the cluster pod network and SERVICE_CIDR for the cluster service network.

For example:

#! ------#! Basic cluster creation configuration #! ------

CLUSTER_NAME: aws-mgmt-cluster

VMware, Inc. 155 VMware Tanzu Kubernetes Grid

CLUSTER_PLAN: dev INFRASTRUCTURE_PROVIDER: aws ENABLE_CEIP_PARTICIPATION: true TMC_REGISTRATION_URL: https://tmc-org.cloud.vmware.com/installer?id=[...]&source=registration ENABLE_AUDIT_LOGGING: true CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

Configure Identity Management

Set IDENTITY_MANAGEMENT_TYPE to ldap or oidc. Set none to disable identity management. It is strongly recommended to enable identity management for production deployments.

For information identity management in Tanzu Kubernetes Grid, and the pre-deployment steps to perform, see Configure Identity Management After Management Cluster Deployment.

IDENTITY_MANAGEMENT_TYPE: oidc

IDENTITY_MANAGEMENT_TYPE: ldap

OIDC To configure OIDC, update the variables below. For information about how to configure the variables, see Identity Providers - OIDC in the Tanzu CLI Configuration File Variable Reference.

For example:

OIDC_IDENTITY_PROVIDER_CLIENT_ID: 0oa2i[...]NKst4x7 OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: groups OIDC_IDENTITY_PROVIDER_ISSUER_URL: https://dev-[...].okta.com OIDC_IDENTITY_PROVIDER_SCOPES: openid,groups,email OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: email

LDAP

To configure LDAP, uncomment and update the LDAP_* variables with information about your LDAPS server. For information about how to configure the variables, see Identity Providers - LDAP in the Tanzu CLI Configuration File Variable Reference.

For example:

LDAP_BIND_DN: "" LDAP_BIND_PASSWORD: "" LDAP_GROUP_SEARCH_BASE_DN: dc=example,dc=com LDAP_GROUP_SEARCH_FILTER: (objectClass=posixGroup) LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: memberUid LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn LDAP_GROUP_SEARCH_USER_ATTRIBUTE: uid LDAP_HOST: ldaps.example.com:636 LDAP_ROOT_CA_DATA_B64: "" LDAP_USER_SEARCH_BASE_DN: ou=people,dc=example,dc=com LDAP_USER_SEARCH_FILTER: (objectClass=posixAccount)

VMware, Inc. 156 VMware Tanzu Kubernetes Grid

LDAP_USER_SEARCH_NAME_ATTRIBUTE: uid LDAP_USER_SEARCH_USERNAME: uid

Configure Proxies To optionally send outgoing HTTP(S) traffic from the management cluster to a proxy, uncomment and set the *_PROXY settings. The proxy settings are common to all infrastructure providers. You can choose to use one proxy for HTTP requests and another proxy for HTTPS requests or to use the same proxy for both HTTP and HTTPS requests. n (Required) TKG_HTTP_PROXY: This is the URL of the proxy that handles HTTP requests. To set the URL, use the format below:

PROTOCOL://USERNAME:PASSWORD@FQDN-OR-IP:PORT

Where:

n (Required) PROTOCOL: This must be http.

n (Optional) USERNAME and PASSWORD: This is your HTTP proxy username and password. You must set USERNAME and PASSWORD if the proxy requires authentication.

n (Required) FQDN-OR-IP: This is the FQDN or IP address of your HTTP proxy.

n (Required) PORT: This is the port number that your HTTP proxy uses.

For example, http://user:[email protected]:1234. n (Required) TKG_HTTPS_PROXY: This is the URL of the proxy that handles HTTPS requests. You can set TKG_HTTPS_PROXY to the same value as TKG_HTTP_PROXY or provide a different value. To set the value, use the URL format from the previous step, where:

n (Required) PROTOCOL: This must be http.

n (Optional) USERNAME and PASSWORD: This is your HTTPS proxy username and password. You must set USERNAME and PASSWORD if the proxy requires authentication.

n (Required) FQDN-OR-IP: This is the FQDN or IP address of your HTTPS proxy.

n (Required) PORT: This is the port number that your HTTPS proxy uses.

For example, http://user:[email protected]:1234. n (Optional) TKG_NO_PROXY: This sets one or more comma-separated network CIDRs or hostnames that must bypass the HTTP(S) proxy. Do not use spaces. For example, noproxy.yourdomain.com,192.168.0.0/24.

Internally, Tanzu Kubernetes Grid appends localhost, 127.0.0.1, the values of CLUSTER_CIDR and SERVICE_CIDR, .svc, and .svc.cluster.local to the value that you set in TKG_NO_PROXY. It also appends your AWS VPC CIDR and 169.254.0.0/16 for deployments to Amazon EC2 and your Azure VNET CIDR, 169.254.0.0/16, and 168.63.129.16 for deployments to Azure. For vSphere, you must manually add the CIDR of VSPHERE_NETWORK, which includes the IP address of your control plane endpoint, to TKG_NO_PROXY. If you set VSPHERE_CONTROL_PLANE_ENDPOINT to an FQDN, add both the FQDN and VSPHERE_NETWORK to TKG_NO_PROXY.

VMware, Inc. 157 VMware Tanzu Kubernetes Grid

Important: If the cluster VMs need to communicate with external services and infrastructure endpoints in your Tanzu Kubernetes Grid environment, ensure that those endpoints are reachable by the proxies that you set above or add them to TKG_NO_PROXY. Depending on your environment configuration, this may include, but is not limited to:

n Your OIDC or LDAP server

n Harbor

n NSX-T

n NSX Advanced Load Balancer

n AWS VPC CIDRs that are external to the cluster

For example:

#! ------#! Proxy configuration #! ------

TKG_HTTP_PROXY: "http://myproxy.com:1234" TKG_HTTPS_PROXY: "http://myproxy.com:1234" TKG_NO_PROXY: "noproxy.yourdomain.com,192.168.0.0/24"

Configure Node Settings By default, all cluster nodes run Ubuntu v20.04, for all infrastructure providers. On vSphere you can optionally deploy clusters that run Photon OS on their nodes. On Amazon EC2, nodes can optionally run Amazon Linux 2. For the architecture, the default and only current choice is amd64. For the OS and version settings, see see Node Configuration in the Tanzu CLI Configuration File Variable Reference.

For example:

#! ------#! Node configuration #! ------

OS_NAME: "photon" OS_VERSION: "3" OS_ARCH: "amd64"

How you set node compute configuration and sizes depends on the infrastructure provider. For information, see Management Cluster Configuration for vSphere, Management Cluster Configuration for Amazon EC2, or Management Cluster Configuration for Microsoft Azure.

Configure Machine Health Checks Optionally update variables based on your deployment preferences and using the guidelines described in the Configuration Parameter Reference. Alternatively, disable Machine Health Checks by setting ENABLE_MHC: "false".

VMware, Inc. 158 VMware Tanzu Kubernetes Grid

For information about how to configure the Machine Health Check settings, see Machine Health Checks in the Tanzu CLI Configuration File Variable Reference and Configure Machine Health Checks for Tanzu Kubernetes Clusters.

For example:

ENABLE_MHC: "true" MHC_UNKNOWN_STATUS_TIMEOUT: 10m MHC_FALSE_STATUS_TIMEOUT: 20m

Configure a Private Image Registry If you are deploying the management cluster in an Internet-restricted environment, uncomment and update the TKG_CUSTOM_IMAGE_REPOSITORY_* settings. If you are deploying the management cluster in an environment that has access to the external internet, you do not need to configure these settings.

For information about deployments in Internet-restricted environments, see Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment. The private image registry settings are common to all infrastructure providers.

For example:

#! ------#! Image repository configuration #! ------

TKG_CUSTOM_IMAGE_REPOSITORY: "custom-image-repository.io/yourproject" TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: "LS0t[...]tLS0tLQ=="

Configure Antrea CNI By default, clusters that you deploy with the Tanzu CLI provide in-cluster container networking with the Antrea container network interface (CNI).

You can optionally disable Source Network Address Translation (SNAT) for pod traffic, implement hybrid, noEncap, NetworkPolicyOnly traffic encapsulation modes, use proxies and network policies, and implement Traceflow.

For more information about Antrea, see the following resources: n VMware Container Networking with Antrea product page on vmware.com. n Antrea open source project page n Antrea documentation n Deploying Antrea for Kubernetes Networking whitepaper on vmware.com.

To optionally configure these features on Antrea, uncomment and update the ANTREA_* variables. For example:

#! ------#! Antrea CNI configuration

VMware, Inc. 159 VMware Tanzu Kubernetes Grid

#! ------

ANTREA_NO_SNAT: true ANTREA_TRAFFIC_ENCAP_MODE: "hybrid" ANTREA_PROXY: true ANTREA_POLICY: true ANTREA_TRACEFLOW: false

What to Do Next Continue to update the configuration file settings for vSphere, Amazon EC2, or Azure. For the configuration file settings that are specific to each infrastructure provider, see the corresponding topic: n Management Cluster Configuration for vSphere n Management Cluster Configuration for Amazon EC2 n Management Cluster Configuration for Microsoft Azure

Management Cluster Configuration for vSphere To create a cluster configuration file, you can copy an existing configuration file for a previous deployment to vSphere and update it. Alternatively, you can create a file from scratch by using an empty template.

Management Cluster Configuration Template The template below includes all of the options that are relevant to deploying management clusters on vSphere. You can copy this template and use it to deploy management clusters to vSphere. n For information about how to update the settings that are common to all infrastructure providers, see Create a Management Cluster Configuration File n For information about all configuration file variables, see the Tanzu CLI Configuration File Variable Reference. n For examples of how to configure the vSphere settings, see the sections below the template.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

#! ------#! Basic cluster creation configuration #! ------

CLUSTER_NAME: CLUSTER_PLAN: dev INFRASTRUCTURE_PROVIDER: vsphere ENABLE_CEIP_PARTICIPATION: true # TMC_REGISTRATION_URL: ENABLE_AUDIT_LOGGING: true CLUSTER_CIDR: 100.96.0.0/11

VMware, Inc. 160 VMware Tanzu Kubernetes Grid

SERVICE_CIDR: 100.64.0.0/13

#! ------#! Image repository configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

#! ------#! Proxy configuration #! ------

# TKG_HTTP_PROXY: "" # TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

#! ------#! vSphere configuration #! ------

VSPHERE_SERVER: VSPHERE_USERNAME: VSPHERE_PASSWORD: VSPHERE_DATACENTER: VSPHERE_RESOURCE_POOL: VSPHERE_DATASTORE: VSPHERE_FOLDER: VSPHERE_NETWORK: VM Network VSPHERE_CONTROL_PLANE_ENDPOINT: VIP_NETWORK_INTERFACE: "eth0" # VSPHERE_TEMPLATE: VSPHERE_SSH_AUTHORIZED_KEY: # VSPHERE_STORAGE_POLICY_ID: "" VSPHERE_TLS_THUMBPRINT: VSPHERE_INSECURE: false DEPLOY_TKG_ON_VSPHERE7: false ENABLE_TKGS_ON_VSPHERE7: false

#! ------#! Node configuration #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE: # OS_NAME: "" # OS_VERSION: "" # OS_ARCH: "" # VSPHERE_NUM_CPUS: 2 # VSPHERE_DISK_GIB: 40 # VSPHERE_MEM_MIB: 4096 # VSPHERE_CONTROL_PLANE_NUM_CPUS: 2 # VSPHERE_CONTROL_PLANE_DISK_GIB: 40 # VSPHERE_CONTROL_PLANE_MEM_MIB: 8192

VMware, Inc. 161 VMware Tanzu Kubernetes Grid

# VSPHERE_WORKER_NUM_CPUS: 2 # VSPHERE_WORKER_DISK_GIB: 40 # VSPHERE_WORKER_MEM_MIB: 4096

#! ------#! NSX-T specific configuration for enabling NSX-T routable pods #! ------

# NSXT_POD_ROUTING_ENABLED: false # NSXT_ROUTER_PATH: "" # NSXT_USERNAME: "" # NSXT_PASSWORD: "" # NSXT_MANAGER_HOST: "" # NSXT_ALLOW_UNVERIFIED_SSL: false # NSXT_REMOTE_AUTH: false # NSXT_VMC_ACCESS_TOKEN: "" # NSXT_VMC_AUTH_HOST: "" # NSXT_CLIENT_CERT_KEY_DATA: "" # NSXT_CLIENT_CERT_DATA: "" # NSXT_ROOT_CA_DATA: "" # NSXT_SECRET_NAME: "cloud-provider-vsphere-nsxt-credentials" # NSXT_SECRET_NAMESPACE: "kube-system"

#! ------#! Machine Health Check configuration #! ------

ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

#! ------#! Identity management configuration #! ------

IDENTITY_MANAGEMENT_TYPE: "oidc"

#! Settings for OIDC # CERT_DURATION: 2160h # CERT_RENEW_BEFORE: 360h # OIDC_IDENTITY_PROVIDER_CLIENT_ID: # OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: # OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: groups # OIDC_IDENTITY_PROVIDER_ISSUER_URL: # OIDC_IDENTITY_PROVIDER_SCOPES: email # OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: email

#! The following two variables are used to configure Pinniped JWTAuthenticator for workload clusters # SUPERVISOR_ISSUER_URL: # SUPERVISOR_ISSUER_CA_BUNDLE_DATA:

#! Settings for LDAP # LDAP_BIND_DN: # LDAP_BIND_PASSWORD: # LDAP_HOST:

VMware, Inc. 162 VMware Tanzu Kubernetes Grid

# LDAP_USER_SEARCH_BASE_DN: # LDAP_USER_SEARCH_FILTER: # LDAP_USER_SEARCH_USERNAME: userPrincipalName # LDAP_USER_SEARCH_ID_ATTRIBUTE: DN # LDAP_USER_SEARCH_EMAIL_ATTRIBUTE: DN # LDAP_USER_SEARCH_NAME_ATTRIBUTE: # LDAP_GROUP_SEARCH_BASE_DN: # LDAP_GROUP_SEARCH_FILTER: # LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN # LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: # LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn # LDAP_ROOT_CA_DATA_B64:

#! ------#! NSX Advanced Load Balancer configuration #! ------

AVI_ENABLE: false # AVI_NAMESPACE: "tkg-system-networking" # AVI_DISABLE_INGRESS_CLASS: true # AVI_AKO_IMAGE_PULL_POLICY: IfNotPresent # AVI_ADMIN_CREDENTIAL_NAME: avi-controller-credentials # AVI_CA_NAME: avi-controller-ca # AVI_CONTROLLER: # AVI_USERNAME: "" # AVI_PASSWORD: "" # AVI_CLOUD_NAME: # AVI_SERVICE_ENGINE_GROUP: # AVI_DATA_NETWORK: # AVI_DATA_NETWORK_CIDR: # AVI_CA_DATA_B64: "" # AVI_LABELS: "" # AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER: false # AVI_INGRESS_SHARD_VS_SIZE: "" # AVI_INGRESS_SERVICE_TYPE: ""

#! ------#! Antrea CNI configuration #! ------

# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

General vSphere Configuration Provide information to allow Tanzu Kubernetes Grid to log in to vSphere, and to designate the resources for Tanzu Kubernetes Grid can use. n Update the VSPHERE_SERVER, VSPHERE_USERNAME, and VSPHERE_PASSWORD settings with the IP address or FQDN of the vCenter Server instance and the credentials to use to log in.

VMware, Inc. 163 VMware Tanzu Kubernetes Grid

n Provide the full paths to the vSphere datacenter, resource pool, datastores, and folder in which to deploy the management cluster:

n VSPHERE_DATACENTER: /

n VSPHERE_RESOURCE_POOL: //host//Resources

n VSPHERE_DATASTORE: //datastore/

n VSPHERE_FOLDER: //vm/. n Set a static virtual IP address for API requests to the Tanzu Kubernetes cluster in the VSPHERE_CONTROL_PLANE_ENDPOINT setting. If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address. n Specify a network and a network interface in VSPHERE_NETWORK and VIP_NETWORK_INTERFACE. n Optionally uncomment and update VSPHERE_TEMPLATE to specify the path to an OVA file if you are using multiple custom OVA images for the same Kubernetes version. Use the format /MY- DC/vm/MY-FOLDER-PATH/MY-IMAGE. For more information, see Deploy a Cluster with a Custom OVA Image. n Provide your SSH key in the VSPHERE_SSH_AUTHORIZED_KEY option. For information about how to obtain an SSH key, see Prepare to Deploy Management Clusters to vSphere. n Provide the TLS thumbprint in the VSPHERE_TLS_THUMBPRINT variable, or set VSPHERE_INSECURE: true to skip thumbprint verification. n Optionally uncomment VSPHERE_STORAGE_POLICY_ID and specify the name of a storage policy for the VMs, which you have configured on vCenter Server, for the management cluster to use.

For example:

#! ------#! vSphere configuration #! ------

VSPHERE_SERVER: 10.185.12.154 VSPHERE_USERNAME: [email protected] VSPHERE_PASSWORD: VSPHERE_DATACENTER: /dc0 VSPHERE_RESOURCE_POOL: /dc0/host/cluster0/Resources/tanzu VSPHERE_DATASTORE: /dc0/datastore/sharedVmfs-1 VSPHERE_FOLDER: /dc0/vm/tanzu VSPHERE_NETWORK: "VM Network" VSPHERE_CONTROL_PLANE_ENDPOINT: 10.185.11.134 VIP_NETWORK_INTERFACE: "eth0" VSPHERE_TEMPLATE: /dc0/vm/tanzu/my-image.ova VSPHERE_SSH_AUTHORIZED_KEY: ssh-rsa AAAAB3[...]tyaw== [email protected] VSPHERE_TLS_THUMBPRINT: 47:F5:83:8E:5D:36:[...]:72:5A:89:7D:29:E5:DA VSPHERE_INSECURE: false VSPHERE_STORAGE_POLICY_ID: "My storage policy"

VMware, Inc. 164 VMware Tanzu Kubernetes Grid

Configure Node Sizes The Tanzu CLI creates the individual nodes of management clusters and Tanzu Kubernetes clusters according to settings that you provide in the configuration file. On vSphere, you can configure all node VMs to have the same predefined configurations, set different predefined configurations for control plane and worker nodes, or customize the configurations of the nodes. By using these settings, you can create clusters that have nodes with different configurations to the management cluster nodes. You can also create clusters in which the control plane nodes and worker nodes have different configurations. Use Predefined Node Configurations The Tanzu CLI provides the following predefined configurations for cluster nodes: n small: 2 CPUs, 4 GB memory, 20 GB disk n medium: 2 CPUs, 8 GB memory, 40 GB disk n large: 4 CPUs, 16 GB memory, 40 GB disk n extra-large: 8 CPUs, 32 GB memory, 80 GB disk

To create a cluster in which all of the control plane and worker node VMs are the same size, specify the SIZE variable. If you set the SIZE variable, all nodes will be created with the configuration that you set.

SIZE: "large"

To create a in which the control plane and worker node VMs are different sizes, specify the CONTROLPLANE_SIZE and WORKER_SIZE options.

CONTROLPLANE_SIZE: "medium" WORKER_SIZE: "extra-large"

You can combine the CONTROLPLANE_SIZE and WORKER_SIZE options with the SIZE option. For example, if you specify SIZE: "large" with WORKER_SIZE: "extra-large", the control plane nodes will be set to large and worker nodes will be set to extra-large.

SIZE: "large" WORKER_SIZE: "extra-large"

Define Custom Node Configurations You can customize the configuration of the nodes rather than using the predefined configurations.

To use the same custom configuration for all nodes, specify the VSPHERE_NUM_CPUS, VSPHERE_DISK_GIB, and VSPHERE_MEM_MIB options.

VSPHERE_NUM_CPUS: 2 VSPHERE_DISK_GIB: 40 VSPHERE_MEM_MIB: 4096

VMware, Inc. 165 VMware Tanzu Kubernetes Grid

To define different custom configurations for control plane nodes and worker nodes, specify the VSPHERE_CONTROL_PLANE_* and VSPHERE_WORKER_* options.

VSPHERE_CONTROL_PLANE_NUM_CPUS: 2 VSPHERE_CONTROL_PLANE_DISK_GIB: 20 VSPHERE_CONTROL_PLANE_MEM_MIB: 8192 VSPHERE_WORKER_NUM_CPUS: 4 VSPHERE_WORKER_DISK_GIB: 40 VSPHERE_WORKER_MEM_MIB: 4096

You can override these settings by using the SIZE, CONTROLPLANE_SIZE, and WORKER_SIZE options.

Configure NSX Advanced Load Balancer VMware NSX Advanced Load Balancer provides an L4+L7 load balancing solution for vSphere. NSX Advanced Load Balancer includes a Kubernetes operator that integrates with the Kubernetes API to manage the lifecycle of load balancing and ingress resources for workloads. To use NSX Advanced Load Balancer, you must first deploy it in your vSphere environment. For information, see Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch.

You can configure Tanzu Kubernetes Grid to use NSX Advanced Load Balancer. By default, the management cluster and all workload clusters that it manages will use the load balancer. For information about how to configure the NSX Advanced Load Balancer variables, see NSX Advanced Load Balancer in the Tanzu CLI Configuration File Variable Reference.

AVI_ENABLE: true AVI_NAMESPACE: "tkg-system-networking" AVI_DISABLE_INGRESS_CLASS: true AVI_AKO_IMAGE_PULL_POLICY: IfNotPresent AVI_ADMIN_CREDENTIAL_NAME: avi-controller-credentials AVI_CONTROLLER: 10.185.10.217 AVI_USERNAME: "admin" AVI_PASSWORD: "" AVI_CLOUD_NAME: "Default-Cloud" AVI_SERVICE_ENGINE_GROUP: "Default-Group" AVI_DATA_NETWORK: nsx-alb-dvswitch AVI_DATA_NETWORK_CIDR: 10.185.0.0/20 AVI_CA_DATA_B64: LS0tLS1CRU[...]UtLS0tLQo= AVI_LABELS: "" # AVI_INGRESS_DEFAULT_INGRESS_CONTROLLER: false # AVI_INGRESS_SHARD_VS_SIZE: "" # AVI_INGRESS_SERVICE_TYPE: ""

Configure NSX-T Routable Pods

If your vSphere environment uses NSX-T, you can configure it to implement routable, or NO_NAT, pods.

VMware, Inc. 166 VMware Tanzu Kubernetes Grid

NOTE: NSX-T Routable Pods is an experimental feature in this release. Information about how to implement NSX-T Routable Pods will be added to this documentation soon.

#! ------#! NSX-T specific configuration for enabling NSX-T routable pods #! ------

# NSXT_POD_ROUTING_ENABLED: false # NSXT_ROUTER_PATH: "" # NSXT_USERNAME: "" # NSXT_PASSWORD: "" # NSXT_MANAGER_HOST: "" # NSXT_ALLOW_UNVERIFIED_SSL: false # NSXT_REMOTE_AUTH: false # NSXT_VMC_ACCESS_TOKEN: "" # NSXT_VMC_AUTH_HOST: "" # NSXT_CLIENT_CERT_KEY_DATA: "" # NSXT_CLIENT_CERT_DATA: "" # NSXT_ROOT_CA_DATA: "" # NSXT_SECRET_NAME: "cloud-provider-vsphere-nsxt-credentials" # NSXT_SECRET_NAMESPACE: "kube-system"

Management Clusters on vSphere with Tanzu On vSphere 7, the vSphere with Tanzu option includes a built-in supervisor cluster that works as a management cluster and provides a better experience than a separate management cluster deployed by Tanzu Kubernetes Grid. Deploying a Tanzu Kubernetes Grid management cluster to vSphere 7 when vSphere with Tanzu is not enabled is supported, but the preferred option is to enable vSphere with Tanzu and use the Supervisor Cluster. VMware Cloud on AWS and Azure VMware Solution do not support a supervisor cluster, so you need to deploy a management cluster. For information, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

To reflect the recommendation for using the vSphere with Tanzu supervisor cluster as a management cluster, the Tanzu CLI behaves as follows, controlled by the DEPLOY_TKG_ON_VSPHERE7 and ENABLE_TKGS_ON_VSPHERE7 configuration parameters. n vSphere with Tanzu enabled:

n ENABLE_TKGS_ON_VSPHERE7: false informs you that deploying a management cluster is not possible, and exits.

n ENABLE_TKGS_ON_VSPHERE7: true opens the vSphere Client at the address set by VSPHERE_SERVER in your config.yml or local environment, so you can configure your supervisor cluster as described in Enable the Workload Management Platform with the vSphere Networking Stack in the vSphere documentation. n vSphere with Tanzu not enabled:

n DEPLOY_TKG_ON_VSPHERE7: false informs you that deploying a Tanzu Kubernetes Grid management cluster is possible but not recommended, and prompts you to either quit the installation or continue to deploy the management cluster.

VMware, Inc. 167 VMware Tanzu Kubernetes Grid

n DEPLOY_TKG_ON_VSPHERE7: true deploys a TKG management cluster on vSphere 7, against recommendation for vSphere 7, but as required for VMware Cloud on AWS and Azure VMware Solution.

What to Do Next After you have finished updating the management cluster configuration file, create the management cluster by following the instructions in Deploy Management Clusters from a Configuration File.

Management Cluster Configuration for Amazon EC2 To create a cluster configuration file, you can copy an existing configuration file for a previous deployment to Amazon EC2 and update it. Alternatively, you can create a file from scratch by using an empty template.

Management Cluster Configuration Template The template below includes all of the options that are relevant to deploying management clusters on Amazon EC2. You can copy this template and use it to deploy management clusters to Amazon EC2. n For information about how to update the settings that are common to all infrastructure providers, see Create a Management Cluster Configuration File n For information about all configuration file variables, see the Tanzu CLI Configuration File Variable Reference. n For examples of how to configure the vSphere settings, see the sections below the template.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

#! ------#! Basic cluster creation configuration #! ------

CLUSTER_NAME: CLUSTER_PLAN: dev INFRASTRUCTURE_PROVIDER: aws ENABLE_CEIP_PARTICIPATION: true # TMC_REGISTRATION_URL: ENABLE_AUDIT_LOGGING: true CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

#! ------#! Image repository configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

#! ------

VMware, Inc. 168 VMware Tanzu Kubernetes Grid

#! Proxy configuration #! ------

# TKG_HTTP_PROXY: "" # TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

#! ------#! Node configuration #! AWS-only MACHINE_TYPE settings override cloud-agnostic SIZE settings. #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE: CONTROL_PLANE_MACHINE_TYPE: t3.large NODE_MACHINE_TYPE: m5.large # OS_NAME: "" # OS_VERSION: "" # OS_ARCH: ""

#! ------#! AWS configuration #! ------

AWS_REGION: AWS_NODE_AZ: "" AWS_ACCESS_KEY_ID: AWS_SECRET_ACCESS_KEY: AWS_SSH_KEY_NAME: BASTION_HOST_ENABLED: true # AWS_NODE_AZ_1: "" # AWS_NODE_AZ_2: "" # AWS_VPC_ID: "" # AWS_PRIVATE_SUBNET_ID: "" # AWS_PUBLIC_SUBNET_ID: "" # AWS_PUBLIC_SUBNET_ID_1: "" # AWS_PRIVATE_SUBNET_ID_1: "" # AWS_PUBLIC_SUBNET_ID_2: "" # AWS_PRIVATE_SUBNET_ID_2: "" # AWS_VPC_CIDR: 10.0.0.0/16 # AWS_PRIVATE_NODE_CIDR: 10.0.0.0/24 # AWS_PUBLIC_NODE_CIDR: 10.0.1.0/24 # AWS_PRIVATE_NODE_CIDR_1: 10.0.2.0/24 # AWS_PUBLIC_NODE_CIDR_1: 10.0.3.0/24 # AWS_PRIVATE_NODE_CIDR_2: 10.0.4.0/24 # AWS_PUBLIC_NODE_CIDR_2: 10.0.5.0/24

#! ------#! Machine Health Check configuration #! ------

ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m

VMware, Inc. 169 VMware Tanzu Kubernetes Grid

MHC_FALSE_STATUS_TIMEOUT: 12m

#! ------#! Identity management configuration #! ------

IDENTITY_MANAGEMENT_TYPE: "oidc"

#! Settings for OIDC # CERT_DURATION: 2160h # CERT_RENEW_BEFORE: 360h # OIDC_IDENTITY_PROVIDER_CLIENT_ID: # OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: # OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM: groups # OIDC_IDENTITY_PROVIDER_ISSUER_URL: # OIDC_IDENTITY_PROVIDER_SCOPES: email # OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: email

#! The following two variables are used to configure Pinniped JWTAuthenticator for workload clusters # SUPERVISOR_ISSUER_URL: # SUPERVISOR_ISSUER_CA_BUNDLE_DATA:

#! Settings for LDAP # LDAP_BIND_DN: # LDAP_BIND_PASSWORD: # LDAP_HOST: # LDAP_USER_SEARCH_BASE_DN: # LDAP_USER_SEARCH_FILTER: # LDAP_USER_SEARCH_USERNAME: userPrincipalName # LDAP_USER_SEARCH_ID_ATTRIBUTE: DN # LDAP_USER_SEARCH_EMAIL_ATTRIBUTE: DN # LDAP_USER_SEARCH_NAME_ATTRIBUTE: # LDAP_GROUP_SEARCH_BASE_DN: # LDAP_GROUP_SEARCH_FILTER: # LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN # LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: # LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn # LDAP_ROOT_CA_DATA_B64:

#! ------#! Antrea CNI configuration #! ------

# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

Amazon EC2 Connection Settings Specify information about your AWS account and the region and availability zone in which you want to deploy the cluster. If you have set these values as environment variables on the machine on which you run Tanzu CLI commands, you can omit these settings.

VMware, Inc. 170 VMware Tanzu Kubernetes Grid

The AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not mandatory. If not provided, the CLI will find them in the AWS default credentials provider chain. If provided, the values for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY must be base64 encoded.

For example:

AWS_REGION: eu-west-1 AWS_NODE_AZ: "eu-west-1a" AWS_ACCESS_KEY_ID: AWS_SECRET_ACCESS_KEY: AWS_SSH_KEY_NAME: default BASTION_HOST_ENABLED: true

Configure Node Sizes The Tanzu CLI creates the individual nodes of Tanzu Kubernetes clusters according to settings that you provide in the configuration file. On Amazon EC2, you can configure all node VMs to have the same predefined configurations or set different predefined configurations for control plane and worker nodes. By using these settings, you can create Tanzu Kubernetes clusters that have nodes with different configurations to the management cluster nodes. You can also create clusters in which the control plane nodes and worker nodes have different configurations.

When you created the management cluster, the instance types for the node machines are set in the CONTROL_PLANE_MACHINE_TYPE and NODE_MACHINE_TYPE options. By default, these settings are also used for Tanzu Kubernetes clusters. The minimum configuration is 2 CPUs and 8 GB memory. The list of compatible instance types varies in different regions.

CONTROL_PLANE_MACHINE_TYPE: "t3.large" NODE_MACHINE_TYPE: "m5.large"

You can override these settings by using the SIZE, CONTROLPLANE_SIZE and WORKER_SIZE options. To create a Tanzu Kubernetes cluster in which all of the control plane and worker node VMs are the same size, specify the SIZE variable. If you set the SIZE variable, all nodes will be created with the configuration that you set. For information about the configurations of the different sizes of node instances for Amazon EC2, see Amazon EC2 Instance Types.

SIZE: "t3.large"

To create a Tanzu Kubernetes cluster in which the control plane and worker node VMs are different sizes, specify the CONTROLPLANE_SIZE and WORKER_SIZE options.

CONTROLPLANE_SIZE: "t3.large" WORKER_SIZE: "m5.xlarge"

VMware, Inc. 171 VMware Tanzu Kubernetes Grid

You can combine the CONTROLPLANE_SIZE and WORKER_SIZE options with the SIZE option. For example, if you specify SIZE: "t3.large" with WORKER_SIZE: "m5.xlarge", the control plane nodes will be set to t3.large and worker nodes will be set to m5.xlarge.

SIZE: "t3.large" WORKER_SIZE: "m5.xlarge"

Use a New VPC If you want to deploy a development management cluster with a single control plane node to a new VPC, uncomment and update the following rows related to AWS infrastructure.

AWS_REGION: AWS_NODE_AZ: AWS_VPC_CIDR: AWS_PRIVATE_NODE_CIDR: AWS_PUBLIC_NODE_CIDR: CONTROL_PLANE_MACHINE_TYPE: NODE_MACHINE_TYPE: AWS_SSH_KEY_NAME: BASTION_HOST_ENABLED: SERVICE_CIDR: CLUSTER_CIDR:

If you want to deploy a production management cluster with three control plane nodes to a new VPC, also uncomment and update the following variables:

AWS_NODE_AZ_1: AWS_NODE_AZ_2: AWS_PRIVATE_NODE_CIDR_1: AWS_PRIVATE_NODE_CIDR_2: AWS_PUBLIC_NODE_CIDR_1: AWS_PUBLIC_NODE_CIDR_2:

For example, the configuration of a production management cluster on new VPC might look like this:

AWS_REGION: us-west-2 AWS_NODE_AZ: us-west-2a AWS_NODE_AZ_1: us-west-2b AWS_NODE_AZ_2: us-west-2c AWS_PRIVATE_NODE_CIDR: 10.0.0.0/24 AWS_PRIVATE_NODE_CIDR_1: 10.0.2.0/24 AWS_PRIVATE_NODE_CIDR_2: 10.0.4.0/24 AWS_PUBLIC_NODE_CIDR: 10.0.1.0/24 AWS_PUBLIC_NODE_CIDR_1: 10.0.3.0/24 AWS_PUBLIC_NODE_CIDR_2: 10.0.5.0/24 AWS_SSH_KEY_NAME: tkg AWS_VPC_CIDR: 10.0.0.0/16 BASTION_HOST_ENABLED: "true" CONTROL_PLANE_MACHINE_TYPE: m5.large NODE_MACHINE_TYPE: m5.large

VMware, Inc. 172 VMware Tanzu Kubernetes Grid

SERVICE_CIDR: 100.64.0.0/13 CLUSTER_CIDR: 100.96.0.0/11

Use an Existing VPC If you want to deploy a development management cluster with a single control plane node to an existing VPC, uncomment and update the following rows related to AWS infrastructure.

AWS_REGION: AWS_NODE_AZ: AWS_PRIVATE_SUBNET_ID: AWS_PUBLIC_SUBNET_ID: AWS_SSH_KEY_NAME: AWS_VPC_ID: BASTION_HOST_ENABLED: CONTROL_PLANE_MACHINE_TYPE: NODE_MACHINE_TYPE: SERVICE_CIDR: CLUSTER_CIDR:

If you want to deploy a production management cluster with three control plane nodes to an existing VPC, also uncomment and update the following variables:

AWS_NODE_AZ_1: AWS_NODE_AZ_2: AWS_PRIVATE_SUBNET_ID_1: AWS_PRIVATE_SUBNET_ID_2: AWS_PUBLIC_SUBNET_ID_1: AWS_PUBLIC_SUBNET_ID_2:

For example, the configuration of a production management cluster on an existing VPC might look like this:

AWS_REGION: us-west-2 AWS_NODE_AZ: us-west-2a AWS_NODE_AZ_1: us-west-2b AWS_NODE_AZ_2: us-west-2c AWS_PRIVATE_SUBNET_ID: subnet-ID AWS_PRIVATE_SUBNET_ID_1: subnet-ID AWS_PRIVATE_SUBNET_ID_2: subnet-ID AWS_PUBLIC_SUBNET_ID: subnet-ID AWS_PUBLIC_SUBNET_ID_1: subnet-ID AWS_PUBLIC_SUBNET_ID_2: subnet-ID AWS_SSH_KEY_NAME: tkg AWS_VPC_ID: vpc-ID BASTION_HOST_ENABLED: "true" CONTROL_PLANE_MACHINE_TYPE: m5.large NODE_MACHINE_TYPE: m5.large SERVICE_CIDR: 100.64.0.0/13 CLUSTER_CIDR: 100.96.0.0/11

VMware, Inc. 173 VMware Tanzu Kubernetes Grid

What to Do Next After you have finished updating the management cluster configuration file, create the management cluster by following the instructions in Deploy Management Clusters from a Configuration File.

Management Cluster Configuration for Microsoft Azure To create a cluster configuration file, you can copy an existing configuration file for a previous deployment to Azure and update it. Alternatively, you can create a file from scratch by using an empty template.

Management Cluster Configuration Template The template below includes all of the options that are relevant to deploying management clusters on Azure. You can copy this template and use it to deploy management clusters to Azure. n For information about how to update the settings that are common to all infrastructure providers, see Create a Management Cluster Configuration File n For information about all configuration file variables, see the Tanzu CLI Configuration File Variable Reference. n For examples of how to configure the Azure settings, see the sections below the template.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

#! ------#! Basic cluster creation configuration #! ------

CLUSTER_NAME: CLUSTER_PLAN: dev INFRASTRUCTURE_PROVIDER: azure ENABLE_CEIP_PARTICIPATION: true # TMC_REGISTRATION_URL: ENABLE_AUDIT_LOGGING: true CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

#! ------#! Image repository configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

#! ------#! Proxy configuration #! ------

# TKG_HTTP_PROXY: ""

VMware, Inc. 174 VMware Tanzu Kubernetes Grid

# TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

#! ------#! Node configuration #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE: # AZURE_CONTROL_PLANE_MACHINE_TYPE: "Standard_D2s_v3" # AZURE_NODE_MACHINE_TYPE: "Standard_D2s_v3" # OS_NAME: "" # OS_VERSION: "" # OS_ARCH: "" # AZURE_CONTROL_PLANE_DATA_DISK_SIZE_GIB : "" # AZURE_CONTROL_PLANE_OS_DISK_SIZE_GIB : "" # AZURE_CONTROL_PLANE_MACHINE_TYPE : "" # AZURE_CONTROL_PLANE_OS_DISK_STORAGE_ACCOUNT_TYPE : "" # AZURE_ENABLE_NODE_DATA_DISK : "" # AZURE_NODE_DATA_DISK_SIZE_GIB : "" # AZURE_NODE_OS_DISK_SIZE_GIB : "" # AZURE_NODE_MACHINE_TYPE : "" # AZURE_NODE_OS_DISK_STORAGE_ACCOUNT_TYPE : ""

#! ------#! Azure configuration #! ------

AZURE_ENVIRONMENT: "AzurePublicCloud" AZURE_TENANT_ID: AZURE_SUBSCRIPTION_ID: AZURE_CLIENT_ID: AZURE_CLIENT_SECRET: AZURE_LOCATION: AZURE_SSH_PUBLIC_KEY_B64: # AZURE_RESOURCE_GROUP: "" # AZURE_VNET_RESOURCE_GROUP: "" # AZURE_VNET_NAME: "" # AZURE_VNET_CIDR: "" # AZURE_CONTROL_PLANE_SUBNET_NAME: "" # AZURE_CONTROL_PLANE_SUBNET_CIDR: "" # AZURE_NODE_SUBNET_NAME: "" # AZURE_NODE_SUBNET_CIDR: "" # AZURE_CUSTOM_TAGS : "" # AZURE_ENABLE_PRIVATE_CLUSTER : "" # AZURE_FRONTEND_PRIVATE_IP : "" # AZURE_ENABLE_ACCELERATED_NETWORKING : ""

#! ------#! Machine Health Check configuration #! ------ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

VMware, Inc. 175 VMware Tanzu Kubernetes Grid

#! ------#! Identity management configuration #! ------

IDENTITY_MANAGEMENT_TYPE: oidc

#! Settings for Pinniped OIDC # CERT_DURATION: 2160h # CERT_RENEW_BEFORE: 360h # OIDC_IDENTITY_PROVIDER_ISSUER_URL: # OIDC_IDENTITY_PROVIDER_CLIENT_ID: # OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: # OIDC_IDENTITY_PROVIDER_SCOPES: "email,profile,groups" # OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: # OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM:

#! The following two variables are used to configure Pinniped JWTAuthenticator for workload clusters # SUPERVISOR_ISSUER_URL: # SUPERVISOR_ISSUER_CA_BUNDLE_DATA:

#! Settings for LDAP # LDAP_BIND_DN: # LDAP_BIND_PASSWORD: # LDAP_HOST: # LDAP_USER_SEARCH_BASE_DN: # LDAP_USER_SEARCH_FILTER: # LDAP_USER_SEARCH_USERNAME: userPrincipalName # LDAP_USER_SEARCH_ID_ATTRIBUTE: DN # LDAP_USER_SEARCH_EMAIL_ATTRIBUTE: DN # LDAP_USER_SEARCH_NAME_ATTRIBUTE: # LDAP_GROUP_SEARCH_BASE_DN: # LDAP_GROUP_SEARCH_FILTER: # LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN # LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: # LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn # LDAP_ROOT_CA_DATA_B64:

#! ------#! Antrea CNI configuration #! ------# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

Azure Connection Settings Specify information about your Azure account and the region in which you want to deploy the cluster.

VMware, Inc. 176 VMware Tanzu Kubernetes Grid

For example:

AZURE_ENVIRONMENT: "AzurePublicCloud" AZURE_TENANT_ID: b39138ca-[...]-d9dd62f0 AZURE_SUBSCRIPTION_ID: 3b511ccd-[...]-08a6d1a75d78 AZURE_CLIENT_ID: AZURE_CLIENT_SECRET: AZURE_LOCATION: westeurope AZURE_SSH_PUBLIC_KEY_B64: c3NoLXJzYSBBQUFBQjN[...]XJlLmNvbQ==

Configure Node Sizes The Tanzu CLI creates the individual nodes of Tanzu Kubernetes clusters according to settings that you provide in the configuration file. On Amazon EC2, you can configure all node VMs to have the same predefined configurations or set different predefined configurations for control plane and worker nodes. By using these settings, you can create Tanzu Kubernetes clusters that have nodes with different configurations to the management cluster nodes. You can also create clusters in which the control plane nodes and worker nodes have different configurations.

When you created the management cluster, the instance types for the node machines are set in the AZURE_CONTROL_PLANE_MACHINE_TYPE and AZURE_NODE_MACHINE_TYPE options. By default, these settings are also used for Tanzu Kubernetes clusters. The minimum configuration is 2 CPUs and 8 GB memory. The list of compatible instance types varies in different regions.

AZURE_CONTROL_PLANE_MACHINE_TYPE: "Standard_D2s_v3" AZURE_NODE_MACHINE_TYPE: "Standard_D2s_v3"

You can override these settings by using the SIZE, CONTROLPLANE_SIZE and WORKER_SIZE options. To create a Tanzu Kubernetes cluster in which all of the control plane and worker node VMs are the same size, specify the SIZE variable. If you set the SIZE variable, all nodes will be created with the configuration that you set. Set to Standard_D2s_v3, Standard_D4s_v3, and so on. For information about node instances for Azure, see Sizes for virtual machines in Azure.

SIZE: Standard_D2s_v3

To create a Tanzu Kubernetes cluster in which the control plane and worker node VMs are different sizes, specify the CONTROLPLANE_SIZE and WORKER_SIZE options.

CONTROLPLANE_SIZE: Standard_D2s_v3 WORKER_SIZE: Standard_D4s_v3

You can combine the CONTROLPLANE_SIZE and WORKER_SIZE options with the SIZE option. For example, if you specify SIZE: "Standard_D2s_v3" with WORKER_SIZE: "Standard_D4s_v3", the control plane nodes will be set to Standard_D2s_v3 and worker nodes will be set to Standard_D4s_v3.

SIZE: Standard_D2s_v3 WORKER_SIZE: Standard_D4s_v3

VMware, Inc. 177 VMware Tanzu Kubernetes Grid

Deploy a Cluster with Custom Node Subnets To specify custom subnets (IP ranges) for the nodes in a cluster, set variables as follows before you create the cluster. You can define them as environment variables before you run tanzu cluster create or include them in the cluster configuration file that you pass in with the --file option.

To specify a custom subnet (IP range) for the control plane node in a cluster: n Subnet already defined in Azure: Set AZURE_CONTROL_PLANE_SUBNET_NAME to the subnet name. n Create a new subnet: Set AZURE_CONTROL_PLANE_SUBNET_NAME to name for the new subnet, and optionally set AZURE_CONTROL_PLANE_SUBNET_CIDR to a CIDR range within the configured Azure VNET.

n If you omit AZURE_CONTROL_PLANE_SUBNET_CIDR a CIDR is generated automatically.

To specify a custom subnet for the worker nodes in a cluster, set environment variables AZURE_NODE_SUBNET_NAME and AZURE_NODE_SUBNET_CIDR following the same rules as for the control plane node, above.

For example:

AZURE_CONTROL_PLANE_SUBNET_CIDR: 10.0.0.0/24 AZURE_CONTROL_PLANE_SUBNET_NAME: my-cp-subnet AZURE_NODE_SUBNET_CIDR: 10.0.1.0/24 AZURE_NODE_SUBNET_NAME: my-worker-subnet AZURE_RESOURCE_GROUP: my-rg AZURE_VNET_CIDR: 10.0.0.0/16 AZURE_VNET_NAME: my-vnet AZURE_VNET_RESOURCE_GROUP: my-rg

What to Do Next After you have finished updating the management cluster configuration file, create the management cluster by following the instructions in Deploy Management Clusters from a Configuration File.

IMPORTANT: If this is the first time that you are deploying a management cluster to Azure with a new version of Tanzu Kubernetes Grid, for example v1.3.1, make sure that you have accepted the base image license for that version. For information, see Accept the Base Image License in Prepare to Deploy Management Clusters to Microsoft Azure.

Configure Identity Management After Management Cluster Deployment

If you enabled identity management when you deployed a management cluster, you must perform additional post-deployment steps on the management cluster so that authenticated users can access it.

VMware, Inc. 178 VMware Tanzu Kubernetes Grid

To configure identity management on a management cluster, you must perform the following steps: n Make sure that the authentication service is running correctly. n For OIDC deployments, provide the callback URL for the management cluster to your OIDC identity provider. n Generate a kubeconfig file to share with users by running tanzu management-cluster kubeconfig get with the --export-file option.

n You can generate an administrator kubeconfig that contains embedded credentials, or a regular kubeconfig that prompts users to authenticate with an external identity provider.

n See Retrieve Management Cluster kubeconfig for more information. n Set up role-based access control (RBAC) by creating a role binding on the management cluster, that assigns role-based permissions to individual authenticated users or user groups.

Prerequisites n You have deployed a management cluster with either OIDC or LDAPS identity management configured. n If you configured an OIDC server as the identity provider, you have followed the procedures in Enabling Identity Management in Tanzu Kubernetes Grid to add users in the OIDC server.

Connect kubectl to the Management Cluster

To configure identity management, you must obtain and use the admin context of the management cluster.

1 Get the admin context of the management cluster.

The procedures in this topic use a management cluster named id-mgmt-test.

tanzu management-cluster kubeconfig get id-mgmt-test --admin

If your management cluster is named id-mgmt-test, you should see the confirmation Credentials of workload cluster 'id-mgmt-test' have been saved. You can now access the cluster by running 'kubectl config use-context id-mgmt-test-admin@id-mgmt-test'. The admin context of a cluster gives you full access to the cluster without requiring authentication with your IDP.

2 Set kubectl to the admin context of the management cluster.

kubectl config use-context id-mgmt-test-admin@id-mgmt-test

The next steps depend on whether you are using an OIDC or LDAP identity management service. n Check the Status of an OIDC Identity Management Service n Check the Status of an LDAP Identity Management Service

VMware, Inc. 179 VMware Tanzu Kubernetes Grid

Check the Status of an OIDC Identity Management Service

In Tanzu Kubernetes Grid v1.3.0, Pinniped used Dex as the endpoint for both OIDC and LDAP providers. In Tanzu Kubernetes Grid v1.3.1 and later, Pinniped with OIDC no longer requires Dex. In Tanzu Kubernetes Grid v1.3.1 and later, Dex is only used if you use an LDAP provider. Because of this change, the way in which you check the status of an OIDC identity management service is different in Tanzu Kubernetes Grid v1.3.1 and later compared to Tanzu Kubernetes Grid v1.3.0.

For new management cluster deployments with OIDC authentication, it is strongly recommended to use Tanzu Kubernetes Grid v1.3.1 or later.

When you check the status of the service, you must note the address at which the service is exposed to your OIDC identity provider.

1 Get information about the services that are running in the management cluster.

Tanzu Kubernetes Grid 1.3.1 or later:

In Tanzu Kubernetes Grid v1.3.1 and later, the identity management service runs in the pinniped-supervisor namespace:

kubectl get all -n pinniped-supervisor

You see the following entry in the output:

vSphere:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/pinniped-supervisor NodePort 100.70.70.12 5556:31234/TCP 84m

Amazon EC2:

NAME TYPE CLUSTER-IP EXTERNAL- IP PORT(S) AGE service/pinniped-supervisor LoadBalancer 100.69.13.66 ab1[...]71.eu- west-1.elb.amazonaws.com 443:30865/TCP 56m

Azure:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/pinniped-supervisor LoadBalancer 100.69.169.220 20.54.226.44 443:30451/TCP 84m

Tanzu Kubernetes Grid 1.3.0:

In Tanzu Kubernetes Grid v1.3.0, the identity management service runs in the tanzu-system- auth namespace:

kubectl get all -n tanzu-system-auth

You see the following entry in the output:

VMware, Inc. 180 VMware Tanzu Kubernetes Grid

vSphere:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dexsvc NodePort 100.70.70.12 5556:30167/TCP 84m

Amazon EC2:

NAME TYPE CLUSTER-IP EXTERNAL- IP PORT(S) AGE service/dexsvc LoadBalancer 100.65.184.107 a6e[...]cc6-921316974.eu- west-1.elb.amazonaws.com 443:32547/TCP 84m

Azure:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dexsvc LoadBalancer 100.69.169.220 20.54.226.44 443:30451/TCP 84m

2 Note the following information:

n For management clusters that are running on vSphere, note the port on which the pinniped-supervisor or dexsvc service is running. In the example above, the port listed under EXTERNAL-IP is 31234 for the pinniped-supervisor service in Tanzu Kubernetes Grid v1.3.1 and later, or 30167 for the dexsvc service in v1.3.0.

n For clusters that you deploy to Amazon EC2 and Azure, note the external address of the LoadBalancer node of the pinniped-supervisor or dexsvc service is running, that is listed under EXTERNAL-IP.

3 Check that all services in the management cluster are running.

kubectl get pods -A

It can take several minutes for the Pinniped service to be up and running. For example, on Amazon EC2 and Azure deployments the service must wait for the LoadBalancer IP addresses to be ready. Wait until you see that pinniped-post-deploy-job is completed before you proceed to the next steps.

NAMESPACE NAME READY STATUS RESTARTS AGE [...] pinniped-supervisor pinniped-post-deploy-job-hq8fc 0/1 Completed 0 85m

NOTE: You are able to run kubectl get pods because you are using the admin context for the management cluster. Users who attempt to connect to the management cluster with the regular context will not be able to access its resources, because they are not yet authorized to do so.

VMware, Inc. 181 VMware Tanzu Kubernetes Grid

Check the Status of an LDAP Identity Management Service

If you use an LDAP identity management service, Pinniped uses Dex as the endpoint to expose to your provider. In Tanzu Kubernetes Grid v1.3.0, Pinniped uses Dex as the endpoint for both OIDC and LDAP providers. In Tanzu Kubernetes Grid v1.3.1 and later, Dex is only used if you use an LDAP provider. This procedure applies to LDAP identity management for all v1.3.x versions, and to OIDC identity management for Tanzu Kubernetes Grid v1.3.0. If you are using OIDC identity management with Tanzu Kubernetes Grid v1.3.1 or later, see Check the Status of an OIDC Identity Management Service (v1.3.1 and later) above.

1 Get information about the services that are running in the management cluster in the tanzu- system-auth namespace.

kubectl get all -n tanzu-system-auth

You see the following entry in the output:

vSphere:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dexsvc NodePort 100.70.70.12 5556:30167/TCP 84m

Amazon EC2:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dexsvc LoadBalancer 100.65.184.107 a6e[...]74.eu-west-1.elb.amazonaws.com 443:32547/TCP 84m

Azure:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/dexsvc LoadBalancer 100.69.169.220 20.54.226.44 443:30451/TCP 84m

2 Check that all services in the management cluster are running.

kubectl get pods -A

It can take several minutes for the Pinniped service to be up and running. For example, on Amazon EC2 and Azure deployments the service must wait for the LoadBalancer IP addresses to be ready. Wait until you see that pinniped-post-deploy-job is completed before you proceed to the next steps.

NAMESPACE NAME READY STATUS RESTARTS AGE [...] pinniped-supervisor pinniped-post-deploy-job-hq8fc 0/1 Completed 0 85m

NOTE: You are able to run kubectl get pods because you are using the admin context for the management cluster. Users who attempt to connect to the management cluster with the regular context will not be able to access its resources, because they are not yet authorized to do so.

VMware, Inc. 182 VMware Tanzu Kubernetes Grid

Provide the Callback URI to the OIDC Provider

If you configured an LDAP server as your identity provider, you do not need to configure a callback URI. For the next steps, go to Generate a kubeconfig to Allow Authenticated Users to Connect to the Management Cluster.

If you configured the management cluster to use OIDC authentication, you must provide the callback URI for that management cluster to your OIDC identity provider.

For example, if you are using OIDC and your IDP is Okta, perform the following steps:

1 Log in to your Okta account.

2 In the main menu, go to Applications.

3 Select the application that you created for Tanzu Kubernetes Grid.

4 In the General Settings panel, click Edit.

5 Under Login, update Login redirect URIs to include the address of the node in which the pinniped-supervisor is running:

NOTE: In Tanzu Kubernetes Grid v1.3.0, you must provide the address of the dexsvc node. The port number of the API endpoint is also different for the pinniped-supervisor and dexsvc services. n On vSphere, add the IP address that you set as the API endpoint and the pinniped-supervisor or dexsvc port number that you noted in the previous procedure.

n Tanzu Kubernetes Grid v1.3.1 and later:

https://:31234/callback

n Tanzu Kubernetes Grid v1.3.0:

https://:30167/callback n On Amazon EC2 and Azure, add the external IP address of the LoadBalancer node on which the pinniped-supervisor or dexsvc is running, that you noted in the previous procedure.

NOTE:

https:///callback

In all cases, you must specify https, not http.

1 Click Save.

VMware, Inc. 183 VMware Tanzu Kubernetes Grid

Generate a kubeconfig to Allow Authenticated Users to Connect to the Management Cluster

To allow users to access the management cluster, you export the management cluster's kubeconfig to a file that you can share with those users. n If you export the admin version of the kubeconfig, any users with whom you share it will have full access to the management cluster and IDP authentication is bypassed. n If you export the regular version of the kubeconfig, it is populated with the necessary authentication information, so that the user's identity will be verified with your IDP before they can access the cluster's resources.

This procedure allows you to test the login step of the authentication process if a browser is present on the machine on which you are running tanzu and kubectl commands. If the machine does not have a browser, see Authenticate Users on a Machine Without a Browser below.

1 Export the regular kubeconfig for the management cluster to the local file /tmp/ id_mgmt_test_kubeconfig.

Note that the command does not include the --admin option, so the kubeconfig that is exported is the regular kubeconfig, not the admin version.

tanzu management-cluster kubeconfig get --export-file /tmp/id_mgmt_test_kubeconfig

You should see confirmation that You can now access the cluster by specifying '-- kubeconfig /tmp/id_mgmt_test_kubeconfig' flag when using 'kubectl' command.

2 Connect to the management cluster by using the newly-created kubeconfig file.

kubectl get pods -A --kubeconfig /tmp/id_mgmt_test_kubeconfig

The authentication process requires a browser to be present on the machine from which users connect to clusters, because running kubectl commands automatically opens the IDP login page so that users can log in to the cluster.

Your browser should open and display the login page for your OIDC provider or an LDAPS login page.

LDAPS:

VMware, Inc. 184 VMware Tanzu Kubernetes Grid

OIDC:

Enter the credentials of a user account that exists in your OIDC or LDAP server.

After a successful login, the browser should display the following message:

you have been logged in and may now close this tab

3 Go back to the terminal in which you run tanzu and kubectl commands.

If you already configured a role binding on the cluster for the authenticated user, the output of kubectl get pods -A appears, displaying the pod information.

If you have not configured a role binding on the cluster, you see a message denying the user account access to the pods: Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot list resource "pods" in API group "" at the cluster scope. This happens because the user has been successfully authenticated, but they are not yet authorized to access any resources on the cluster. To authorize the user to access the cluster resources, you must Create a Role Binding on the Management Cluster.

Authenticate Users on a Machine Without a Browser

If the machine on which you are running tanzu and kubectl commands does not have a browser, you can skip the automatic opening of a browser during the authentication process.

1 Set the TANZU_CLI_PINNIPED_AUTH_LOGIN_SKIP_BROWSER=true environment variable.

This adds the --skip-browser option to the kubeconfig for the cluster.

export TANZU_CLI_PINNIPED_AUTH_LOGIN_SKIP_BROWSER=true

On Windows systems, use the SET command instead of export.

VMware, Inc. 185 VMware Tanzu Kubernetes Grid

2 Export the regular kubeconfig for the management cluster to the local file /tmp/ id_mgmt_test_kubeconfig.

Note that the command does not include the --admin option, so the kubeconfig that is exported is the regular kubeconfig, not the admin version.

tanzu management-cluster kubeconfig get --export-file /tmp/id_mgmt_test_kubeconfig

You should see confirmation that You can now access the cluster by specifying '-- kubeconfig /tmp/id_mgmt_test_kubeconfig' flag when using 'kubectl' command.

3 Connect to the management cluster by using the newly-created kubeconfig file.

kubectl get pods -A --kubeconfig /tmp/id_mgmt_test_kubeconfig

The login URL is displayed in the terminal. For example:

Please log in: https://ab9d82be7cc2443ec938e35b69862c9c-10577430.eu-west-1.elb.amazonaws.com/ oauth2/authorize?access_type=offline&client_id=pinniped- cli&code_challenge=vPtDqg2zUyLFcksb6PrmE8bI9qF8it22KQMy52hB6DE&code_challenge_method=S256&nonce=2a 66031e3075c65ea0361b3ba30bf174&redirect_uri=http%3A%2F %2F127.0.0.1%3A57856%2Fcallback&response_type=code&scope=offline_access+openid+pinniped%3Arequest- audience&state=01064593f32051fee7eff9333389d503

4 Copy the login URL and paste it into a browser on a machine that does have one.

5 In the browser, log in to your identity provider.

You will see a message that the identity provider could not send the authentication code because there is no localhost listener on your workstation.

1 Copy the URL of the authenticated session from the URL field of the browser.

2 On the machine that does not have a browser, use the URL that you copied in the preceding step to get the authentication code from the identity provider.

curl -L ''

Wrap the URL in quotes, to escape any special characters. For example, the command will resemble the following:

curl - L 'http://127.0.0.1:37949/callback?code=FdBkopsZwYX7w5zMFnJqYoOlJ50agmMWHcGBWD- DTbM.8smzyMuyEBlPEU2ZxWcetqkStyVPjdjRgJNgF1-vODs&scope=openid+offline_access+pinniped%3Arequest- audience&state=a292c262a69e71e06781d5e405d42c03'

After running curl -L '', you should see the following message:

you have been logged in and may now close this tab

VMware, Inc. 186 VMware Tanzu Kubernetes Grid

3 Connect to the management cluster again by using the same kubeconfig file as you used previously.

kubectl get pods -A --kubeconfig /tmp/id_mgmt_test_kubeconfig

If you already configured a role binding on the cluster for the authenticated user, the output shows the pod information.

If you have not configured a role binding on the cluster, you will see a message denying the user account access to the pods: Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot list resource "pods" in API group "" at the cluster scope. This happens because the user has been successfully authenticated, but they are not yet authorized to access any resources on the cluster. To authorize the user to access the cluster resources, you must configure Role-Based Access Control (RBAC) on the cluster by creating a cluster role binding.

Create a Role Binding on the Management Cluster

To complete the identity management configuration of the management cluster, you must create cluster role bindings for the users who use the kubeconfig that you generated in the preceding step. There are many roles with which you can associate users, but the most useful roles are the following: n cluster-admin: Can perform any operation on the cluster. n admin: Permission to view most resources but can only modify resources like roles and bindings. Cannot modify pods or deployments. n edit: The opposite of admin. Can create, update, and delete resources like deployments, services, and pods. Cannot change roles or permissions. n view: Read-only.

You can assign any of these roles to users. For more information about RBAC and cluster role bindings, see Using RBAC Authorization in the Kubernetes documentation.

1 Make sure that you are using the admin context of the management cluster.

kubectl config current-context

If the context is not the management cluster admin context, set kubectl to use that context. For example:

kubectl config use-context id-mgmt-test-admin@id-mgmt-test

2 To see the full list of roles that are available on a cluster, run the following command:

kubectl get clusterroles

3 Create a cluster role binding to associate a given user with a role.

VMware, Inc. 187 VMware Tanzu Kubernetes Grid

The following command creates a role binding named id-mgmt-test-rb that binds the role cluster-admin for this cluster to the user [email protected]. For OIDC the username is usually the email address of the user. For LDAPS it is the LDAP username, not the email address.

OIDC:

kubectl create clusterrolebinding id-mgmt-test-rb --clusterrole cluster-admin --user [email protected]

LDAP:

kubectl create clusterrolebinding id-mgmt-test-rb --clusterrole cluster-admin --user

4 Attempt to connect to the management cluster again by using the kubeconfig file that you created in the previous procedure.

kubectl get pods -A --kubeconfig /tmp/id_mgmt_test_kubeconfig

This time, because the user is bound to the cluster-admin role on this management cluster, the list of pods should be displayed.

## Add a Load Balancer Service to a Management Cluster on vSphere

When deploying a management cluster on vSphere, you may want to use a load balancer service with the identity management services provided by Pinniped for OIDC or by Pinniped and Dex for LDAP. Setting up a load balancer service on the management cluster for Pinniped and Dex identity management services can simplify your DNS and firewall configuration requirements.

This procedure modifies Pinniped by updating the app secret that contains the deployment configuration. This update ensures that any configuration changes made to Pinniped components are preserved during future upgrades of the management cluster.

Prerequisites Before you begin this procedure, you must have the following: n An external load balancer service configured and available for use as a provider by the management cluster. For example, you may have set up MetalLB. n Successfully installed and configured Pinniped for OIDC or Pinniped and Dex for LDAP on the management cluster.

Procedure

1 Make sure that you are using the admin context of the management cluster.

kubectl config current-context

If the context is not the management cluster admin context, set kubectl to use that context. For example:

kubectl config use-context id-mgmt-test-admin@id-mgmt-test

VMware, Inc. 188 VMware Tanzu Kubernetes Grid

2 Obtain the name of the secret containing the Pinniped configuration.

kubectl get secret -n tkg-system -l tkg.tanzu.vmware.com/addon-name=pinniped

3 Save the existing values from the secret.

kubectl get secret tkg-mgmt-vc-pinniped-addon -n tkg-system -o jsonpath={.data.values\\.yaml} | base64 -d > values.yaml

4 Update the configuration for the Dex service located in values.yaml. This step is not required if you are using OIDC because Dex is not enabled for OIDC.

LDAP only

a Locate and copy the entire section for Dex that is labeled as follows:

dex: app: dex ......

b In the Dex configuration section, add or update a service: section to include name: dexsvc and type: LoadBalancer. This configuration update should resemble the following:

dex: app: dex ...... service: name: dexsvc type: LoadBalancer

5 In a text editor, prepare the stringData section for the secret update. Make sure indentation matches the examples below.

OIDC:

stringData: overlays.yaml: | #@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind": "Service", "metadata": {"name": "pinniped- supervisor", "namespace": "pinniped-supervisor"}}) --- #@overlay/replace spec: type: LoadBalancer selector: app: pinniped-supervisor ports: - name: https protocol: TCP port: 443 targetPort: 8443 values.yaml: |

VMware, Inc. 189 VMware Tanzu Kubernetes Grid

#@data/values #@overlay/match-child-defaults missing_ok=True --- infrastructure_provider: vsphere tkg_cluster_role: management

LDAP:

In the values.yml: section under stringData:, include the Dex configuration that you prepared in the previous step.

stringData: overlays.yaml: | #@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind": "Service", "metadata": {"name": "pinniped- supervisor", "namespace": "pinniped-supervisor"}}) --- #@overlay/replace spec: type: LoadBalancer selector: app: pinniped-supervisor ports: - name: https protocol: TCP port: 443 targetPort: 8443 values.yaml: | #@data/values #@overlay/match-child-defaults missing_ok=True --- infrastructure_provider: vsphere tkg_cluster_role: management dex: app: dex ...... service: name: dexsvc type: LoadBalancer

6 Run kubectl edit to edit the secret.

kubectl edit secret tkg-mgmt-vc-pinniped-addon -n tkg-system

7 When editing the secret text, add the stringData configuration YAML that you prepared in the previous step. Leave the existing encoded Base64 content for values.yaml.

apiVersion: v1 data:

VMware, Inc. 190 VMware Tanzu Kubernetes Grid

values.yaml: LEAVE EXISTING BASE64 ENCODED DATA stringData: ADD THE YAML PREPARED DURING PREVIOUS STEP

kind: Secret

8 After you save the secret, check the status of Pinniped app. The app should eventually show Reconcile succeeded.

kubectl get app pinniped -n tkg-system

NAME DESCRIPTION SINCE-DEPLOY AGE pinniped Reconcile succeeded 3m23s 7h50m

9 If the returned status is Reconcile failed, run the following command to get details on the failure.

kubectl get app pinniped -n tkg-system -o yaml

If the failed status is due to a ytt template format error, edit the secret to correct the format in values.yaml or stringData and save the secret again.

10 Check the configuration map for pinniped-info.

kubectl get cm pinniped-info -n kube-public -o yaml | grep issuer

This command returns the IP address or DNS name for the load balancer service that you configured for the Pinniped. For example: issuer: https://10.186.131.117

## What to Do Next

Share the generated `kubeconfig` file with other users, to allow them to access the management cluster. You can also start creating workload clusters, assign users to roles on those clusters, and share their `kubeconfig` files with those users.

- For information about creating workload clusters, see [Deploying Tanzu Kubernetes Clusters](../ tanzu-k8s-clusters/deploy.md). - For information about how to grant users access to workload clusters on which you have implemented identity management, see [Authenticate Connections to a Workload Cluster](../cluster-lifecycle/ connect.md#id-mgmt).

VMware, Inc. 191 VMware Tanzu Kubernetes Grid

Examine the Management Cluster Deployment

During the deployment of the management cluster, either from the installer interface or the CLI, Tanzu Kubernetes Grid creates a temporary management cluster using a Kubernetes in Docker, kind, cluster on the bootstrap machine. Then, Tanzu Kubernetes Grid uses it to provision the final management cluster on the platform of your choice, depending on whether you are deploying to vSphere, Amazon EC2, or Microsoft Azure. After the deployment of the management cluster finishes successfully, Tanzu Kubernetes Grid deletes the temporary kind cluster.

When Tanzu Kubernetes Grid creates a management cluster for the first time, it also creates a folder ~/.tanzu/tkg/providers that contains all of the files required by Cluster API to create the management cluster.

The Tanzu Kubernetes Grid installer interface saves the settings for the management cluster that it creates into a cluster configuration file ~/.tanzu/tkg/clusterconfigs/UNIQUE-ID.yaml, where UNIQUE-ID is a generated filename.

IMPORTANT: By default, unless you set the KUBECONFIG environment variable to save the kubeconfig for a cluster to a specific file, all clusters that you deploy from the Tanzu CLI are added to a shared .kube-tkg/config file. If you delete the shared .kube-tkg/config file, all management clusters become orphaned and thus unusable.

Management Cluster Networking

When you deploy a management cluster, pod-to-pod networking with Antrea is automatically enabled in the management cluster.

Configure DHCP Reservations for the Control Plane Nodes (vSphere Only)

After you deploy a cluster to vSphere, each control plane node requires a static IP address. This includes both management and Tanzu Kubernetes clusters. These static IP addresses are required in addition to the static IP address that you assigned to Kube-VIP when you deploy a management cluster.

To make the IP addresses that your DHCP server assigned to the control plane nodes static, you can configure a DHCP reservation for each control plane node in the cluster. For instructions on how to configure DHCP reservations, see your DHCP server documentation.

Verify the Deployment of the Management Cluster

After the deployment of the management cluster completes successfully, you can obtain information about your management cluster by: n Locating the management cluster objects in vSphere, Amazon EC2, or Azure n Using the Tanzu CLI and kubectl

VMware, Inc. 192 VMware Tanzu Kubernetes Grid

View Management Cluster Objects in vSphere, Amazon EC2, or Azure To view the management cluster objects in vSphere, Amazon EC2, or Azure, do the following: n If you deployed the management cluster to vSphere, go to the resource pool that you designated when you deployed the management cluster. n If you deployed the management cluster to Amazon EC2, go to the Instances view of your EC2 dashboard. n If you deployed the management cluster to Azure, go to the resource group that you designated when you deployed the management cluster.

You should see the following VMs or instances. n vSphere:

n One or three control plane VMs, for development or production control plane, respectively, with names similar to CLUSTER-NAME-control-plane-sx5rp

n A worker node VM with a name similar to CLUSTER-NAME-md-0-6b8db6b59d-kbnk4 n Amazon EC2:

n One or three control plane VM instances, for development or production control plane, respectively, with names similar to CLUSTER-NAME-control-plane-bcpfp

n A worker node instance with a name similar to CLUSTER-NAME-md-0-dwfnm

n An EC2 bastion host instance with the name CLUSTER-NAME-bastion n Azure:

n One or three control plane VMs, for development or production control plane, respectively, with names similar to CLUSTER-NAME-control-plane-rh7xv

n A worker node VMs with a name similar to CLUSTER-NAME-md-0-rh7xv

n Disk and Network Interface resources for the control plane and worker node VMs, with names based on the same name patterns.

If you did not specify a name for the management cluster, CLUSTER-NAME is something similar to tkg-mgmt-vsphere-20200323121503 or tkg-mgmt-aws-20200323140554.

View Management Cluster Details With Tanzu CLI and kubectl Tanzu CLI provides commands that facilitate many of the operations that you can perform with your management cluster. However, for certain operations, you still need to use kubectl.

VMware, Inc. 193 VMware Tanzu Kubernetes Grid

When you deploy a management cluster, the kubectl context is not automatically set to context of the management cluster. Tanzu Kubernetes Grid provides two contexts for every management cluster and Tanzu Kubernetes cluster: n The admin context of a cluster gives you full access to that cluster.

n If you implemented identity management on the cluster, using the admin context allows you to run kubectl operations without requiring authentication with your identity provider (IDP).

n If you did not implement identity management on the management cluster, you must use the admin context to run kubectl operations. n If you implemented identity management on the cluster, using the regular context requires you to authenticate with your IDP before you can run kubectl operations on the cluster.

Before you can run kubectl operations on a management cluster, you must obtain its kubeconfig.

1 On the bootstrap machine, run the tanzu login command to see the available management clusters and which one is the current login context for the CLI.

For more information, see List Management Clusters and Change Context.

2 To see the details of the management cluster, run tanzu management-cluster get.

For more information, see See Management Cluster Details.

3 To retrieve a kubeconfig for the management cluster, run the tanzu management-cluster kubeconfig get command as described in Retrieve Management Cluster kubeconfig.

4 Set the context of kubectl to the management cluster.

kubectl config use-context my-mgmnt-cluster-admin@my-mgmnt-cluster

5 Use kubectl commands to examine the resources of the management cluster.

For example, run kubectl get nodes, kubectl get pods, or kubectl get namespaces to see the nodes, pods, and namespaces running in the management cluster.

Retrieve Management Cluster kubeconfig

The tanzu management-cluster kubeconfig get command retrieves kubeconfig configuration information for the current management cluster, with options as follows: n --export-file FILE

n Without option: Add the retrieved cluster configuration information to the kubectl CLI's current kubeconfig file, whether it is the default ~/.kube/config or set by the KUBECONFIG environment variable.

n With option: Write the cluster configuration to a standalone kubeconfig file FILE that you can share with others.

VMware, Inc. 194 VMware Tanzu Kubernetes Grid

n --admin

n Without option: Generate a regular kubeconfig that requires the user to authenticate with an external identity provider, and grants them access to cluster resources based on their assigned roles. To generate a regular kubeconfig, identity management must be configured on the cluster.

n The context name for this kubeconfig includes a tanzu-cli- prefix. For example, tanzu- cli-id-mgmt-test@id-mgmt-test.

n With option: Generate an administrator kubeconfig containing embedded credentials that lets the user access the cluster without logging in to an identity provider, and grants full access to the cluster's resources. If identity management is not configured on the cluster, you must specify the --admin option.

n The context name for this kubeconfig includes an -admin suffix. For example, id-mgmt- test-admin@id-mgmt-test.

For example, to generate a standalone kubeconfig file to share with someone to grant them full access to your current management cluster:

tanzu management-cluster kubeconfig get --admin --export-file MC-ADMIN-KUBECONFIG

To retrieve a kubeconfig for a workload cluster, run tanzu cluster kubeconfig get as described in Retrieve Tanzu Kubernetes Cluster kubeconfig.

What to Do Next

You can now use Tanzu Kubernetes Grid to start deploying Tanzu Kubernetes clusters. For information, see Chapter 5 Deploying Tanzu Kubernetes Clusters.

If you need to deploy more than one management cluster, on any or all of vSphere, Azure, and Amazon EC2, see Manage Your Management Clusters. This topic also provides information about how to add existing management clusters to your CLI instance, obtain credentials, scale and delete management clusters, and how to opt in or out of the CEIP.

VMware, Inc. 195 Deploying Tanzu Kubernetes Clusters 5

This section describes how you use the Tanzu CLI to deploy and manage Tanzu Kubernetes clusters.

Before you can create Tanzu Kubernetes clusters, you must install the Tanzu CLI and deploy a management cluster. For information, see Chapter 3 Install the Tanzu CLI and Other Tools and Chapter 4 Deploying Management Clusters.

You can use the Tanzu CLI to deploy Tanzu Kubernetes clusters to the following platforms: n vSphere 6.7u3 n vSphere 7 (see below) n Amazon EC2 n Microsoft Azure

This chapter includes the following topics: n About Tanzu Kubernetes Clusters n Tanzu Kubernetes Clusters, kubectl, and kubeconfig n Using the Tanzu CLI to Create and Manage Clusters in vSphere with Tanzu n Deploy Tanzu Kubernetes Clusters n Deploy Tanzu Kubernetes Clusters to vSphere n Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster n Deploy Tanzu Kubernetes Clusters to Amazon EC2 n Deploy Tanzu Kubernetes Clusters to Azure n Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions n Customize Tanzu Kubernetes Cluster Networking n Create Persistent Volumes with Storage Classes n Configure Tanzu Kubernetes Plans and Clusters

VMware, Inc. 196 VMware Tanzu Kubernetes Grid

About Tanzu Kubernetes Clusters

In VMware Tanzu Kubernetes Grid, Tanzu Kubernetes clusters are the Kubernetes clusters in which your application workloads run.

Tanzu Kubernetes Grid automatically deploys clusters to the platform on which you deployed the management cluster. For example, you cannot deploy clusters to Amazon EC2 or Azure from a management cluster that is running in vSphere, or the reverse. It is not possible to use shared services between the different providers because, for example, vSphere clusters are reliant on sharing vSphere networks and storage, while Amazon EC2 and Azure use their own systems. Tanzu Kubernetes Grid automatically deploys clusters from whichever management cluster you have set as the context for the CLI by using the tanzu login command. For information about tanzu login, see Manage Your Management Clusters. n For information about how to use the Tanzu CLI to deploy Tanzu Kubernetes clusters, see Deploy Tanzu Kubernetes Clusters and its subtopics. n After you have deployed Tanzu Kubernetes clusters, the Tanzu CLI provides commands and options to perform the cluster lifecycle management operations described in Chapter 6 Managing Cluster Lifecycles.

For information about how to upgrade existing clusters to a new version of Kubernetes, see Upgrade Tanzu Kubernetes Clusters.

Tanzu Kubernetes Clusters, kubectl, and kubeconfig

When you create a management cluster, the Tanzu CLI and kubectl contexts are automatically set to that management cluster. However, Tanzu Kubernetes Grid does not automatically set the kubectl context to a Tanzu Kubernetes cluster when you create it. You must set the kubectl context to a Tanzu Kubernetes cluster manually by using the kubectl config use-context command.

By default, unless you specify the KUBECONFIG option to save the kubeconfig for a cluster to a specific file, all Tanzu Kubernetes clusters that you deploy are added to a shared .kube/config file. If you delete the shared .kube/config file and you still have the .kube-tkg/config file for the management cluster, you can recover the .kube/config of the Tanzu Kubernetes clusters with the tanzu cluster kubeconfig get command.

Do not change context or edit the .kube-tkg/config or .kube/config files while Tanzu Kubernetes Grid operations are running.

VMware, Inc. 197 VMware Tanzu Kubernetes Grid

Using the Tanzu CLI to Create and Manage Clusters in vSphere with Tanzu

If you have vSphere 7 and you have enabled the vSphere with Tanzu feature, you can use the Tanzu CLI to interact with the vSphere with Tanzu Supervisor Cluster, to deploy Tanzu Kubernetes clusters in vSphere with Tanzu. For more information, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

Deploy Tanzu Kubernetes Clusters

After you have deployed a management cluster to vSphere, Amazon EC2, or Azure, or you have connected the Tanzu CLI to a vSphere with Tanzu Supervisor Cluster, you can use the Tanzu CLI to deploy Tanzu Kubernetes clusters.

To deploy a Tanzu Kubernetes cluster, you create a configuration file that specifies the different options with which to deploy the cluster. You then run the tanzu cluster create command, specifying the configuration file in the --file option.

This topic describes the most basic configuration options for Tanzu Kubernetes clusters.

Prerequisites for Cluster Deployment n You have followed the procedures in Chapter 3 Install the Tanzu CLI and Other Tools and Chapter 4 Deploying Management Clusters to deploy a management cluster to vSphere, Amazon EC2, or Azure. n You have already upgraded the management cluster to the version that corresponds with the Tanzu CLI version. If you attempt to deploy a Tanzu Kubernetes cluster with an updated CLI without upgrading the management cluster first, the Tanzu CLI returns the error Error: validation failed: version mismatch between management cluster and cli version. Please upgrade your management cluster to the latest to continue. For instructions on how to upgrade management clusters, see Upgrade Management Clusters. n Alternatively, you have a vSphere 7 instance on which a vSphere with Tanzu Supervisor Cluster is running. To deploy clusters to a vSphere 7 instance on which the vSphere with Tanzu feature is enabled, you must connect the Tanzu CLI to the vSphere with Tanzu Supervisor Cluster. For information about how to do this, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster. n vSphere: If you are deploying Tanzu Kubernetes clusters to vSphere, each cluster requires one static virtual IP address to provide a stable endpoint for Kubernetes. Make sure that this IP address is not in the DHCP range, but is in the same subnet as the DHCP range. n Azure: If you are deploying Tanzu Kubernetes clusters to Azure, each cluster requires a Network Security Group (NSG) for its worker nodes named CLUSTER-NAME-node-nsg, where CLUSTER-NAME is the name of the cluster. For more information, see Network Security Groups on Azure.

VMware, Inc. 198 VMware Tanzu Kubernetes Grid

n Configure Tanzu Kubernetes cluster node size depending on cluster complexity and expected demand. For more information, see Minimum VM Sizes for Cluster Nodes.

Create a Tanzu Kubernetes Cluster Configuration File

When you deploy a Tanzu Kubernetes cluster, most of the configuration for the cluster is the same as the configuration of the management cluster that you use to deploy it. Because most of the configuration is the same, the easiest way to obtain an initial configuration file for a Tanzu Kubernetes cluster is to make a copy of the management cluster configuration file and to update it.

1 Locate the YAML configuration file for the management cluster. n If you deployed the management cluster from the installer interface and you did not specify the --file option when you ran tanzu management-cluster create --ui, the configuration file is saved in ~/.tanzu/tkg/clusterconfigs/. The file has a randomly generated name, for example, bm8xk9bv1v.yaml. n If you deployed the management cluster from the installer interface and you did specify the --file option, the management cluster configuration is taken from in the file that you specified. n If you deployed the management cluster from the Tanzu CLI without using the installer interface, the management cluster configuration is taken from either a file that you specified in the --file option, or from the default location, ~/.tanzu/tkg/cluster-config.yaml.

1 Make a copy of the management cluster configuration file and save it with a new name.

For example, save the file as my-aws-tkc.yaml, my-azure-tkc.yaml or my-vsphere-tkc.yaml.

IMPORTANT: The recommended practice is to use a dedicated configuration file for every Tanzu Kubernetes cluster that you deploy.

Deploy a Tanzu Kubernetes Cluster with Minimum Configuration

The simplest way to deploy a Tanzu Kubernetes cluster is to specify a configuration that is identical to that of the management cluster. In this case, you only need to specify a name for the cluster. If you are deploying the cluster to vSphere, you must also specify an IP address or FQDN for the Kubernetes API endpoint.

Note: To configure a workload cluster to use an OS other than the default Ubuntu 20.04, you must set the OS_NAME and OS_VERSION values in the cluster configuration file. The installer interface does not include node VM OS values in the management cluster configuration files that it saves to ~/.tanzu/tkg/clusterconfigs.

1 Open the new YAML cluster configuration file in a text editor.

2 Optionally set a name for the cluster in the CLUSTER_NAME variable.

For example, if you are deploying the cluster to vSphere, set the name to my-vsphere-tkc.

CLUSTER_NAME: my-vsphere-tkc

VMware, Inc. 199 VMware Tanzu Kubernetes Grid

If you do not specify a CLUSTER_NAME value in the cluster configuration file or as an environment variable, you must pass it as the first argument to the tanzu cluster create command. The CLUSTER_NAME value passed to tanzu cluster create overrides the name you set in the configuration file. Workload cluster names must be must be 42 characters or less, and must comply with DNS hostname requirements as amended in RFC 1123.

3 If you are deploying the cluster to vSphere, specify a static virtual IP address or FQDN in the VSPHERE_CONTROL_PLANE_ENDPOINT variable.

No two clusters, including any management cluster and workload cluster, can have the same VSPHERE_CONTROL_PLANE_ENDPOINT address. n Ensure that this IP address is not in the DHCP range, but is in the same subnet as the DHCP range. n If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address.

VSPHERE_CONTROL_PLANE_ENDPOINT: 10.90.110.100

1 Save the configuration file.

2 Run the tanzu cluster create command, specifying the path to the configuration file in the -- file option.

If you saved the Tanzu Kubernetes cluster configuration file my-vsphere-tkc.yaml in the default clusterconfigs folder, run the following command to create a cluster with a name that you specified in the configuration file:

tanzu cluster create --file .tanzu/tkg/clusterconfigs/my-vsphere-tkc.yaml

If you did not specify a name in the configuration file, or to create a cluster with a different name to the one that you specified, specify the cluster name in the tanzu cluster create command. For example, to create a cluster named another-vsphere-tkc from the configuration file my-vsphere-tkc.yaml, run the following command:

tanzu cluster create another-vsphere-tkc --file .tanzu/tkg/clusterconfigs/my-vsphere-tkc.yaml

Any name that you specify in the tanzu cluster create command will override the name you set in the configuration file.

3 To see information about the cluster, run the tanzu cluster get command, specifying the cluster name.

tanzu cluster get my-vsphere-tkc

The output lists information about the status of the control plane and worker nodes, the Kubernetes version that the cluster is running, and the names of the nodes.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES my-vsphere-tkc default running 1/1 1/1 v1.20.5+vmware.2

VMware, Inc. 200 VMware Tanzu Kubernetes Grid

Details:

NAME READY SEVERITY REASON SINCE MESSAGE /my-vsphere-tkc True 17m ├─ClusterInfrastructure - VSphereCluster/my-vsphere-tkc True 19m ├─ControlPlane - KubeadmControlPlane/my-vsphere-tkc-control-plane True 17m │ └─Machine/my-vsphere-tkc-control-plane-ss9rt True 17m └─Workers └─MachineDeployment/my-vsphere-tkc-md-0 └─Machine/my-vsphere-tkc-md-0-657958d58-mgtpp True 8m33s

The cluster runs the default version of Kubernetes for this Tanzu Kubernetes Grid release, which in Tanzu Kubernetes Grid v1.3.1 is v1.20.5.

Deploy a Cluster with Different Numbers of Control Plane and Worker Nodes

In the preceding example, because you did not change any of the node settings in the Tanzu Kubernetes cluster configuration file, the resulting Tanzu Kubernetes cluster has the same numbers of control plane and worker nodes as the management cluster. The nodes have the same CPU, memory, and disk configuration as the management cluster nodes. n If you selected Development in the Management Cluster Settings section of the installer interface, or specified CLUSTER_PLAN: dev and the default numbers of nodes in the management cluster configuration, the Tanzu Kubernetes cluster consists of the following VMs or instances:

n vSphere:

n One control plane node, with a name similar to my-dev-cluster-control-plane-nj4z6.

n One worker node, with a name similar to my-dev-cluster-md-0-6ff9f5cffb-jhcrh.

n Amazon EC2:

n One control plane node, with a name similar to my-dev-cluster-control-plane-d78t5.

n One EC2 bastion node, with the name my-dev-cluster-bastion.

n One worker node, with a name similar to my-dev-cluster-md-0-2vsr4.

n Azure:

n One control plane node, with a name similar to my-dev-cluster-20200902052434- control-plane-4d4p4.

n One worker node, with a name similar to my-dev-cluster-20200827115645-md-0-rjdbr.

VMware, Inc. 201 VMware Tanzu Kubernetes Grid

n If you selected Production in the Management Cluster Settings section of the installer interface, or specified CLUSTER_PLAN: prod and the default numbers of nodes in the management cluster configuration file, Tanzu CLI deploys a cluster with three control plane nodes and automatically implements stacked control plane HA for the cluster. The Tanzu Kubernetes cluster consists of the following VMs or instances:

n vSphere

n Three control plane nodes, with names similar to my-prod-cluster-control-plane-nj4z6.

n Three worker nodes, with names similar to my-prod-cluster-md-0-6ff9f5cffb-jhcrh.

n Amazon EC2:

n Three control plane nodes, with names similar to my-prod-cluster-control-plane-d78t5.

n One EC2 bastion node, with the name my-prod-cluster-bastion.

n Three worker nodes, with names similar to my-prod-cluster-md-0-2vsr4.

n Azure:

n Three control plane nodes, with names similar to my-prod-cluster-20200902052434- control-plane-4d4p4.

n Three worker nodes, with names similar to my-prod-cluster-20200827115645-md-0-rjdbr.

If you copied the cluster configuration from the management cluster, you can update the CLUSTER_PLAN variable in the configuration file to deploy a Tanzu Kubernetes cluster that uses the prod plan, even if the management cluster was deployed with the dev plan, and the reverse.

CLUSTER_PLAN: prod

To deploy a Tanzu Kubernetes cluster with more control plane nodes than the dev and prod plans define by default, specify the CONTROL_PLANE_MACHINE_COUNT variable in the cluster configuration file. The number of control plane nodes that you specify in CONTROL_PLANE_MACHINE_COUNT must be uneven.

CONTROL_PLANE_MACHINE_COUNT: 5

Specify the number of worker nodes for the cluster in the WORKER_MACHINE_COUNT variable.

WORKER_MACHINE_COUNT: 10

How you configure the size and resource configurations of the nodes depends on whether you are deploying clusters to vSphere, Amazon EC2, or Azure. For information about how to configure the nodes, see the appropriate topic for each provider: n Deploy Tanzu Kubernetes Clusters to vSphere n Deploy Tanzu Kubernetes Clusters to Amazon EC2 n Deploy Tanzu Kubernetes Clusters to Azure.

VMware, Inc. 202 VMware Tanzu Kubernetes Grid

Configure Common Settings

You configure proxies, Machine Health Check, private registries, and Antrea on Tanzu Kubernetes Clusters in the same way as you do for management clusters. For information, see Create a Management Cluster Configuration File.

Deploy a Cluster in a Specific Namespace

If you have created namespaces in your Tanzu Kubernetes Grid instance, you can deploy Tanzu Kubernetes clusters to those namespaces by specifying the NAMESPACE variable. If you do not specify the the NAMESPACE variable, Tanzu Kubernetes Grid places clusters in the default namespace. Any namespace that you identify in the NAMESPACE variable must exist in the management cluster before you run the command. For example, you might want to create different types of clusters in dedicated namespaces. For information about creating namespaces in the management cluster, see Create Namespaces in the Management Cluster.

NAMESPACE: production

NOTE: If you have created namespaces, you must provide a unique name for all Tanzu Kubernetes clusters across all namespaces. If you provide a cluster name that is in use in another namespace in the same instance, the deployment fails with an error.

Create Tanzu Kubernetes Cluster Manifest Files

You can use the Tanzu CLI to create cluster manifest files for Tanzu Kubernetes clusters without actually creating the clusters. To generate a cluster manifest YAML file that you can pass to kubectl apply -f, run the tanzu cluster create command with the --dry-run option and save the output to a file. Use the same options and configuration --file that you would use if you were creating the cluster, for example:

tanzu cluster create my-cluster --file my-cluster-config.yaml --dry-run > my-cluster-manifest.yaml

Deploy a Cluster from a Saved Manifest File

To deploy a cluster from the saved manifest file, pass it to the kubectl apply -f command. For example:

kubectl config use-context my-mgmt-context-admin@my-mgmt-context

kubectl apply -f my-cluster-manifest.yaml

Advanced Configuration of Tanzu Kubernetes Clusters

If you need to deploy a Tanzu Kubernetes cluster with more advanced configuration, rather than copying the configuration file of the management cluster, see the topics that describe the options that are specific to each infrastructure provider. n Deploy Tanzu Kubernetes Clusters to vSphere

VMware, Inc. 203 VMware Tanzu Kubernetes Grid

n Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster n Deploy Tanzu Kubernetes Clusters to Amazon EC2 n Deploy Tanzu Kubernetes Clusters to Azure

Each of the topics on deployment to vSphere, Amazon EC2, and Azure include Tanzu Kubernetes cluster templates, that contain all of the options that you can use for each provider.

You can further customize the configuration of your Tanzu Kubernetes clusters by performing the following types of operations: n Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions n Customize Tanzu Kubernetes Cluster Networking n Create Persistent Volumes with Storage Classes

What to Do Next

After you have deployed Tanzu Kubernetes clusters, the Tanzu CLI provides commands and options to perform the following cluster lifecycle management operations. See Chapter 6 Managing Cluster Lifecycles.

Deploy Tanzu Kubernetes Clusters to vSphere

When you deploy Tanzu Kubernetes clusters to vSphere, you must specify options in the cluster configuration file to connect to vCenter Server and identify the vSphere resources that the cluster will use. You can also specify standard sizes for the control plane and worker node VMs, or configure the CPU, memory, and disk sizes for control plane and worker nodes explicitly. If you use custom image templates, you can identify which template to use to create node VMs.

For the full list of options that you must specify when deploying Tanzu Kubernetes clusters to vSphere, see the Tanzu CLI Configuration File Variable Reference.

Tanzu Kubernetes Cluster Template

The template below includes all of the options that are relevant to deploying Tanzu Kubernetes clusters on vSphere. You can copy this template and update it to deploy Tanzu Kubernetes clusters to vSphere.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

VMware, Inc. 204 VMware Tanzu Kubernetes Grid

With the exception of the options described in the sections below the template, the way in which you configure the variables for Tanzu Kubernetes clusters that are specific to vSphere is identical for both management clusters and workload clusters. For information about how to configure the variables, see Create a Management Cluster Configuration File and Management Cluster Configuration for vSphere. Options that are specific to workload clusters that are common to all infrastructure providers are described in Deploy Tanzu Kubernetes Clusters.

#! ------#! Basic cluster creation configuration #! ------

# CLUSTER_NAME: CLUSTER_PLAN: dev NAMESPACE: default CNI: antrea IDENTITY_MANAGEMENT_TYPE: oidc

#! ------#! Node configuration #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE:

# VSPHERE_NUM_CPUS: 2 # VSPHERE_DISK_GIB: 40 # VSPHERE_MEM_MIB: 4096

# VSPHERE_CONTROL_PLANE_NUM_CPUS: 2 # VSPHERE_CONTROL_PLANE_DISK_GIB: 40 # VSPHERE_CONTROL_PLANE_MEM_MIB: 8192 # VSPHERE_WORKER_NUM_CPUS: 2 # VSPHERE_WORKER_DISK_GIB: 40 # VSPHERE_WORKER_MEM_MIB: 4096

# CONTROL_PLANE_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT_0: # WORKER_MACHINE_COUNT_1: # WORKER_MACHINE_COUNT_2:

#! ------#! vSphere configuration #! ------

VSPHERE_NETWORK: VM Network # VSPHERE_TEMPLATE: VSPHERE_SSH_AUTHORIZED_KEY: VSPHERE_USERNAME: VSPHERE_PASSWORD: VSPHERE_SERVER: VSPHERE_DATACENTER: VSPHERE_RESOURCE_POOL:

VMware, Inc. 205 VMware Tanzu Kubernetes Grid

VSPHERE_DATASTORE: # VSPHERE_STORAGE_POLICY_ID VSPHERE_FOLDER: VSPHERE_TLS_THUMBPRINT: VSPHERE_INSECURE: false VSPHERE_CONTROL_PLANE_ENDPOINT:

#! ------#! NSX-T specific configuration for enabling NSX-T routable pods #! ------

# NSXT_POD_ROUTING_ENABLED: false # NSXT_ROUTER_PATH: "" # NSXT_USERNAME: "" # NSXT_PASSWORD: "" # NSXT_MANAGER_HOST: "" # NSXT_ALLOW_UNVERIFIED_SSL: false # NSXT_REMOTE_AUTH: false # NSXT_VMC_ACCESS_TOKEN: "" # NSXT_VMC_AUTH_HOST: "" # NSXT_CLIENT_CERT_KEY_DATA: "" # NSXT_CLIENT_CERT_DATA: "" # NSXT_ROOT_CA_DATA: "" # NSXT_SECRET_NAME: "cloud-provider-vsphere-nsxt-credentials" # NSXT_SECRET_NAMESPACE: "kube-system"

#! ------#! Machine Health Check configuration #! ------

ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

#! ------#! Common configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

# TKG_HTTP_PROXY: "" # TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

ENABLE_AUDIT_LOGGING: true

ENABLE_DEFAULT_STORAGE_CLASS: true

CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

# OS_NAME: "" # OS_VERSION: "" # OS_ARCH: ""

VMware, Inc. 206 VMware Tanzu Kubernetes Grid

#! ------#! Autoscaler configuration #! ------

ENABLE_AUTOSCALER: false # AUTOSCALER_MAX_NODES_TOTAL: "0" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m" # AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m" # AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m" # AUTOSCALER_MIN_SIZE_0: # AUTOSCALER_MAX_SIZE_0: # AUTOSCALER_MIN_SIZE_1: # AUTOSCALER_MAX_SIZE_1: # AUTOSCALER_MIN_SIZE_2: # AUTOSCALER_MAX_SIZE_2:

#! ------#! Antrea CNI configuration #! ------# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

Deploy a Cluster with a Custom OVA Image

If you are using a single custom OVA image for each version of Kubernetes to deploy clusters on one operating system, follow Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions. In that procedure, you import the OVA into vSphere and then specify it for tanzu cluster create with the --tkr option.

If you are using multiple custom OVA images for the same Kubernetes version, then the --tkr value is ambiguous. This happens when the OVAs for the same Kubernetes version: n Have different operating systems, for example created by make build-node-ova-vsphere- ubuntu-1804, make build-node-ova-vsphere-photon-3, and make build-node-ova-vsphere-rhel-7. n Have the same name but reside in different vCenter folders.

To resolve this ambiguity, set the VSPHERE_TEMPLATE option to the desired OVA image before you run tanzu cluster create.

If the OVA template image name is unique, set VSPHERE_TEMPLATE to just the image name.

If multiple images share the same name, set VSPHERE_TEMPLATE to the full inventory path of the image in vCenter. This path follows the form /MY-DC/vm/MY-FOLDER-PATH/MY-IMAGE, where: n MY_DC is the datacenter containing the OVA template image

VMware, Inc. 207 VMware Tanzu Kubernetes Grid

n MY_FOLDER_PATH is the path to the image from the datacenter, as shown in the vCenter VMs and Templates view n MY_IMAGE is the image name

For example:

VSPHERE_TEMPLATE: "/TKG_DC/vm/TKG_IMAGES/ubuntu-1804-kube-v1.18.8-vmware.1"

You can determine the image's full vCenter inventory path manually or use the govc CLI:

1 Install govc. For installation instructions, see the govmomi repository on GitHub.

2 Set environment variables for govc to access your vCenter:

n export GOVC_USERNAME=VCENTER-USERNAME

n export GOVC_PASSWORD=VCENTER-PASSWORD

n export GOVC_URL=VCENTER-URL

n export GOVC_INSECURE=1

3 Run govc find / -type m and find the image name in the output, which lists objects by their complete inventory paths.

For more information about custom OVA images, see Chapter 8 Building Machine Images.

Configure DHCP Reservations for the Control Plane Nodes

After you deploy a cluster to vSphere, each control plane node requires a static IP address. This includes both management and Tanzu Kubernetes clusters. These static IP addresses are required in addition to the static IP address that you assigned to Kube-VIP when you deploy a managment cluster.

To make the IP addresses that your DHCP server assigned to the control plane nodes static, you can configure a DHCP reservation for each control plane node in the cluster. For instructions on how to configure DHCP reservations, see your DHCP server documentation.

What to Do Next

Advanced options that are applicable to all infrastructure providers are described in the following topics: n Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions n Customize Tanzu Kubernetes Cluster Networking n Create Persistent Volumes with Storage Classes n Configure Tanzu Kubernetes Plans and Clusters

After you have deployed your cluster, see Chapter 6 Managing Cluster Lifecycles.

VMware, Inc. 208 VMware Tanzu Kubernetes Grid

Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster

You can use Tanzu CLI with a vSphere with Tanzu Supervisor Cluster that is running in a vSphere 7 instance. In this way, you can deploy Tanzu Kubernetes clusters to vSphere with Tanzu and manage their lifecycle directly from the Tanzu CLI. vSphere with Tanzu provides a vSphere Plugin for kubectl. The vSphere Plugin for kubectl extends the standard kubectl commands so that you can connect to the Supervisor Cluster from kubectl by using vCenter Single Sign-On credentials. Once you have installed the vSphere Plugin for kubectl, you can connect the Tanzu CLI to the Supervisor Cluster. Then, you can use the Tanzu CLI to deploy and manage Tanzu Kubernetes clusters running in vSphere.

NOTE: On VMware Cloud on AWS and Azure VMware Solution, you cannot create a Supervisor Cluster, and need to deploy a management cluster to run tanzu commands.

Prerequisites n Perform the steps described in Chapter 3 Install the Tanzu CLI and Other Tools. n Make sure that you have a vSphere account that has the correct permissions for deployment of clusters to vSphere 7. For information about how to create a user account, see Required Permissions for the vSphere Account. Alternatively, you can use a vSphere with Tanzu DevOps account. For information about the DevOps user role, see vSphere with Tanzu User Roles and Workflows. n You have access to a vSphere 7 instance on which the vSphere with Tanzu feature is enabled. n Download and install the kubectl vsphere CLI utility on the bootstrap machine on which you run Tanzu CLI commands.

For information about how to obtain and install the vSphere Plugin for kubectl, see Download and Install the Kubernetes CLI Tools for vSphere in the vSphere with Tanzu documentation.

Step 1: Add the Supervisor Cluster

Connect to the supervisor cluster and add it as a management cluster to the tanzu CLI:

1 From vCenter Hosts and Clusters view, in the left column, expand the nested Datacenter, the vCenter cluster that hosts the supervisor cluster, and its Namespaces object.

2 Under Namespaces, select the namespace

Under Summary > Status > Link to CLI Tools click Copy link and record the URL, for example https://192.168.123.3. Remove the https:// to obtain the supervisor cluster API endpoint, SUPERVISOR_IP below, which serves as the download page for the Kubernetes CLI tools.

VMware, Inc. 209 VMware Tanzu Kubernetes Grid

On the bootstrap machine, run the kubectl vsphere login command to log in to vSphere 7 with your vCenter Single Sign-On user account.

Specify a vCenter Single Sign-On user account with the required privileges for Tanzu Kubernetes Grid operation, and the virtual IP (VIP) address for the control plane of the supervisor cluster. For example:

kubectl vsphere login --vsphere-username [email protected] --server=SUPERVISOR_IP -- insecure-skip-tls-verify=true

Enter the password you use to log in to your vCenter Single Sign-On user account.

When you have successfully logged in, kubectl vsphere displays all of the contexts to which you have access. The list of contexts should include the IP address of the supervisor cluster.

Set the context of kubectl to the supervisor cluster.

kubectl config use-context SUPERVISOR_IP

Collect information to run the tanzu login command, which adds the supervisor cluster to your Tanzu Kubernetes Grid instance:

n Decide on a name for the tanzu CLI to use for the supervisor cluster, serving as a Tanzu Kubernetes Grid management cluster.

n The path to the local management cluster kubeconfig file, which defaults to ~/.kube/config and is set by the KUBECONFIG environment variable.

n The context of the supervisor cluster, which is the same as SUPERVISOR_IP.

a Run the tanzu login command, passing in the values above.

In the example below, the KUBECONFIG_PATH defaults to ~/.kube/config if the KUBECONFIG env variable is not set.

$ tanzu login --name my-super --kubeconfig --context 10.161.90.119 ✔ successfully logged in to management cluster using the kubeconfig my-super

b Check that the supervisor cluster was added by running tanzu login again.

The supervisor cluster should be listed by the name that you provided in the preceding step:

tanzu login ? Select a server [Use arrows to move, type to filter] > my-vsphere-mgmt-cluster () my-aws-mgmt-cluster () SUPERVISOR_IP () + new server

VMware, Inc. 210 VMware Tanzu Kubernetes Grid

Step 2: Configure Cluster Parameters

Configure the Tanzu Kubernetes clusters that the tanzu CLI calls the supervisor cluster to create:

1 Obtain information about the storage classes that are defined in the supervisor cluster.

kubectl get storageclasses

2 Set variables to define the storage classes, VM classes, service domain, namespace, and other required values with which to create your cluster. For information about all of the configuration parameters that you can set when deploying Tanzu Kubernetes clusters to vSphere with Tanzu, see Configuration Parameters for Provisioning Tanzu Kubernetes Clusters in the vSphere with Tanzu documentation.

The following table lists the required variables:

Option Value Type or Example Description

CONTROL_PLANE_STORAGE_CLASS Value returned from CLI: kubectl get Default storage class for control plane nodes storageclasses WORKER_STORAGE_CLASS Default storage class for worker nodes

DEFAULT_STORAGE_CLASS Empty string "" for no default, or value Default storage class for control plane or from CLI, as above. workers

STORAGE_CLASSES Empty string "" lets clusters use any Storage classes available for node storage classes in the namespace, or customization comma-separated list string of values from CLI, "SC-1,SC-2,SC-3"

CONTROL_PLANE_VM_CLASS A standard VM class for vSphere with VM class for control plane nodes Tanzu, for example guaranteed-large. See WORKER_VM_CLASS VM class for worker nodes Virtual Machine Class Types for Tanzu Kubernetes Clusters in the vSphere with Tanzu documentation.

SERVICE_CIDR CIDR range The CIDR range to use for the Kubernetes services. The recommended range is 100.64.0.0/13. Change this value only if the recommended range is unavailable.

CLUSTER_CIDR CIDR range The CIDR range to use for pods. The recommended range is 100.96.0.0/11. Change this value only if the recommended range is unavailable.

SERVICE_DOMAIN Domain e.g. my.example.com, or cluster.local if no DNS. If you are going to assign FQDNs with the nodes, DNS lookup is required.

NAMESPACE Namespace The namespace in which to deploy the cluster.

CLUSTER_PLAN dev, prod, or a custom plan See Tanzu CLI Configuration File Variable Reference for variables required for all Tanzu INFRASTRUCTURE_PROVIDER tkg-service-vsphere Kubernetes cluster configuration files.

VMware, Inc. 211 VMware Tanzu Kubernetes Grid

You can set the variables above by doing either of the following: n Include them in the cluster configuration file passed to the tanzu CLI --file option. For example:

CONTROL_PLANE_VM_CLASS: guaranteed-large n From command line, set them as local environment variables by running export (on Linux and macOS) or SET (on Windows) on the command line. For example:

export CONTROL_PLANE_VM_CLASS=guaranteed-large

Note: If you want to configure unique proxy settings for a Tanzu Kubernetes cluster, you can set TKG_HTTP_PROXY, TKG_HTTPS_PROXY, and NO_PROXY as environment variables and then use the Tanzu CLI to create the cluster. These variables take precedence over your existing proxy configuration in vSphere with Tanzu.

Step 3: Create a Cluster

Run tanzu cluster create to create a Tanzu Kubernetes cluster.

1 Determine the versioned Tanzu Kubernetes release (TKr) for the cluster:

2 Obtain the list of TKr that are available in the supervisor cluster.

tanzu kubernetes-release get

3 From the command output, record the desired value listed under NAME, for example v1.18.9--- vmware.1-tkg.1.a87f261. The tkr NAME is the same as its VERSION but with + changed to ---.

4 Determine the namespace for the cluster.

5 Obtain the list of namespaces.

kubectl get namespaces

6 From the command output, record the namespace that includes the Supervisor cluster, for example test-gc-e2e-demo-ns.

7 Decide on the cluster plan: dev, prod, or a custom plan. n You can customize or create cluster plans with files in the ~/.tanzu/tkg/providers/ infrastructure-tkg-service-vsphere directory. See Configure Tanzu Kubernetes Plans and Clusters for details.

1 Run tanzu cluster create with the namespace and tkr NAME values above to create a Tanzu Kubernetes cluster:

tanzu cluster create my-vsphere7-cluster --tkr=TKR-NAME

#! ------#! Settings for creating clusters on vSphere with Tanzu #! ------

VMware, Inc. 212 VMware Tanzu Kubernetes Grid

#! Identifies the storage class to be used for storage of the disks that store the root file systems of the worker nodes. CONTROL_PLANE_STORAGE_CLASS: #! Specifies the name of the VirtualMachineClass that describes the virtual #! hardware settings to be used each control plane node in the pool. CONTROL_PLANE_VM_CLASS: #! Specifies a named storage class to be annotated as the default in the #! cluster. If you do not specify it, there is no default. DEFAULT_STORAGE_CLASS: #! Specifies the service domain for the cluster SERVICE_DOMAIN: #! Specifies named persistent volume (PV) storage classes for container #! workloads. Storage classes associated with the Supervisor Namespace are #! replicated in the cluster. In other words, each storage class listed must be #! available on the Supervisor Namespace for this to be a valid value STORAGE_CLASSES: #! Identifies the storage class to be used for storage of the disks that store the root file systems of the worker nodes. WORKER_STORAGE_CLASS: #! Specifies the name of the VirtualMachineClass that describes the virtual #! hardware settings to be used each worker node in the pool WORKER_VM_CLASS: NAMESPACE:

What to Do Next You can now use the Tanzu CLI to deploy more Tanzu Kubernetes clusters to the vSphere with Tanzu Supervisor Cluster. You can also use the Tanzu CLI to manage the lifecycles of clusters that are already running there. For information about how to manage the lifecycle of clusters, see the other topics in Chapter 5 Deploying Tanzu Kubernetes Clusters.

Deploy Tanzu Kubernetes Clusters to Amazon EC2

When you deploy Tanzu Kubernetes clusters to Amazon EC2, you must specify options in the cluster configuration file to connect to your AWS account and identify the resources that the cluster will use. You can also specify the sizes for the control plane and worker node VMs, distribute nodes across availability zones, and share VPCs between clusters.

For the full list of options that you must specify when deploying Tanzu Kubernetes clusters to Amazon EC2, see the Tanzu CLI Configuration File Variable Reference.

Tanzu Kubernetes Cluster Template

The template below includes all of the options that are relevant to deploying Tanzu Kubernetes clusters on Amazon EC2. You can copy this template and update it to deploy Tanzu Kubernetes clusters to Amazon EC2.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

VMware, Inc. 213 VMware Tanzu Kubernetes Grid

With the exception of the options described in the sections below the template, the way in which you configure the variables for Tanzu Kubernetes clusters that are specific to Amazon EC2 is identical for both management clusters and workload clusters. For information about how to configure the variables, see Create a Management Cluster Configuration File and Management Cluster Configuration for Amazon EC2. Options that are specific to workload clusters that are common to all infrastructure providers are described in Deploy Tanzu Kubernetes Clusters.

#! ------#! Cluster creation basic configuration #! ------

#! CLUSTER_NAME: CLUSTER_PLAN: dev NAMESPACE: default CNI: antrea IDENTITY_MANAGEMENT_TYPE: oidc

#! ------#! Node configuration #! AWS-only MACHINE_TYPE settings override cloud-agnostic SIZE settings. #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE: CONTROL_PLANE_MACHINE_TYPE: t3.large NODE_MACHINE_TYPE: m5.large # CONTROL_PLANE_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT_0: # WORKER_MACHINE_COUNT_1: # WORKER_MACHINE_COUNT_2:

#! ------#! AWS Configuration #! ------

AWS_REGION: AWS_NODE_AZ: "" # AWS_NODE_AZ_1: "" # AWS_NODE_AZ_2: "" # AWS_VPC_ID: "" # AWS_PRIVATE_SUBNET_ID: "" # AWS_PUBLIC_SUBNET_ID: "" # AWS_PUBLIC_SUBNET_ID_1: "" # AWS_PRIVATE_SUBNET_ID_1: "" # AWS_PUBLIC_SUBNET_ID_2: "" # AWS_PRIVATE_SUBNET_ID_2: "" # AWS_VPC_CIDR: 10.0.0.0/16 # AWS_PRIVATE_NODE_CIDR: 10.0.0.0/24 # AWS_PUBLIC_NODE_CIDR: 10.0.1.0/24 # AWS_PRIVATE_NODE_CIDR_1: 10.0.2.0/24 # AWS_PUBLIC_NODE_CIDR_1: 10.0.3.0/24 # AWS_PRIVATE_NODE_CIDR_2: 10.0.4.0/24

VMware, Inc. 214 VMware Tanzu Kubernetes Grid

# AWS_PUBLIC_NODE_CIDR_2: 10.0.5.0/24 AWS_SSH_KEY_NAME: BASTION_HOST_ENABLED: true

#! ------#! Machine Health Check configuration #! ------

ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

#! ------#! Common configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

# TKG_HTTP_PROXY: "" # TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

ENABLE_AUDIT_LOGGING: true ENABLE_DEFAULT_STORAGE_CLASS: true

CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

# OS_NAME: "" # OS_VERSION: "" # OS_ARCH: ""

#! ------#! Autoscaler configuration #! ------

ENABLE_AUTOSCALER: false # AUTOSCALER_MAX_NODES_TOTAL: "0" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m" # AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m" # AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m" # AUTOSCALER_MIN_SIZE_0: # AUTOSCALER_MAX_SIZE_0: # AUTOSCALER_MIN_SIZE_1: # AUTOSCALER_MAX_SIZE_1: # AUTOSCALER_MIN_SIZE_2: # AUTOSCALER_MAX_SIZE_2:

#! ------#! Antrea CNI configuration #! ------

VMware, Inc. 215 VMware Tanzu Kubernetes Grid

# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

Tanzu Kubernetes Cluster Plans and Node Distribution across AZs

When you create a prod Tanzu Kubernetes cluster on Amazon EC2, Tanzu Kubernetes Grid evenly distributes its control plane and worker nodes across the three Availability Zones (AZs) that you specified in your management cluster configuration. This includes Tanzu Kubernetes clusters that are configured with any of the following: n The default number of control plane nodes n The CONTROL_PLANE_MACHINE_COUNT setting that is greater than the default number of control plane nodes n The default number of worker nodes n The WORKER_MACHINE_COUNT setting that is greater than the default number of worker nodes

For example, if you specify WORKER_MACHINE_COUNT: 5, Tanzu Kubernetes Grid deploys two worker nodes in the first AZ, two worker nodes in the second AZ, and one worker node in the third AZ. You can optionally customize this default AZ placement mechanism for worker nodes by following the instructions in Configure AZ Placement Settings for Worker Nodes below. You cannot customize the default AZ placement mechanism for control plane nodes.

Configure AZ Placement Settings for Worker Nodes

When creating a prod Tanzu Kubernetes cluster on Amazon EC2, you can optionally specify how many worker nodes the tanzu cluster create command deploys in each of the three AZs you selected in the Tanzu Kubernetes Grid installer interface or configured in the cluster configuration file.

To do this:

1 Set the following variables in the cluster configuration file:

n WORKER_MACHINE_COUNT_0: Sets the number of worker nodes in the first AZ, AWS_NODE_AZ.

n WORKER_MACHINE_COUNT_1: Sets the number of worker nodes in the second AZ, AWS_NODE_AZ_1.

n WORKER_MACHINE_COUNT_2: Sets the number of worker nodes in the third AZ, AWS_NODE_AZ_2.

2 Create the cluster. For example:

tanzu cluster create my-prod-cluster

VMware, Inc. 216 VMware Tanzu Kubernetes Grid

Deploy a Cluster that Shares a VPC and NAT Gateway(s) with the Management Cluster

By default, Amazon EC2 imposes a limit of 5 NAT gateways per availability zone. For more information about this limit, see Resource Usage in Your Amazon Web Services Account. If you used the option to create a new VPC when you deployed the management cluster, by default, all Tanzu Kubernetes clusters that you deploy from this management cluster will also create a new VPC and one or three NAT gateways: one NAT gateway for development clusters and three NAT gateways, one in each of your availability zones, for production clusters. So as not to hit the limit of 5 NAT gateways per availability zone, you can modify the configuration with which you deploy Tanzu Kubernetes clusters so that they reuse the VPC and NAT gateway(s) that were created when the management cluster was deployed.

Configuring Tanzu Kubernetes clusters to share a VPC and NAT gateway(s) with their management cluster depends on how the management cluster was deployed: n It was deployed with the option to create a new VPC, either by selecting the option in the UI or by specifying AWS_VPC_CIDR in the cluster configuration file. n Ideally, tanzu cluster create was used with the --file option to save the cluster configuration to a different location than the default .tanzu/tkg/cluster-config.yaml file.

To deploy Tanzu Kubernetes clusters that reuse the same VPC as the management cluster, you must modify the configuration file from which you deploy Tanzu Kubernetes clusters.

If you deployed the management cluster with the option to reuse an existing VPC, all Tanzu Kubernetes clusters will share that VPC and its NAT gateway(s), and no action is required.

1 Open the cluster configuration file for the management cluster in a text editor.

2 Update the setting for AWS_VPC_ID with the ID the VPC that was created when the management cluster was deployed.

You can obtain this ID from your Amazon EC2 dashboard. Alternatively, you can obtain it by running tanzu management-cluster create --ui, selecting Deploy to AWS EC2 and consulting the value that is provided if you select Select an existing VPC in the VPC for AWS section of the installer interface. Cancel the deployment when you have copied the VPC ID.

VMware, Inc. 217 VMware Tanzu Kubernetes Grid

3 Update the settings for the AWS_PUBLIC_SUBNET_ID and AWS_PRIVATE_SUBNET_ID variables. If you are deploying a prod Tanzu Kubernetes cluster, update AWS_PUBLIC_SUBNET_ID, AWS_PUBLIC_SUBNET_ID_1, and AWS_PUBLIC_SUBNET_ID_2 and AWS_PRIVATE_SUBNET_ID, AWS_PRIVATE_SUBNET_ID_1, and AWS_PRIVATE_SUBNET_ID_2.

You can obtain the network information from the VPC dashboard.

4 Save the cluster configuration file.

5 Run the tanzu cluster create command with the --file option, specifying the modified cluster configuration file.

tanzu cluster create my-cluster --file my-cluster-config.yaml

Deploy a Cluster to an Existing VPC and Add Subnet Tags

If both of the following are true, you must add the kubernetes.io/cluster/YOUR-CLUSTER- NAME=shared tag to the public subnet or subnets that you intend to use for your Tanzu Kubernetes cluster: n You want to deploy the cluster to an existing VPC that was not created by Tanzu Kubernetes Grid. n You want to create services of type LoadBalancer in the cluster.

Adding the kubernetes.io/cluster/YOUR-CLUSTER-NAME=shared tag to the public subnet or subnets enables you to create services of type LoadBalancer after you deploy the cluster. To add this tag and then deploy the cluster, follow the steps below:

1 Gather the ID or IDs of the public subnet or subnets within your existing VPC that you want to use for the cluster. To deploy a prod Tanzu Kubernetes cluster, you must provide three subnets.

2 Create the required tag by running the following command:

aws ec2 create-tags --resources YOUR-PUBLIC-SUBNET-ID-OR-IDS --tags Key=kubernetes.io/cluster/ YOUR-CLUSTER-NAME,Value=shared

Where:

n YOUR-PUBLIC-SUBNET-ID-OR-IDS is the ID or IDs of the public subnet or subnets that you gathered in the previous step.

n YOUR-CLUSTER-NAME is the name of the Tanzu Kubernetes cluster that you want to create.

For example:

aws ec2 create-tags --resources subnet-00bd5d8c88a5305c6 subnet-0b93f0fdbae3436e8 subnet-06b29d20291797698 --tags Key=kubernetes.io/cluster/my-cluster,Value=shared

3 Create the cluster. For example:

tanzu cluster create my-cluster

VMware, Inc. 218 VMware Tanzu Kubernetes Grid

Deploy a Prod Cluster from a Dev Management Cluster

When you create a prod Tanzu Kubernetes cluster from a dev management cluster that is running on Amazon EC2, you must define a subset of additional variables in the cluster configuration file, which defaults to .tanzu/tkg/cluster-config.yaml, before running the tanzu cluster create command. This enables Tanzu Kubernetes Grid to create the cluster and spread its control plane and worker nodes across AZs.

To create a prod Tanzu Kubernetes cluster from a dev management cluster on Amazon EC2, perform the steps below:

1 Set the following variables in the cluster configuration file:

n Set PLAN to prod.

n AWS_NODE_AZ variables: AWS_NODE_AZ was set when you deployed your dev management cluster. For the prod Tanzu Kubernetes cluster, add AWS_NODE_AZ_1 and AWS_NODE_AZ_2.

n AWS_PUBLIC_NODE_CIDR (new VPC) or AWS_PUBLIC_SUBNET_ID (existing VPC) variables: AWS_PUBLIC_NODE_CIDR or AWS_PUBLIC_SUBNET_ID was set when you deployed your dev management cluster. For the prod Tanzu Kubernetes cluster, add one of the following:

n AWS_PUBLIC_NODE_CIDR_1 and AWS_PUBLIC_NODE_CIDR_2

n AWS_PUBLIC_SUBNET_ID_1 and AWS_PUBLIC_SUBNET_ID_2

n AWS_PRIVATE_NODE_CIDR (new VPC) or AWS_PRIVATE_SUBNET_ID (existing VPC) variables: AWS_PRIVATE_NODE_CIDR or AWS_PRIVATE_SUBNET_ID was set when you deployed your dev management cluster. For the prod Tanzu Kubernetes cluster, add one of the following:

n AWS_PRIVATE_NODE_CIDR_1 and AWS_PRIVATE_NODE_CIDR_2

n AWS_PRIVATE_SUBNET_ID_1 and AWS_PRIVATE_SUBNET_ID_2

2 (Optional) Customize the default AZ placement mechanism for the worker nodes that you intend to deploy by following the instructions in Configure AZ Placement Settings for Worker Nodes. By default, Tanzu Kubernetes Grid distributes prod worker nodes evenly across the AZs.

3 Deploy the cluster by running the tanzu cluster create command. For example:

tanzu cluster create my-cluster

What to Do Next

Advanced options that are applicable to all infrastructure providers are described in the following topics: n Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions n Customize Tanzu Kubernetes Cluster Networking n Create Persistent Volumes with Storage Classes

VMware, Inc. 219 VMware Tanzu Kubernetes Grid

n Configure Tanzu Kubernetes Plans and Clusters

After you have deployed your cluster, see Chapter 6 Managing Cluster Lifecycles.

Deploy Tanzu Kubernetes Clusters to Azure

When you deploy Tanzu Kubernetes (workload) clusters to Microsoft Azure, you must specify options in the cluster configuration file to connect to your Azure account and identify the resources that the cluster will use.

For the full list of options that you must specify when deploying workload clusters to Azure, see the Tanzu CLI Configuration File Variable Reference.

Create a Network Security Group for Each Cluster

Each workload cluster on Azure requires a Network Security Group (NSG) for its worker nodes named CLUSTER-NAME-node-nsg, where CLUSTER-NAME is the name of the cluster.

For more information, see Network Security Groups on Azure.

Azure Private Clusters

By default, Azure management and workload clusters are public. But you can also configure them to be private, which means their API server uses an Azure internal load balancer (ILB) and is therefore only accessible from within the cluster’s own VNET or peered VNETs.

To make an Azure cluster private, include the following in its configuration file: n Set AZURE_ENABLE_PRIVATE_CLUSTER to true. n (Optional) Set AZURE_FRONTEND_PRIVATE_IP to an internal address for the cluster's load balancer.

n This address must be within the range of its control plane subnet and must not be used by another component.

n If not set, this address defaults to 10.0.0.100. n Set AZURE_VNET_NAME, AZURE_VNET_CIDR, AZURE_CONTROL_PLANE_SUBNET_NAME, AZURE_CONTROL_PLANE_SUBNET_CIDR, AZURE_NODE_SUBNET_NAME, and AZURE_NODE_SUBNET_CIDR to the VNET and subnets that you use for other Azure private clusters.

n Because Azure private clusters are not accessible outside their VNET, the management cluster and any workload and shared services clusters that it manages must be in the same private VNET.

n The bootstrap machine, where you run the Tanzu CLI to create and use the private clusters, must also be in the same private VNET.

For more information, see API Server Endpoint in the Cluster API Provider Azure documentation.

VMware, Inc. 220 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Cluster Template

The template below includes all of the options that are relevant to deploying Tanzu Kubernetes clusters on Azure. You can copy this template and update it to deploy Tanzu Kubernetes clusters to Azure.

Mandatory options are uncommented. Optional settings are commented out. Default values are included where applicable.

The way in which you configure the variables for Tanzu Kubernetes clusters that are specific to Azure is identical for both management clusters and workload clusters. For information about how to configure the variables, see Create a Management Cluster Configuration File and Management Cluster Configuration for Microsoft Azure. Options that are specific to workload clusters that are common to all infrastructure providers are described in Deploy Tanzu Kubernetes Clusters.

#! ------#! Cluster creation basic configuration #! ------

# CLUSTER_NAME: CLUSTER_PLAN: dev NAMESPACE: default CNI: antrea IDENTITY_MANAGEMENT_TYPE: oidc

#! ------#! Node configuration #! ------

# SIZE: # CONTROLPLANE_SIZE: # WORKER_SIZE: # AZURE_CONTROL_PLANE_MACHINE_TYPE: "Standard_D2s_v3" # AZURE_NODE_MACHINE_TYPE: "Standard_D2s_v3" # CONTROL_PLANE_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT: 1 # WORKER_MACHINE_COUNT_0: # WORKER_MACHINE_COUNT_1: # WORKER_MACHINE_COUNT_2: # AZURE_CONTROL_PLANE_DATA_DISK_SIZE_GIB : "" # AZURE_CONTROL_PLANE_OS_DISK_SIZE_GIB : "" # AZURE_CONTROL_PLANE_MACHINE_TYPE : "" # AZURE_CONTROL_PLANE_OS_DISK_STORAGE_ACCOUNT_TYPE : "" # AZURE_ENABLE_NODE_DATA_DISK : "" # AZURE_NODE_DATA_DISK_SIZE_GIB : "" # AZURE_NODE_OS_DISK_SIZE_GIB : "" # AZURE_NODE_MACHINE_TYPE : "" # AZURE_NODE_OS_DISK_STORAGE_ACCOUNT_TYPE : ""

#! ------#! Azure Configuration #! ------

VMware, Inc. 221 VMware Tanzu Kubernetes Grid

AZURE_ENVIRONMENT: "AzurePublicCloud" AZURE_TENANT_ID: AZURE_SUBSCRIPTION_ID: AZURE_CLIENT_ID: AZURE_CLIENT_SECRET: AZURE_LOCATION: AZURE_SSH_PUBLIC_KEY_B64: # AZURE_CONTROL_PLANE_SUBNET_NAME: "" # AZURE_CONTROL_PLANE_SUBNET_CIDR: "" # AZURE_NODE_SUBNET_NAME: "" # AZURE_NODE_SUBNET_CIDR: "" # AZURE_RESOURCE_GROUP: "" # AZURE_VNET_RESOURCE_GROUP: "" # AZURE_VNET_NAME: "" # AZURE_VNET_CIDR: "" # AZURE_CUSTOM_TAGS : "" # AZURE_ENABLE_PRIVATE_CLUSTER : "" # AZURE_FRONTEND_PRIVATE_IP : "" # AZURE_ENABLE_ACCELERATED_NETWORKING : ""

#! ------#! Machine Health Check configuration #! ------

ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

#! ------#! Common configuration #! ------

# TKG_CUSTOM_IMAGE_REPOSITORY: "" # TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: ""

# TKG_HTTP_PROXY: "" # TKG_HTTPS_PROXY: "" # TKG_NO_PROXY: ""

ENABLE_AUDIT_LOGGING: true ENABLE_DEFAULT_STORAGE_CLASS: true

CLUSTER_CIDR: 100.96.0.0/11 SERVICE_CIDR: 100.64.0.0/13

# OS_NAME: "" # OS_VERSION: "" # OS_ARCH: ""

#! ------#! Autoscaler configuration #! ------

ENABLE_AUTOSCALER: false

VMware, Inc. 222 VMware Tanzu Kubernetes Grid

# AUTOSCALER_MAX_NODES_TOTAL: "0" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s" # AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m" # AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m" # AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m" # AUTOSCALER_MIN_SIZE_0: # AUTOSCALER_MAX_SIZE_0: # AUTOSCALER_MIN_SIZE_1: # AUTOSCALER_MAX_SIZE_1: # AUTOSCALER_MIN_SIZE_2: # AUTOSCALER_MAX_SIZE_2:

#! ------#! Antrea CNI configuration #! ------

# ANTREA_NO_SNAT: false # ANTREA_TRAFFIC_ENCAP_MODE: "encap" # ANTREA_PROXY: false # ANTREA_POLICY: true # ANTREA_TRACEFLOW: false

What to Do Next

Advanced options that are applicable to all infrastructure providers are described in the following topics: n Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions n Customize Tanzu Kubernetes Cluster Networking n Create Persistent Volumes with Storage Classes n Configure Tanzu Kubernetes Plans and Clusters

After you have deployed your cluster, see Chapter 6 Managing Cluster Lifecycles.

Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions

Tanzu Kubernetes Grid can create Tanzu Kubernetes clusters that run on: n A Kubernetes version that Tanzu Kubernetes Grid ships with, including the default version that the management cluster runs, or n A Kubernetes version that comes out after the current version of Tanzu Kubernetes Grid, and that VMware publishes a Bill of Materials (BoM) for in a public registry.

VMware, Inc. 223 VMware Tanzu Kubernetes Grid

List Available Versions

To list all available Kubernetes releases with their current compatibility and upgrade status, run tanzu kubernetes-release get with an optional version match argument, for example: n tanzu kubernetes-release get: list all releases n tanzu kubernetes-release get v1.19: list all releases matching v1.19 n tanzu kubernetes-release get v1.19.1+vmware.1: list the v1.19.1+vmware.1 release

Sample output:

$ tanzu kubernetes-release get NAME VERSION COMPATIBLE UPGRADEAVAILABLE v1.17.16---vmware.2 v1.17.16+vmware.2 True True v1.18.14---vmware.2 v1.18.14+vmware.2 True True v1.18.16---vmware.2 v1.18.16+vmware.2 True False v1.19.6---vmware.2 v1.19.6+vmware.2 True True v1.19.8---vmware.2 v1.19.8+vmware.2 True False v1.20.1---vmware.2 v1.20.1+vmware.2 True True v1.20.4---vmware.2 v1.20.4+vmware.2 False False

List Available Upgrades

To list the available upgrades for a Kubernetes release, run tanzu kubernetes-release available- upgrades get with the full name of the version, for example:

tanzu kubernetes-release available-upgrades get v1.19.6---vmware.2 NAME VERSION v1.19.8---vmware.2 v1.19.8+vmware.2 v1.19.9---vmware.2 v1.19.9+vmware.2 v1.20.1---vmware.2 v1.20.1+vmware.2 v1.20.4---vmware.2 v1.20.4+vmware.2 v1.20.5---vmware.2 v1.20.5+vmware.1

How Tanzu Kubernetes Grid Updates Kubernetes Versions

Tanzu Kubernetes Grid manages Kubernetes versions as custom resources definition (CRD) objects called Tanzu Kubernetes releases (TKr).

A TKr controller periodically polls a public registry for new Kubernetes version BoM files. When it detects a new version, it downloads the BoM and creates a corresponding TKr. The controller then saves the new BoM and TKr in the management cluster, as a ConfigMap and custom resource, respectively.

The tanzu CLI queries the management cluster to list available Kubernetes versions. When the CLI needs to create a new cluster, it downloads the TKr and BoM, uses the TKr to create the cluster, and saves the BoM to the local ~/.tanzu/tkg/bom directory.

Note: VMware publishes TKrs to all Microsoft Azure regions within the AzurePublicCloud and AzureUSGovernment cloud environments.

VMware, Inc. 224 VMware Tanzu Kubernetes Grid

Deploy a Cluster with a Non-Default Kubernetes Version

Each release of Tanzu Kubernetes Grid provides a default version of Kubernetes. The default version for Tanzu Kubernetes Grid v1.3.1 is Kubernetes v1.20.5.

As upstream Kubernetes releases patches or new versions, VMware publishes them in a public registry and the Tanzu Kubernetes release controller imports them into the management cluster. This lets the tanzu CLI create clusters based on the new versions.

To list available Kubernetes versions, see Available Kubernetes Versions above.

To deploy clusters that run a non-default version of Kubernetes different from the default version, follow the steps below.

Publish the Kubernetes Version to your Infrastructure On vSphere and Azure, you need to take an additional step before you can deploy clusters that run non-default versions of Kubernetes: n vSphere: Import the appropriate base image template OVA file into vSphere and convert it to a VM template. For information about importing base OVA files into vSphere, see Import a Base Image Template into vSphere. n Azure: Run the Azure CLI command to accept the license for the base OS version. Once you have accepted a license, you can skip this step in the future:

a Convert your target Kubernetes version listed in the output of the tanzu kubernetes- release get command into its Azure image SKU as follows:

n Change leading v to k8s-.

n Change . to dot in the version number.

n Change trailing +vmware.* to -ubuntu-2004, to designate Ubuntu v20.04, the default OS version for all Tanzu Kubernetes Grid VMs on Azure.

n Examples: k8s-1dot19dot8-ubuntu-2004, k8s-1dot20dot5-ubuntu-2004.

b Run az vm image terms accept. For example:

az vm image terms accept --publisher vmware-inc --offer tkg-capi --plan k8s-1dot20dot5- ubuntu-2004 n Amazon EC2: No action required. The Amazon Linux 2 Amazon Machine Images (AMI) that includes the supported Kubernetes versions is publicly available to all Amazon EC2 users, in all supported AWS regions. Tanzu Kubernetes Grid automatically uses the appropriate AMI for the Kubernetes version that you specify.

VMware, Inc. 225 VMware Tanzu Kubernetes Grid

Deploy the Kubernetes Cluster To deploy a Tanzu Kubernetes cluster with a version of Kubernetes that is not the default for your Tanzu Kubernetes Grid release, specify the version in the --tkr option. n Deploy a Kubernetes v1.19.1 cluster to vSphere:

tanzu cluster create my-1-19-1-cluster --tkr v1.19.1---vmware.1-tkg.1-60d2ffd n Deploy a Kubernetes v1.19.1 cluster to Amazon EC2 or Azure:

tanzu cluster create my-1-19-1-cluster --tkr v1.19.1---vmware.1-tkg.1-60d2ffd

For more details on how to create a Tanzu Kubernetes cluster, see Deploy Tanzu Kubernetes Clusters.

Deploy a Cluster with an Alternate OS or Custom Machine Image

With out-of-the-box Tanzu Kubernetes Grid, the --tkr option to tanzu cluster create supports common Kubernetes versions running on common base machine OSes. But you can build custom machine images and TKr to create new clusters with.

Reasons to do this include: n To create clusters on a base OS that VMware supports but does not distribute, such as Red Hat Enterprise Linux (RHEL) v7. n To install additional packages into the base machine image, or otherwise customize it as described in Customization in the Image Builder documentation.

To deploy a cluster with an alternate OS or custom machine image, you build a custom image, create a TKr for it, and deploy clusters with it as described in Chapter 8 Building Machine Images.

Customize Tanzu Kubernetes Cluster Networking

This topic describes how to customize networking for Tanzu Kubernetes (workload) clusters, including using a cluster network interface (CNI) other than the default Antrea, and supporting publicly-routable, no-NAT IP addresses for workload clusters on vSphere with NSX-T networking.

Deploy a Cluster with a Non-Default CNI

When you use the Tanzu CLI to deploy a Tanzu Kubernetes cluster, an Antrea cluster network interface (CNI) is automatically enabled in the cluster. Alternatively, you can enable a Calico CNI or your own CNI provider.

Existing Tanzu Kubernetes clusters that you deployed with a version of Tanzu Kubernetes Grid earlier than 1.2.x and then upgrade to v1.3 continue to use Calico as the CNI provider. You cannot change the CNI provider for these clusters.

VMware, Inc. 226 VMware Tanzu Kubernetes Grid

You can change the default CNI for a Tanzu Kubernetes cluster by specifying the CNI variable in the configuration file. The CNI variable supports the following options: n (Default) antrea: Enables Antrea. n calico: Enables Calico. See Enable Calico below. n none: Allows you to enable a custom CNI provider. See Enable a Custom CNI Provider below.

If you do not specify the CNI variable, Antrea is enabled by default.

CNI: antrea

#! ------#! Antrea CNI configuration #! ------ANTREA_NO_SNAT: false ANTREA_TRAFFIC_ENCAP_MODE: "encap" ANTREA_PROXY: false ANTREA_POLICY: true ANTREA_TRACEFLOW: false

Enable Calico To enable Calico in a Tanzu Kubernetes cluster, specify the following in the configuration file:

CNI: calico

After the cluster creation process completes, you can examine the cluster as described in Retrieve Tanzu Kubernetes Cluster kubeconfig and Examine the Deployed Cluster.

Enable a Custom CNI Provider To enable a custom CNI provider in a Tanzu Kubernetes cluster, follow the steps below:

1 Specify CNI: none in the configuration file when you create the cluster. For example:

CNI: none

The cluster creation process will not succeed until you apply a CNI to the cluster. You can monitor the cluster creation process in the Cluster API logs on the management cluster. For instructions on how to access the Cluster API logs, see Monitor Workload Cluster Deployments in Cluster API Logs.

2 After the cluster has been initialized, apply your CNI provider to the cluster:

a Get the admin credentials of the cluster. For example:

tanzu cluster kubeconfig get my-cluster --admin

b Set the context of kubectl to the cluster. For example:

kubectl config use-context my-cluster-admin@my-cluster

VMware, Inc. 227 VMware Tanzu Kubernetes Grid

c Apply the CNI provider to the cluster:

kubectl apply -f PATH-TO-YOUR-CNI-CONFIGURATION/example.yaml

3 Monitor the status of the cluster by using the tanzu cluster list command. When the cluster creation completes, the cluster status changes from creating to running. For more information about how to examine your cluster, see Connect to and Examine Tanzu Kubernetes Clusters.

Deploy Pods with Routable, No-NAT IP Addresses (NSX-T)

On vSphere with NSX-T networking and the Antrea container network interface (CNI), you can configure a Kubernetes workload cluster with routable IP addresses for its worker pods, bypassing network address translation (NAT) for external requests from and to the pods.

Routable IP addresses on pods let you: n Trace outgoing requests to common shared services, because their source IP address is the routable pod IP address, not a NAT address. n Support authenticated incoming requests from the external internet directly to pods, bypassing NAT.

The following sections explain how to deploy Tanzu Kubernetes Grid workload clusters with routable-IP pods. The range of routable IP addresses is set with the cluster's CLUSTER_CIDR configuration variable.

Configure NSX-T for Routable-IP Pods 1 Browse to your NSX-T server and open the Networking tab.

2 Under Connectivity > Tier-1 Gateways, click Add Tier-1 Gateway and configure a new Tier-1 gateway dedicated to routable-IP pods: n Name: Make up a name for your routable pods T1 gateway. n Linked Tier-0 Gateway: Select the Tier-0 gateway that your other Tier-1 gateways for Tanzu Kubernetes Grid use. n Edge Cluster: Select an existing edge cluster. n Route Advertisement: Enable All Static Routes, All NAT IP's, and All Connected Segments & Service Ports.

Click Save to save the gateway.

1 Under Connectivity > Segments, click Add Segment and configure a new NSX-T segment, a logical switch, for the workload cluster nodes containing the routable pods: n Name: Make up a name for the network segment for the workload cluster nodes. n Connectivity: Select the Tier-1 gateway that you just created. n Transport Zone: Select an overlay transport zone, such as tz-overlay.

VMware, Inc. 228 VMware Tanzu Kubernetes Grid

n Subnets: Choose an IP address range for cluster nodes, such as 195.115.4.1/24. This range should not overlap with DHCP profile Server IP Address values. n Route Advertisement: Enable All Static Routes, All NAT IP's, and All Connected Segments & Service Ports.

Click Save to save the gateway.

Deploy a Cluster with Routable-IP Pods To deploy a workload cluster that has no-NAT, publicly-routable IP addresses for its worker pods:

1 Create a workload cluster configuration file as described in Create a Tanzu Kubernetes Cluster Configuration File and as follows: n To set the block of routable IP addresses assigned to worker pods, you can either:

n Set CLUSTER_CIDR in the workload cluster configuration file, or

n Prepend your tanzu cluster create command with a CLUSTER_CIDR= setting, as shown in the following step. n Set NSXT_POD_ROUTING_ENABLED to "true". n Set NSXT_MANAGER_HOST to your NSX-T manager IP address. n Set NSXT_ROUTER_PATH to the inventory path of the newly-added Tier-1 gateway for routable IPs. Obtain this from NSX-T manager > Connectivity > Tier-1 Gateways by clicking the menu icon (

Set other NSXT_ string variables for accessing NSX-T by following the NSX-T Pod Routing table in the Tanzu CLI Configuration File Variable Reference. Pods can authenticate with NSX- T in one of four ways, with the least secure listed last:

n Certificate: Set NSXT_CLIENT_CERT_KEY_DATA, NSXT_CLIENT_CERT_KEY_DATA, and for a CA-issued certificate, NSXT_ROOT_CA_DATA_B64.

n VMware Identity Manager token on VMware Cloud (VMC): Set NSXT_VMC_AUTH_HOST and NSXT_VMC_ACCESS_TOKEN.

n Username/password stored in a Kubernetes secret: Set NSXT_SECRET_NAMESPACE, NSXT_SECRET_NAME, NSXT_USERNAME, and NSXT_PASSWORD.

n Username/password as plaintext in configuration file: Set NSXT_USERNAME and NSXT_PASSWORD.

a Run tanzu cluster create as described in Deploy Tanzu Kubernetes Clusters. For example:

$ CLUSTER_CIDR=100.96.0.0/11 tanzu cluster create my-routable-work-cluster -f my-routable-work- cluster-config.yaml

VMware, Inc. 229 VMware Tanzu Kubernetes Grid

Validating configuration... Creating workload cluster 'my-routable-work-cluster'... Waiting for cluster to be initialized... Waiting for cluster nodes to be available...

Validate Routable IPs To test routable IP addresses for your workload pods:

1 Deploy a webserver to the routable workload cluster.

2 Run kubectl get pods --o wide to retrieve NAME, INTERNAL-IP and EXTERNAL-IP values for your routable pods, and verify that the IP addresses listed are identical and are within the routable CLUSTER_CIDR range.

3 Run kubectl get nodes --o wide to retrieve NAME, INTERNAL-IP and EXTERNAL-IP values for the workload cluster nodes, which contain the routable-IP pods.

4 Log in to a different workload cluster's control plane node:

5 Run kubectl config use-context CLUSTER-CONTEXT to change context to the different cluster.

6 Run kubectl get nodes to retrieve the IP address of the current cluster's control plane node.

7 Run ssh capv@CONTROLPLANE-IP using the IP address you just retrieved.

8 ping and send curl requests to the routable IP address where you deployed the webserver, and confirm its responses.

n ping output should list the webserver's routable pod IP as the source address.

9 From a browser, log in to NSX-T and navigate to the Tier-1 gateway that you created for routable-IP pods.

10 Click Static Routes and confirm that the following routes were created within the routable CLUSTER_CIDR range:

11 A route for pods in the workload cluster's control plane node, with Next Hops shown as the address of the control plane node itself.

12 A route for pods in the workload cluster's worker nodes, with Next Hops shown as the addresses of the worker nodes themselves.

Delete Routable IPs After you delete a workload cluster that contains routable-IP pods, you may need to free the routable IP addresses by deleting them from T1 router:

1 In the NSX-T manager > Connectivity > Tier-1 Gateways select your routable-IP gateway.

2 Under Static Routes click the number of routes to open the list.

3 Search for routes that include the deleted cluster name, and delete each one from the menu icon (

VMware, Inc. 230 VMware Tanzu Kubernetes Grid

If if a permissions error prevents you from deleting the route from the menu, which may happen if the route is created by a certificate, delete the route via the API:

1 From the menu next to the route name, select Copy Path to Clipboard.

2 Run curl -i -k -u 'NSXT_USERNAME:NSXT_PASSWORD' -H 'Content-Type: application/json' -H 'X-Allow-Overwrite: true' -X DELETE https://NSXT_MANAGER_HOST/policy/api/v1/STATIC- ROUTE-PATH where:

n NSXT_MANAGER_HOST, NSXT_USERNAME, and NSXT_PASSWORD are your NSX-T manager IP address and credentials

n STATIC_ROUTE_PATH is the path that you just copied to the clipboard. The name starts with /infra/tier-1s/ and includes /static-routes/. Customize Cluster Node IP Addresses You can configure cluster-specific IP address blocks for management or workload cluster nodes. How you do this depends on the cloud infrastructure that the cluster runs on: vSphere

On vSphere, the cluster configuration file's VSPHERE_NETWORK sets the VM network that Tanzu Kubernetes Grid uses for cluster nodes and other Kubernetes objects. IP addresses are allocated to nodes by a DHCP server that runs in this VM network, deployed separately from Tanzu Kubernetes Grid.

If you are using NSX-T networking, you can configure DHCP bindings for your cluster nodes by following Configure DHCP Static Bindings on a Segment in the VMware NSX-T Data Center documentation. Amazon EC2

To configure cluster-specific IP address blocks on Amazon EC2, set the following variables in the cluster configuration file as described in the Amazon EC2 table in the Tanzu CLI Configuration File Variable Reference. n Set AWS_PUBLIC_NODE_CIDR to set an IP address range for public nodes.

n Make additional ranges available by setting AWS_PRIVATE_NODE_CIDR_1 or AWS_PRIVATE_NODE_CIDR_2 n Set AWS_PRIVATE_NODE_CIDR to set an IP address range for private nodes.

n Make additional ranges available by setting AWS_PRIVATE_NODE_CIDR_1 and AWS_PRIVATE_NODE_CIDR_2 n All node CIDR ranges must lie within the cluster's VPC range, which defaults to 10.0.0.0/16.

n Set this range with AWS_VPC_CIDR or assign nodes to an existing VPC and address range with AWS_VPC_ID.

Microsoft Azure

VMware, Inc. 231 VMware Tanzu Kubernetes Grid

To configure cluster-specific IP address blocks on Azure, set the following variables in the cluster configuration file as described in the Microsoft Azure table in the Tanzu CLI Configuration File Variable Reference. n Set AZURE_NODE_SUBNET_CIDR to create a new VNET with a CIDR block for worker node IP addresses. n Set AZURE_CONTROL_PLANE_SUBNET_CIDR to create a new VNET with a CIDR block for control plane node IP addresses. n Set AZURE_NODE_SUBNET_NAME to assign worker node IP addresses from the range of an existing VNET. n Set AZURE_CONTROL_PLANE_SUBNET_NAME to assign control plane node IP addresses from the range of an existing VNET.

Create Persistent Volumes with Storage Classes

This topic explains how to use dynamic storage in Tanzu Kubernetes (workload) clusters in Tanzu Kubernetes Grid.

Overview: PersistentVolume, PersistentVolumeClaim, and StorageClass

Within a Kubernetes cluster, PersistentVolume (PV) objects provide shared storage for cluster pods that is unaffected by pod lifecycles. Storage is provisioned to the PV through a PersistentVolumeClaim (PVC) object, which defines how much and how the pod accesses the underlying storage. For more information, see Persistent Volumes in the Kubernetes documentation.

Cluster administrators can define StorageClass objects that let cluster users dynamically create PVC and PV objects with different storage types and rules. Tanzu Kubernetes Grid also provides default StorageClass objects that let users provision persistent storage in a turnkey environment.

StorageClass objects include a provisioner field identifying the internal or external service plug-in that provisions PVs, and a parameters field that associates the Kubernetes storage class with storage options defined at the infrastructure level, such as VM Storage Policies in vSphere. For more information, see Storage Classes in the Kubernetes documentation.

Supported Storage Types

Tanzu Kubernetes Grid supports StorageClass objects for different storage types, provisioned by Kubernetes internal ("in-tree") or external ("out-of-tree") plug-ins.

Storage Types n vSphere Cloud Native Storage (CNS) n Amazon EBS n Azure Disk

VMware, Inc. 232 VMware Tanzu Kubernetes Grid

n iSCSI n NFS

See Default Storage Classes below for vSphere CNS, Azure EBS, and Azure Disk default storage classes.

Plug-in Locations n Kubernetes internal ("in-tree") storage.

n Ships with core Kubernetes; provider values are prefixed with kubernetes.io, e.g. kubernetes.io/aws-ebs. n External ("out-of-tree") storage.

n Can be anywhere defined by provider value, e.g. csi.vsphere.vmware.com.

n Follow the Container Storage Interface (CSI) standard for external storage.

Default Storage Classes

Tanzu Kubernetes Grid provides default StorageClass objects that let workload cluster users provision persistent storage on their infrastructure in a turnkey environment, without needing StorageClass objects created by a cluster administrator.

The ENABLE_DEFAULT_STORAGE_CLASS variable is set to true by default in the cluster configuration file passed to --file option of tanzu cluster create, to enable the default storage class for a workload cluster.

The Tanzu Kubernetes Grid default storage class definitions are: vSphere CNS

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: default annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: csi.vsphere.vmware.com parameters: storagePolicyName: optional

See the vSphere CSI storage class parameters in the Kubernetes documentation.

Amazon EBS

kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: default annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: kubernetes.io/aws-ebs

VMware, Inc. 233 VMware Tanzu Kubernetes Grid

See the Amazon EBS storage class parameters in the Kubernetes documentation.

Azure Disk

apiVersion: storage.k8s.io/v1beta1 kind: StorageClass metadata: name: default annotations: storageclass.beta.kubernetes.io/is-default-class: "true" labels: kubernetes.io/cluster-service: "true" provisioner: kubernetes.io/azure-disk parameters: kind: Managed storageaccounttype: Standard_LRS cachingmode: ReadOnly volumeBindingMode: WaitForFirstConsumer

See the Azure Disk storage class parameters in the Kubernetes documentation.

Set Up CNS and Create a Storage Policy (vSphere) vSphere administrators can set up vSphere CNS and create storage policies for virtual disk (VMDK) storage, based on the needs of Tanzu Kubernetes Grid cluster users.

You can use either vSAN or local VMFS (Virtual Machine File System) for persistent storage in a Kubernetes cluster, as follows: vSAN Storage:

To create a storage policy for vSAN storage in the vSphere Client, browse to Home > Policies and Profiles > VM Storage Policies and click Create to launch the Create VM Storage Policy wizard.

Follow the instructions in Create a Storage Policy in the vSphere documentation. Make sure to: n In the Policy structure pane, under Datastore specific rules, select Enable rules for "vSAN" storage. n Configure other panes or accept defaults as needed. n Record the storage policy name for reference as the storagePolicyName value in StorageClass objects.

Local VMFS Storage:

To create a storage policy for local storage, apply a tag to the storage and create a storage policy based on the tag as follows:

1 From the top-level vSphere menu, select Tags & Custom Attributes

2 In the Tags pane, select Categories and click New.

VMware, Inc. 234 VMware Tanzu Kubernetes Grid

3 Enter a category name, such as tkg-storage. Use the checkboxes to associate it with Datacenter and the storage objects, Folder and Datastore. Click Create.

4 From the top-level Storage view, select your VMFS volume, and in its Summary pane, click Tags > Assign....

5 From the Assign Tag popup, click Add Tag.

6 From the Create Tag popup, give the tag a name, such as tkg-storage-ds1 and assign it the Category you created. Click OK.

7 From Assign Tag, select the tag and click Assign.

8 From top-level vSphere, select VM Storage Policies > Create a Storage Policy. A configuration wizard starts.

9 In the Name and description pane, enter a name for your storage policy. Record the storage policy name for reference as the storagePolicyName value in StorageClass objects.

10 In the Policy structure pane, under Datastore specific rules, select Enable tag-based placement rules.

11 In the Tag based placement pane, click Add Tag Rule and configure:

n Tag category: Select your category name

n Usage option: Use storage tagged with

n Tags: Browse and select your tag name

12 Confirm and configure other panes or accept defaults as needed, then click Review and finish. Finish to create the storage policy.

Create a Custom Storage Class

Cluster administrators can create a new storage class as follows:

1 On vSphere, select or create the VM storage policy to use as the basis for the Kubernetes StorageClass.

n vSphere administrators can create a storage policy by following Create a Storage Policy (vSphere), above.

2 Create a StorageClass configuration .yaml with provisioner, parameters, and other options.

n On vSphere, associate a Kubernetes storage class with a vSphere storage policy by setting its storagePolicyName parameter to the vSphere storage policy name, as a double- quoted string.

3 Pass the file to kubectl create -f

4 Verify the storage class by running kubectl describe storageclass .

VMware, Inc. 235 VMware Tanzu Kubernetes Grid

Examples: n Enabling Dynamic Provisioning in the Kubernetes documentation n CSI - Container Storage Interface in the Kubernetes vSphere Cloud Provider documentation

Use a Custom Storage Class in a Cluster

To provision persistent storage for their cluster nodes that does not use one of the Default Storage Classes described above, cluster users include a custom storage class in a pod configuration as follows:

1 Set the context of kubectl to the cluster. For example:

kubectl config use-context my-cluster-admin@my-cluster

1 Select or create a storage class.

n Select:

n To list available storage classes, run kubectl get storageclass.

n Create

n Cluster admins can create storage classes by following Create a Custom Storage Class, above.

2 Create a PVC and its PV:

a Create a PersistentVolumeClaim configuration .yaml with spec.storageClassName set to the metadata.name value of your StorageClass object. For an example, see Enabling Dynamic Provisioning in the Kubernetes documentation.

b Pass the file to kubectl create -f

c Run kubectl describe pvc to verify the PVC.

d A PV is automatically created with the PVC. Record its name, listed in the kubectl describe pvd output after Successfully provisioned volume.

e Run kubectl describe pv to verify the PV.

3 Create a pod using the PVC:

a Create a Pod configuration .yaml that sets spec.volumes to include your PVC under persistentVolumeClaim.claimName. For an example, see Dynamic Provisioning and StorageClass API in the vSphere Storage for Kubernetes documentation.

b Pass the file to kubectl create -f

c Run kubectl get pod to verify the pod.

Enable Offline Volume Expansion for vSphere CSI (vSphere 7)

To enable offline volume expansion for vSphere CSI storage used by workload clusters, you need to add a csi-resizer sidecar pod to the cluster's CSI processes.

VMware, Inc. 236 VMware Tanzu Kubernetes Grid

The CSI configuration for workload clusters is encoded as a Kubernetes secret. This procedure adds the csi-resizer process by revising the CSI configuration secret. It adds to the secret a stringData definition that combines two encoded configuration data strings: a values.yaml string containing the secret's prior CSI configuration data, and a new overlays.yaml string that deploys the csi-resizer pod.

1 Log into the management cluster for the workload cluster you are changing, and run tanzu cluster list if you need to retrieve the name of the workload cluster.

2 Retrieve the name of the CSI secret for the workload cluster, using label selectors vsphere-csi and the cluster name:

$ kubectl get secret \ -l tkg.tanzu.vmware.com/cluster-name=NAME_OF_WORKLOAD_CLUSTER \ -l tkg.tanzu.vmware.com/addon-name=vsphere-csi my-wc-vsphere-csi-secret

1 Save a backup of the secret's content, in YAML format, to vsphere-csi-secret.yaml:

$ kubectl get secret my-wc-vsphere-csi-secret -o yaml > vsphere-csi-secret.yaml

1 Output the secret's current content again, with the data.values values base64-decoded into plain YAML.

$ kubectl get secret my-wc-vsphere-csi-secret -o jsonpath={.data.values\\.yaml} | base64 -d

#@data/values #@overlay/match-child-defaults missing_ok=True --- vsphereCSI: CSIAttacherImage: repository: projects.registry.vmware.com/tkg path: csi/csi-attacher tag: v3.0.0_vmware.1 pullPolicy: IfNotPresent vsphereCSIControllerImage: repository: projects.registry.vmware.com/tkg path: csi/vsphere-block-csi-driver tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent livenessProbeImage: repository: projects.registry.vmware.com/tkg path: csi/csi-livenessprobe tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent vsphereSyncerImage: repository: projects.registry.vmware.com/tkg path: csi/volume-metadata-syncer tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent CSIProvisionerImage: repository: projects.registry.vmware.com/tkg path: csi/csi-provisioner

VMware, Inc. 237 VMware Tanzu Kubernetes Grid

tag: v2.0.0_vmware.1 pullPolicy: IfNotPresent CSINodeDriverRegistrarImage: repository: projects.registry.vmware.com/tkg path: csi/csi-node-driver-registrar tag: v2.0.1_vmware.1 pullPolicy: IfNotPresent namespace: kube-system clusterName: wc-1 server: 10.170.104.114 datacenter: /dc0 publicNetwork: VM Network username: password:

1 Open vsphere-csi-secret.yaml in an editor and do the following to make it look like the code below:

2 After the first line, add two lines that define stringData, and values.yaml as its first element.

3 Copy the data.values output from the previous step.

4 After the third line, paste in the data.values output and indent it as the value of values.yaml.

5 Immediately below the values.yaml definition, add another stringData definition for overlays.yaml as shown below. Do not modify other definitions in the file.

```yaml apiVersion: v1 stringData: values.yaml: | #@data/values #@overlay/match-child-defaults missing_ok=True --- vsphereCSI: CSIAttacherImage: repository: projects.registry.vmware.com/tkg path: csi/csi-attacher tag: v3.0.0_vmware.1 pullPolicy: IfNotPresent vsphereCSIControllerImage: repository: projects.registry.vmware.com/tkg path: csi/vsphere-block-csi-driver tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent livenessProbeImage: repository: projects.registry.vmware.com/tkg path: csi/csi-livenessprobe tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent vsphereSyncerImage: repository: projects.registry.vmware.com/tkg path: csi/volume-metadata-syncer tag: v2.1.0_vmware.1 pullPolicy: IfNotPresent

VMware, Inc. 238 VMware Tanzu Kubernetes Grid

CSIProvisionerImage: repository: projects.registry.vmware.com/tkg path: csi/csi-provisioner tag: v2.0.0_vmware.1 pullPolicy: IfNotPresent CSINodeDriverRegistrarImage: repository: projects.registry.vmware.com/tkg path: csi/csi-node-driver-registrar tag: v2.0.1_vmware.1 pullPolicy: IfNotPresent namespace: kube-system clusterName: wc-1 server: 10.170.104.114 datacenter: /dc0 publicNetwork: VM Network username: password: overlays.yaml: | #@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind": "Deployment", "metadata": {"name": "vsphere-csi- controller"}}) --- spec: template: spec: containers: #@overlay/append - name: csi-resizer image: projects.registry.vmware.com/tkg/kubernetes-csi_external-resizer:v1.0.0_vmware.1 args: - "--v=4" - "--timeout=300s" - "--csi-address=$(ADDRESS)" - "--leader-election" env: - name: ADDRESS value: /csi/csi.sock volumeMounts: - mountPath: /csi name: socket-dir kind: Secret ... ```

1 Run kubectl apply to update the cluster's secret with the revised definitions and re-create the csi-controller pod:

$ kubectl apply -f vsphere-csi-secret.yaml

1 To verify that the vsphere-csi-controller and external resizer are working on the cluster:

VMware, Inc. 239 VMware Tanzu Kubernetes Grid

2 Confirm that vsphere-csi-controller is running on the workload cluster with six healthy pods:

$ kubectl get pods -n kube-system -l app=vsphere-csi-controller NAME READY STATUS RESTARTS AGE vsphere-csi-controller- 6/6 Running 0 6m49s

3 Check the logs of the vsphere-csi-controller to see that the external resizer started.

$ kubectl logs vsphere-csi-controller- -n kube-system -c csi-resizer I0308 23:44:45.035254 1 main.go:79] Version : v1.0.0-0-gb22717d I0308 23:44:45.037856 1 connection.go:153] Connecting to unix:///csi/csi.sock I0308 23:44:45.038572 1 common.go:111] Probing CSI driver for readiness I0308 23:44:45.040579 1 csi_resizer.go:77] CSI driver name: "csi.vsphere.vmware.com" W0308 23:44:45.040612 1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified. I0308 23:44:45.042184 1 controller.go:117] Register Pod informer for resizer csi.vsphere.vmware.com I0308 23:44:45.043182 1 leaderelection.go:243] attempting to acquire leader lease kube- system/external-resizer-csi-vsphere-vmware-com... I0308 23:44:45.073383 1 leaderelection.go:253] successfully acquired lease kube-system/ external-resizer-csi-vsphere-vmware-com I0308 23:44:45.076028 1 leader_election.go:172] new leader detected, current leader: vsphere-csi-controller-87d7dcf48-jcht2 I0308 23:44:45.079332 1 leader_election.go:165] became leader, starting I0308 23:44:45.079638 1 controller.go:241] Starting external resizer csi.vsphere.vmware.com

For more information about expanding vSphere CSI storage volumes in online or offline mode, see Expand a Persistent Volume in Online Mode and Expand a Persistent Volume in Offline Mode.

Configure Tanzu Kubernetes Plans and Clusters

This topic explains where Tanzu Kubernetes (workload) cluster plan configuration values come from, and what is the order of precedence for their multiple sources. It also explains how you can customize the dev and prod plans for workload clusters on each cloud infrastructure, and how you can use ytt overlays to customize cluster plans and clusters, and create new custom plans, while preserving original configuration code.

Where Cluster Configuration Values Come From

When the tanzu CLI creates a cluster, it combines configuration values from the following: n Live input at invocation

n CLI input

n UI input, when deploying a management cluster with the installer n Environment variables n ~/.tanzu/tkg/cluster-config.yaml or other file passed to the CLI --file option

VMware, Inc. 240 VMware Tanzu Kubernetes Grid

n Cluster plan YAML configuration files in ~/.tanzu/tkg/providers, as described in Plan Configuration Files below. n Other, non-plan YAML configuration files under ~/.tanzu/tkg/providers

Live input applies configuration values that are unique to each invocation, environment variables persist them over a terminal session, and configuration files and overlays persist them indefinitely. You can customize clusters through any of these sources, with recommendations and caveats described below.

See Configuration Precedence Order for how the tanzu CLI derives specific cluster configuration values from these multiple sources where they may conflict.

Plan Configuration Files

The ~/.tanzu/tkg/providers directory contains workload cluster plan configuration files in the following subdirectories, based on the cloud infrastructure that deploys the clusters:

Clusters deployed by... ~/.tanzu/tkg/providers Directory

Management cluster on vSphere /infrastructure-vsphere vSphere 7 Supervisor cluster /infrastructure-tkg-service-vsphere

Management cluster on Amazon EC2 /infrastructure-aws

Management cluster on Azure /infrastructure-azure

These plan configuration files are named cluster-template-definition-PLAN.yaml. The configuration values for each plan come from these files and from the files that they list under spec.paths: n Config files that ship with the tanzu CLI n Custom files that users create and add to the spec.paths list n ytt Overlays that users create or edit to overwrite values in other configuration files

Files to Edit, Files to Leave Alone

To customize cluster plans via YAML, you edit files under ~/.tanzu/tkg/providers/, but you should avoid changing other files.

Files to Edit

Workload cluster plan configuration file paths follow the form ~/.tanzu/tkg/providers/ infrastructure-INFRASTRUCTURE/VERSION/cluster-template-definition-PLAN.yaml, where: n INFRASTRUCTURE is vsphere, aws, or azure. n VERSION is the version of the Cluster API Provider module that the configuration uses. n PLAN is dev, prod, or a custom plan as created in the New nginx Workload Plan example.

VMware, Inc. 241 VMware Tanzu Kubernetes Grid

Each plan configuration file has a spec.paths section that lists source files and ytt directories that configure the cluster plan. For example:

spec: paths: - path: providers/infrastructure-aws/v0.5.5/ytt - path: providers/infrastructure-aws/ytt - path: providers/ytt - path: bom filemark: text-plain - path: providers/config_default.yaml

These files are processed in the order listed. If the same configuration field is set in multiple files, the last-processed setting is the one that the tanzu CLI uses.

To customize your cluster configuration, you can: n Create new configuration files and add them to the spec.paths list.

n This is the easier method. n Modify existing ytt overlay files as described in ytt Overlays below.

n This is the more powerful method, for people who are comfortable with ytt.

Files to Leave Alone

VMware discourages changing the following files under ~/.tanzu/tkg/providers, except as directed: n base-template.yaml files, in ytt directories

n These configuration files use values from the Cluster API provider repos for vSphere, AWS, and Azure under Kubernetes SIGs, and other upstream, open-source projects, and they are best kept intact.

n Instead, create new configuration files or see Clusters and Cluster Plans in Customizing Clusters, Plans, and Extensions with ytt Overlays to set values in the overlay.yaml file in the same ytt directory. n ~/.tanzu/tkg/providers/config_default.yaml - Append only

n This file contains system-wide defaults for Tanzu Kubernetes Grid on all cloud infrastructures.

n Do not modify existing values in this file, but you can append a User Customizations section at the end.

n Instead of changing values in this file, customize cluster configurations in files that you pass to the --file option of tanzu cluster create and tanzu management-cluster create. n ~/.tanzu/tkg/providers/config.yaml

n The tanzu CLI uses this file as a reference for all providers present in the /providers directory, and their default versions.

VMware, Inc. 242 VMware Tanzu Kubernetes Grid

Configuration Precedence Order

When the tanzu CLI creates a cluster, it reads in configuration values from multiple sources that may conflict. It resolves conflicts by using values in the following order of descending precedence:

Processing layers, Source Examples Notes ordered by descending precedence

8. User-specific data AZURE_NODE_MACHINE_TYPE: The main source of workload values, from or written to Standard_D2s_v3 (and management) cluster top-level config file parameters is the file passed to CLI --file option, which defaults to ~/.tanzu/tkg/ cluster-config.yaml.

7. Factory default data Shipped with TKG config_default.yaml These are the supported values cluster template configuration "knobs", with documentation and their default settings where applicable.

6. BOM metadata data bom-1.3.1+vmware.1.yaml One per Kubernetes version values released by TKG

5 (tie). User-provided Customizable ytt myhacks.yaml Topmost layer of ytt customizations processing files before the Data Values layers; takes precedence over the layers below it

5 (tie). Additional rm-bastion.yaml, rm- processing YAMLs, not mhc.yaml, custom-resource- user-provided annotations.yaml

4. Add-on YAMLs and calico.yaml, antrea.yaml A specific class of overlays customization representing one of more resources to be applied to the cluster post- creation.

3. Plan-specific prod.yaml, dev.yaml Plan-specific customizations. processing YAMLs

2. Overlay YAML ytt/overlay.yaml Defines what in the basic template is overridable, using legacy, "KEY_NAME:value" style entries.

1. Base Cluster template ytt/base-template.yaml Base CAPI template with actual YAML default values and no ytt annotations.

VMware, Inc. 243 VMware Tanzu Kubernetes Grid ytt Overlays

Tanzu Kubernetes Grid supports customizing workload cluster configurations by adding or modifying configuration files directly, but using ytt overlays instead lets you customize configurations at different scopes and manage multiple, modular configuration files, without destructively editing upstream and inherited configuration values.

For more information, see Clusters and Cluster Plans in Customizing Clusters, Plans, and Extensions with ytt Overlays.

IMPORTANT: You can only use ytt overlays to modify workload clusters. Using ytt overlays to modify management clusters is not supported.

The following examples show how to use configuration overlay files to customize existing workload cluster plans and create a new plan.

For an overlay that customizes cluster certificates, see Trust Custom CA Certificates on Cluster Nodes in the Tanzu Kubernetes Cluster Secrets topic.

Nameservers on vSphere This example adds one or more custom nameservers to worker and control plane nodes in Tanzu Kubernetes Grid clusters on vSphere. It disables DNS resolution from DHCP so that the custom nameservers take precedence.

Two overlay files apply to control plane nodes, and the other two apply to worker nodes. You add all four files into your ~/.tanzu/tkg/providers/infrastructure-vsphere/ytt/ directory.

The last line of each overlay-dns file sets the nameserver addresses. The code below shows a single nameserver, but you can specify multiple nameservers as a list, for example nameservers: ["1.2.3.4","5.6.7.8"].

File vsphere-overlay-dns-control-plane.yaml:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate", "metadata": {"name": data.values.CLUSTER_NAME+"-control-plane"}}) --- spec: template: spec: network: devices: #@overlay/match by=overlay.all, expects="1+" - #@overlay/match missing_ok=True nameservers: ["8.8.8.8"]

VMware, Inc. 244 VMware Tanzu Kubernetes Grid

File vsphere-overlay-dhcp-control-plane.yaml:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"}) --- spec: kubeadmConfigSpec: preKubeadmCommands: #! disable dns from being emitted by dhcp client #@overlay/append - echo '[DHCPv4]' >> /etc/systemd/network/10-id0.network #@overlay/append - echo 'UseDNS=no' >> /etc/systemd/network/10-id0.network #@overlay/append - '/usr/bin/systemctl restart systemd-networkd 2>/dev/null'

File vsphere-overlay-dns-workers.yaml:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate", "metadata": {"name": data.values.CLUSTER_NAME+"-worker"}}) --- spec: template: spec: network: devices: #@overlay/match by=overlay.all, expects="1+" - #@overlay/match missing_ok=True nameservers: ["8.8.8.8"]

File vsphere-overlay-dhcp-workers.yaml:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}) --- spec: template: spec: #@overlay/match missing_ok=True preKubeadmCommands: #! disable dns from being emitted by dhcp client #@overlay/append - echo '[DHCPv4]' >> /etc/systemd/network/10-id0.network #@overlay/append - echo 'UseDNS=no' >> /etc/systemd/network/10-id0.network

VMware, Inc. 245 VMware Tanzu Kubernetes Grid

#@overlay/append - '/usr/bin/systemctl restart systemd-networkd 2>/dev/null'

Disable Bastion Host on AWS For an example overlay that disables the Bastion host for workload clusters on AWS, see Disable Bastion Server on AWS in the TKG Lab repository.

New Plan nginx

This example adds and configures a new workload cluster plan nginx that runs an nginx server. It uses the Cluster Resource Set (CRS) to deploy the nginx server to vSphere clusters created with the vSphere Cluster API provider version v0.7.6.

1 In .tkg/providers/infrastructure-vsphere/v0.7.6/, add a new file cluster-template- definition-nginx.yaml with contents identical to the cluster-template-definition-dev.yaml and cluster-template-definition-prod.yaml files:

apiVersion: run.tanzu.vmware.com/v1alpha1 kind: TemplateDefinition spec: paths: - path: providers/infrastructure-vsphere/v0.7.6/ytt - path: providers/infrastructure-vsphere/ytt - path: providers/ytt - path: bom filemark: text-plain - path: providers/config_default.yaml

The presence of this file creates a new plan, and the tanzu CLI parses its filename to create the option nginx to pass to tanzu cluster create --plan.

1 In ~/.tanzu/tkg/providers/ytt/04_user_customizations/, create a new file deploy_service.yaml containing:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data") #@ load("@ytt:yaml", "yaml")

#@ def nginx_deployment(): apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment spec: selector: matchLabels: app: nginx replicas: 2 template: metadata: labels: app: nginx

VMware, Inc. 246 VMware Tanzu Kubernetes Grid

spec: containers: - name: nginx image: nginx:1.14.2 ports: - containerPort: 80 #@ end

#@ if data.values.TKG_CLUSTER_ROLE == "workload" and data.values.CLUSTER_PLAN == "nginx":

--- apiVersion: addons.cluster.x-k8s.io/v1alpha3 kind: ClusterResourceSet metadata: name: #@ "{}-nginx-deployment".format(data.values.CLUSTER_NAME) labels: cluster.x-k8s.io/cluster-name: #@ data.values.CLUSTER_NAME spec: strategy: "ApplyOnce" clusterSelector: matchLabels: tkg.tanzu.vmware.com/cluster-name: #@ data.values.CLUSTER_NAME resources: - name: #@ "{}-nginx-deployment".format(data.values.CLUSTER_NAME) kind: ConfigMap --- apiVersion: v1 kind: ConfigMap metadata: name: #@ "{}-nginx-deployment".format(data.values.CLUSTER_NAME) type: addons.cluster.x-k8s.io/resource-set stringData: value: #@ yaml.encode(nginx_deployment())

#@ end

In this file, the conditional #@ if data.values.TKG_CLUSTER_ROLE == "workload" and data.values.CLUSTER_PLAN == "nginx": applies the overlay that follows to workload clusters with the plan nginx.

If the 04_user_customizations directory does not already exist under the top-level ytt directory, create it.

VMware, Inc. 247 Managing Cluster Lifecycles 6

After you have deployed management clusters and Tanzu Kubernetes clusters, you can manage the lifecycles and update your clusters by using a combination of kubectl and the Tanzu CLI. n Manage Your Management Clusters n Enable Identity Management After Management Cluster Deployment n Connect to and Examine Tanzu Kubernetes Clusters n Scale Tanzu Kubernetes Clusters n Update and Troubleshoot Core Add-On Configuration n Tanzu Kubernetes Cluster Secrets n Configure Machine Health Checks for Tanzu Kubernetes Clusters n Back Up and Restore Clusters n Delete Tanzu Kubernetes Clusters

This chapter includes the following topics: n Manage Your Management Clusters n Managing Participation in CEIP n Enable Identity Management After Management Cluster Deployment n Connect to and Examine Tanzu Kubernetes Clusters n Scale Tanzu Kubernetes Clusters n Update and Troubleshoot Core Add-On Configuration n Tanzu Kubernetes Cluster Secrets n Configure Machine Health Checks for Tanzu Kubernetes Clusters n Back Up and Restore Clusters n Delete Tanzu Kubernetes Clusters

VMware, Inc. 248 VMware Tanzu Kubernetes Grid

Manage Your Management Clusters

This topic explains how to manage multiple management clusters from the same bootstrap machine, including management clusters deployed by Tanzu Kubernetes Grid to vSphere, Azure, or Amazon EC2 and vSphere with Tanzu Supervisor Clusters designated as Tanzu Kubernetes Grid management clusters.

List Management Clusters and Change Context

To list available management clusters and see which one you are currently logged in to, run tanzu login on your bootstrap machine: n To change your current login context, use your up- and down-arrow keys to highlight the new management cluster and then press Enter. n To retain your current context, press Enter without changing the highlighting.

For example, if you have two management clusters, my-vsphere-mgmt-cluster and my-aws-mgmt- cluster, you are currently logged in to my-vsphere-mgmt-cluster:

$ tanzu login ? Select a server [Use arrows to move, type to filter] > my-vsphere-mgmt-cluster () my-aws-mgmt-cluster () + new server

See Management Cluster Details

To see the details of a management cluster:

1 Run tanzu login to log in to the management cluster, as described in List Management Clusters and Change Context.

2 Run tanzu management-cluster get. For example:

$ tanzu management-cluster get NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES mc-test-cli tkg-system running 1/1 1/1 v1.20.1+vmware.2 management

Details:

NAME READY SEVERITY REASON SINCE MESSAGE /mc-test-cli True 29m ├─ClusterInfrastructure - AzureCluster/mc-test-cli True 30m ├─ControlPlane - KubeadmControlPlane/mc-test-cli-control-plane True 29m │ └─Machine/mc-test-cli-control-plane-htlc4 True 30m └─Workers └─MachineDeployment/mc-test-cli-md-0 └─Machine/mc-test-cli-md-0-699df4dc76-9kgmw True 30m

Providers:

VMware, Inc. 249 VMware Tanzu Kubernetes Grid

NAMESPACE NAME TYPE PROVIDERNAME VERSION WATCHNAMESPACE capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider kubeadm v0.3.14 capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider kubeadm v0.3.14 capi-system cluster-api CoreProvider cluster-api v0.3.14 capz-system infrastructure-azure InfrastructureProvider azure v0.4.8

To see more options, run tanzu management-cluster get --help.

Management Clusters, kubectl, and kubeconfig

Tanzu Kubernetes Grid does not automatically change the kubectl context when you run tanzu login to change the tanzu CLI context. Also, Tanzu Kubernetes Grid does not set the kubectl context to a workload cluster when you create it. To change the kubectl context, use the kubectl config use-context command.

By default, Tanzu Kubernetes Grid saves cluster context information in the following files on your bootstrap machine: n Management cluster contexts: ~/.kube-tkg/config n Workload cluster contexts: ~/.kube/config

Management Clusters and Their Configuration Files

When you run tanzu management-cluster create for the first time, it creates the ~/.tanzu/tkg subfolder that contains the Tanzu Kubernetes Grid configuration files. To deploy your first management cluster, you must specify the --ui or --file option with tanzu management-cluster create: n tanzu management-cluster create --ui creates a management cluster with the installer interface and saves the settings from your installer input into a cluster configuration file ~/.tanzu/tkg/clusterconfigs/UNIQUE-ID.yaml, where UNIQUE-ID is a generated filename. n tanzu management-cluster create --file creates a management cluster using an existing cluster configuration file. The --file option applies to cluster configuration files only and does not change where the tanzu CLI references other files under ~/.tanzu/tkg. n tanzu management-cluster create with neither the --ui nor --file option creates a management cluster using the default cluster configuration file ~/.tanzu/tkg/cluster- config.yaml.

The recommended practice is to use a dedicated configuration file for every management cluster that you deploy.

For more information about configuration files in Tanzu Kubernetes Grid, see What Happens When You Create a Management Cluster.

VMware, Inc. 250 VMware Tanzu Kubernetes Grid

Add Existing Management Clusters to Your Tanzu CLI

The Tanzu CLI allows you to log in to a management cluster that someone else created. To log in, you can use the local kubeconfig details or the server endpoint option.

To log into an existing management cluster by using a local kubeconfig:

1 Run tanzu login, use your down-arrow key to highlight + new server, and press Enter.

tanzu login ? Select a server + new server

2 When prompted, select Local kubeconfig as your login type and enter the path to your local kubeconfig file, context, and the name of your server. For example:

tanzu login ? Select a server + new server ? Select login type Local kubeconfig ? Enter path to kubeconfig (if any) /Users/exampleuser/examples/kubeconfig ? Enter kube context to use new-mgmt-cluster-admin@new-mgmt-cluster ? Give the server a name new-mgmt-cluster ✔ successfully logged in to management cluster using the kubeconfig new-mgmt-cluster

To log into an existing management cluster using the Server endpoint option:

1 Run tanzu login, use your down-arrow key to highlight + new server, and press Enter.

tanzu login ? Select a server + new server

2 When prompted, select Server endpoint as your login type.

3 In the Enter Server endpoint field, enter the Kubernetes API Server IP address of the management cluster.

4 In the Give the server a name field, enter a name for the server, and press Enter.

5 If identity management is enabled on the management cluster, in the Okta login page that opens in the default browser, enter your Okta credentials. You are logged in to the management cluster.

6 Verify that the following output is displayed on the Tanzu CLI:

successfully logged in to management cluster by using the kubeconfig

Alternatively, you can run tanzu login with the --server, --kubeconfig, and --context options and bypass the interactive prompts.

VMware, Inc. 251 VMware Tanzu Kubernetes Grid

Delete Management Clusters from Your Tanzu CLI Configuration

It is possible that you might add a management cluster that someone else created to your instance of the Tanzu CLI, that at some point you no longer require. Similarly, if you deployed a management cluster and that management cluster has been deleted from your infrastructure provider by means other than by running tanzu management-cluster delete, that management cluster will continue to appear in the list of management clusters that the CLI tracks when you run tanzu login. In these cases, you can remove the management cluster from the list of management clusters that the Tanzu CLI tracks.

1 Run tanzu config server list, to see the list of management clusters that the Tanzu CLI tracks.

tanzu config server list

You should see all of the management clusters that you have either deployed yourself or added to the Tanzu CLI, the location of their kubeconfig files, and their contexts.

2 Run the tanzu config server delete command to remove a management cluster.

tanzu config server delete my-vsphere-mc

Running the tanzu config server delete command removes the cluster details from the ~/.tanzu/ config.yaml and ~/.kube-tkg/config.yaml files. It does not delete the management cluster itself, if it still exists. To delete a management cluster rather than just remove it from the Tanzu CLI configuration, see Delete Management Clusters.

Scale Management Clusters

After you deploy a management cluster, you can scale it up or down by increasing or reducing the number of node VMs that it contains. To scale a management cluster, use the tanzu cluster scale command with one or both of the following options: n --controlplane-machine-count changes the number of management cluster control plane nodes. n --worker-machine-count changes the number of management cluster worker nodes.

Because management clusters run in the tkg-system namespace rather than the default namespace, you must also specify the --namespace option when you scale a management cluster.

1 Run tanzu login before you run tanzu cluster scale to make sure that the management cluster to scale is the current context of the Tanzu CLI.

2 To scale a production management cluster that you originally deployed with 3 control plane nodes and 5 worker nodes to 5 and 10 nodes respectively, run the following command: tanzu cluster scale MANAGEMENT-CLUSTER-NAME --controlplane-machine-count 5 -- worker-machine-count 10 --namespace tkg-system

VMware, Inc. 252 VMware Tanzu Kubernetes Grid

If you initially deployed a development management cluster with one control plane node and you scale it up to 3 control plane nodes, Tanzu Kubernetes Grid automatically enables stacked HA on the control plane.

IMPORTANT: Do not change context or edit the .kube-tkg/config file while Tanzu Kubernetes Grid operations are running.

Update Management Cluster Credentials (vSphere)

To update the vSphere credentials used by a management cluster, and optionally all of the workload clusters that it manages, see Update Management and Workload Cluster Credentials.

Manage Participation in CEIP

When you deploy a management cluster by using either the installer interface or the CLI, participation in the VMware Customer Experience Improvement Program (CEIP) is enabled by default, unless you specify the option to opt out. If you remain opted in to the program, the management cluster sends information about how you use Tanzu Kubernetes Grid back to VMware at regular intervals, so that we can make improvements in future versions.

For more information about the CEIP, see Managing Participation in CEIP.

If you opted out of the CEIP when you deployed a management cluster and want to opt in, or if you opted in and want to opt out, see Opt In or Out of the VMware CEIP in Managing Participation in CEIP to change your CEIP participation setting after deployment

Create Namespaces in the Management Cluster

To help you to organize and manage your development projects, you can optionally divide the management cluster into Kubernetes namespaces. You can then use Tanzu CLI to deploy Tanzu Kubernetes clusters to specific namespaces in your management cluster. For example, you might want to create different types of clusters in dedicated namespaces. If you do not create additional namespaces, Tanzu Kubernetes Grid creates all Tanzu Kubernetes clusters in the default namespace. For information about Kubernetes namespaces, see the Kubernetes documentation.

1 Make sure that kubectl is connected to the correct management cluster context by displaying the current context.

kubectl config current-context

2 List the namespaces that are currently present in the management cluster.

kubectl get namespaces

You will see that the management cluster already includes several namespaces for the different services that it provides:

capi-kubeadm-bootstrap-system Active 4m7s capi-kubeadm-control-plane-system Active 4m5s

VMware, Inc. 253 VMware Tanzu Kubernetes Grid

capi-system Active 4m11s capi-webhook-system Active 4m13s capv-system Active 3m59s cert-manager Active 6m56s default Active 7m11s kube-node-lease Active 7m12s kube-public Active 7m12s kube-system Active 7m12s tkg-system Active 3m57s

3 Use kubectl create -f to create new namespaces, for example for development and production.

These examples use the production and development namespaces from the Kubernetes documentation. kubectl create -f https://k8s.io/examples/admin/namespace-dev.json kubectl create -f https://k8s.io/examples/admin/namespace-prod.json

4 Run kubectl get namespaces --show-labels to see the new namespaces.

development Active 22m name=development production Active 22m name=production

Delete Management Clusters

To delete a management cluster, run the tanzu management-cluster delete command.

When you run tanzu management-cluster delete, Tanzu Kubernetes Grid creates a temporary kind cleanup cluster on your bootstrap machine to manage the deletion process. The kind cluster is removed when the deletion process completes.

1 To see all your management clusters, run tanzu login as described in List Management Clusters and Change Context.

2 If there are management clusters that you no longer require, run tanzu management-cluster delete.

You must be logged in to the management cluster that you want to delete.

tanzu management-cluster delete my-aws-mgmt-cluster

To skip the yes/no verification step when you run tanzu management-cluster delete, specify the --yes option.

tanzu management-cluster delete my-aws-mgmt-cluster --yes

3 If there are Tanzu Kubernetes clusters running in the management cluster, the delete operation is not performed.

VMware, Inc. 254 VMware Tanzu Kubernetes Grid

In this case, you can delete the management cluster in two ways:

n Run tanzu cluster delete to delete all of the running clusters and then run tanzu management-cluster delete again.

n Run tanzu management-cluster delete with the --force option.

tanzu management-cluster delete my-aws-mgmt-cluster --force

IMPORTANT: Do not change context or edit the .kube-tkg/config file while Tanzu Kubernetes Grid operations are running.

What to Do Next

You can use Tanzu Kubernetes Grid to start deploying Tanzu Kubernetes clusters to different Tanzu Kubernetes Grid instances. For information, see Chapter 5 Deploying Tanzu Kubernetes Clusters.

If you have vSphere 7, you can also deploy and manage Tanzu Kubernetes clusters in vSphere with Tanzu. For information, see Use the Tanzu CLI with a vSphere with Tanzu Supervisor Cluster.

Managing Participation in CEIP

VMware's Customer Experience Improvement Program (CEIP) is a voluntary program that collects information about how people use our products.

The data collected may include device identifiers and information that identifies your users. This data is collected to enable VMware to diagnose and improve its products and services, fix product issues, provide proactive technical support, and to advise you on how to best deploy and use our products.

When you deploy a management cluster by using either the installer interface or the CLI, participation in the VMware Customer Experience Improvement Program (CEIP) is enabled by default, unless you specify the option to opt out. If you remain opted in to the program, the management cluster sends data to VMware at regular intervals. This data is collected to enable VMware to diagnose and improve its products and services, fix product issues, provide proactive technical support, and to advise you on how to best deploy and use our products.

If you opt in to CEIP, management clusters send the following information to VMware: n The number of Tanzu Kubernetes clusters that you deploy. n The infrastructure, network, and storage providers that you use. n The time that it takes for the tanzu CLI to perform basic operations such as cluster create, cluster delete, cluster scale, and cluster upgrade. n The Tanzu Kubernetes Grid extensions that you implement. n The plans that you use to deploy clusters, as well as the number and configuration of the control plane and worker nodes.

VMware, Inc. 255 VMware Tanzu Kubernetes Grid

n The versions of Tanzu Kubernetes Grid and Kubernetes that you use. n The type and size of the workloads that your clusters run, as well as their lifespan. n Whether or not you integrate Tanzu Kubernetes Grid with Tanzu Kubernetes Grid Service for vSphere, Tanzu Mission Control, or Tanzu Observability by Wavefront. n The nature of any problems, errors, and failures that you encounter when using Tanzu Kubernetes Grid, so that we can identify which areas of Tanzu Kubernetes Grid need to be made more robust.

Opt In or Opt Out of the VMware CEIP

If you opted out of the CEIP when you deployed a management cluster and want to opt in, or if you opted in and want to opt out, you can change your CEIP participation setting after deployment.

CEIP runs as a cronjob on the management cluster. It does not run on workload clusters.

1 Run tanzu login, as described in List Management Clusters and Change Context to log in to the management cluster that you want to see or set CEIP status for.

2 Run the tanzu management-cluster ceip-participation get command to see the CEIP status of the current management cluster.

tanzu management-cluster ceip-participation get

The status Opt-in means that CEIP participation is enabled on a management cluster. Opt-out means that CEIP participation is disabled.

MANAGEMENT-CLUSTER-NAME CEIP-STATUS my-aws-mgmt-cluster Opt-out

3 To enable CEIP participation on a management cluster on which it is currently disabled, run the tanzu management-cluster ceip-participation set command with the value true.

tanzu management-cluster ceip-participation set true

4 To verify that the CEIP participation is now active, run tanzu management-cluster ceip- participation get again.

The status should now be Opt-in.

MANAGEMENT-CLUSTER-NAME CEIP-STATUS my-aws-mgmt-cluster Opt-in

VMware, Inc. 256 VMware Tanzu Kubernetes Grid

You can also check that the CEIP cronjob is running by setting the kubectl context to the management cluster and running kubectl get cronjobs -A. For example:

kubectl config use-context my-aws-mgmt-cluster-admin@my-aws-mgmt-cluster

kubectl get cronjobs -A

The output shows that the tkg-telemetry job is running:

NAMESPACE NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE tkg-system-telemetry tkg-telemetry 0 */6 * * * False 0 18s

5 To disable CEIP participation on a management cluster on which it is currently enabled, run the tanzu management-cluster ceip-participation set command with the value false.

tanzu management-cluster ceip-participation set false

6 To verify that the CEIP participation is disabled, run tanzu management-cluster ceip- participation get again.

The status should now be Opt-out.

MANAGEMENT-CLUSTER-NAME CEIP-STATUS my-aws-mgmt-cluster Opt-out

If you run kubectl get cronjobs -A again, the output shows that no job is running:

No resources found

Add Entitlement Account Number and Environment Type to Telemetry Profile

Platform operators can use the Tanzu CLI to add an Entitlement Account Number (EAN) and environment type to a telemetry profile.

The EAN is a unique nine-digit number associated with an account. Adding an EAN to a telemetry profile allows all the information collected by CEIP to be associated with that account and allows your account team to create reports for the account.

VMware recommends that you use your EAN for all product and support interactions. If you do not provide an EAN, a new EAN may be created for the interaction.

To add an EAN and environment type to a telemetry profile:

1 Identify the Entitlement Account Number

2 Update the Management Cluster

VMware, Inc. 257 VMware Tanzu Kubernetes Grid

Identify the Entitlement Account Number

If you do not know the EAN, use one of the following methods to find it. n Find the EAN from My VMware n Find the EAN from the Partner Connect Portal

Find the EAN from Customer Connect In a web browser, navigate to VMware Customer Connect and log in. If you are a new user, register to create a Customer Connect profile. For more information about creating a Customer Connect profile, see How to create a Customer Connect profile in the VMware knowledge base.

Find the EAN from Customer Connect in one of the following ways: n From Account Summary:

a On the Home page, click the Manage Accounts quick link.

b Select Accounts > Account Summary.

c On the Account Summary page, locate the account and record the EAN.

n From License Keys:

a In the top menu bar, click Accounts > License Keys.

VMware, Inc. 258 VMware Tanzu Kubernetes Grid

b On the License Keys page, locate the account and record the EAN.

Find the EAN from the Partner Connect Portal Find the EAN from Partner Connect:

1 In a web browser, navigate to the VMware Partner Portal and log in.

If you are a new user, register with Partner Connect. For more information about registering, see How to register with Partner Connect in the VMware knowledge base.

2 In the top menu bar, click Incentives > Advantage Plus and select Entitlement Account Lookup.

VMware, Inc. 259 VMware Tanzu Kubernetes Grid

3 Update the Customer Name and Country fields, then click Search. In the results, locate the account and record the EAN.

Update the Management Cluster

Add the EAN and environment type to the telemetry profile, then confirm the CEIP status.

1 To add the EAN and environment type to the telemetry profile, run:

tanzu management-cluster ceip-participation set true --labels = entitlement-account-number="MY- EAN",env-type="MY-ENVIRONMENT"

Where:

n MY-EAN is the Entitlement Account Number.

n MY-ENVIRONMENT is the environment type of the specified management cluster. Accepted values are production, development, and test.

VMware, Inc. 260 VMware Tanzu Kubernetes Grid

2 To verify that the CEIP participation is now active, run:

tanzu management-cluster ceip-participation get

3 Confirm that the output from this command shows the status as Opt-in. For example:

MANAGEMENT-CLUSTER-NAME CEIP-STATUS my-aws-mgmt-cluster Opt-in

Enable Identity Management After Management Cluster Deployment

This topic describes how to enable identity management in Tanzu Kubernetes Grid as a post- deployment step.

Overview

If you did not configure identity management when you deployed your management cluster, you can enable it as a post-deployment step by following the instructions below:

1 Obtain your identity provider details.

2 Generate a Kubernetes secret for the Pinniped add-on.

3 Check the status of the identity management service.

4 (OIDC only) Provide the callback URI to your OIDC provider.

5 Generate a non-admin kubeconfig.

6 Create role bindings for your management cluster users.

7 Enable identity management in workload clusters.

You must enable identity management in the management cluster first and then in each Tanzu Kubernetes (workload) cluster that it manages.

Obtain Your Identity Provider Details

Before you can enable identity management, you must have an identity provider. Tanzu Kubernetes Grid supports LDAPS and OIDC identity providers. For more information, see Obtain Your Identity Provider Details in Enabling Identity Management in Tanzu Kubernetes Grid and then return to this topic.

Generate a Kubernetes Secret for the Pinniped Add-on

This procedure configures the Pinniped add-on in your management cluster.

VMware, Inc. 261 VMware Tanzu Kubernetes Grid

To generate a Kubernetes secret for the Pinniped add-on:

1 Create a cluster configuration file using the configuration settings that you defined when you deployed your management cluster. Include the following configuration variables in the file:

n Basic cluster variables.

# This is the name of your target management cluster. CLUSTER_NAME: # For the management cluster, the default namespace is "tkg-system". NAMESPACE: CLUSTER_PLAN: CLUSTER_CIDR: SERVICE_CIDR:

n vSphere-, AWS-, or Azure-specific variables that you set when you deployed your management cluster. For information about these variables, see Tanzu CLI Configuration File Variable Reference.

For example:

VSPHERE_SERVER: VSPHERE_DATACENTER: VSPHERE_RESOURCE_POOL: VSPHERE_DATASTORE: VSPHERE_FOLDER: VSPHERE_NETWORK: VSPHERE_SSH_AUTHORIZED_KEY: VSPHERE_TLS_THUMBPRINT: VSPHERE_INSECURE: VSPHERE_USERNAME: VSPHERE_PASSWORD: VSPHERE_CONTROL_PLANE_ENDPOINT:

n OIDC or LDAP identity provider details.

Note: Set these variables only for the management cluster. You do not need to set them for your workload clusters.

# Identity management type. This must be "oidc" or "ldap".

IDENTITY_MANAGEMENT_TYPE:

# Set these variables if you want to configure OIDC.

CERT_DURATION: 2160h CERT_RENEW_BEFORE: 360h OIDC_IDENTITY_PROVIDER_ISSUER_URL: OIDC_IDENTITY_PROVIDER_CLIENT_ID: OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: OIDC_IDENTITY_PROVIDER_SCOPES: "email,profile,groups" OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM:

VMware, Inc. 262 VMware Tanzu Kubernetes Grid

# Set these variables if you want to configure LDAP.

LDAP_BIND_DN: LDAP_BIND_PASSWORD: LDAP_HOST: LDAP_USER_SEARCH_BASE_DN: LDAP_USER_SEARCH_FILTER: LDAP_USER_SEARCH_USERNAME: userPrincipalName LDAP_USER_SEARCH_ID_ATTRIBUTE: DN LDAP_USER_SEARCH_EMAIL_ATTRIBUTE: DN LDAP_USER_SEARCH_NAME_ATTRIBUTE: LDAP_GROUP_SEARCH_BASE_DN: LDAP_GROUP_SEARCH_FILTER: LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn LDAP_ROOT_CA_DATA_B64:

For more information, see Variables for Configuring Identity Providers - OIDC and Variables for Configuring Identity Providers - LDAP.

2 Set the _TKG_CLUSTER_FORCE_ROLE environment variable to management.

export _TKG_CLUSTER_FORCE_ROLE="management"

On Windows, use the SET command.

3 Set the FILTER_BY_ADDON_TYPE environment variable to authentication/pinniped.

export FILTER_BY_ADDON_TYPE="authentication/pinniped"

4 Generate a manifest file for the cluster.

tanzu cluster create CLUSTER-NAME --dry-run -f CLUSTER-CONFIG-FILE > CLUSTER-NAME-example- secret.yaml

Where:

n CLUSTER-NAME is the name of your target management cluster.

n CLUSTER-CONFIG-FILE is the configuration file that you created above.

The resulting manifest contains only the secret for the Pinniped add-on.

5 Review the secret and then apply it to the cluster. For example:

kubectl apply -f CLUSTER-NAME-example-secret.yaml

6 After applying the secret, check the status of the Pinniped add-on by running the kubectl get app command.

$ kubectl get app pinniped -n tkg-system NAME DESCRIPTION SINCE-DEPLOY AGE pinniped Reconcile succeeded 3m23s 7h50m

VMware, Inc. 263 VMware Tanzu Kubernetes Grid

If the returned status is Reconcile failed, run the following command to get details on the failure.

kubectl get app pinniped -n tkg-system -o yaml

For more information about troubleshooting the Pinniped add-on, see Troubleshooting Core Add-on Configuration in Update and Troubleshoot Core Add-On Configuration.

Check the Status of the Identity Management Service

Confirm that the Pinniped service is running correctly. To check the status of the Pinniped service: n If you are configuring an OIDC identity provider, follow the instructions in Check the Status of an OIDC Identity Management Service and then return to this topic. n If you are configuring a LDAP identity provider, follow the instructions in Check the Status of an LDAP Identity Management Service and then return to this topic.

(OIDC Only) Provide the Callback URI to the OIDC Provider

If you are configuring your management cluster to use OIDC authentication, you must provide the callback URI for the management cluster to your OIDC identity provider. To configure the callback URI, follow the instructions in Provide the Callback URI to the OIDC Provider and then return to this topic.

Generate a Non-Admin kubeconfig

To allow authenticated users to connect to the management cluster, generate a non-admin kubeconfig. To generate the non-admin kubeconfig file, follow the instructions in Generate a kubeconfig to Allow Authenticated Users to Connect to the Management Cluster and then return to this topic.

Create Role Bindings for Your Management Cluster Users

To complete the identity management configuration of the management cluster, you must create role bindings for the users who use the kubeconfig that you generated in the above step. To create a role binding, follow the instructions in Create a Role Binding on the Management Cluster and then return to this topic.

Enable Identity Management in Workload Clusters

Any workload clusters that you create after you enable identity management in the management cluster are automatically configured to use the same identity management service. If a workload cluster was created before you enabled identity management in your management cluster, you must enable it manually.

VMware, Inc. 264 VMware Tanzu Kubernetes Grid

To enable identity management in a workload cluster:

1 Generate a Kubernetes secret for the Pinniped add-on.

a Create a cluster configuration file using the configuration settings that you defined when you deployed your workload cluster. Include the following variables:

n Basic cluster variables.

# This is the name of your target workload cluster. CLUSTER_NAME: # For workload clusters, the default namespace is "default". NAMESPACE: CLUSTER_PLAN: CLUSTER_CIDR: SERVICE_CIDR:

n vSphere-, AWS-, or Azure-specific variables that you set when you deployed your workload cluster. For information about these variables, see Tanzu CLI Configuration File Variable Reference.

n Supervisor issuer URL and CA bundle data.

# This is the Pinniped supervisor service endpoint in the management cluster. SUPERVISOR_ISSUER_URL:

# Pinniped uses this b64-encoded CA bundle data for communication between the management cluster and the workload cluster. SUPERVISOR_ISSUER_CA_BUNDLE_DATA_B64:

You can retrieve these values by running kubectl get configmap pinniped-info -n kube-public -o yaml against the management cluster.

b Set the _TKG_CLUSTER_FORCE_ROLE environment variable to workload.

export _TKG_CLUSTER_FORCE_ROLE="workload"

c Set the FILTER_BY_ADDON_TYPE environment variable to authentication/pinniped.

export FILTER_BY_ADDON_TYPE="authentication/pinniped"

d Generate the secret for the Pinniped add-on.

tanzu cluster create CLUSTER-NAME --dry-run -f CLUSTER-CONFIG-FILE > CLUSTER-NAME-example- secret.yaml

2 Review the secret and apply it to the management cluster.

VMware, Inc. 265 VMware Tanzu Kubernetes Grid

3 After applying the secret, check the status of the Pinniped add-on by running the kubectl get app command against the workload cluster.

$ kubectl get app pinniped -n tkg-system NAME DESCRIPTION SINCE-DEPLOY AGE pinniped Reconcile succeeded 3m23s 7h50m

4 Configure role-based access control on the workload cluster by following the instructions in Authenticate Connections to a Workload Cluster.

Connect to and Examine Tanzu Kubernetes Clusters

After you have deployed Tanzu Kubernetes clusters, you use the tanzu cluster list and tanzu cluster kubeconfig get commands to obtain the list of running clusters and their credentials. Then, you can connect to the clusters by using kubectl and start working with your clusters.

Obtain Lists of Deployed Tanzu Kubernetes Clusters

To see lists of Tanzu Kubernetes clusters and the management clusters that manage them, use the tanzu cluster list command. n To list all of the Tanzu Kubernetes clusters that are running in the default namespace of this management cluster, run the tanzu cluster list command.

tanzu cluster list

The output lists all of the Tanzu Kubernetes clusters that are managed by the management cluster. The output lists the cluster names, the namespace in which they are running, their current status, the numbers of actual and requested control plane and worker nodes, and the Kubernetes version that the cluster is running.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES vsphere-cluster default running 1/1 1/1 v1.20.5+vmware.1 vsphere-cluster2 default running 1/1 1/1 v1.20.5+vmware.1 my-vsphere-tkc default running 1/1 1/1 v1.20.5+vmware.1

Clusters can be in the following states:

n creating: The control plane is being created

n createStalled: The process of creating control plane has stalled

n deleting: The cluster is in the process of being deleted

n failed: The creation of the control plane has failed

n running: The control plane has initialized fully

n updating: The cluster is in the process of rolling out an update or is scaling nodes

n updateFailed: The cluster update process failed

n updateStalled: The cluster update process has stalled

VMware, Inc. 266 VMware Tanzu Kubernetes Grid

n No status: The creation of the cluster has not started yet

If a cluster is in a stalled state, check that there is network connectivity to the external registry, make sure that there are sufficient resources on the target platform for the operation to complete, and ensure that DHCP is issuing IPv4 addresses correctly. n To list only those clusters that are running in a given namespace, specify the --namespace option.

tanzu cluster list --namespace=my-namespace n To include the current management cluster in the output of tanzu cluster list, specify the -- include-management-cluster option.

tanzu cluster list --include-management-cluster

You can see that the management cluster is running in the tkg-system namespace and has the management role.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES vsphere-cluster default running 1/1 1/1 v1.19.1+vmware.2 vsphere-cluster2 default running 3/3 3/3 v1.19.1+vmware.2 vsphere-mgmt-cluster tkg-system running 1/1 1/1 v1.19.1+vmware.2 management n To see all of the management clusters and change the context of the Tanzu CLI to a different management cluster, run the tanzu login command. See List Management Clusters and Change Context for more information.

Export Tanzu Kubernetes Cluster Details to a File

You can export the details of the clusters that are managed by a management cluster in either JSON or YAML format. You can save the JSON or YAML to a file so that you can use it in scripts to run bulk operations on clusters.

1 To export cluster details as JSON, run tanzu cluster list with the --output option, specifying json.

tanzu cluster list --output json

The output shows the cluster information as JSON:

[ { "name": "vsphere-cluster", "namespace": "default", "status": "running", "plan": "", "controlplane": "1/1", "workers": "1/1", "kubernetes": "v1.19.1+vmware.2", "roles": [] },

VMware, Inc. 267 VMware Tanzu Kubernetes Grid

{ "name": "vsphere-cluster2", "namespace": "default", "status": "running", "plan": "", "controlplane": "3/3", "workers": "3/3", "kubernetes": "v1.19.1+vmware.2", "roles": [] } ]

2 To export cluster details as YAML, run tanzu cluster list with the --output option, specifying yaml.

tanzu cluster list --output yaml

The output shows the cluster information as YAML:

- name: vsphere-cluster namespace: default status: running plan: "" controlplane: 1/1 workers: 1/1 kubernetes: v1.19.1+vmware.2 roles: [] - name: vsphere-cluster2 namespace: default status: running plan: "" controlplane: 3/3 workers: 3/3 kubernetes: v1.19.1+vmware.2 roles: []

3 Save the output as a file.

tanzu cluster list --output json > clusters.json

tanzu cluster list --output yaml > clusters.yaml

For how to save the details of multiple management clusters, including their context and kubeconfig files, see Save Management Cluster Details to a File.

Retrieve Tanzu Kubernetes Cluster kubeconfig

After you create a Tanzu Kubernetes cluster, you can obtain its cluster, context, and user kubeconfig settings by running the tanzu cluster kubeconfig get command, specifying the name of the cluster.

By default, the command adds the cluster's kubeconfig settings to your current kubeconfig file.

VMware, Inc. 268 VMware Tanzu Kubernetes Grid

To generate a standalone administrator kubeconfig file with embedded credentials, add the -- admin option. This kubeconfig file grants its user full access to the cluster's resources and lets them access the cluster without logging in to an identity provider.

IMPORTANT: If identity management is not configured on the cluster, you must specify the -- admin option.

tanzu cluster kubeconfig get my-cluster --admin

You should see the following output:

You can now access the cluster by running 'kubectl config use-context my-cluster-admin@my-cluster'

If identity management is enabled on a cluster, you can generate a regular kubeconfig that requires the user to authenticate with your external identity provider, and grants them access to cluster resources based on their assigned roles. In this case, run tanzu cluster kubeconfig get without the --admin option.

tanzu cluster kubeconfig get my-cluster

You should see the following output:

You can now access the cluster by running 'kubectl config use-context tanzu-cli-my-cluster@my-cluster'

If the cluster is running in a namespace other than the default namespace, you must specify the --namespace option to get the credentials of that cluster.

tanzu cluster kubeconfig get my-cluster --namespace=my-namespace

To save the configuration information in a standalone kubeconfig file, for example to distribute them to developers, specify the --export-file option. This kubeconfig file requires the user to authenticate with an external identity provider, and grants access to cluster resources based on their assigned roles.

tanzu cluster kubeconfig get my-cluster --export-file my-cluster-credentials

IMPORTANT: By default, unless you specify the --export-file option to save the kubeconfig for a cluster to a specific file, the credentials for all clusters that you deploy from the Tanzu CLI are added to a shared kubeconfig file. If you delete the shared kubeconfig file, all clusters become unusable.

To retrieve a kubeconfig for a management cluster, run tanzu management-cluster kubeconfig get as described in Retrieve Tanzu Kubernetes Cluster kubeconfig.

VMware, Inc. 269 VMware Tanzu Kubernetes Grid

Authenticate Connections to a Workload Cluster

If you deployed the management cluster with identity management enabled or enabled identity management on the management cluster as a post-deployment step, any workload clusters that you create from your management cluster are automatically configured to use the same identity management service. When you provide users with the admin kubeconfig for a management cluster or workload cluster, they have full access to the cluster and do not need to be authenticated. However, if you provide users with the regular kubeconfig, they must have a user account in your OIDC or LDAP identity provider and you must configure Role-Based Access Control (RBAC) on the cluster to grant access permissions to the designated user.

The authentication process requires a browser to be present on the machine from which users connect to clusters, because running kubectl commands automatically opens the IDP login page so that users can log in to the cluster. If the machine on which you are running tanzu and kubectl commands does not have a browser, see Authenticate Users on a Machine Without a Browser below.

To authenticate users on a workload cluster on which identity management is enabled, perform the following steps.

1 Obtain the regular kubeconfig for the workload cluster and export it to a file.

This example exports the kubeconfig for the cluster my-cluster to the file my-cluster- credentials.

tanzu cluster kubeconfig get my-cluster --export-file my-cluster-credentials

2 Use the generated file to attempt to run an operation on the cluster.

For example, run:

kubectl get pods -A --kubeconfig my-cluster-credentials

You should be redirected to the log in page for your identity provider.

After successfully logging in with a user account from your identity provider, if you already configured a role binding on the cluster for the authenticated user, the output shows the pod information.

If you have not configured a role binding on the cluster, you see the message Error from server (Forbidden): pods is forbidden: User "" cannot list resource "pods" in API group "" at the cluster scope. This happens because this user does not have any permissions on the cluster yet. To authorize the user to access the cluster resources, you must Configure a Role Binding on the cluster.

VMware, Inc. 270 VMware Tanzu Kubernetes Grid

Authenticate Users on a Machine Without a Browser

If the machine on which you are running tanzu and kubectl commands does not have a browser, you can skip the automatic opening of a browser during the authentication process.

1 If it is not set already, set the TANZU_CLI_PINNIPED_AUTH_LOGIN_SKIP_BROWSER=true environment variable.

This adds the --skip-browser option to the kubeconfig for the cluster.

export TANZU_CLI_PINNIPED_AUTH_LOGIN_SKIP_BROWSER=true

On Windows systems, use the SET command instead of export.

2 Export the regular kubeconfig for the cluster to the local file my-cluster-credentials.

Note that the command does not include the --admin option, so the kubeconfig that is exported is the regular kubeconfig, not the admin version.

tanzu cluster kubeconfig get my-cluster --export-file my-cluster-credentials

3 Connect to the cluster by using the newly-created kubeconfig file.

kubectl get pods -A --kubeconfig my-cluster-credentials

The login URL is displayed in the terminal. For example:

Please log in: https://ab9d82be7cc2443ec938e35b69862c9c-10577430.eu-west-1.elb.amazonaws.com/ oauth2/authorize?access_type=offline&client_id=pinniped- cli&code_challenge=vPtDqg2zUyLFcksb6PrmE8bI9qF8it22KQMy52hB6DE&code_challenge_method=S256&nonce=2a 66031e3075c65ea0361b3ba30bf174&redirect_uri=http%3A%2F %2F127.0.0.1%3A57856%2Fcallback&response_type=code&scope=offline_access+openid+pinniped%3Arequest- audience&state=01064593f32051fee7eff9333389d503

4 Copy the login URL and paste it into a browser on a machine that does have one.

5 In the browser, log in to your identity provider.

You will see a message that the identity provider could not send the authentication code because there is no localhost listener on your workstation.

1 Copy the URL of the authenticated session from the URL field of the browser.

2 On the machine that does not have a browser, use the URL that you copied in the preceding step to get the authentication code from the identity provider.

curl -L ''

Wrap the URL in quotes, to escape any special characters. For example, the command will resemble the following:

curl - L 'http://127.0.0.1:37949/callback?code=FdBkopsZwYX7w5zMFnJqYoOlJ50agmMWHcGBWD- DTbM.8smzyMuyEBlPEU2ZxWcetqkStyVPjdjRgJNgF1-vODs&scope=openid+offline_access+pinniped%3Arequest- audience&state=a292c262a69e71e06781d5e405d42c03'

VMware, Inc. 271 VMware Tanzu Kubernetes Grid

After running curl -L '', you should see the following message:

you have been logged in and may now close this tab

3 Connect to the cluster again by using the same kubeconfig file as you used previously.

kubectl get pods -A --kubeconfig my-cluster-credentials

If you already configured a role binding on the cluster for the authenticated user, the output shows the pod information.

If you have not configured a role binding on the cluster, you will see a message denying the user account access to the pods: Error from server (Forbidden): pods is forbidden: User "[email protected]" cannot list resource "pods" in API group "" at the cluster scope. This happens because the user has been successfully authenticated, but they are not yet authorized to access any resources on the cluster. To authorize the user to access the cluster resources, you must configure Role-Based Access Control (RBAC) on the cluster by creating a cluster role binding.

Configure a Role Binding on a Workload Cluster

To complete the identity management configuration of the workload cluster, you must create cluster role bindings for the users who use the kubeconfig that you generated in the preceding step. There are many roles with which you can associate users, but the most useful roles are the following: n cluster-admin: Can perform any operation on the cluster. n admin: Permission to view most resources but can only modify resources like roles and bindings. Cannot modify pods or deployments. n edit: The opposite of admin. Can create, update, and delete resources like deployments, services, and pods. Cannot change roles or permissions. n view: Read-only.

You can assign any of these roles to users. For more information about RBAC and cluster role bindings, see Using RBAC Authorization in the Kubernetes documentation.

1 Set the kubectl context to the workload cluster's admin kubeconfig.

You need to switch to the workload cluster's admin context so that you can create a role binding. For example, run the following two commands to change to the admin context:

Get the kubeconfig:

tanzu cluster kubeconfig get my-cluster --admin

Switch context:

kubectl config use-context my-cluster-admin@my-cluster

VMware, Inc. 272 VMware Tanzu Kubernetes Grid

2 To see the full list of roles that are available on the cluster, run the following command:

kubectl get clusterroles

3 Create a cluster role binding to associate a given user with a role on the cluster.

The following command creates a role binding named workload-test-rb that binds the role cluster-admin for this cluster to the user [email protected]. For OIDC the username is usually the email address of the user. For LDAPS it is the LDAP username, not the email address.

OIDC:

kubectl create clusterrolebinding workload-test-rb --clusterrole cluster-admin --user [email protected]

LDAP:

kubectl create clusterrolebinding workload-test-rb --clusterrole cluster-admin --user

4 Use the regular kubeconfig file that you generated above to attempt to run an operation on the cluster again.

For example, run:

kubectl get pods -A --kubeconfig my-cluster-credentials

This time, you should see the list of pods that are running in the workload cluster. This is because the user of the my-cluster-credentials kubeconfig file has both been authenticated by your identity provider, and has the necessary permissions on the cluster. You can share the my-cluster-credentials kubeconfig file with any users for whom you configure role bindings on the cluster.

For information about how to configure RBAC on management clusters, see Configure Identity Management After Management Cluster Deployment.

Examine the Deployed Cluster

1 After you have added the credentials to your kubeconfig, you can connect to the cluster by using kubectl.

kubectl config use-context my-cluster-admin@my-cluster

2 Use kubectl to see the status of the nodes in the cluster.

kubectl get nodes

VMware, Inc. 273 VMware Tanzu Kubernetes Grid

For example, if you deployed the my-prod-cluster in Deploy a Cluster with a Highly Available Control Plane with the prod plan and the default 3 control plane nodes and worker nodes, you see the following output.

NAME STATUS ROLES AGE VERSION my-prod-cluster-control-plane-gp4rl Ready master 8m51s v1.19.1+vmware.2 my-prod-cluster-control-plane-n8bh7 Ready master 5m58s v1.19.1+vmware.2 my-prod-cluster-control-plane-xflrg Ready master 3m39s v1.19.1+vmware.2 my-prod-cluster-md-0-6946bcb48b-dk7m6 Ready 6m45s v1.19.1+vmware.2 my-prod-cluster-md-0-6946bcb48b-dq8s9 Ready 7m23s v1.19.1+vmware.2 my-prod-cluster-md-0-6946bcb48b-nrdlp Ready 7m8s v1.19.1+vmware.2

Because networking with Antrea is enabled by default in Tanzu Kubernetes clusters, all clusters are in the Ready state without requiring any additional configuration.

3 Use kubectl to see the status of the pods running in the cluster.

kubectl get pods -A

The example below shows the pods running in the kube-system namespace in the my-prod- cluster cluster on vSphere.

NAMESPACE NAME READY STATUS RESTARTS AGE kube-system antrea-agent-2mw42 2/2 Running 0 4h41m kube-system antrea-agent-4874z 2/2 Running 1 4h45m kube-system antrea-agent-9qfr6 2/2 Running 0 4h48m kube-system antrea-agent-cf7cf 2/2 Running 0 4h46m kube-system antrea-agent-j84mz 2/2 Running 0 4h46m kube-system antrea-agent-rklbg 2/2 Running 0 4h46m kube-system antrea-controller-5d594c5cc7-5pttm 1/1 Running 0 4h48m kube-system coredns-5bcf65484d-7dp8d 1/1 Running 0 4h48m kube-system coredns-5bcf65484d-pzw8p 1/1 Running 0 4h48m kube-system etcd-my-prod-cluster-control-plane-frsgd 1/1 Running 0 4h48m kube-system etcd-my-prod-cluster-control-plane-khld4 1/1 Running 0 4h44m kube-system etcd-my-prod-cluster-control-plane-sjvx7 1/1 Running 0 4h41m kube-system kube-apiserver-my-prod-cluster-control-plane-frsgd 1/1 Running 0 4h48m kube-system kube-apiserver-my-prod-cluster-control-plane-khld4 1/1 Running 1 4h45m kube-system kube-apiserver-my-prod-cluster-control-plane-sjvx7 1/1 Running

VMware, Inc. 274 VMware Tanzu Kubernetes Grid

0 4h41m kube-system kube-controller-manager-my-prod-cluster-control-plane-frsgd 1/1 Running 1 4h48m kube-system kube-controller-manager-my-prod-cluster-control-plane-khld4 1/1 Running 0 4h45m kube-system kube-controller-manager-my-prod-cluster-control-plane-sjvx7 1/1 Running 0 4h41m kube-system kube-proxy-hzqlt 1/1 Running 0 4h48m kube-system kube-proxy-jr4w6 1/1 Running 0 4h45m kube-system kube-proxy-lx8bp 1/1 Running 0 4h46m kube-system kube-proxy-rzbgh 1/1 Running 0 4h46m kube-system kube-proxy-s684n 1/1 Running 0 4h41m kube-system kube-proxy-z9v9t 1/1 Running 0 4h46m kube-system kube-scheduler-my-prod-cluster-control-plane-frsgd 1/1 Running 1 4h48m kube-system kube-scheduler-my-prod-cluster-control-plane-khld4 1/1 Running 0 4h45m kube-system kube-scheduler-my-prod-cluster-control-plane-sjvx7 1/1 Running 0 4h41m kube-system kube-vip-my-prod-cluster-control-plane-frsgd 1/1 Running 1 4h48m kube-system kube-vip-my-prod-cluster-control-plane-khld4 1/1 Running 0 4h45m kube-system kube-vip-my-prod-cluster-control-plane-sjvx7 1/1 Running 0 4h41m kube-system vsphere-cloud-controller-manager-4nlsw 1/1 Running 0 4h41m kube-system vsphere-cloud-controller-manager-gw7ww 1/1 Running 2 4h48m kube-system vsphere-cloud-controller-manager-vp968 1/1 Running 0 4h44m kube-system vsphere-csi-controller-555595b64c-l82kb 5/5 Running 3 4h48m kube-system vsphere-csi-node-5zq47 3/3 Running 0 4h41m kube-system vsphere-csi-node-8fzrg 3/3 Running 0 4h46m kube-system vsphere-csi-node-8zs5l 3/3 Running 0 4h45m kube-system vsphere-csi-node-f2v55 3/3 Running 0 4h46m kube-system vsphere-csi-node-khtwv 3/3 Running 0 4h48m kube-system vsphere-csi-node-shtqj 3/3 Running 0 4h46m

VMware, Inc. 275 VMware Tanzu Kubernetes Grid

You can see from the example above that the following services are running in the my-prod- cluster cluster:

n Antrea, the container networking interface

n coredns, for DNS

n etcd, for key-value storage

n kube-apiserver, the Kubernetes API server

n kube-proxy, the Kubernetes network proxy

n kube-scheduler, for scheduling and availability

n vsphere-cloud-controller-manager, the Kubernetes cloud provider for vSphere

n kube-vip, load balancing services for the Cluster API server

n vsphere-csi-controller and vsphere-csi-node, the container storage interface for vSphere

Access a Workload Cluster as a Standard User

A standard, non-admin user can retrieve a workload cluster's kubeconfig by using the Tanzu CLI, as described in this section. This workflow is different from how an admin user, who created the cluster's management cluster, retrieves this information by using the system that was used to create the management cluster. Admin users also can use this procedure if they are retrieving the kubeconfig from a system different from the one that was used to create the management cluster.

Prerequisites Before you perform this task, ensure that: n You have a Docker application that is running on your system. If your system runs Microsoft Windows, set the Docker mode to Linux, and configure Windows Subsystem for Linux. n Obtain the management cluster endpoint details and the name of the workload cluster that you want to manage from the platform administrator.

Access the Workload Cluster 1 On the Tanzu CLI, run the following command:

tanzu login --endpoint https://MANAGEMENT-CLUSTER-CONTROL-PLANE-ENDPOINT:6443 --name TKG- ENVIRONMENT

TKG-ENVIRONMENT is the name of your Tanzu Kubernetes Grid Environment.

If identity management is configured on the management cluster, the login screen for the identity management provider (LDAP or OIDC) opens in your default browser.

LDAPS:

VMware, Inc. 276 VMware Tanzu Kubernetes Grid

OIDC:

2 Log in to the identity management provider.

3 On the Tanzu CLI, run the following command to obtain the workload cluster context:

tanzu cluster kubeconfig get MY-WORKLOAD-CLUSTER

For more information on obtaining the workload cluster context, see Retrieve Tanzu Kubernetes Cluster kubeconfig.

1 Run the following command to switch to the workload cluster:

kubectl config use-context tanzu-cli-MY-WORKLOAD-CLUSTER@MY-WORKLOAD-CLUSTER

In your subsequent logins to the Tanzu CLI, you will see an option to choose your Tanzu Kubernetes Grid environment from a list that pops up after your enter tanzu login.

VMware, Inc. 277 VMware Tanzu Kubernetes Grid

Scale Tanzu Kubernetes Clusters

This topic explains how to scale a Tanzu Kubernetes cluster in three ways: n Autoscale: Enable Cluster Autoscaler, which scales the number of worker nodes. See Scale Worker Nodes with Cluster Autoscaler below. n Scale horizontally: Run the tanzu cluster scale command with the --controlplane-machine- count and --worker-machine-count options, which scale the number of control plane and worker nodes. See Scale Clusters Horizontally With the Tanzu CLI below. n Scale vertically: Change the cluster's machine template to increase the size of the control plane and worker nodes. See Scale Clusters Vertically With kubectl below.

Scale Worker Nodes with Cluster Autoscaler

To enable Cluster Autoscaler in a Tanzu Kubernetes cluster, you set the AUTOSCALER_ options in the configuration file that you use to deploy the cluster or as environment variables before running tanzu cluster create --file. For information about the default configuration of Cluster Autoscaler in Tanzu Kubernetes Grid and the Cluster Autoscaler options, see Cluster Autoscaler in the Tanzu CLI Configuration File Variable Reference.

The AUTOSCALER_*_SIZE settings limit the number of worker nodes in a cluster, while AUTOSCALER_MAX_NODES_TOTAL limits the count of all nodes, both worker and control plane.

1 Set the following configuration parameters.

n For clusters with a single machine deployment such as dev clusters on vSphere, Amazon EC2, or Azure and prod clusters on vSphere or Azure, set AUTOSCALER_MIN_SIZE_0 and AUTOSCALER_MAX_SIZE_0.

n For clusters with multiple machine deployments such as prod clusters on Amazon EC2, set:

n AUTOSCALER_MIN_SIZE_0 and AUTOSCALER_MAX_SIZE_0

n AUTOSCALER_MIN_SIZE_1 and AUTOSCALER_MAX_SIZE_1

n AUTOSCALER_MIN_SIZE_2 and AUTOSCALER_MAX_SIZE_2

You cannot modify these values after you deploy the cluster.

#! ------#! Autoscaler related configuration #! ------ENABLE_AUTOSCALER: false AUTOSCALER_MAX_NODES_TOTAL: "0" AUTOSCALER_SCALE_DOWN_DELAY_AFTER_ADD: "10m" AUTOSCALER_SCALE_DOWN_DELAY_AFTER_DELETE: "10s" AUTOSCALER_SCALE_DOWN_DELAY_AFTER_FAILURE: "3m" AUTOSCALER_SCALE_DOWN_UNNEEDED_TIME: "10m" AUTOSCALER_MAX_NODE_PROVISION_TIME: "15m" AUTOSCALER_MIN_SIZE_0: AUTOSCALER_MAX_SIZE_0:

VMware, Inc. 278 VMware Tanzu Kubernetes Grid

AUTOSCALER_MIN_SIZE_1: AUTOSCALER_MAX_SIZE_1: AUTOSCALER_MIN_SIZE_2: AUTOSCALER_MAX_SIZE_2:

2 Create the cluster. For example: tanzu cluster create example-cluster

For each Tanzu Kubernetes cluster that you create with Autoscaler enabled, Tanzu Kubernetes Grid creates a Cluster Autoscaler deployment in the management cluster. To disable Cluster Autoscaler, delete the Cluster Autoscaler deployment associated with your Tanzu Kubernetes cluster.

Scale a Cluster Horizontally With the Tanzu CLI

To horizontally scale a Tanzu Kubernetes cluster, use the tanzu cluster scale command. You change the number of control plane nodes by specifying the --controlplane-machine-count option. You change the number of worker nodes by specifying the --worker-machine-count option.

NOTE: On clusters that run in vSphere with Tanzu, you can only run either 1 control plane node or 3 control plane nodes. You can scale up the number of control plane nodes from 1 to 3, but you cannot scale down the number from 3 to 1. n To scale a cluster that you originally deployed with 3 control plane nodes and 5 worker nodes to 5 and 10 nodes respectively, run the following command: tanzu cluster scale cluster_name --controlplane-machine-count 5 --worker-machine- count 10

If you initially deployed a cluster with --controlplane-machine-count 1 and then you scale it up to 3 control plane nodes, Tanzu Kubernetes Grid automatically enables stacked HA on the control plane. n If the cluster in running in a namespace other than the default namespace, you must specify the --namespace option to scale that cluster. tanzu cluster scale cluster_name --controlplane-machine-count 5 --worker-machine- count 10 --namespace=my-namespace

IMPORTANT: Do not change context or edit the .kube-tkg/config file while Tanzu Kubernetes Grid operations are running.

Scale a Cluster Vertically With kubectl

To vertically scale a Tanzu Kubernetes cluster, follow the Updating Infrastructure Machine Templates procedure in The Cluster API Book, which changes the cluster's machine template.

The procedure downloads the cluster's existing machine template, with a kubectl get command that you can construct as follows:

kubectl get MACHINE-TEMPLATE-TYPE MACHINE-TEMPLATE-NAME -o yaml

VMware, Inc. 279 VMware Tanzu Kubernetes Grid

Where: n MACHINE-TEMPLATE-TYPE is:

n VsphereMachineTemplate on vSphere

n AWSMachineTemplate on Amazon EC2

n AzureMachineTemplate on Azure n MACHINE-TEMPLATE-NAME is the name of the machine template for the cluster nodes that you are scaling, which follows the form:

n CLUSTER-NAME-control-plane for control plane nodes

n CLUSTER-NAME-worker for worker nodes

For example:

kubectl get VsphereMachineTemplate monitoring-cluster-worker -o yaml

Update and Troubleshoot Core Add-On Configuration

This topic describes how to update and troubleshoot the default configuration of core add-ons in Tanzu Kubernetes Grid.

Default Core Add-On Configuration

Tanzu Kubernetes Grid automatically manages the lifecycle of its core add-ons, which includes the CNI, Metrics Server, Pinniped, vSphere CPI, and vSphere CSI add-ons. For more information, see Core Add-ons in Deploying Management Clusters.

To review the default configuration of these add-ons, you can: n Download the following templates from projects.registry.vmware.com/tkg/tanzu_core/addons:

n antrea-templates

n calico-templates

n metrics-server-templates

n pinniped-templates

n vsphere-cpi-templates

n vsphere-csi-templates n Examine the Kubernetes secret for your target add-on by running the kubectl get secret CLUSTER-NAME-ADD-ON-NAME-addon -n CLUSTER-NAMESPACE command against the management cluster.

VMware, Inc. 280 VMware Tanzu Kubernetes Grid

For example, to review the default configuration of the Antrea add-on: n Review the Antrea templates:

a Locate the version tag for antrea-templates in the Tanzu Kubernetes release (TKr) that you used to deploy your cluster. You can retrieve the TKr by running the kubectl get tkr command against the management cluster:

1 Run kubectl get clusters CLUSTER-NAME -n CLUSTER-NAMESPACE --show-labels.

2 In the output, locate the value of tanzuKubernetesRelease. For example, tanzuKubernetesRelease=v1.20.5---vmware.1-tkg.1.

3 Run kubectl get tkr TKR-VERSION, where TKR-VERSION is the value that you retrieved above. For example:

kubectl get tkr v1.20.5---vmware.1-tkg.1 -o yaml

4 In the output, locate the version tag under tanzu_core/addons/antrea-templates.

Alternatively, you can review the TKr in ~/tanzu/tkg/bom/YOUR-TKR-BOM-FILE.

b Download the templates. For example:

imgpkg pull -i projects.registry.vmware.com/tkg/tanzu_core/addons/antrea-templates:v1.3.1 -o antrea-templates

c Navigate to antrea-templates and review the templates. n Retrieve and review the Antrea add-on secret. To retrieve the secret, run the following command against the management cluster:

kubectl get secret CLUSTER-NAME-antrea-addon -n CLUSTER-NAMESPACE

Where:

n CLUSTER-NAME is the name of your target cluster. If you want to review the Antrea add-on secret for a workload cluster, CLUSTER-NAME is the name of your workload cluster.

n CLUSTER-NAMESPACE is the namespace of your target cluster.

Updating and Troubleshooting Core Add-on Configuration

You can update and troubleshoot the default configuration of a core add-on by modifying the following resources:

VMware, Inc. 281 VMware Tanzu Kubernetes Grid

Type Resources Description

Configuration Add-on secret To update the default configuration of a core add-on, you can: updates n Modify the values.yaml section of the add-on secret. For more information, see Update the values.yaml section below. n Add an overlay to the add-on secret. For more information, see Add an Overlay below.

Troubleshooting App custom Same as above. Additionally, if you need to apply temporary changes to your resource (CR) and add-on configuration, you can: add-on secret n Pause secret reconciliation. n Pause app CR reconciliation. This disables lifecycle management for the add-on. Use with caution. For more information, see Pause Core Add-on Lifecycle Management below.

For more information about add-on secrets and app CRs, see Key Components and Objects below.

Updating Core Add-on Configuration

You can update the default configuration of a core add-on by modifying the values.yaml section of the add-on secret or by adding an overlay to the add-on secret. These changes are persistent.

Update the values.yaml section

In the values.yaml section, you can update the following configuration settings:

Add-on Setting Description

Antrea antrea.config.defaultMTU By default, this parameter is set to null.

Pinniped dex.config.oidc.CLIENT_ID* (v1.3.0) or The client ID of your OIDC provider. pinniped.upstream_oidc_client_id (v1.3.1+)

Pinniped dex.config.oidc.CLIENT_SECRET (v1.3.0) or The client secret of your OIDC provider. pinniped.upstream_oidc_client_secret (v1.3.1+)

Pinniped dex.config.oidc.issuer (v1.3.0) or The URL of your OIDC provider. pinniped.upstream_oidc_issuer_url (v1.3.1+)

Pinniped dex.config.oidc.scopes (v1.3.0) or A list of additional scopes to request in the pinniped.upstream_oidc_additional_scopes (v1.3.1+) token response.

Pinniped dex.config.oidc.claimMapping (v1.3.0) or OIDC claim mapping. pinniped.upstream_oidc_claims (v1.3.1+)

Pinniped dex.config.ldap.host The IP or DNS address of your LDAP server. If you want to change the default port 636 to a different port, specify "host:port".

Pinniped dex.config.ldap.bindDN and dex.config.ldap.bindPW The DN and password for an application service account.

Pinniped dex.config.ldap.userSearch Search attributes for users.

Pinniped dex.config.ldap.groupSearch Search attributes for groups.

VMware, Inc. 282 VMware Tanzu Kubernetes Grid

Add-on Setting Description vSphere CSI vsphereCSI.provisionTimeout By default, this parameter is set to 300s. vSphere CSI vsphereCSI.attachTimeout By default, this parameter is set to 300s.

If you want to update a Pinniped setting that starts with dex., you must restart dex in the management cluster after you update the add-on secret.

To modify the values.yaml section of an add-on secret:

1 Retrieve the add-on secret by running the kubectl get secret CLUSTER-NAME-ADD-ON-NAME-addon -n CLUSTER-NAMESPACE command against the management cluster. For example:

kubectl get secret example-mgmt-cluster-antrea-addon -n tkg-system -o jsonpath={.data.values\ \.yaml} | base64 -d > values.yaml

2 Update the values.yaml section. You can update any of the values listed in the table above.

3 Apply your update by running the kubectl apply command. Alternatively, you can use the kubectl edit command to update the add-on secret.

4 After updating the secret, check the status of the add-on by running the kubectl get app command. For example:

$ kubectl get app antrea -n tkg-system NAME DESCRIPTION SINCE-DEPLOY AGE antrea Reconcile succeeded 3m23s 7h50m

If the returned status is Reconcile failed, run the following command to get details on the failure:

kubectl get app antrea -n tkg-system -o yaml

The example below updates the default MTU for the Antrea add-on.

... stringData: values.yaml: | #@data/values #@overlay/match-child-defaults missing_ok=True --- infraProvider: vsphere antrea: config: defaultMTU: 8900

VMware, Inc. 283 VMware Tanzu Kubernetes Grid

Add an Overlay If you want to update a configuration setting that is not supported by the default add-on templates, you can add an overlay to the add-on secret. The example below instructs Pinniped to use LoadBalancer instead of the default NodePort on vSphere:

... stringData: overlays.yaml: | #@ load("@ytt:overlay", "overlay") #@overlay/match by=overlay.subset({"kind": "Service", "metadata": {"name": "pinniped-supervisor", "namespace": "pinniped-supervisor"}}) --- #@overlay/replace spec: type: LoadBalancer selector: app: pinniped-supervisor ports: - name: https protocol: TCP port: 443 targetPort: 8443 values.yaml: | #@data/values #@overlay/match-child-defaults missing_ok=True --- infrastructure_provider: vsphere tkg_cluster_role: management

To add an overlay to an add-on secret:

1 Retrieve the add-on secret by running the kubectl get secret CLUSTER-NAME-ADD-ON-NAME-addon -n CLUSTER-NAMESPACE command against the management cluster. For example:

kubectl get secret example-mgmt-cluster-pinniped-addon -n tkg-system -o jsonpath={.data.values\ \.yaml} | base64 -d > values.yaml

2 Add your overlay.yaml section under stringData.

3 Apply the update by running the kubectl apply command. Alternatively, you can use the kubectl edit command to update the add-on secret.

4 After updating the secret, check the status of the add-on by running the kubectl get app command. For example:

$ kubectl get app pinniped -n tkg-system NAME DESCRIPTION SINCE-DEPLOY AGE pinniped Reconcile succeeded 3m23s 7h50m

VMware, Inc. 284 VMware Tanzu Kubernetes Grid

If the returned status is Reconcile failed, run the following command to get details on the failure:

kubectl get app pinniped -n tkg-system -o yaml

Troubleshooting Core Add-on Configuration Before troubleshooting the core add-ons, review the following sections: n Key Components and Objects below. n Updating Core Add-on Configuration above. n Pause Core Add-on Lifecycle Management below.

Key Components and Objects Tanzu Kubernetes Grid uses the following components and objects for core add-on management.

Components in the management cluster: n kapp-controller, a local package manager: When you deploy a management cluster, the Tanzu CLI installs kapp-controller in the cluster. kapp-controller deploys tanzu-addons-manager and the core add-ons. It also deploys and manages kapp-controller in each Tanzu Kubernetes (workload) cluster that you deploy from that management cluster. n tanzu-addons-manager: Manages the lifecycle of the core add-ons in the management cluster and workload clusters that you deploy from your management cluster. n tkr-controller: Creates Tanzu Kubernetes releases (TKr) and BoM ConfigMaps in the management cluster.

Component in workload clusters: kapp-controller deploys the core add-ons in the workload cluster in which it runs.

Objects: n Secret: The Tanzu CLI creates a secret for each core add-on, per cluster. These secrets define the configuration of the core add-ons. All add-on secrets are created in the management cluster. tanzu-addons-manager reads the secrets and uses the configuration information they contain to create app CRs. n App CR: For each add-on, tanzu-addons-manager creates an app CR in the target cluster. Then, kapp-controller reconciles the CR and deploys the add-on. n BoM ConfigMap: Provides metadata information about the core add-ons, such as image location, to tanzu-addons-manager.

You can use the following commands to monitor the status of these components and objects:

VMware, Inc. 285 VMware Tanzu Kubernetes Grid

Command Description kubectl get app ADD-ON -n tkg-system -o yaml Check the app CR in your target cluster. For example, kubectl get app antrea -n tkg- system -o yaml. kubectl get cluster CLUSTER-NAME -n CLUSTER-NAMESPACE -o In the management cluster, check if the TKr jsonpath={.metadata.labels.tanzuKubernetesRelease} label of your target cluster points to the correct TKr. kubectl get tanzukubernetesrelease TKR-NAME Check if the TKr is present in the management cluster. kubectl get configmaps -n tkr-system -l 'tanzuKubernetesRelease=TKR- Check if the BoM ConfigMap corresponding NAME' to your TKr is present in the management cluster. kubectl get app CLUSTER-NAME-kapp-controller -n CLUSTER-NAMESPACE For workload clusters, check if the kapp- controller app CR is present in the management cluster. kubectl logs deployment/tanzu-addons-controller-manager -n tkg-system Check tanzu-addons-manager logs in the management cluster. kubectl get configmap -n tkg-system | grep ADD-ON-ctrl Check if your updates to the add-on secret have been applied. The sync period is 5 minutes.

Pause Core Add-on Lifecycle Management IMPORTANT: The commands in this section disable add-on lifecycle management. Whenever possible, use the procedures described in Updating Add-on Configuration above instead.

If you need to temporary pause add-on lifecycle management for a core add-on, you can use the commands below: n To pause secret reconciliation, run the following command against the management cluster:

kubectl patch secret/CLUSTER-NAME-ADD-ON-NAME-addon -n CLUSTER-NAMESPACE -p '{"metadata": {"annotations":{"tkg.tanzu.vmware.com/addon-paused": ""}}}' --type=merge

After you run this command, tanzu-addons-manager stops reconciling the secret. n To pause app CR reconciliation, run the following command against your target cluster:

kubectl patch app/ADD-ON-NAME -n tkg-system -p '{"spec":{"paused":true}}' --type=merge

After you run this command, kapp-controller stops reconciling the app CR.

If you want to temporary modify a core add-on app, pause secret reconciliation first and then pause app CR reconciliation. After you unpause add-on lifecycle management, tanzu-addons- manager and kapp-controller resume secret and app CR reconciliation: n To unpause secret reconciliation, remove tkg.tanzu.vmware.com/addon-paused from the secret annotations.

VMware, Inc. 286 VMware Tanzu Kubernetes Grid

n To unpause app CR reconciliation, update the app CR with {"spec":{"paused":false}} or remove the variable.

Tanzu Kubernetes Cluster Secrets

This topic explains how to configure and manage secrets used by management and Tanzu Kubernetes (workload) clusters in Tanzu Kubernetes Grid, including: n Credentials that clusters use to access cloud infrastructure APIs and resources. n Certificate Authority (CA) certificates that clusters use to access private container registries.

Update Management and Workload Cluster Credentials (vSphere)

To update the vSphere credentials used by the current management cluster and by all of its workload clusters, use the tanzu management-cluster credentials update --cascading command:

1 Run tanzu login to log in to the management cluster that you are updating.

2 Run tanzu management-cluster credentials update

You can pass values to the following command options, or let the CLI prompt you for them: n --vsphere-user: Name for the vSphere account. n --vsphere-password: Password the vSphere account.

Update Management Cluster Credentials Only To update a management cluster's vSphere credentials without also updating them for its workload clusters, use the tanzu management-cluster credentials update command as above, but without the --cascading option.

Update Workload Cluster Credentials (vSphere)

To update the credentials that a single workload cluster uses to access vSphere, use the tanzu cluster credentials update command:

1 Run tanzu login to log in to the management cluster that created the workload cluster that you are updating.

2 Run tanzu cluster credentials update CLUSTER_NAME

You can pass values to the following command options, or let the CLI prompt you for them: n --namespace: The namespace of the cluster you are updating credentials for, such as default. n --vsphere-user: Name for the vSphere account. n --vsphere-password: Password the vSphere account.

You can also use tanzu management-cluster credentials update --cascading to update vSphere credentials for a management cluster and all of the workload clusters it manages.

VMware, Inc. 287 VMware Tanzu Kubernetes Grid

Trust Custom CA Certificates on Cluster Nodes

You can add custom CA certificates in Tanzu Kubernetes cluster nodes by using a ytt overlay file to enable the cluster nodes to pull images from a container registry that uses self signed certificates.

The overlay code below adds custom CA certificates to all nodes in a new cluster, so that containerd and other tools trust the certificate. The code works on all cloud infrastructure providers, for clusters based on Photon or Ubuntu VM image templates.

For overlays that customize clusters and create a new cluster plan, see ytt Overlays in the Configure Tanzu Kubernetes Plans and Clusters topic.

1 Choose whether to apply the custom CA to all new clusters, just the ones created on one cloud infrastructure, or ones created with a specific Cluster API provider version, such as Cluster API Provider vSphere v0.7.4.

2 In your local ~/.tanzu/tkg/providers/ directory, find the ytt directory that covers your chosen scope. For example, /ytt/03_customizations/ applies to all clusters, and /infrastructure- vsphere/ytt/ applies to all vSphere clusters.

3 In your chosen ytt directory, create a new .yaml file or augment an existing overlay file with the following code:

#@ load("@ytt:overlay", "overlay") #@ load("@ytt:data", "data")

#! This ytt overlay adds additional custom CA certificates on TKG cluster nodes, so containerd and other tools trust these CA certificates. #! It works when using Photon or Ubuntu as the TKG node template on all TKG infrastructure providers.

#! Trust your custom CA certificates on all Control Plane nodes. #@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"}) --- spec: kubeadmConfigSpec: #@overlay/match missing_ok=True files: #@overlay/append - content: #@ data.read("tkg-custom-ca.pem") owner: root:root permissions: "0644" path: /etc/ssl/certs/tkg-custom-ca.pem #@overlay/match missing_ok=True preKubeadmCommands: #! For Photon OS #@overlay/append - '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh' #! For Ubuntu #@overlay/append - '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom-ca.pem /usr/ local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'

VMware, Inc. 288 VMware Tanzu Kubernetes Grid

#! Trust your custom CA certificates on all worker nodes. #@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}) --- spec: template: spec: #@overlay/match missing_ok=True files: #@overlay/append - content: #@ data.read("tkg-custom-ca.pem") owner: root:root permissions: "0644" path: /etc/ssl/certs/tkg-custom-ca.pem #@overlay/match missing_ok=True preKubeadmCommands: #! For Photon OS #@overlay/append - '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh' #! For Ubuntu #@overlay/append - '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom- ca.pem /usr/local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'

4 In the same ytt directory, add the Certificate Authority to a new or existing tkg-custom-ca.pem file.

Configure Machine Health Checks for Tanzu Kubernetes Clusters

This topic describes how to use the Tanzu CLI to create, update, retrieve, and delete MachineHealthCheck objects for Tanzu Kubernetes clusters.

About MachineHealthCheck

MachineHealthCheck is a controller that provides node health monitoring and node auto-repair for Tanzu Kubernetes clusters.

This controller is enabled in the global Tanzu Kubernetes Grid configuration by default, for all Tanzu Kubernetes clusters. You can override your global Tanzu Kubernetes Grid configuration for individual Tanzu Kubernetes clusters in two ways: n When deploying the management cluster. You can enable or disable the default MachineHealthCheck in either the Tanzu Kubernetes Grid installer interface or the cluster configuration file. Each Tanzu Kubernetes cluster that you deploy with your management cluster inherits this configuration by default. For more information, see Chapter 4 Deploying Management Clusters. n After creating a Tanzu Kubernetes cluster. You can use the Tanzu CLI to create, update, retrieve, and delete MachineHealthCheck objects for individual Tanzu Kubernetes clusters. See the sections below.

VMware, Inc. 289 VMware Tanzu Kubernetes Grid

When MachineHealthCheck is enabled in a Tanzu Kubernetes cluster, it runs in the same namespace as the cluster.

#! ------#! Machine Health Check configuration #! ------ENABLE_MHC: true MHC_UNKNOWN_STATUS_TIMEOUT: 5m MHC_FALSE_STATUS_TIMEOUT: 12m

Create or Update a MachineHealthCheck

To create a MachineHealthCheck with the default configuration, run the following command:

tanzu cluster machinehealthcheck set CLUSTER-NAME

Where CLUSTER-NAME is the name of the Tanzu Kubernetes cluster you want to monitor.

You can also use this command to create MachineHealthCheck objects with custom configuration options or update existing MachineHealthCheck objects. To set custom configuration options for a MachineHealthCheck, run the tanzu cluster machinehealthcheck set command with one or more of the following: n --mhc-name: By default, when you run tanzu cluster machinehealthcheck set CLUSTER-NAME, the command sets the name of the MachineHealthCheck to CLUSTER-NAME. Specify the --mhc-name option if you want to set a different name. For example:

tanzu cluster machinehealthcheck set my-cluster --mhc-name my-mhc n --match-labels: This option filters machines by label keys and values. You can specify one or more label constraints. The MachineHealthCheck is applied to all machines that satisfy these constraints. Use the syntax below:

tanzu cluster machinehealthcheck set my-cluster --match-labels "key1:value1,key2:value2"

For example:

tanzu cluster machinehealthcheck set my-cluster --match-labels "node-pool:my-cluster-worker-pool" n --node-startup-timeout: This option controls the amount of time that the MachineHealthCheck waits for a machine to join the cluster before considering the machine unhealthy. For example, the command below sets the --node-startup-timeout option to 10m:

tanzu cluster machinehealthcheck set my-cluster --node-startup-timeout 10m

If a machine fails to join the cluster within this amount of time, the MachineHealthCheck recreates the machine.

VMware, Inc. 290 VMware Tanzu Kubernetes Grid

n --unhealthy-conditions: This option can set the Ready, MemoryPressure, DiskPressure, PIDPressure, and NetworkUnavailable conditions. The MachineHealthCheck uses the conditions that you set to determine whether a node is unhealthy. To set the status of a condition, use True, False, or Unknown. For example:

tanzu cluster machinehealthcheck set my-cluster --unhealthy-conditions "Ready:False:5m,Ready:Unknown:5m"

In the example above, if the status of the Ready node condition remains Unknown or False for longer than 5m, the MachineHealthCheck considers the machine unhealthy and recreates it.

Retrieve a MachineHealthCheck

To retrieve a MachineHealthCheck object, run the following command:

tanzu cluster machinehealthcheck get CLUSTER-NAME

If you assigned a non-default name to the object, specify the --mhc-name flag.

Delete a MachineHealthCheck

To delete a MachineHealthCheck object, run the following command:

tanzu cluster machinehealthcheck delete CLUSTER-NAME

If you assigned a non-default name to the object, specify the --mhc-name flag.

Back Up and Restore Clusters

To back up and restore Tanzu Kubernetes Grid clusters, you can use Velero, an open source community standard tool for backing up and restoring Kubernetes cluster objects and persistent volumes. Velero supports a variety of storage providers to store its backups.

If a Tanzu Kubernetes Grid management or workload cluster crashes and fails to recover, the infrastructure administrator can use a Velero backup to restore its contents to a new cluster, including cluster extensions and internal API objects for the workload clusters.

The following sections explain how to set up a Velero server on Tanzu Kubernetes Grid management or workload clusters, and direct it from the Velero CLI to back up and restore the clusters.

NOTE: You must create a new cluster to restore to; you cannot restore a cluster backup to an existing cluster.

Setup Overview

To back up and restore Tanzu Kubernetes Grid clusters, you need: n The Velero CLI running on your local workstation; see Install the Velero CLI. n A storage provider with locations to save the backups to; see Set Up a Storage Provider.

VMware, Inc. 291 VMware Tanzu Kubernetes Grid

n A Velero server running on each cluster that you want to back up; see Deploy Velero Server to Clusters.

Install the Velero CLI

1 Go to the Tanzu Kubernetes Grid downloads page and log in with your My VMware credentials.

2 Under Product Downloads, click Go to Downloads.

3 Scroll to the Velero entries and download the Velero CLI .gz file for your workstation OS. Its filename starts with velero-linux-, velero-mac-, or velero-windows64-.

4 Use the gunzip command or the extraction tool of your choice to unpack the binary:

gzip -d .gz

5 Rename the CLI binary for your platform to velero, make sure that it is executable, and add it to your PATH.

n macOS and Linux platforms:

1 Move the binary into the /usr/local/bin folder and rename it to velero.

2 Make the file executable:

chmod +x /usr/local/bin/velero

n Windows platforms:

1 Create a new Program Files\velero folder and copy the binary into it.

2 Rename the binary to velero.exe.

3 Right-click the velero folder, select Properties > Security, and make sure that your user account has the Full Control permission.

4 Use Windows Search to search for env.

5 Select Edit the system environment variables and click the Environment Variables button.

6 Select the Path row under System variables, and click Edit.

7 Click New to add a new row and enter the path to the velero binary.

6 On vSphere with Tanzu:

n Install the vSphere plugin for kubectl by following the procedure in Download and Install the Kubernetes CLI Tools for vSphere.

n This plugin retrieves the Supervisor Cluster credentials to enable the Velero CLI to access it.

VMware, Inc. 292 VMware Tanzu Kubernetes Grid

Set Up a Storage Provider

To back up Tanzu Kubernetes Grid clusters, you need storage locations for: n Cluster object storage backups for Kubernetes metadata in both management clusters and workload clusters n Volume snapshots for data used by workload clusters

See Backup Storage Locations and Volume Snapshot Locations in the Velero documentation. Velero supports a variety of storage providers.

VMware recommends dedicating a unique storage bucket to each cluster.

Storage for vSphere On vSphere, cluster object storage backups and volume snapshots save to the same storage location. This location must be S3-compatible external storage on Amazon Web Services (AWS), or an S3 provider such as MinIO.

To set up storage for Velero on vSphere, follow the installation procedures in the Velero Plugin for AWS repository, depending on what kind of cluster you are backing up: n Management cluster or workload cluster deployed by Tanzu Kubernetes Grid: See Velero Plugin for vSphere in Vanilla Kubernetes Cluster for the v1.1.0 plugin n vSphere with Tanzu Supervisor cluster: See Velero Plugin for vSphere in vSphere with Tanzu Supervisor Cluster for the v1.1.0 plugin

Storage for and on AWS To set up storage for Velero on AWS, follow the procedures in the Velero Plugins for AWS repository:

1 Create an S3 bucket

2 Set permissions for Velero

Set up S3 storage as needed for each plugin. The object store plugin stores and retrieves cluster object backups, and the volume snapshotter stores and retrieves data volumes.

Storage for and on Azure To set up storage for Velero on Azure, follow the procedures in the Velero Plugins for Azure repository:

1 Create an Azure storage account and blob container

2 Get the resource group containing your VMs and disks

3 Set permissions for Velero

Set up S3 storage as needed for each plugin. The object store plugin stores and retrieves cluster object backups, and the volume snapshotter stores and retrieves data volumes.

VMware, Inc. 293 VMware Tanzu Kubernetes Grid

Deploy Velero Server to Clusters

To deploy the Velero Server to a cluster, you run the velero install command. This command creates a namespace called velero on the cluster, and places a deployment named velero in it. velero install installs the Velero server to the current default cluster in your kubeconfig, or else you can specify a different cluster with the --kubeconfig flag.

How you run the velero install command and otherwise set up Velero on a cluster depends on your infrastructure and storage provider:

Velero Server on vSphere without Tanzu This procedure applies to management clusters deployed by Tanzu Kubernetes Grid and the workload clusters they create. If a vSphere with Tanzu Supervisor cluster serves as your Tanzu Kubernetes Grid management cluster, see the vSphere with Tanzu instructions below.

1 Install the Velero server to the current default cluster in your kubeconfig by running velero install, as described in the Install section for Vanilla Kubernetes clusters in the Velero Plugin for vSphere v1.1.0 repository. Include option values as follows: n --provider aws n --plugins velero/velero-plugin-for-aws:v1.1.0 n --bucket $BUCKET: name of your S3 bucket n --backup-location-config region=$REGION: region the bucket is in n --snapshot-location-config region=$REGION: region the bucket is in n For bucket access via username / password, include --secret-file ./velero-creds pointing to local file that looks like:

[default] aws_access_key_id= aws_secret_access_key= n For bucket access via kube2iam:

--pod-annotations iam.amazonaws.com/role=arn:aws:iam:::role/`` --no-secret n (Optional) --kubeconfig to install the Velero server to a cluster other than the current default. n For additional options, see Install and start Velero.

For example, to use MinIO as the object storage, following the MinIO server setup instructions in the Velero documentation:

velero install --provider aws --plugins "velero/velero-plugin-for-aws:v1.1.0" --bucket velero -- secret-file ./credentials-velero --backup-location-config "region=minio,s3ForcePathStyle=true,s3Url=minio_server_url" --snapshot-location-config region="default"

VMware, Inc. 294 VMware Tanzu Kubernetes Grid

Installing the Velero server to the cluster creates a namespace in the cluster called velero, and places a deployment named velero in it.

1 For workload clusters with CSI-based volumes, add the Velero Plugin for vSphere. This plugin lets Velero use your S3 bucket to store CSI volume snapshots for workload data, in addition to storing cluster objects:

a Download the Velero Plugin for vSphere v1.1.0 image.

b Run velero plugin add PLUGIN-IMAGE with the plugin image name.

n PLUGIN-IMAGE is the container image name listed in the Velero Plugin for vSphere repo v1.1.0, for example, vsphereveleroplugin/velero-plugin-for-vsphere:1.1.0.

c Enable the plugin by adding the following VirtualMachine permissions to the role you created for the Tanzu Kubernetes Grid account, if you did not already include them when you created the account:

n Configuration > Toggle disk change tracking

n Provisioning > Allow read-only disk access

n Provisioning > Allow virtual machine download

n Snapshot management > Create snapshot

n Snapshot management > Remove snapshot

Velero Server on vSphere with Tanzu vSphere with Tanzu Supervisor clusters do not have the Kubernetes API server permissions required to retrieve Kubernetes cluster objects, so you need to install Velero with a Velero vSphere Operator that elevates Velero's permissions.

To install Velero on a Supervisor cluster, follow Installing Velero on a Supervisor Cluster in the Velero Plugin for vSphere v1.1.0 repository.

NOTE: Tanzu Kubernetes Grid does not support backing up the Kubernetes object metadata for the Supervisor cluster, which captures its state. You can use Velero to back up data volume snapshots for user workloads running on the Supervisor cluster, as well as objects and data from workload clusters managed by the Supervisor cluster.

Velero Server on AWS To install Velero on clusters on AWS, follow the Install and start Velero procedure in the Velero Plugins for AWS repository.

Velero Server on Azure To install Velero on clusters on Azure, follow the Install and start Velero procedure in the Velero Plugins for Azure repository.

VMware, Inc. 295 VMware Tanzu Kubernetes Grid vSphere Backup and Restore

These sections describe how to back up and restore Tanzu Kubernetes Grid clusters on vSphere.

Back Up a Cluster on vSphere 1 Follow the Deploy Velero Server to Clusters instructions above to deploy a Velero server onto the cluster, along with the Velero Plugin for vSphere if needed.

2 If you are backing up a management cluster, set Cluster.Spec.Paused to true for all of its workload clusters:

kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'

3 Back up the cluster:

velero backup create your_backup_name --exclude-namespaces=tkg-system

Excluding tkg-system objects avoids creating duplicate cluster API objects when restoring a cluster.

4 If you backed up a management cluster, set Cluster.Spec.Paused back to false for all of its workload clusters.

Restore a Cluster on vSphere 1 Create a new cluster. You cannot restore a cluster backup to an existing cluster.

2 Follow the Deploy Velero Server to Clusters instructions above to deploy a Velero server onto the new cluster, along with the Velero Plugin for vSphere if needed.

3 Restore the cluster:

velero restore create your_restore_name --from-backup your_backup_name

4 Set Cluster.Spec.Paused field to false for all workload clusters:

kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'

AWS Backup and Restore

These sections describe how to back up and restore clusters on AWS.

Back Up a Cluster on AWS 1 Follow the Velero Plugin for AWS setup instructions to install Velero server on the cluster.

2 If you are backing up a management cluster, set Cluster.Spec.Paused to true for all of its workload clusters:

kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'

VMware, Inc. 296 VMware Tanzu Kubernetes Grid

3 Back up the cluster:

velero backup create your_backup_name --exclude-namespaces=tkg-system

Excluding tkg-system objects avoids creating duplicate cluster API objects when restoring a cluster.

4 If you backed up a management cluster, set Cluster.Spec.Paused back to false for all of its workload clusters.

Restore a Cluster on AWS 1 Create a new cluster. You cannot restore a cluster backup to an existing cluster.

2 Follow the Velero Plugin for AWS setup instructions to install Velero server on the new cluster.

3 Restore the cluster:

velero backup get velero restore create your_restore_name --from-backup your_backup_name

4 Set Cluster.Spec.Paused to false for all workload clusters:

kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'

Azure Backup and Restore

These sections describe how to back up and restore clusters on Azure.

Back Up a Cluster on Azure 1 Follow the Velero Plugin for Azure setup instructions to install Velero server on the cluster.

2 If you are backing up a management cluster, set Cluster.Spec.Paused to true for all of its workload clusters:

kubectl patch cluster workload_cluster_name --type='merge' -p '{"spec":{"paused": true}}'

3 Back up the cluster:

velero backup create your_backup_name --exclude-namespaces=tkg-system

Excluding tkg-system objects avoids creating duplicate cluster API objects when restoring a cluster.

4 If velero backup returns a transport is closing error, try again after increasing the memory limit, as described in Update resource requests and limits after install in the Velero documentation.

5 If you backed up a management cluster, set Cluster.Spec.Paused back to false for all of its workload clusters.

VMware, Inc. 297 VMware Tanzu Kubernetes Grid

Restore a Cluster on Azure 1 Create a new cluster. You cannot restore a cluster backup to an existing cluster.

2 Follow the Velero Plugin for Azure setup instructions to install Velero server on the new cluster.

3 Restore the cluster:

velero backup get velero restore create your_restore_name --from-backup your_backup_name

4 Set Cluster.Spec.Paused to false for all workload clusters:

kubectl patch cluster cluster_name --type='merge' -p '{"spec":{"paused": false}}'

Delete Tanzu Kubernetes Clusters

To delete a Tanzu Kubernetes cluster, run the tanzu cluster delete command. Depending on the cluster contents and cloud infrastructure, you may need to delete in-cluster volumes and services before you delete the cluster itself.

Step One: List Clusters

To list all of the Tanzu Kubernetes clusters that a management cluster is managing, run the tanzu cluster list command.

tanzu cluster list

Step Two: Delete Volumes and Services

If the cluster you want to delete contains persistent volumes or services such as load balancers and databases, you may need to manually delete them before you delete the cluster itself. What you need to pre-delete depends on your cloud infrastructure: n vSphere

n Load Balancer: see Delete Service type LoadBalancer below.

n Persistent Volumes and Persistent Volume Claims: see Delete Persistent Volume Claims and Persistent Volumes, below. n Amazon EC2

n Load Balancers: Application or Network Load Balancers (ALBs or NLBs) in the cluster's VPC, but not Classic Load Balancers (ELB v1). Delete these resources in the AWS UI or with the kubectl delete command.

n Other Services: Any subnet and EC2-backed service in the cluster's VPC, such as an RDS or VPC, and related resources such as:

n VPC: Delete under VPC Dashboard > Virtual Private Cloud > Your VPCs.

VMware, Inc. 298 VMware Tanzu Kubernetes Grid

n RDS: Delete under RDS Dashboard > Databases.

n Subnets: Delete under VPC Dashboard > Virtual Private Cloud > Subnets.

n Route Tables: Delete under VPC Dashboard > Virtual Private Cloud > Route Tables.

n Internet Gateways: Delete under VPC Dashboard > Virtual Private Cloud > Internet Gateways.

n Elastic IP Addresses: Delete under VPC Dashboard > Virtual Private Cloud > Elastic IPs.

n NAT Gateways: Delete under VPC Dashboard > Virtual Private Cloud > NAT Gateways.

n Network ACLs: Delete under VPC Dashboard > Security > Network ACLs.

n Security Groups: Delete under VPC Dashboard > Security > Security Groups.

Delete these resources in the AWS UI as above or with the aws CLI.

n Persistent Volumes and Persistent Volume Claims: Delete these resources with the kubectl delete command as described in Delete Persistent Volume Claims and Persistent Volumes, below. n Azure

n No action required. Deleting a cluster deletes everything that TKG created in the cluster's resource group.

Delete Service type LoadBalancer To delete Service type LoadBalancer (Service) in a cluster:

1 Set kubectl to the cluster's context.

kubectl config set-context my-cluster@user

2 Retrieve the cluster's list of services.

kubectl get service

3 Delete each Service type LoadBalancer.

kubectl delete service

Delete Persistent Volumes and Persistent Volume Claims To delete Persistent Volume (PV) and Persistent Volume Claim (PVC) objects in a cluster:

1 Run kubectl config set-context my-cluster@user to set kubectl to the cluster's context.

2 Run kubectl get pvc to retrieve the cluster's Persistent Volume Claims (PVCs).

VMware, Inc. 299 VMware Tanzu Kubernetes Grid

3 For each PVC:

a Run kubectl describe pvc to identify the PV it is bound to. The PV is listed in the command output as Volume, after Status: Bound.

b Run kubectl describe pv to describe to determine if its bound PV Reclaim Policy is Retain or Delete.

c Run kubectl delete pvc to delete the PVC.

d If the PV reclaim policy is Retain, run kubectl delete pv and then log into your cloud portal and delete the PV object there. For example, delete a vSphere CNS volume from your datastore pane > Monitor > Cloud Native Storage > Container Volumes. For more information about vSphere CNS, see Getting Started with VMware Cloud Native Storage.

Step Three: Delete Cluster

1 To delete a cluster, run tanzu cluster delete.

tanzu cluster delete my-cluster

If the cluster is running in a namespace other than the default namespace, you must specify the --namespace option to delete that cluster.

tanzu cluster delete my-cluster --namespace=my-namespace

To skip the yes/no verification step when you run tanzu cluster delete, specify the --yes option.

tanzu cluster delete my-cluster --namespace=my-namespace --yes

IMPORTANT: Do not change context or edit the .kube-tkg/config file while Tanzu Kubernetes Grid operations are running.

VMware, Inc. 300 Deploying and Managing Extensions and Shared Services 7

Tanzu Kubernetes Grid includes binaries for tools that provide in-cluster and shared services to the clusters running in your Tanzu Kubernetes Grid instance. All of the provided binaries and container images are built and signed by VMware.

This chapter includes the following topics: n Locations and Dependencies n Preparing to Deploy the Extensions n Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle n Install Cert Manager on Workload Clusters n Create a Shared Services Cluster n Add Certificates to the Kapp Controller n Implementing Ingress Control with Contour n Implementing Log Forwarding with Fluent Bit n Implementing Monitoring with Prometheus and Grafana n Implementing Service Discovery with External DNS n Deploy Harbor Registry as a Shared Service n Implementing User Authentication n Delete Tanzu Kubernetes Grid Extensions

Locations and Dependencies

You can add functionalities to Tanzu Kubernetes clusters by installing extensions to different cluster locations as follows:

Function Extension Location Procedure

Ingress Control Contour Tanzu Kubernetes or Shared Implementing Ingress Control with Contour Service cluster

Service Discovery External DNS Tanzu Kubernetes or Shared Implementing Service Discovery with External Service cluster DNS

Log Forwarding Fluent Bit Tanzu Kubernetes cluster Implementing Log Forwarding with Fluent Bit

VMware, Inc. 301 VMware Tanzu Kubernetes Grid

Container Registry Harbor Shared Services cluster Deploy Harbor Registry as a Shared Service

Monitoring Prometheus Tanzu Kubernetes cluster Implementing Monitoring with Prometheus and Grafana Grafana Tanzu Kubernetes cluster

Some extensions require or are enhanced by other extensions deployed to the same cluster: n Contour is required by Harbor, External DNS, and Grafana n Prometheus is required by Grafana n External DNS is recommended for Harbor on infrastructures with load balancing (AWS, Azure, and vSphere with NSX Advanced Load Balancer), especially in production or other environments in which Harbor availability is important.

Preparing to Deploy the Extensions

Before you can deploy the Tanzu Kubernetes Grid extensions, you must prepare your bootstrap environment. n To deploy the extensions, you update configuration files with information about your environment. You then use kubectl to apply preconfigured YAML files that pull data from the updated configuration files to create and update clusters that implement the extensions. The YAML files include calls to ytt, kapp, and kbld commands, so these tools must be present on your bootstrap environment when you deploy the extensions. For information about installing ytt, kapp, and kbld, see Install the Carvel Tools. n If you are using Tanzu Kubernetes Grid in an Internet-restricted environment, see Deploying the Tanzu Kubernetes Grid Extensions in an Internet Restricted Environment.

Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle

The Tanzu Kubernetes Grid extension manifests are provided in a separate bundle to the Tanzu CLI and other binaries.

1 On the system that you use as the bootstrap machine, go to the Tanzu Kubernetes Grid downloads page and log in with your My VMware credentials.

2 Under Product Downloads, click Go to Downloads.

3 Scroll to VMware Tanzu Kubernetes Grid Extensions Manifest 1.3.1 and click Download Now.

4 Use either the tar command or the extraction tool of your choice to unpack the bundle of YAML manifest files for the Tanzu Kubernetes Grid extensions. tar -xzf tkg-extensions-manifests-v1.3.1-vmware.1.tar.gz For convenience, unpack the bundle in the same location as the one from which you run tanzu and kubectl commands.

VMware, Inc. 302 VMware Tanzu Kubernetes Grid

IMPORTANT: n After you unpack the bundle, the extensions files are contained in a folder named tkg- extensions-v1.3.1+vmware.1. This folder contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. In the procedures to deploy the extensions, take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder. n For historical reasons, the extensions bundle includes the manifests for the Dex and Gangway extensions. Tanzu Kubernetes Grid v1.3 introduces user authentication with Pinniped, that run automatically in management clusters if you enable identity management during deployment. For new deployments, enable Pinniped and Dex in your management clusters. Do not use the Dex and Gangway extensions. For information about identity management with Pinniped, see Enabling Identity Management in Tanzu Kubernetes Grid. For information about migrating existing Dex and Gangway deployments to Pinniped, see Register Core Add-ons.

Install Cert Manager on Workload Clusters

Before you can deploy Tanzu Kubernetes Grid extensions, you must install cert-manager, which provides automated certificate management, on workload clusters. The cert-manager service already runs by default in management clusters.

All extensions other than Fluent Bit require cert-manager to be running on workload clusters. Fluent Bit does not use cert-manager.

To install the cert-manager service on a workload cluster, specify the cluster with kubectl config use-context and then do the following:

1 Deploy cert-manager on the cluster.

kubectl apply -f cert-manager/

2 Check that the Kapp controller and cert-manager services are running as pods in the cluster.

kubectl get pods -A

The command output should show:

n A kapp-controller pod with a name like kapp-controller-cd55bbd6b-vt2c4 running in the namespace tkg-system.

n For extensions other than Fluent Bit, pods with names like cert-manager-69877b5f94-8kwx9, cert-manager-cainjector-7594d76f5f-8tstw, and cert-manager-webhook-5fc8c6dc54-nlvzp running in the namespace cert-manager.

n A Ready status of 1/1 for all of these pods. If this status is not displayed, stop and troubleshoot the pods before proceeding.

VMware, Inc. 303 VMware Tanzu Kubernetes Grid

Create a Shared Services Cluster

The Harbor service runs on a shared services cluster, to serve all the other clusters in an installation. The Harbor service requires the Contour service to also run on the shared services cluster. In many environments, the Harbor service also benefits from External DNS running on its cluster, as described in Harbor Registry and External DNS.

Each Tanzu Kubernetes Grid instance can only have one shared services cluster.

To deploy a shared services cluster:

1 Create a cluster configuration YAML file for the target cluster. To deploy to a shared services cluster, for example named tkg-services, it is recommended to use the prod cluster plan rather than the dev plan. For example:

INFRASTRUCTURE_PROVIDER: vsphere CLUSTER_NAME: tkg-services CLUSTER_PLAN: prod

2 vSphere: To deploy the cluster to vSphere, add a line to the configuration file that sets VSPHERE_CONTROL_PLANE_ENDPOINT to a static virtual IP (VIP) address for the control plane of the shared services cluster. Ensure that this IP address is not in the DHCP range, but is in the same subnet as the DHCP range. If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address. For example:

VSPHERE_CONTROL_PLANE_ENDPOINT: 10.10.10.10

3 Deploy the cluster by passing the cluster configuration file to the tanzu cluster create:

tanzu cluster create tkg-services --file tkg-services-config.yaml

Throughout the rest of these procedures, the cluster that you just deployed is referred to as the shared services cluster.

4 Set the context of kubectl to the context of your management cluster. For example, if your cluster is named mgmt-cluster, run the following command.

kubectl config use-context mgmt-cluster-admin@mgmt-cluster

5 Add the label tanzu-services to the shared services cluster, as its cluster role. This label identifies the shared services cluster to the management cluster and workload clusters.

kubectl label cluster.cluster.x-k8s.io/tkg-services cluster-role.tkg.tanzu.vmware.com/tanzu- services="" --overwrite=true

You should see the confirmation cluster.cluster.x-k8s.io/tkg-services labeled.

6 Check that the label has been correctly applied by running the following command.

tanzu cluster list --include-management-cluster

VMware, Inc. 304 VMware Tanzu Kubernetes Grid

You should see that the tkg-services cluster has the tanzu-services role.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN another-cluster default running 1/1 1/1 v1.20.5+vmware.1 dev tkg-services default running 3/3 3/3 v1.20.5+vmware.1 tanzu-services prod mgmt-cluster tkg-system running 1/1 1/1 v1.20.5+vmware.1 management dev

7 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

You should see folders for authentication, ingress, logging, monitoring, registry, and some YAML files. Run all of the commands in this procedure from this location.

8 Get the admin credentials of the shared services cluster on which to deploy Harbor.

tanzu cluster kubeconfig get tkg-services --admin

9 Set the context of kubectl to the shared services cluster.

kubectl config use-context tkg-services-admin@tkg-services

Add Certificates to the Kapp Controller

Previous versions of Tanzu Kubernetes Grid required the user to install the kapp-controller service to any extension cluster. As of v1.3, all management and workload clusters are created with the kapp-controller service pre-installed. If the cluster configuration file specifies a private registry with TKG_CUSTOM_IMAGE_REPOSITORY and TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE variables, the kapp-controller is configured to trust the private registry.

To enable a cluster's Kapp Controller to trust additional private registries, add their certificates to its configuration:

1 If needed, set the current kubectl context to the cluster with the Kapp Controller you are changing:

kubectl config use-context CLUSTER-CONTEXT

1 Open the Kapp Controller's ConfigMap file in an editor:

kubectl edit configmap -n tkg-system kapp-controller-config

1 Edit the ConfigMap file to add new certificates to the data.caCerts property:

apiVersion: v1 kind: ConfigMap

VMware, Inc. 305 VMware Tanzu Kubernetes Grid

metadata: # Name must be `kapp-controller-config` for kapp controller to pick it up name: kapp-controller-config # Namespace must match the namespace kapp-controller is deployed to namespace: tkg-system data: # A cert chain of trusted ca certs. These will be added to the system-wide # cert pool of trusted ca's (optional) caCerts: | -----BEGIN CERTIFICATE----- -----END CERTIFICATE------BEGIN CERTIFICATE----- -----END CERTIFICATE----- # The url/ip of a proxy for kapp controller to use when making network requests (optional) httpProxy: "" # The url/ip of a tls capable proxy for kapp controller to use when making network requests (optional) httpsProxy: "" # A comma delimited list of domain names which kapp controller should bypass the proxy for when making requests (optional) noProxy: "" ```

1. Save the ConfigMap and exit the editor.

1. Delete the `kapp-controller` pod, so that it regenerates with the new configuration: kubectl delete pod -n tkg-system -l app=kapp-controller

## Upgrading the Tanzu Kubernetes Grid Extensions

For information about how to upgrade the Tanzu Kubernetes Grid extensions from a previous release, see [Upgrade Tanzu Kubernetes Grid Extensions](../upgrade-tkg/extensions.md).

Implementing Ingress Control with Contour

Ingress control is a core concept in Kubernetes, that is implemented by a third party proxy. Contour is a Kubernetes ingress controller that uses the Envoy edge and service proxy. Tanzu Kubernetes Grid includes signed binaries for Contour and Envoy, that you can deploy on Tanzu Kubernetes clusters to provide ingress control services on those clusters.

For general information about ingress control, see Ingress Controllers in the Kubernetes documentation.

You deploy Contour and Envoy directly on Tanzu Kubernetes clusters. You do not need to deploy Contour on management clusters. Deploying Contour is a prerequisite if you want to deploy the Prometheus, Grafana, and Harbor extensions.

VMware, Inc. 306 VMware Tanzu Kubernetes Grid

Prerequisites n Deploy a management cluster on vSphere, Amazon EC2, or Azure. n You have downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n You have installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools.

IMPORTANT: n In this release of Tanzu Kubernetes Grid, the provided implementation of Contour and Envoy assumes that you use self-signed certificates. To configure Contour and Envoy with your own certificates, engage with VMware Tanzu support. n Tanzu Kubernetes Grid does not support IPv6 addresses. This is because upstream Kubernetes only provides alpha support for IPv6. Always provide IPv4 addresses in the procedures in this section. n The extensions folder tkg-extensions-v1.3.1+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prepare the Tanzu Kubernetes Cluster for Contour Deployment

Before you can deploy Contour on a Tanzu Kubernetes cluster, you must install the tools that the Contour extension requires.

This procedure applies to all clusters, running on vSphere, Amazon EC2, and Azure.

1 Create a cluster configuration YAML file for a workload cluster. For a cluster named contour- test, for example:

INFRASTRUCTURE_PROVIDER: vsphere CLUSTER_NAME: contour-test CLUSTER_PLAN: dev

2 vSphere: To deploy the cluster to vSphere, add a line to the configuration file that sets VSPHERE_CONTROL_PLANE_ENDPOINT to a static virtual IP (VIP) address for the control plane of the Tanzu Kubernetes cluster. Ensure that this IP address is not in the DHCP range, but is in the same subnet as the DHCP range. If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address. For example:

VSPHERE_CONTROL_PLANE_ENDPOINT: 10.10.10.10

VMware, Inc. 307 VMware Tanzu Kubernetes Grid

3 Deploy the cluster by passing the cluster configuration file to the tanzu cluster create:

tanzu cluster create contour-test --file contour-test-config.yaml

4 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

You should see folders for authentication, ingress, logging, monitoring, registry, and some YAML files. Run all of the commands in these procedures from this location.

5 Get the admin credentials of the Tanzu Kubernetes cluster on which to deploy Contour.

tanzu cluster kubeconfig get contour-test --admin

6 Set the context of kubectl to the Tanzu Kubernetes cluster.

kubectl config use-context contour-test-admin@contour-test

7 If you haven't already, install cert-manager on the Tanzu Kubernetes workload cluster by following the procedure in Install Cert Manager on Workload Clusters.

The cluster is ready for you to deploy the Contour service. For the next steps, see Deploy Contour on the Tanzu Kubernetes Cluster.

Deploy Contour on the Tanzu Kubernetes Cluster

After you have set up the cluster, you must create the configuration file that is used when you deploy Contour, create a Kubernetes secret, and then deploy Contour.

This procedure applies to all clusters, running on vSphere, Amazon EC2, and Azure.

1 Create a namespace for the Contour service on the Tanzu Kubernetes cluster.

kubectl apply -f ingress/contour/namespace-role.yaml

You should see confirmation that a tanzu-system-ingress namespace, service account, and RBAC role bindings are created.

namespace/tanzu-system-ingress created serviceaccount/contour-extension-sa created role.rbac.authorization.k8s.io/contour-extension-role created rolebinding.rbac.authorization.k8s.io/contour-extension-rolebinding created clusterrole.rbac.authorization.k8s.io/contour-extension-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/contour-extension-cluster-rolebinding created

2 Copy the relevant contour-data-values YAML example file into its directory and name it contour-data-values.yaml:

VMware, Inc. 308 VMware Tanzu Kubernetes Grid

vSphere, installing Contour on a workload cluster with NSX Advanced Load Balancer (ALB):

cp ingress/contour/vsphere/contour-data-values-lb.yaml.example ingress/contour/vsphere/contour- data-values.yaml

vSphere, installing Contour on the shared services cluster, or on a workload cluster without NSX ALB:

cp ingress/contour/vsphere/contour-data-values.yaml.example ingress/contour/vsphere/contour-data- values.yaml

Amazon EC2:

cp ingress/contour/aws/contour-data-values.yaml.example ingress/contour/aws/contour-data- values.yaml

Azure:

cp ingress/contour/azure/contour-data-values.yaml.example ingress/contour/azure/contour-data- values.yaml

3 (Optional) Modify any values in contour-data-values.yaml. In most cases you do not need to modify contour-data-values.yaml, but the ingress/contour/README.md file in the extensions bundle directory lists non-default options that you may set.

n For example, the Contour extension exposes Envoy as a NodePort type service by default, but also supports it as a LoadBalancer or ClusterIP service. You set this envoy.service.type value in contour-data-values.yaml.

4 Create a Kubernetes secret named contour-data-values with the values that you set in contour-data-values.yaml.

vSphere:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/vsphere/ contour-data-values.yaml -n tanzu-system-ingress

Amazon EC2:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/aws/ contour-data-values.yaml -n tanzu-system-ingress

Azure:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/azure/ contour-data-values.yaml -n tanzu-system-ingress

You should see the confirmation secret/contour-data-values created.

VMware, Inc. 309 VMware Tanzu Kubernetes Grid

5 Deploy the Contour extension

kubectl apply -f ingress/contour/contour-extension.yaml

You should see the confirmation extension.clusters.tmc.cloud.vmware.com/contour created.

6 View the status of the Contour service.

kubectl get app contour -n tanzu-system-ingress

The status of the Contour app should show Reconcile Succeeded when Contour has deployed successfully.

NAME DESCRIPTION SINCE-DEPLOY AGE contour Reconcile succeeded 15s 72s

7 If the status is not Reconcile Succeeded, view the full status details of the Contour service.

Viewing the full status can help you to troubleshoot the problem.

kubectl get app contour -n tanzu-system-ingress -o yaml

8 Check that the new services are running by listing all of the pods that are running in the cluster.

kubectl get pods -A

In the tanzu-system-ingress namespace, you should see the contour and envoy services running in a pod with names similar to contour-55c56bd7b7-85gc2 and envoy-tmxww.

NAMESPACE NAME READY STATUS RESTARTS AGE [...] tanzu-system-ingress contour-55c56bd7b7-85gc2 1/1 Running 0 6m37s tanzu-system-ingress contour-55c56bd7b7-x4kv7 1/1 Running 0 6m37s tanzu-system-ingress envoy-tmxww 2/2 Running 0 6m38s

9 If you deployed Contour to Amazon EC2 or Azure, check that a load balancer has been created for the Envoy service.

$ kubectl get svc envoy -n tanzu-system-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'

On Amazon EC2, the loadbalancer has a name similar to aabaaad4dfc8e4a808a70a7cbf7d9249-1201421080.us-west-2.elb.amazonaws.com. On Azure, it will be an IP address similar to 20.54.226.44.

Access the Envoy Administration Interface Remotely

After you have deployed Contour on a cluster, you can use the embedded Envoy administration interface to view data about your deployments.

VMware, Inc. 310 VMware Tanzu Kubernetes Grid

For information about the Envoy administration interface, see Administration interface in the Envoy documentation.

1 Obtain the name of the Envoy pod.

ENVOY_POD=$(kubectl -n tanzu-system-ingress get pod -l app=envoy -o name | head -1)

2 Forward the Envoy pod to port 9001 on your local machine.

kubectl -n tanzu-system-ingress port-forward $ENVOY_POD 9001

3 In a browser, navigate to http://127.0.0.1:9001/.

You should see the Envoy administration interface.

4 Click the links in the Envoy administration interface to see information about the operations in your cluster.

Visualize the Internal Contour Directed Acyclic Graph (DAG)

When you have started running workloads in your Tanzu Kubernetes cluster, you can visualize the traffic information that Contour exposes in the form of a directed acyclic graph (DAG).

1 Obtain a Contour pod.

CONTOUR_POD=$(kubectl -n tanzu-system-ingress get pod -l app=contour -o name | head -1)

VMware, Inc. 311 VMware Tanzu Kubernetes Grid

2 Forward port 6060 on the Contour pod.

kubectl -n tanzu-system-ingress port-forward $CONTOUR_POD 6060

3 Open a new terminal window and download and store the DAG as a *.png file.

This command requires you to install dot on your system, if it is not present already.

curl localhost:6060/debug/dag | dot -T png > contour-dag.png

4 Open contour-dag.png to view the graph.

Optional Configuration

In addition to the minimum configuration provided in the contour-data-values.yaml file for your logging endpoint, you can customize your configuration by adding values that you can copy from the file tkg-extensions-v1.3.1+vmware.1/ingress/contour/values.yaml. Note that this file is not located in the tkg-extensions-v1.3.1+vmware.1/extensions/ingress/contour folder, but in the ingress folder that is at the same level as the extensions folder.

You can also customize your Contour ingress setup using ytt overlays. See ytt Overlays and Example: External DNS Annotation below.

The table below contains information on the values you can copy from the tkg-extensions- v1.3.1+vmware.1/ingress/contour/values.yaml file and how they can be used to modify the default behavior of Fluent Bit when deployed onto a Tanzu Kubernetes cluster.

NOTE: Where applicable, the settings that you configure in contour-data-values.yaml override any settings that you configure in values.yaml.

If you reconfigure your Fluent Bit settings after the initial deployment, you must follow the steps in Update a Running Contour Deployment in order to apply the new configuration to the cluster.

Parameter Type and Description Default infrastructure_provider String. Infrastructure Mandatory parameter Provider: vsphere, aws, azure contour.namespace String. Namespace in which tanzu-system-ingress to deploy Contour. contour.config.requestTimeout time.Duration. Client 0s request timeout to pass to Envoy. contour.config.tls.minimumProtocolVersion String. Minimum TLS version 1.1 that Contour will negotiate.

VMware, Inc. 312 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default contour.config.tls.fallbackCertificate.name String. Name of the secret null containing the fallback certificate for requests that do not match SNI defined for a vhost. contour.config.tls.fallbackCertificate.namespace String. Namespace of the null secret containing the fallback certificate. contour.config.leaderelection.configmapName String. Name of the leader-elect configmap to use for Contour leaderelection. contour.config.leaderelection.configmapNamespace String. Namespace of the tanzu-system-ingress contour leaderelection configmap. contour.config.disablePermitInsecure Boolean. Disables false ingressroute permitInsecure field. contour.config.accesslogFormat String. Access log format. envoy contour.config.jsonFields Array of strings. Fields to See https://godoc.org/ log github.com/projectcontour/ contour/internal/ envoy#JSONFields. contour.config.useProxyProtocol Boolean. See https:// false projectcontour.io/guides/ proxy-proto/. contour.config.defaultHTTPVersions Array of strings. HTTP "HTTP/1.1 HTTP2" versions that Contour should program Envoy to serve. contour.config.timeouts.requestTimeout time.Duration. The timeout null (timeout is disabled) for an entire request. contour.config.timeouts.connectionIdleTimeout time.Duration. The time to 60s wait before terminating an idle connection. contour.config.timeouts.streamIdleTimeout time.Duration. The time to 5m wait before terminating a request or stream with no activity. contour.config.timeouts.maxConnectionDuration time.Duration. The time to null (timeout is disabled) wait before terminating a connection irrespective of activity. contour.config.timeouts.ConnectionShutdownGracePeriod time.Duration. The time to 5s wait between sending an initial and final GOAWAY.

VMware, Inc. 313 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default contour.config.debug Boolean. Turn on Contour false debugging. contour.config.ingressStatusAddress String. The address to set null on status of every Ingress resource. contour.certificate.duration time.Duration. Duration of 8760h the Contour certificate. contour.certificate.renewBefore time.Duration. Time 360h Contour certificate should be renewed. contour.deployment.replicas Integer. Number of Contour 2 replicas. contour.image.repository String. Repository the projects.registry.vmware.com/tkg containing Contour image. contour.image.name String. Name of the Contour contour image. contour.image.tag String. Contour image tag. v1.8.1_vmware.1 contour.image.pullPolicy String. Contour image pull IfNotPresent policy. envoy.image.repository String. Repository projects.registry.vmware.com/tkg containing the Envoy image. envoy.image.name String. Name of the Envoy envoy image. envoy.image.tag String. Envoy image tag. v1.15.0_vmware.1 envoy.image.pullPolicy String. Envoy image pull IfNotPresent policy. envoy.hostPort.enable Boolean. Flag to expose true Envoy ports on host. envoy.hostPort.http Integer. Envoy HTTP host 80 port. envoy.hostPort.https Integer. Envoy HTTPS host 443 port. envoy.service.type String. Type of service to vsphere: NodePort AWS: expose to Envoy: ClusterIP, LoadBalancer Azure: LoadBalancer NodePort, LoadBalancer envoy.service.externalTrafficPolicy String. External traffic Cluster policy for Envoy service: Local, Cluster envoy.service.nodePort.http Integer. Desired nodePort null - Kubernetes assigns a for service of type NodePort dynamic node port used for HTTP requests.

VMware, Inc. 314 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default envoy.service.nodePort.https Integer. Desired nodePort null - Kubernetes assigns a for service of type NodePort dynamic node port used for HTTPS requests. envoy.deployment.hostNetwork Boolean. Run Envoy on false hostNetwork envoy.service.aws.LBType String. AWS load balancer classic type to be used to expose the Envoy service: classic, nlb envoy.loglevel String. Log level to use for info Envoy. ytt Overlays and Example: External DNS Annotation

In addition to modifying contour-data-values.yaml, you can use ytt overlays to configure your Contour setup as described in Extensions and Shared Services in Customizing Clusters, Plans, and Extensions with ytt Overlays and in the extensions mods examples in the TKG Lab repository.

One TKG Lab example, Setup DNS for Wildcard Domain Contour Ingress, adds a wildcard annotation to Contour for External DNS. The example procedure does this by running a script generate-and-apply-external-dns-yaml.sh that sets up the configuration files used to deploy the External DNS and Contour extensions. To customize the Contour extension, the script applies two ytt overlay files: n contour-overlay.yaml adds external-dns annotation metadata to Contour's Envoy service n contour-extension-overlay.yaml directs the Contour extension to always deploy the Contour app and Envoy service with contour-overlay.yaml

See the TKG Lab repository and its Step by Step setup guide for more examples.

Update a Contour Deployment

If you need to make changes to the configuration of the Contour extension, you can update the extension.

Perform the steps in this procedure if you modify either of the extensions/ingress/contour/ /contour-data-values.yaml or ingress/contour/values.yaml files after the initial deployment of Contour.

1 Obtain the Contour data values from the Kubernetes secret.

kubectl get secret contour-data-values -n tanzu-system-ingress -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > contour-data-values.yaml

2 Modify either or both of the extensions/ingress/contour//contour-data- values.yaml or ingress/contour/values.yaml files to update your configuration.

VMware, Inc. 315 VMware Tanzu Kubernetes Grid

3 Update the Kubernetes secret.

This command assumes that you are running it from tkg-extensions-v1.3.1+vmware.1/ extensions.

vSphere:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/vsphere/ contour-data-values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

Amazon EC2:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/aws/ contour-data-values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

Azure:

kubectl create secret generic contour-data-values --from-file=values.yaml=ingress/contour/azure/ contour-data-values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

Note that the final - on the kubectl replace command above is necessary to instruct kubectl to accept the input being piped to it from the kubectl create secret command.

The Contour extension will be reconciled using the new values you just added. The changes should show up in five minutes or less. This is handled by the Kapp controller, which synchronizes every five minutes.

Implementing Log Forwarding with Fluent Bit

Fluent Bit is a lightweight log processor and forwarder that allows you to collect data and logs from different sources, unify them, and send them to multiple destinations. Tanzu Kubernetes Grid includes signed binaries for Fluent Bit, that you can deploy on management clusters and on Tanzu Kubernetes clusters to provide a log-forwarding service.

The Fluent Bit implementation provided in this release of Tanzu Kubernetes Grid allows you to gather logs from management clusters or Tanzu Kubernetes clusters running in vSphere, Amazon EC2, and Azure, and to forward them to an Elastic Search, Kafka, Splunk, or HTTP endpoint. The Fluent Bit deployment that Tanzu Kubernetes Grid provides is also pre-configured to expose certain metrics to Prometheus and Grafana.

You can deploy Fluent Bit on any management clusters or Tanzu Kubernetes clusters from which you want to collect logs. First, you configure an output plugin on the cluster from which you want to gather logs, depending on the endpoint that you use. Then, you deploy Fluent Bit on the cluster. The examples in this topic deploy Fluent Bit on a Tanzu Kubernetes cluster.

IMPORTANT: n Tanzu Kubernetes Grid does not support IPv6 addresses. This is because upstream Kubernetes only provides alpha support for IPv6. Always provide IPv4 addresses in the procedures in this section.

VMware, Inc. 316 VMware Tanzu Kubernetes Grid

n The extensions folder tkg-extensions-v1.3.1+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prerequisites n You have deployed a management cluster on vSphere, Amazon EC2, or Azure. n You have downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n You have installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n You have deployed one of the following logging management backends for storing and analyzing logs:

n Elastic Search.

n Kafka.

n Splunk.

n Another HTTP-based service. For example, VMware vRealize Log Insight.

n Syslog. For example, VMware vRealize Log Insight.

Prepare the Cluster for Fluent Bit Deployment

Before you can deploy Fluent Bit on a management cluster or on a Tanzu Kubernetes cluster, you must install the tools that the Fluent Bit extension requires.

NOTE: Deploying Fluent Bit on a management cluster does not mean that logs will also be collected from the Tanzu Kubernetes clusters that the management cluster manages. You must deploy Fluent Bit on each cluster from which you want to gather logs.

This procedure applies to all clusters, running on vSphere, Amazon EC2, and Azure.

1 Create a cluster configuration YAML file for a management or workload cluster. For a cluster named fluentbit-test, for example:

INFRASTRUCTURE_PROVIDER: vsphere CLUSTER_NAME: fluentbit-test CLUSTER_PLAN: dev

VMware, Inc. 317 VMware Tanzu Kubernetes Grid

2 vSphere: To deploy the cluster to vSphere, add a line to the configuration file that sets VSPHERE_CONTROL_PLANE_ENDPOINT to a static virtual IP (VIP) address for the control plane of the Tanzu Kubernetes cluster. Ensure that this IP address is not in the DHCP range, but is in the same subnet as the DHCP range. If you mapped a fully qualified domain name (FQDN) to the VIP address, you can specify the FQDN instead of the VIP address. For example:

VSPHERE_CONTROL_PLANE_ENDPOINT: 10.10.10.10

3 Deploy the cluster by passing the cluster configuration file to the tanzu cluster create:

tanzu cluster create fluentbit-test --file fluentbit-test-config.yaml

4 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

You should see folders for authentication, ingress, logging, monitoring, registry, and some YAML files. Run all of the commands in these procedures from this location.

5 Get the admin credentials of the Tanzu Kubernetes cluster on which to deploy Fluent Bit.

tanzu cluster kubeconfig get fluentbit-test --admin

6 Set the context of kubectl to the Tanzu Kubernetes cluster.

kubectl config use-context fluentbit-test-admin@fluentbit-test

7 Create a namespace for the Fluent Bit service on the cluster.

kubectl apply -f logging/fluent-bit/namespace-role.yaml

You should see confirmation that a tanzu-system-logging namespace, service account, and RBAC role bindings are created.

namespace/tanzu-system-logging created serviceaccount/fluent-bit-extension-sa created role.rbac.authorization.k8s.io/fluent-bit-extension-role created rolebinding.rbac.authorization.k8s.io/fluent-bit-extension-rolebinding created clusterrole.rbac.authorization.k8s.io/fluent-bit-extension-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/fluent-bit-extension-cluster-rolebinding created

The cluster is ready for you to deploy the Fluent Bit service. For the next steps, see the appropriate procedure for your logging endpoint: n Prepare the Fluent Bit Configuration File for an Elastic Search Output Plugin n Prepare the Fluent Bit Configuration File for a Kafka Output Plugin n Prepare the Fluent Bit Configuration File for a Splunk Output Plugin n Prepare the Fluent Bit Configuration File for an HTTP Endpoint Output Plugin

VMware, Inc. 318 VMware Tanzu Kubernetes Grid

n Prepare the Fluent Bit Configuration File for a Syslog Endpoint Output Plugin

Prepare the Fluent Bit Configuration File for an Elastic Search Output Plugin

After you have set up the cluster, you must create the configuration file that is used when you deploy Fluent Bit and use it to create a Kubernetes secret.

This procedure describes how to configure Elastic Search as the output plugin on a cluster for clusters that are running on vSphere, Amazon EC2, and Azure. You must already have deployed Elastic Search as the logging management backend for storing and analyzing logs.

1 Make a copy of the fluent-bit-data-values.yaml.example file and name it fluent-bit-data- values.yaml.

cp logging/fluent-bit/elasticsearch/fluent-bit-data-values.yaml.example logging/fluent-bit/ elasticsearch/fluent-bit-data-values.yaml

2 Open fluent-bit-data-values.yaml in a text editor.

3 Update fluent-bit-data-values.yaml with information about your Tanzu Kubernetes cluster and Elastic Search server.

n : The name of Tanzu Kubernetes Grid instance. This is the same as the name of the management cluster.

n : The name of the Tanzu Kubernetes cluster on which you are deploying Fluent Bit.

n : The IP address or host name of the target Elastic Search server.

n : The TCP port of the target Elastic Search server.

NOTE: The host name and port number must be wrapped in quotes ("").

For example:

#@data/values #@overlay/match-child-defaults missing_ok=True --- logging: image: repository: "projects.registry.vmware.com/tkg" tkg: instance_name: "my-mgmt-cluster" cluster_name: "fluentbit-test" fluent_bit: output_plugin: "elasticsearch" elasticsearch: host: "elasticsearch" port: "9200"

4 Optionally customize your Fluent Bit deployment.

VMware, Inc. 319 VMware Tanzu Kubernetes Grid

For information about other values that you can configure to customize your Fluent Bit deployment, see Optional Configuration below.

5 Create a Kubernetes secret named fluent-bit-data-values with the values that you set in fluent-bit-data-values.yaml.

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ elasticsearch/fluent-bit-data-values.yaml -n tanzu-system-logging

You should see the confirmation secret/fluent-bit-data-values created.

For the next steps, see Deploy the Fluent Bit Extension.

Prepare the Fluent Bit Configuration File for a Kafka Output Plugin

After you have set up the cluster, you must create the configuration file that is used when you deploy Fluent Bit and use it to create a Kubernetes secret.

This procedure describes how to configure Kafka as the output plugin on a cluster for clusters that are running on vSphere, Amazon EC2, and Azure. You must already have deployed Kafka as the logging management backend for storing and analyzing logs.

1 Make a copy of the fluent-bit-data-values.yaml.example file and name it fluent-bit-data- values.yaml.

cp logging/fluent-bit/kafka/fluent-bit-data-values.yaml.example logging/fluent-bit/kafka/fluent- bit-data-values.yaml

2 Open fluent-bit-data-values.yaml in a text editor.

3 Update fluent-bit-data-values.yaml with information about your Tanzu Kubernetes cluster and Kafka server.

n : The name of Tanzu Kubernetes Grid instance. This is the same as the name of the management cluster.

n : The name of the Tanzu Kubernetes cluster on which you are deploying Fluent Bit.

n : The name of the Kafka broker service.

n : The name of the topic that ingests the logs in Kafka.

NOTE: The broker service name and topic name values must be wrapped in quotes ("").

For example:

#@data/values #@overlay/match-child-defaults missing_ok=True --- logging: image: repository: "projects.registry.vmware.com/tkg" tkg:

VMware, Inc. 320 VMware Tanzu Kubernetes Grid

instance_name: "my-mgmt-cluster" cluster_name: "fluentbit-test" fluent_bit: output_plugin: "kafka" kafka: broker_service_name: "my-kafka-broker" topic_name: "my-kafka-topic-name"

4 Optionally customize your Fluent Bit deployment.

For information about other values that you can configure to customize your Fluent Bit deployment, see Optional Configuration below.

5 Create a Kubernetes secret named fluent-bit-data-values with the values that you set in fluent-bit-data-values.yaml.

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ kafka/fluent-bit-data-values.yaml -n tanzu-system-logging

You should see the confirmation secret/fluent-bit-data-values created.

For the next steps, see Deploy the Fluent Bit Extension.

Prepare the Fluent Bit Configuration File for a Splunk Output Plugin

After you have set up the cluster, you must create the configuration file that is used when you deploy Fluent Bit and use it to create a Kubernetes secret.

This procedure describes how to configure Splunk as the output plugin on a cluster for clusters that are running on vSphere, Amazon EC2, and Azure. You must already have deployed Splunk as the logging management backend for storing and analyzing logs.

1 Make a copy of the fluent-bit-data-values.yaml.example file and name it fluent-bit-data- values.yaml.

cp logging/fluent-bit/splunk/fluent-bit-data-values.yaml.example logging/fluent-bit/splunk/fluent- bit-data-values.yaml

2 Open fluent-bit-data-values.yaml in a text editor.

3 Update fluent-bit-data-values.yaml with information about your Tanzu Kubernetes cluster and Splunk server.

n : The name of Tanzu Kubernetes Grid instance. This is the same as the name of the management cluster.

n : The name of the Tanzu Kubernetes cluster on which you are deploying Fluent Bit.

n : The IP address or host name of the target Splunk Server.

n : The TCP port of the target Splunk Server. The default TCP port for Splunk Server is 8088.

VMware, Inc. 321 VMware Tanzu Kubernetes Grid

n : The authentication token for the HTTP event collector interface.

NOTE: The host name, port number, and token values must be wrapped in quotes ("").

For example:

#@data/values #@overlay/match-child-defaults missing_ok=True --- logging: image: repository: "projects.registry.vmware.com/tkg" tkg: instance_name: "my-mgmt-cluster" cluster_name: "fluentbit-test" fluent_bit: output_plugin: "splunk" splunk: host: "mysplunkhost.example.com" port: "8088" token: "f61b7aecf75e95cd226234f4fe901ed450fa323648165a91bf02f0a07c5199eb"

4 Optionally customize your Fluent Bit deployment.

For information about other values that you can configure to customize your Fluent Bit deployment, see Optional Configuration below.

5 Create a Kubernetes secret named fluent-bit-data-values with the values that you set in fluent-bit-data-values.yaml.

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ splunk/fluent-bit-data-values.yaml -n tanzu-system-logging

You should see the confirmation secret/fluent-bit-data-values created.

For the next steps, see Deploy the Fluent Bit Extension.

Prepare the Fluent Bit Configuration File for an HTTP Endpoint Output Plugin

After you have set up the cluster, you must create the configuration file that is used when you deploy Fluent Bit and use it to create a Kubernetes secret.

This procedure describes how to configure an HTTP endpoint as the output plugin on a cluster for clusters that are running on vSphere, Amazon EC2, and Azure. You must already have deployed an HTTP endpoint as the logging management backend for storing and analyzing logs, such as VMware vRealize Log Insight Cloud.

VMware, Inc. 322 VMware Tanzu Kubernetes Grid

NOTE: The provided implementation of Fluent Bit only works with vRealize Log Insight Cloud and not with the on-premises version of vRealize Log Insight. If you want to use the on-premises version of vRealize Log Insight, see Prepare the Fluent Bit Configuration File for a Syslog Output Plugin below.

1 Make a copy of the fluent-bit-data-values.yaml.example file and name it fluent-bit-data- values.yaml.

cp logging/fluent-bit/http/fluent-bit-data-values.yaml.example logging/fluent-bit/http/fluent-bit- data-values.yaml

2 Open fluent-bit-data-values.yaml in a text editor.

3 Update fluent-bit-data-values.yaml with information about your Tanzu Kubernetes cluster and HTTP logging endpoint.

n : The name of Tanzu Kubernetes Grid instance. This is the same as the name of the management cluster.

n : The name of the Tanzu Kubernetes cluster on which you are deploying Fluent Bit.

n : The IP address or host name of the target HTTP server.

n : The TCP port of the target HTTP server.

n : The HTTP URI for the target Web server.

n : The HTTP header key/value pair. For example, for VMware vRealize Log Insight Cloud, set the key to the authorization bearer token and the value to the API token.

n json: The data format to use in the HTTP request body. For example, for vRealize Log Insight Cloud, leave this value set to json.

NOTE: The host name, port number, URI, header key value, and format values must be wrapped in quotes ("").

For example:

#@data/values #@overlay/match-child-defaults missing_ok=True --- logging: image: repository: projects.registry.vmware.com/tkg tkg: instance_name: "my-mgmt-cluster" cluster_name: "fluentbit-test" fluent_bit: output_plugin: "http" http: host: "data.mgmt.cloud.vmware.com"

VMware, Inc. 323 VMware Tanzu Kubernetes Grid

port: "443" uri: "/le-mans/v1/streams/ingestion-pipeline-stream" header_key_value: "Authorization Bearer H986qVcPlbQry1UWbY8QuSMXhAtAYsG6" format: "json"

4 Optionally customize your Fluent Bit deployment.

For information about other values that you can configure to customize your Fluent Bit deployment, see Optional Configuration below.

5 Create a Kubernetes secret named fluent-bit-data-values with the values that you set in fluent-bit-data-values.yaml.

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ http/fluent-bit-data-values.yaml -n tanzu-system-logging

You should see the confirmation secret/fluent-bit-data-values created.

For the next steps, see Deploy the Fluent Bit Extension.

Prepare the Fluent Bit Configuration File for a Syslog Output Plugin

This procedure describes how to configure a syslog endpoint as the output plugin on a cluster that is running on vSphere, Amazon EC2, or Azure. You must already have deployed a syslog endpoint as the logging management backend for storing and analyzing logs, such as VMware vRealize Log Insight.

NOTE: The provided implementation of Fluent Bit works only with the on-premises version of vRealize Log Insight and not with vRealize Log Insight Cloud. If you want to use vRealize Log Insight Cloud, see Prepare the Fluent Bit Configuration File for an HTTP Endpoint Output Plugin above.

After you have set up the cluster, you must create the Fluent Bit configuration file and use it to create a Kubernetes secret:

1 Make a copy of the fluent-bit-data-values.yaml.example file and name it fluent-bit-data- values.yaml.

cp logging/fluent-bit/syslog/fluent-bit-data-values.yaml.example logging/fluent-bit/syslog/fluent- bit-data-values.yaml

2 Open fluent-bit-data-values.yaml in a text editor.

3 Update fluent-bit-data-values.yaml with information about your cluster and syslog endpoint.

n : The name of the Tanzu Kubernetes Grid instance. This is the same as the name of the management cluster.

n : The name of the Tanzu Kubernetes cluster on which you are deploying Fluent Bit.

n : The domain name or IP address of your syslog server.

n : The TCP or UDP port of your syslog server.

VMware, Inc. 324 VMware Tanzu Kubernetes Grid

n : The transport protocol to use for syslog messages. Set to tcp, udp, or tls.

n : The format to use for syslog messages. For example, rfc5424.

NOTE: These values must be wrapped in quotes ("").

For example:

#@data/values #@overlay/match-child-defaults missing_ok=True --- logging: image: repository: projects.registry.vmware.com/tkg tkg: instance_name: "my-mgmt-cluster" cluster_name: "fluentbit-test" fluent_bit: output_plugin: "syslog" syslog: host: "10.182.176.50" port: "514" mode: "tcp" format: "rfc5424"

4 Optionally, customize your Fluent Bit deployment.

For information about other values that you can configure to customize your Fluent Bit deployment, see Optional Configuration below.

5 Create a Kubernetes secret named fluent-bit-data-values with the values that you set in fluent-bit-data-values.yaml.

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ syslog/fluent-bit-data-values.yaml -n tanzu-system-logging

You should see the confirmation secret/fluent-bit-data-values created.

For the next steps, see Deploy the Fluent Bit Extension.

Deploy the Fluent Bit Extension

After you have prepared the cluster and updated the appropriate configuration file for your logging endpoint, you can deploy Fluent Bit on the cluster.

1 Deploy the Fluent Bit extension.

kubectl apply -f logging/fluent-bit/fluent-bit-extension.yaml

You should see the confirmation extension.clusters.tmc.cloud.vmware.com/fluent-bit created.

2 View the status of the Fluent Bit service.

kubectl get app fluent-bit -n tanzu-system-logging

VMware, Inc. 325 VMware Tanzu Kubernetes Grid

The status of the Fluent Bit app should show Reconcile Succeeded when Fluent Bit has deployed successfully.

NAME DESCRIPTION SINCE-DEPLOY AGE fluent-bit Reconcile succeeded 54s 14m

3 If the status is not Reconcile Succeeded, view the full status details of the Fluent Bit service.

Viewing the full status can help you to troubleshoot the problem.

kubectl get app fluent-bit -n tanzu-system-logging -o yaml

4 Check that the new services are running by listing all of the pods that are running in the cluster.

kubectl get pods -A

In the tanzu-system-logging namespace, you should see the fluent-bit services running in a pod with names similar to fluent-bit-6zn4l.

NAMESPACE NAME READY STATUS RESTARTS AGE [...] tanzu-system-logging fluent-bit-6zn4l 1/1 Running 0 22m tanzu-system-logging fluent-bit-g8tkp 1/1 Running 0 22m

Fluent Bit will now capture logs from all of the containers that run in pods in the cluster. Fluent Bit also captures logs from systemd for the kubelet and containerd services. The kubelet and containerd services do not run in containers, and they write their logs to journald. Fluent Bit implements a systemd input plugin to collect log messages from the journald daemon.

Monitoring and Viewing Logs

You can use different dashboards to view all of the logs that Fluent Bit collects, depending on the logging endpoint that you configured. For example, if you configured an Elastic Search endpoint, you can view logs in the Kibana dashboard, or if you implemented an HTTP endpoint, you can view them in vRealize Log Insight dashboards. The image below shows an example of a Kibana dashboard.

VMware, Inc. 326 VMware Tanzu Kubernetes Grid

The Fluent Bit deployment that Tanzu Kubernetes Grid provides is pre-configured to expose certain metrics to Prometheus. If you also deploy the Tanzu Kubernetes Grid implementation of Prometheus on your cluster, it includes a predefined job named kubernetes_pods that you can use to view Fluent Bit metrics in the Prometheus dashboard without performing any additional configuration.

The image below shows an example of the same data viewed in a Grafana dashboard.

VMware, Inc. 327 VMware Tanzu Kubernetes Grid

For information about the Prometheus and Grafana extensions, see Implementing Monitoring with Prometheus and Grafana.

Optional Configuration

In addition to the minimum configuration provided in the fluent-bit-data-values.yaml file for your logging endpoint, you can customize your configuration by adding values that you can copy from the file tkg-extensions-v1.3.1+vmware.1/logging/fluent-bit/values.yaml. Note that this file is not located in the tkg-extensions-v1.3.1+vmware.1/extensions/logging/fluent-bit folder, but in the logging folder that is at the same level as the extensions folder.

You can also customize your Fluent Bit logging setup using ytt overlays, as described in Extensions and Shared Services in Customizing Clusters, Plans, and Extensions with ytt Overlays and in the extensions mods examples in the TKG Lab repository.

The table below contains information on the values you can copy from the tkg-extensions- v1.3.1+vmware.1/logging/fluent-bit/values.yaml file and how they can be used to modify the default behavior of Fluent Bit when deployed onto a Tanzu Kubernetes cluster.

NOTE: Where applicable, the settings that you configure in fluent-bit-data-values.yaml override any settings that you configure in values.yaml.

If you reconfigure your Fluent Bit settings after the initial deployment, you must follow the steps in Update a Running Fluent Bit Deployment in order to apply the new configuration to the cluster.

Parameter Type and Description Default logging.namespace String. Namespace in which to deploy Fluent tanzu-system-logging Bit. logging.service_account_name String. Name of the Fluent Bit service fluent-bit account. logging.cluster_role_name String. Name of the cluster role which grants fluent-bit-read get, watch, and list permissions to Fluent Bit.

VMware, Inc. 328 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default logging.image.name String. Name of the Fluent Bit image. fluent-bit logging.image.tag String. Fluent bit image tag. v1.5.3_vmware.1 logging.image.repository Repository containing the Fluent Bit image projects.registry.vmware.com/tkg String. logging.image.pullPolicy Fluent bit image pull policy string IfNotPresent logging.update_strategy String. Update strategy to be used when RollingUpdate updating DaemonSet. tkg.cluster_name String. Name of the Tanzu Kubernetes null cluster. Set this value in fluent-bit-data- values.yaml. tkg.instance_name String. Name of Tanzu Kubernetes Grid null management cluster. Set this value in fluent- bit-data-values.yaml. fluent_bit.log_level String. Log level to use for Fluent Bit. info fluent_bit.output_plugin String. Set the back end to which Fluent Bit null should flush the information it gathers. Set this value in fluent-bit-data-values.yaml. fluent_bit.elasticsearch.host String. IP address or host name of the target null Elastic Search instance. Set this value in fluent-bit-data-values.yaml. fluent_bit.elasticsearch.port Integer. TCP port of the target Elastic Search null instance. Set this value in fluent-bit-data- values.yaml. fluent_bit.kafka.broker_service_name String. Single or multiple list of Kafka Brokers, null for example 192.168.1.3:9092. Set this value in fluent-bit-data-values.yaml. fluent_bit.kafka.topic_name String. Single entry or list of topics separated null by (,) that Fluent Bit will use to send messages to Kafka. Set this value in fluent- bit-data-values.yaml. fluent_bit.splunk.host String. IP address or host name of the target null Splunk Server. Set this value in fluent-bit- data-values.yaml. fluent_bit.splunk.port Integer. TCP port of the target Splunk Server. null Set this value in fluent-bit-data-values.yaml. fluent_bit.splunk.token String. Specify the Authentication Token for null the HTTP Event Collector interface. Set this value in fluent-bit-data-values.yaml. fluent_bit.http.host String. IP address or host name of the target null HTTP Server. Set this value in fluent-bit- data-values.yaml. fluent_bit.http.port Integer. TCP port of the target HTTP Server. null Set this value in fluent-bit-data-values.yaml.

VMware, Inc. 329 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default fluent_bit.http.header_key_value String. HTTP header key/value pair. Multiple null headers can be set. Set this value in fluent- bit-data-values.yaml. fluent_bit.http.format String. Specify the data format to be used in null the HTTP request body. Set this value in fluent-bit-data-values.yaml. fluent_bit.syslog.host String. Domain name or IP address of the null target syslog server. Set this value in fluent- bit-data-values.yaml. fluent_bit.syslog.port Integer. TCP or UDP port of the target syslog null server. Set this value in fluent-bit-data- values.yaml. fluent_bit.syslog.mode String. Transport protocol for syslog null messages. This must be tcp, udp or tls. Set this value in fluent-bit-data-values.yaml. fluent_bit.syslog.format String. Format for syslog messages. Set this null value in fluent-bit-data-values.yaml. host_path.volume_1 String. Directory path from the host node's /var/log file system into the pod, for volume 1. host_path.volume_2 String. Directory path from the host node's /var/lib/docker/containers file system into the pod, for volume 2. host_path.volume_3 String. Directory path from the host node's /run/log file system into the pod, for volume 3.

Update a Running Fluent Bit Deployment

If you need to make changes to the configuration of the Fluent Bit extension, you can update the extension.

Perform the steps in this procedure if you modify either of the extensions/logging/fluent-bit/ /fluent-bit-data-values.yaml or logging/fluent-bit/values.yaml files after initial deployment of Fluent Bit.

1 Modify either or both of the extensions/logging/fluent-bit//fluent-bit-data- values.yaml or logging/fluent-bit/values.yaml files to update your configuration.

2 Update the Kubernetes secret.

This command assumes that you are running it from tkg-extensions-v1.3.1+vmware.1/ extensions.

Elastic Search:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ elasticsearch/fluent-bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

VMware, Inc. 330 VMware Tanzu Kubernetes Grid

Kafka:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/kafka/ fluent-bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

Splunk:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ splunk/fluent-bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

HTTP:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/http/ fluent-bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

Syslog

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=logging/fluent-bit/ syslog/fluent-bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

Note that the final - on the kubectl replace command above is necessary to instruct kubectl to accept the input being piped to it from the kubectl create secret command.

The Fluent Bit extension will be reconciled using the new values you just added. The changes should show up in five minutes or less. This is handled by the Kapp controller, which synchronizes every five minutes.

Implementing Monitoring with Prometheus and Grafana

Tanzu Kubernetes Grid provides cluster monitoring services by implementing the open source Prometheus and Grafana projects. n Prometheus is an open source systems monitoring and alerting toolkit. It can collect metrics from target clusters at specified intervals, evaluate rule expressions, display the results, and trigger alerts if certain conditions arise. For more information about Prometheus, see the Prometheus Overview. The Tanzu Kubernetes Grid implementation of Prometheus includes Alert Manager, which you can configure to notify you when certain events occur. n Grafana is open source visualization and analytics software. It allows you to query, visualize, alert on, and explore your metrics no matter where they are stored. In other words, Grafana provides you with tools to turn your time-series database (TSDB) data into high-quality graphs and visualizations. For more information about Grafana, see What is Grafana?.

You deploy Prometheus and Grafana on Tanzu Kubernetes clusters. The following diagram shows how the monitoring components on a cluster interact.

VMware, Inc. 331 VMware Tanzu Kubernetes Grid

For instructions on how to deploy Prometheus and Grafana on your clusters, see: n Deploy Prometheus on Tanzu Kubernetes Clusters n Deploy Grafana on Tanzu Kubernetes Clusters

Deploy Prometheus on Tanzu Kubernetes Clusters

Tanzu Kubernetes Grid includes signed binaries for Prometheus, an open-source systems monitoring and alerting toolkit. You can deploy Prometheus on Tanzu Kubernetes clusters to monitor services in your clusters.

Prerequisites Before you begin this procedure, you must have: n Downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information on how to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle.

VMware, Inc. 332 VMware Tanzu Kubernetes Grid

n Installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n Deployed a Tanzu Kubernetes Grid management cluster on vSphere, Amazon EC2, or Azure. n Deployed a Tanzu Kubernetes cluster. The examples in this topic use a cluster named monitoring-cluster.

IMPORTANT: n Tanzu Kubernetes Grid does not support IPv6 addresses because upstream Kubernetes only provides alpha support for IPv6. In the following procedures, you must always provide IPv4 addresses. n The extensions folder tkg-extensions-v1.3.1+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prepare the Tanzu Kubernetes Cluster for Prometheus Deployment Before you can deploy Prometheus on a Tanzu Kubernetes cluster, you must install the tools that the Prometheus extension requires.

This procedure applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure.

1 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

You should see subfolders named authentication, ingress, logging, monitoring, registry, and some YAML files. Run all of the commands in these procedures from this location.

2 Retrieve the admin credentials of the cluster.

tanzu cluster kubeconfig get monitoring-cluster --admin

3 Set the context of kubectl to the cluster.

kubectl config use-context monitoring-cluster-admin@monitoring-cluster

4 If you haven't already, install cert-manager on the Tanzu Kubernetes workload cluster by following the procedure in Install Cert Manager on Workload Clusters.

5 Create a namespace for the Prometheus service on the Tanzu Kubernetes cluster.

kubectl apply -f monitoring/prometheus/namespace-role.yaml

VMware, Inc. 333 VMware Tanzu Kubernetes Grid

You should see confirmation that a tanzu-system-monitoring namespace, service account, and RBAC role bindings have been created.

namespace/tanzu-system-monitoring created serviceaccount/prometheus-extension-sa created role.rbac.authorization.k8s.io/prometheus-extension-role created rolebinding.rbac.authorization.k8s.io/prometheus-extension-rolebinding created clusterrole.rbac.authorization.k8s.io/prometheus-extension-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/prometheus-extension-cluster-rolebinding created

When all pods are ready, the Tanzu Kubernetes cluster is ready for you to deploy the Prometheus extension. To do so, follow the procedure in Prepare the Prometheus Configuration Files.

Prepare the Prometheus Extension Configuration File This procedure describes how to prepare the Prometheus extension configuration file for a Tanzu Kubernetes cluster. This configuration file applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure and is required to deploy the Prometheus extension.

For additional configuration options, you can use ytt overlays as described in Extensions and Shared Services in Customizing Clusters, Plans, and Extensions with ytt Overlays and in the extensions mods examples in the TKG Lab repository.

1 Make a copy of the prometheus-data-values.yaml.example file and name it prometheus-data- values.yaml.

cp monitoring/prometheus/prometheus-data-values.yaml.example monitoring/prometheus/prometheus- data-values.yaml

After you have renamed the file, you do not need to modify it. The prometheus-data- values.yaml file only designates whether you are deploying to a cluster that is running on vSphere, Amazon EC2, or Azure.

2 You can now either deploy the Prometheus extension with default values, or you can customize the deployment.

n To deploy Prometheus using default configuration values, proceed directly to Deploy Prometheus on the Tanzu Kubernetes Cluster. The default values for configuration parameters are listed in the table in Customize Your Prometheus Deployment.

n To customize your Prometheus deployment, see Customize Your Prometheus Deployment. For example, you can customize how often Prometheus scrapes the cluster for data or configure how notifications and alerts are sent via Slack and email.

VMware, Inc. 334 VMware Tanzu Kubernetes Grid

Customize Your Prometheus Deployment

In addition to the minimum configuration provided in the prometheus-data-values.yaml file, you can customize your configuration by adding values that you can copy from the file tkg-extensions- v1.3.1+vmware.1/monitoring/prometheus/values.yaml into prometheus-data-values.yaml. Note that this file is not located in the tkg-extensions-v1.3.1+vmware.1/extensions/monitoring/prometheus folder but in the monitoring folder that is at the same level as the extensions folder.

If you modify the values.yaml file before you deploy the Prometheus extension, then the custom settings take effect immediately upon deployment. For instructions, see Deploy Prometheus on the Tanzu Kubernetes Cluster.

If you modify the values.yaml file after you deploy the Prometheus extension, then you must update your running deployment. For instructions, see Update a Running Prometheus Deployment.

Prometheus Extension Configuration Parameters

The following table lists configuration parameters of the Prometheus extension and describes their default values. To customize Prometheus, specify the parameters and their custom values in the extensions/monitoring/grafana//prometheus-data-values.yaml file of your Tanzu Kubernetes cluster.

Parameter Type and Description Default monitoring.namespace String. Namespace where tanzu-system-monitoring Prometheus is deployed. monitoring.create_namespace Boolean. The flag false indicates whether to create the namespace specified by monitoring.namespace. onitoring.prometheus_server.config.alerting_rules_yaml YAML file. Detailed alert alerting_rules.yaml rules defined in Prometheus. monitoring.prometheus_server.config.recording_rules_yaml YAML file. Detailed recording_rules.yaml record rules defined in Prometheus. monitoring.prometheus_server.service.type String. Type of service to ClusterIP expose Prometheus. Supported value: ClusterIP. monitoring.prometheus_server.enable_alerts.kubernetes_api Boolean. Enable SLO true alerting for the Kubernetes API in Prometheus. monitoring.prometheus_server.sc.sc_enabled Boolean. Define if false StorageClass is enabled in the deployment.

VMware, Inc. 335 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.prometheus_server.sc.is_default Boolean. Define if current false StorageClass is the default StorageClass. monitoring.prometheus_server.sc.vsphereDatastoreurl String. Datastore URL for "xxx-xxx-xxxx" StorageClass used in vCenter. monitoring.prometheus_server.sc.aws_type String. AWS type defined gp2 for StorageClass on AWS. monitoring.prometheus_server.sc.aws_fsType String. AWS file system ext4 type defined for StorageClass on AWS. monitoring.prometheus_server.sc.allowVolumeExpansion Boolean. Define if volume true expansion allowed for StorageClass on AWS. monitoring.prometheus_server.pvc.annotations Map. Storage class {} annotations. monitoring.prometheus_server.pvc.storage_class String. Storage class to null use for Persistent Volume Claim. By default, this is null and the default provisioner is used. monitoring.prometheus_server.pvc.accessMode String. Define access ReadWriteOnce mode for Persistent Volume Claim: ReadWriteOnce, ReadOnlyMany, ReadWriteMany. monitoring.prometheus_server.pvc.storage String. Define storage 8Gi size for Persistent Volume Claim. monitoring.prometheus_server.deployment.replicas Integer. Number of 1 Prometheus replicas. monitoring.prometheus_server.image.repository String. Repository projects.registry.vmware.com/t containing the kg/prometheus Prometheus image. monitoring.prometheus_server.image.name String. Name of the prometheus Prometheus image. monitoring.prometheus_server.image.tag String. Image tag for the v2.17.1_vmware.1 Prometheus image. monitoring.prometheus_server.image.pullPolicy String. The image pull IfNotPresent policy for the Prometheus image. monitoring.alertmanager.config.slack_demo String. Slack notification See Slack configuration configuration for Alert example below Manager.

VMware, Inc. 336 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.alertmanager.config.email_receiver String. Email notification See e-mail configuration configuration for Alert example below Manager. monitoring.alertmanager.service.type String. Type of service to ClusterIP expose Alert Manager. Supported values: ClusterIP. monitoring.alertmanager.image.repository String. Repository projects.registry.vmware.com/t containing the Alert kg/prometheus Manager image. monitoring.alertmanager.image.name String. Name of the Alert alertmanager Manager image. monitoring.alertmanager.image.tag String. Image tag for the v0.20.0_vmware.1 Alert Manager image. monitoring.alertmanager.image.pullPolicy String. The image pull IfNotPresent policy for the Alert Manager image. monitoring.alertmanager.pvc.annotations Map. StorageClass {} annotations. monitoring.alertmanager.pvc.storage_class String. StorageClass to null use for Persistent Volume Claim. By default, this is null and the default provisioner is used. monitoring.alertmanager.pvc.accessMode String. Define access ReadWriteOnce mode for Persistent Volume Claim: ReadWriteOnce, ReadOnlyMany, ReadWriteMany. monitoring.alertmanager.pvc.storage String. Define storage 2Gi size for Persistent Volume Claim. monitoring.alertmanager.deployment.replicas Integer. Number of Alert 1 Manager replicas. monitoring.kube_state_metrics.image.repository String. Repository projects.registry.vmware.com/t containing kube-state- kg/prometheus metrics image. monitoring.kube_state_metrics.image.name String. Name of the kube- kube-state-metrics state-metrics image. monitoring.kube_state_metrics.image.tag String. Image tag for the v1.9.5_vmware.1 kube-state-metrics image. monitoring.kube_state_metrics.image.pullPolicy String. The image pull IfNotPresent policy for the kube-state- metrics image.

VMware, Inc. 337 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.kube_state_metrics.deployment.replicas Integer. Number of kube- 1 state-metrics replicas. monitoring.node_exporter.image.repository String. Repository projects.registry.vmware.com/t containing the node- kg/prometheus exporter image. monitoring.node_exporter.image.name String. Name of the node- node-exporter exporter image. monitoring.node_exporter.image.tag String. Image tag for the v0.18.1_vmware.1 node-exporter image. monitoring.node_exporter.image.pullPolicy String. The image pull IfNotPresent policy for the node- exporter image. monitoring.node_exporter.deployment.replicas Integer. Number of node- 1 exporter replicas. monitoring.pushgateway.image.repository String. Repository projects.registry.vmware.com/t containing the kg/prometheus pushgateway image. monitoring.pushgateway.image.name String. Name of the pushgateway pushgateway image. monitoring.pushgateway.image.tag String. Image tag for the v1.2.0_vmware.1 pushgateway image. monitoring.pushgateway.image.pullPolicy String. The image pull IfNotPresent policy for the pushgateway image. monitoring.pushgateway.deployment.replicas Integer. Number of 1 pushgateway replicas. monitoring.cadvisor.image.repository String. Repository projects.registry.vmware.com/t containing cadvisor kg/prometheus image. monitoring.cadvisor.image.name String. Name of the cadvisor cadvisor image. monitoring.cadvisor.image.tag String. Image tag for the v0.36.0_vmware.1 cadvisor image. monitoring.cadvisor.image.pullPolicy String. The image pull IfNotPresent policy for the cadvisor image. monitoring.cadvisor.deployment.replicas Integer. Number of 1 cadvisor replicas. monitoring.ingress.enabled Boolean. Enable/disable false ingress for Prometheus and Alert Manager. monitoring.ingress.virtual_host_fqdn String. Hostname for prometheus.system.tanzu accessing Promethues and Alert Manager.

VMware, Inc. 338 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.ingress.prometheus_prefix String. Path prefix for / Prometheus. monitoring.ingress.alertmanager_prefix String. Path prefix for /alertmanager/ Alert Manager. monitoring.ingress.tlsCertificate.tls.crt String. Optional certificate Generated certificate for ingress when using your own TLS certificate (a self-signed certificate is generated by default). monitoring.ingress.tlsCertificate.tls.key String. Optional certificate Generated cert key private key for ingress when using your own TLS certificate (a private key from a self-signed certificate is generated by default).

Slack Configuration Example

To configure Slack notifications in Alert Manager, you can use the following YAML as an example configuration:

slack_demo: name: slack_demo slack_configs: - api_url: https://hooks.slack.com channel: '#alertmanager-test'

Email Configuration Example

The YAML below provides an example of how to configure email notifications in Alert Manager.

email_receiver: name: email-receiver email_configs: - to: [email protected] send_resolved: false from: [email protected] smarthost: smtp.example.com:25 require_tls: false

Deploy Prometheus on a Tanzu Kubernetes Cluster After you have prepared a Tanzu Kubernetes cluster, you can deploy the Prometheus extension on the cluster. As part of the preparation, you have updated the appropriate configuration file for your platform and optionally customized your deployment.

VMware, Inc. 339 VMware Tanzu Kubernetes Grid

This procedure applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure.

1 Create a Kubernetes secret named prometheus-data-values with the values that you set in monitoring/prometheus/prometheus-data-values.yaml.

kubectl create secret generic prometheus-data-values --from-file=values.yaml=monitoring/ prometheus/prometheus-data-values.yaml -n tanzu-system-monitoring

2 Deploy the Prometheus extension.

kubectl apply -f monitoring/prometheus/prometheus-extension.yaml

You should see a confirmation that extensions.clusters.tmc.cloud.vmware.com/prometheus has been created.

3 The extension takes several minutes to deploy. To check the status of the deployment, use the kubectl get app command.

kubectl get app prometheus -n tanzu-system-monitoring

While the extension is being deployed, the "Description" field from the kubectl get app command shows a status of Reconciling. After Prometheus is deployed successfully, the status of the Prometheus app shown by the kubectl get app command changes to Reconcile succeeded.

You can view detailed status information with this command:

kubectl get app prometheus -n tanzu-system-monitoring -o yaml

Update a Running Prometheus Deployment Use this procedure to update the configuration of an already deployed Prometheus extension.

1 Using the information in Customize Your Prometheus Deployment as a reference, make the necessary changes to the tkg-extensions-v1.3.1+vmware.1/monitoring/prometheus/values.yaml file. For example, if you need to change the number of Prometheus replicas from the default value of one, then you add this configuration to the values.yaml file:

monitoring: prometheus_server: deployment: replicas: 2

After you have made all applicable changes, save the file.

2 Update the Kubernetes secret.

VMware, Inc. 340 VMware Tanzu Kubernetes Grid

This command assumes that you are running it from tkg-extensions-v1.3.1+vmware.1/ extensions.

kubectl create secret generic prometheus-data-values --from-file=values.yaml=monitoring/ prometheus/prometheus-data-values.yaml -n tanzu-system-monitoring -o yaml --dry-run=client | kubectl replace -f -

Note that the final - on the kubectl replace command above is necessary to instruct kubectl to accept the input being piped to it from the kubectl create secret command.

The Prometheus extension is reconciled using the new values that you just added. The changes should show up in five minutes or less. Updates are handled by the Kapp controller, which synchronizes every five minutes.

Access the Prometheus Dashboard By default, ingress is not enabled on Prometheus. This is because access to the Prometheus dashboard is not authenticated. If you want to access the Prometheus dashboard, you must perform the following steps.

1 Deploy Contour on the cluster.

For information about deploying Contour, see Implementing Ingress Control with Contour.

2 Copy the monitoring.ingress.enabled section from tkg-extensions-v1.3.1+vmware.1/ monitoring/prometheus/values.yaml into prometheus-data-values.yaml.

Copy the section into the position shown in this example.

#@data/values #@overlay/match-child-defaults missing_ok=True --- infrastructure_provider: "vsphere" monitoring: ingress: enabled: false virtual_host_fqdn: "prometheus.corp.tanzu" prometheus_prefix: "/" alertmanager_prefix: "/alertmanager/"

3 Update monitoring.ingress.enabled from false to true.

4 Create a DNS record to map prometheus.corp.tanzu to the address of the Envoy load balancer.

To obtain the address of the Envoy load balancer, see Implementing Ingress Control with Contour.

5 Access the Prometheus dashboard by navigating to http://prometheus.corp.tanzu in a browser.

VMware, Inc. 341 VMware Tanzu Kubernetes Grid

What to Do Next The Prometheus extension is now running and scraping data from your cluster. To visualize the data in Grafana dashboards, see Deploy Grafana on Tanzu Kubernetes Clusters.

If you need to remove the Prometheus extension on your cluster, see Delete Tanzu Kubernetes Grid Extensions.

Deploy Grafana on Tanzu Kubernetes Clusters

Tanzu Kubernetes Grid includes a Grafana extension that you can deploy on your Tanzu Kubernetes clusters. Grafana allows you to visualize and analyze metrics data collected by Prometheus on your clusters.

Prerequisites Before you begin this procedure, you must have: n Downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n Installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n Deployed a Tanzu Kubernetes Grid management cluster on vSphere, Amazon EC2, or Azure. n Deployed a Tanzu Kubernetes cluster. The examples in this topic use a cluster named monitoring-cluster. n Installed the Prometheus extension on the Tanzu Kubernetes cluster. For instructions on how to install Prometheus, see Deploy Prometheus on Tanzu Kubernetes Clusters. n Installed Contour for ingress control on the Tanzu Kubernetes cluster. For information on installing Contour, see Implementing Ingress Control with Contour.

VMware, Inc. 342 VMware Tanzu Kubernetes Grid

IMPORTANT: n Tanzu Kubernetes Grid does not support IPv6 addresses because upstream Kubernetes only provides alpha support for IPv6. In the following procedures, you must always provide IPv4 addresses. n The extensions folder tkg-extensions-v1.3.1+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prepare the Tanzu Kubernetes Cluster for the Grafana Extension To deploy the Grafana extension, you must prepare the specific Tanzu Kubernetes cluster where you plan to deploy the extension. First you must install a few supporting applications on the Tanzu Kubernetes cluster.

If you have already installed another extension such as the Prometheus extension onto the Tanzu Kubernetes cluster, then you can skip this section and proceed directly to Prepare the Configuration File for the Grafana Extension.

This procedure applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure.

1 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

You should see subfolders named authentication, ingress, logging, monitoring, registry, service-discovery and some YAML files.

2 Retrieve the admin credentials of the Tanzu Kubernetes cluster.

tanzu cluster kubeconfig get monitoring-cluster --admin

3 Set the context of kubectl to the Tanzu Kubernetes cluster.

kubectl config use-context monitoring-cluster-admin@monitoring-cluster

4 If you haven't already, install cert-manager on the Tanzu Kubernetes workload cluster by following the procedure in Install Cert Manager on Workload Clusters.

When all pods are ready, the Tanzu Kubernetes cluster is ready for you to deploy the Prometheus extension. To do so, follow the procedure in Prepare the Configuration Files for the Grafana Extension.

VMware, Inc. 343 VMware Tanzu Kubernetes Grid

Prepare the Grafana Extension Configuration File This procedure describes how to prepare the Grafana extenson configuration file for Tanzu Kubernetes clusters. This configuration file applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure and is required to deploy the Grafana extension.

For additional configuration options, you can use ytt overlays as described in Extensions and Shared Services in Customizing Clusters, Plans, and Extensions with ytt Overlays and in the extensions mods examples in the TKG Lab repository.

1 Make a copy of the grafana-data-values.yaml.example file for your infrastructure platform, and name the file grafana-data-values.yaml.

cp monitoring/grafana/grafana-data-values.yaml.example monitoring/grafana/grafana-data-values.yaml

2 Edit the grafana-data-values.yaml file and replace with a Base64 encoded password.

To generate a Base64 encoded password, run the following command:

echo -n 'mypassword' | base64

You can also use the Base64 encoding tool at https://www.base64encode.org/ to encode your password. For example, by using either method, a password of mypassword results in the encoded password bXlwYXNzd29yZA==.

3 Save grafana-data-values.yaml when you are finished.

4 You can now either deploy the Grafana extension with default values, or you can you can customize the deployment.

n To deploy Grafana using default configuration values, proceed directly to Deploy Grafana on a Tanzu Kubernetes Cluster. By default, grafana-data-values.yaml only contains the configuration of the infrastructure provider and a default administrative password.

n To customize your Grafana deployment, see Customize Your Grafana Deployment. For example, you can specify LDAP authentication or configure storage for Grafana.

Customize the Configuration of the Grafana Extension

You can customize the configuration of the Grafana extension by editing the tkg-extensions- v1.3.1+vmware.1/extensions/monitoring/grafana/grafana-data-values.yaml file.

If you modify this file before you deploy the Grafana extension, then the custom settings take effect immediately upon deployment. For instructions, see Deploy Grafana on a Tanzu Kubernetes Cluster.

If you modify this file after you deploy the Grafana extension, then you must update your running deployment. For instructions, see Update a Running Grafana Extension.

Grafana Extension Configuration Parameters

VMware, Inc. 344 VMware Tanzu Kubernetes Grid

The following table describes configuration parameters of the Grafana extension and their default values. To customize Grafana, specify the parameters and their custom values in the grafana-data-values.yaml file of your Tanzu Kubernetes cluster.

Parameter Type and Description Default monitoring.namespace String. Namespace in which to tanzu-system-monitoring deploy Grafana. monitoring.create_namespace Boolean. The flag indicates false whether to create the namespace specified by monitoring.namespace. monitoring.grafana.cluster_role.apiGroups List. API group defined for [""] Grafana ClusterRole. monitoring.grafana.cluster_role.resources List. Resources defined for ["configmaps", "secrets"] Grafana ClusterRole. monitoring.grafana.cluster_role.verbs List. Access permission defined ["get", "watch", "list"] for ClusterRole. monitoring.grafana.config.grafana_ini Config file. Grafana grafana.ini configuration file details. monitoring.grafana.config.datasource.type String. Grafana datasource prometheus type. monitoring.grafana.config.datasource.access String. Access mode, proxy or proxy direct (Server or Browser in the UI). monitoring.grafana.config.datasource.isDefault Boolean. Flag to mark the true default Grafana datasource. monitoring.grafana.config.provider_yaml YAML file. Config file to define provider.yaml Grafana dashboard provider. monitoring.grafana.service.type String. Type of Kubernetes vSphere: NodePort, AWS/Azure: Service to expose Grafana: LoadBalancer ClusterIP, NodePort, LoadBalancer. monitoring.grafana.pvc.storage_class String. StorageClass to use for null Persistent Volume Claim. By default this is null and default provisioner is used. monitoring.grafana.pvc.accessMode String. Define access mode for ReadWriteOnce Persistent Volume Claim: ReadWriteOnce, ReadOnlyMany, ReadWriteMany. monitoring.grafana.pvc.storage String. Define storage size for 2Gi Persistent Volume Claim. monitoring.grafana.deployment.replicas Integer. Number of Grafana 1 replicas. monitoring.grafana.image.repository String. Repository containing projects.registry.vmware.com/tk the Grafana image. g/grafana

VMware, Inc. 345 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.grafana.image.name String. Name of the Grafana grafana image. monitoring.grafana.image.tag String. Image tag of the v7.0.3_vmware.1 Grafana image. monitoring.grafana.image.pullPolicy String. Image pull policy for the IfNotPresent Grafana image. monitoring.grafana.secret.type String. Secret type defined for Opaque Grafana dashboard. monitoring.grafana.secret.admin_user String. Username to access the YWRtaW4= (admin in Base64 Grafana dashboard. encoding) monitoring.grafana.secret.admin_password String. Password to access null Grafana dashboard. monitoring.grafana.secret.ldap_toml String. If using LDAP "" authentication, LDAP configuration file path. monitoring.grafana_init_container.image.repository String. Repository containing projects.registry.vmware.com/tk the Grafana init container g/grafana image. monitoring.grafana_init_container.image.name String. Name of the Grafana k8s-sidecar init container image. monitoring.grafana_init_container.image.tag String. Image tag of the 0.1.99 Grafana init container image. monitoring.grafana_init_container.image.pullPolicy String. Image pull policy for the IfNotPresent Grafana init container image. monitoring.grafana_sc_dashboard.image.repository String. Repository containing projects.registry.vmware.com/tk the Grafana dashboard image. g/grafana monitoring.grafana_sc_dashboard.image.name String. Name of the Grafana k8s-sidecar dashboard image. monitoring.grafana_sc_dashboard.image.tag String. Image tag of the 0.1.99 Grafana dashboard image. monitoring.grafana_sc_dashboard.image.pullPolicy String. Image pull policy for the IfNotPresent Grafana dashboard image. monitoring.grafana.ingress.enabled Boolean. Enable/disable ingress true for Grafana. monitoring.grafana.ingress.virtual_host_fqdn String. Hostname for accessing grafana.system.tanzu Grafana. monitoring.grafana.ingress.prefix String. Path prefix for grafana. /

VMware, Inc. 346 VMware Tanzu Kubernetes Grid

Parameter Type and Description Default monitoring.grafana.ingress.tlsCertificate.tls.crt String. Optional certificate for Generated certificate ingress if you want to use your own TLS certificate; a self- signed certificate is generated by default. monitoring.grafana.ingress.tlsCertificate.tls.key String. Optional certificate Generated certificate private private key for ingress if you key want to use your own TLS certificate.

Deploy Grafana on a Tanzu Kubernetes Cluster After you have prepared a Tanzu Kubernetes cluster, you can deploy the Grafana extension on the cluster. As part of the preparation, you have updated the appropriate configuration file for your platform and optionally customized your deployment.

This procedure applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure.

1 Create the namespace and RBAC roles for Grafana.

kubectl apply -f extensions/monitoring/grafana/namespace-role.yaml

You should see confirmation that a tanzu-system-monitoring namespace, service account, and RBAC role bindings are created for Grafana.

namespace/tanzu-system-monitoring unchanged serviceaccount/grafana-extension-sa created role.rbac.authorization.k8s.io/grafana-extension-role created rolebinding.rbac.authorization.k8s.io/grafana-extension-rolebinding created clusterrole.rbac.authorization.k8s.io/grafana-extension-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/grafana-extension-cluster-rolebinding created

In this case, you may notice that the output states namespace/tanzu-system-monitoring unchanged. This output is an example of what you would see if you have already deployed the Prometheus extension, which is likely the case since Grafana uses Prometheus as its datasource. If you installed Grafana first, then the output shows namespace/tanzu-system- monitoring created instead.

2 Create a Kubernetes secret that encodes the values stored in the grafana-data-values.yaml configuration file.

kubectl -n tanzu-system-monitoring create secret generic grafana-data-values --from- file=values.yaml=extensions/monitoring/grafana/grafana-data-values.yaml

3 Deploy the Grafana extension.

kubectl apply -f extensions/monitoring/grafana/grafana-extension.yaml

VMware, Inc. 347 VMware Tanzu Kubernetes Grid

You should see a confirmation that extensions.clusters.tmc.cloud.vmware.com/grafana was created.

4 The extension takes several minutes to deploy. To check the status of the deployment, use the kubectl -n tanzu-system-monitoring get app command:

kubectl -n tanzu-system-monitoring get app grafana

While the extension is being deployed, the "Description" field from the kubectl get app command shows a status of Reconciling. After Grafana is deployed successfully, the status of the Grafana app as shown by the kubectl get app command changes to Reconcile succeeded.

You can view detailed status information with this command:

kubectl get app grafana -n tanzu-system-monitoring -o yaml

5 After you have deployed Grafana, configure Contour. The Grafana extension requires Contour to be present and creates a Contour HTTPProxy object with an FQDN of grafana.system.tanzu.

6 To use this FQDN to access the Grafana dashboard, create an entry in your local /etc/hosts file that points an IP address this FQDN.

Amazon EC2 or Azure: Use the IP address of the LoadBalancer for the Envoy service in the tanzu-system-ingress namespace.

vSphere: Use the IP address of a worker node.

7 Use a browser to navigate to https://grafana.system.tanzu.

Since the site uses self-signed certificates, you may need to navigate through a browser- specific security warning before you are able to access the dashboard.

Update a Running Grafana Extension If you need to make changes to the Grafana extension after it has been deployed, you must update the Kubernetes secret that the extension uses for its configuration. The steps below describe how to update the Kubernetes secret, and how to then update the configuration of the Grafana extension.

This procedure applies to Tanzu Kubernetes clusters running on vSphere, Amazon EC2, and Azure.

1 Locate the grafana-data-values.yaml file you created in the Prepare the Configuration File for the Grafana Extension. You must make your changes to this file. If you no longer have this file, you can recreate it with the following kubectl command: kubectl -n tanzu-system-monitoring get secret grafana-data-values -o 'go- template={{ index .data "values.yaml" }}' | base64 -d > grafana-data-values.yaml

Note that macOS users will need to use the -D parameter to base64, instead of the lowercase -d shown above.

VMware, Inc. 348 VMware Tanzu Kubernetes Grid

2 Using the information in Customize the Configuration of the Grafana Extension as a reference, make the necessary changes to the grafana-data-values.yaml file.

3 After you have made all applicable changes, save the file.

4 Update the Kubernetes secret.

These command assume that you are running them from tkg-extensions-v1.3.1+vmware.1/ extensions.

kubectl -n tanzu-system-monitoring create secret generic grafana-data-values --from- file=values.yaml=extensions/monitoring/grafana/grafana-data-values.yaml -o yaml --dry-run=client | kubectl replace -f -

Note that the final - on the kubectl replace command above is necessary to instruct kubectl to accept the input being piped to it from the kubectl create secret command.

5 The Grafana extension is reconciled using the new values that you just added. The changes should show up in five minutes or less. Updates are handled by the Kapp controller, which synchronizes every five minutes.

If you need the changes to the configuration reflected sooner, then you can delete the Grafana pod using kubectl delete pod. The pod is recreated via the Kapp controller with the new settings. You can also change syncPeriod in grafana-extension.yaml to a lower value and re-apply the configuration with kubectl apply -f grafana-extension.yaml.

Remove the Grafana Extension For information on how to remove the Grafana extension from a Tanzu Kubernetes cluster, see Delete Tanzu Kubernetes Grid Extensions.

Implementing Service Discovery with External DNS

The external DNS service reserves DNS hostnames for applications, using a declarative, Kubernetes-native interface. It is packaged as an extension in the Tanzu Kubernetes Grid extensions bundle.

This topic explains how to deploy the external DNS service to a workload or shared services cluster in Tanzu Kubernetes Grid.

On infrastructures with load balancing (AWS, Azure, and vSphere with NSX Advanced Load Balancer), VMware recommends installing the External DNS service alongside the Harbor service, as described in Harbor Registry and External DNS, especially in production or other environments where Harbor availability is important.

The procedures in this topic apply to vSphere, Amazon EC2, and Azure deployments.

Prerequisites n You have deployed a management cluster on vSphere, Amazon EC2, or Azure, in either an Internet-connected or Internet-restricted environment.

VMware, Inc. 349 VMware Tanzu Kubernetes Grid

If you are using Tanzu Kubernetes Grid in an Internet-restricted environment, you performed the procedure in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment before you deployed the management cluster. n You have downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n You have installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n Determine the FQDN for the external DNS service. If you are using external DNS for Harbor on a shared services cluster upgraded from Tanzu Kubernetes Grid v1.2, note the following:

n If your Harbor registry in v1.2 used a fully-qualified domain name (FQDN) that you control, such as myharbor.mycompany.com, use this FQDN for the external DNS service.

n If your Harbor registry in v1.2 used a fictitious domain name such as harbor.system.tanzu, you cannot upgrade workload clusters automatically, but must instead create a new v1.3 workload cluster and migrate the workloads to the new cluster manually.

IMPORTANT: The extensions folder tkg-extensions-v1.3.x+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prepare a Cluster for External DNS Deployment

To prepare a cluster for running the External DNS service:

1 If you are running External DNS on a shared services cluster, alongside Harbor, and the cluster has not yet been created, create the cluster by following Create a Shared Services Cluster.

2 Deploy the Contour service on the cluster. External DNS requires Contour to be present on its cluster, to provide ingress control. For how to deploy Contour, see Deploy Contour on the Tanzu Kubernetes Cluster.

Choose the External DNS Provider

The external-dns extension has been validated with AWS (Route53), Azure, and RFC2136 (BIND). The extension supports Ingress with either Contour or Service type Load Balancer. Below are instructions for each of these options.

AWS (Route53) 1 Create a hosted zone within Route53 with the domain that shared services will be using.

2 Take note of the “Hosted zone ID” as this ID will be used in the external-dns-data- values.yaml.

VMware, Inc. 350 VMware Tanzu Kubernetes Grid

3 Create an IAM user for external-dns with the following policy document and ensure “Programmatic access” is checked. If desired you may fine-tune the policy to permit updates only to the hosted zone that you just created.

{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "route53:ChangeResourceRecordSets" ], "Resource": [ "arn:aws:route53:::hostedzone/*" ] }, { "Effect": "Allow", "Action": [ "route53:ListHostedZones", "route53:ListResourceRecordSets" ], "Resource": [ "*" ] } ] }

1. Take note of the “Access key ID” and “Secret access key” as they will be needed for configuring external-dns. 1. Copy the appropriate example values. - **With Contour**: If you have deployed Contour and would like the external-dns extension to use Contour HTTPProxy resources as sources for external-dns, copy the example file with: ``` cp external-dns-data-values-aws-with-contour.yaml.example external-dns-data-values.yaml ``` - **Without Contour**: To use a Service type LoadBalancer as the only source for external-dns, copy the example file with: ``` cp external-dns-data-values-aws.yaml.example external-dns-data-values.yaml ```

1. Change the values inside `external-dns-data-values.yaml` as appropriate, making sure to fill in your hosted zone ID and domain name. For additional configuration options, you can use `ytt` overlays as described in [Extensions and Shared Services](../ytt.md#extensions) in _Customizing Clusters, Plans, and Extensions with ytt Overlays_ and in the extensions mods examples in the [TKG Lab repository](https://github.com/Tanzu- Solutions-Engineering/tkg-lab).

### RFC2136 (BIND) Server

VMware, Inc. 351 VMware Tanzu Kubernetes Grid

The RFC2136 provider allows you to use any RFC2136-compatible DNS server as a provider for external-dns such as BIND.

1. Find/Create a TSIG key for your server

1. If your DNS is provided for you, ask for a TSIG key authorized to update and transfer the zone you wish to update. The key will look something like this: ``` key "externaldns-key" { algorithm hmac-sha256; secret "/2avn5M4ndEztbDqy66lfQ+PjRZta9UXLtToW6NV5nM="; }; ```

1. If you are managing your own DNS server then you can create a TSIG key using `tsig-keygen -a hmac-sha256 externaldns`. Copy the result to your DNS servers configuration. In the case of BIND you would add the key to the named.conf file and configure the zone with the allow-transfer and update policy fields. For example: ``` key "externaldns-key" { algorithm hmac-sha256; secret "/2avn5M4ndEztbDqy66lfQ+PjRZta9UXLtToW6NV5nM="; }; zone "k8s.example.org" { type master; file "/etc/bind/zones/k8s.zone"; allow-transfer { key "externaldns-key"; }; update-policy { grant externaldns-key zonesub ANY; }; }; ```

1. The above assumes you also have a zone file that might look something like this: ``` $TTL 60 ; 1 minute @ IN SOA k8s.example.org. root.k8s.example.org. ( 16 ; serial 60 ; refresh (1 minute) 60 ; retry (1 minute) 60 ; expire (1 minute) 60 ; minimum (1 minute) ) NS ns.k8s.example.org. ns A 1.2.3.4 ``` 1. Copy the appropriate example values.

VMware, Inc. 352 VMware Tanzu Kubernetes Grid

- **With Contour**: If you have deployed Contour and would like the external-dns extension to use Contour HTTPProxy resources as sources for external-dns, copy the example file with: ``` cp external-dns-data-values-rfc2136-with-contour.yaml.example external-dns-data-values.yaml ``` - **Without Contour**: To use a Service type LoadBalancer as the only source for external-dns, copy the example file with: ``` cp external-dns-data-values-rfc2136.yaml.example external-dns-data-values.yaml ```

1. Change the values inside `external-dns-data-values.yaml` as appropriate, making sure to fill in your DNS server IP, domain name, TSIG secret, and TSIG key name.

### Microsoft Azure

1. Log in to the `az` CLI: `az login`

1. Set your subscription: `az account set -s `

1. Create a service principal: `az ad sp create-for-rbac -n ` - This returns a json that looks similar to the following: ``` { "appId": "a72a7cfd-7cb0-4b02-b130-03ee87e6ca89", "displayName": "foo", "name": "http://foo", "password": "515c55da-f909-4e17-9f52-236ffe1d3033", "tenant": "b35138ca-3ced-4b4a-14d6-cd83d9ea62f0" } ```

1. Assign permissions to the service principal. 1. Discover the id of the resource group ``` az group show --name --query i ``` 1. Assign the reader role to the service principal for the resource group scope. You will need the appId from the output of the creation of the service principal. ``` az role assignment create --role "Reader" --assignee --scope ``` 1. Discover the id of the dns zone. ``` az network dns zone show --name -g --query i ``` 1. Assign the contributor role to the service principal for the dns zone scope. ``` az role assignment create --role "Contributor" --assignee --scope

VMware, Inc. 353 VMware Tanzu Kubernetes Grid

```

1. To connect the external-dns extension to the Azure DNS service you will create a configuration file called azure.json on your local machine with contents that look like the following:

{ "tenantId": "01234abc-de56-ff78-abc1-234567890def", "subscriptionId": "01234abc-de56-ff78- abc1-234567890def", "resourceGroup": "MyDnsResourceGroup", "aadClientId": "01234abc-de56- ff78-abc1-234567890def", "aadClientSecret": "uKiuXeiwui4jo9quae9o" }

- The `tenantId` can be retrieved from: `az account show --query "tenantId"` - The `subscriptionId` can be retrieved from: `az account show --query "id"` - The `resourceGroup` is the name of the resource group that your dns zone is within. - The `aadClientId` is the `appId` from the output of the Service Principal. - The `aadClientSecret` is the password from the output of the Service Principal.

1. Copy the appropriate example values. - **With Contour**: If you have deployed Contour and would like the external-dns extension to use Contour HTTPProxy resources as sources for external-dns, copy the example file with: ``` cp external-dns-data-values-azure-with-contour.yaml.example external-dns-data-values.yaml ``` - **Without Contour**: To use a Service type LoadBalancer as the only source for external-dns, copy the example file with: ``` cp external-dns-data-values-azure.yaml.example external-dns-data-values.yaml ```

1. Change the values inside `external-dns-data-values.yaml` as appropriate, making sure to fill in the Azure resource group and domain name.

## Deploy the External DNS Extension

1. Set the context of `kubectl` to the shared services cluster or other cluster where you are deploying External DNS. kubectl config use-context tkg-services-admin@tkg-services

1. From the unpacked `tkg-extensions` folder, navigate to the `external-dns` extension folder cd extensions/service-discovery/external-dns

1. Install kapp-controller:

VMware, Inc. 354 VMware Tanzu Kubernetes Grid kubectl apply -f ../../kapp-controller.yaml

1. Create external-dns namespace: kubectl apply -f namespace-role.yaml

1. Create a secret with data values: kubectl create secret generic external-dns-data-values --from-file=values.yaml=external-dns- data-values.yaml -n tanzu-system-service-discovery

1. If you chose AWS or Azure as your external DNS provider, run the corresponding following command to create a Kubernetes secret to supply credentials to the DNS provider. - **AWS**: ``` kubectl -n tanzu-system-service-discovery create secret generic route53-credentials --from-literal=aws_access_key_id=YOUR_ACCESS_KEY_ID_HERE --from-literal=aws_secret_access_key=YOUR_SECRET_ACCESS_KEY_HERE ``` - **Azure**: ``` kubectl -n tanzu-system-service-discovery create secret generic azure-config-file --from-file=azure.json ```

1. Deploy the `ExternalDNS` extension: kubectl apply -f external-dns-extension.yaml

1. Ensure the extension is deployed successfully: kubectl get app external-dns -n tanzu-system-service-discovery

`ExternalDNS` app status should change to **Reconcile Succeeded** once `ExternalDNS` is deployed successfully.

## Validating External DNS

If configured with Contour, External DNS will automatically watch the specified namespace for HTTPProxy resources and create DNS records for services with hostnames that match the configured domain filter.

External DNS will also automatically watch for Kubernetes Services with the annotation `external-dns.alpha.kubernetes.io/hostname` and create DNS records for services whose annotations match the configured domain filter.

For example, a service with the annotation `external-dns.alpha.kubernetes.io/hostname: foo.k8s.example.org`

VMware, Inc. 355 VMware Tanzu Kubernetes Grid

will cause External DNS to create a DNS record for `foo.k8s.example.org`, and you can validate that the record exists by examining the zone that you created in whichever provider you created.

Deploy Harbor Registry as a Shared Service

Harbor is an open source, trusted, cloud native container registry that stores, signs, and scans content. Harbor extends the open source Docker Distribution by adding the functionalities usually required by users such as security, identity control and management.

Tanzu Kubernetes Grid includes signed binaries for Harbor, that you can deploy on a shared services cluster to provide container registry services for other Tanzu Kubernetes clusters. Unlike Tanzu Kubernetes Grid extensions, which you use to deploy services on individual clusters, you deploy Harbor as a shared service. In this way, Harbor is available to all of the Tanzu Kubernetes clusters in a given Tanzu Kubernetes Grid instance. To implement Harbor as a shared service, you deploy it on a special cluster that is reserved for running shared services in a Tanzu Kubernetes Grid instance.

You can use the Harbor shared service as a private registry for images that you want to make available to all of the Tanzu Kubernetes clusters that you deploy from a given management cluster. An advantage to using the Harbor shared service is that it is managed by Kubernetes, so it provides greater reliability than a standalone registry. Also, the Harbor implementation that Tanzu Kubernetes Grid provides as a shared service has been tested for use with Tanzu Kubernetes Grid and is fully supported.

The procedures in this topic all apply to vSphere, Amazon EC2, and Azure deployments.

Using the Harbor Shared Service in Internet-Restricted Environments

Another use-case for deploying Harbor as a shared service is for Tanzu Kubernetes Grid deployments in Internet-restricted environments. For more information, see Using the Harbor Shared Service in Internet-Restricted Environments.

Harbor Registry and External DNS

VMware recommends installing the External DNS service alongside the Harbor Registry on infrastructures with load balancing (AWS, Azure, and vSphere with NSX Advanced Load Balancer), especially in production or other environments in which Harbor availability is important.

If the IP address to the shared services ingress load balancer changes, External DNS automatically picks up the change and re-maps the new address to the Harbor hostname. This precludes the need to manually re-map the address as described in Connect to the Harbor User Interface.

Prerequisites n You have deployed a management cluster on vSphere, Amazon EC2, or Azure, in either an Internet-connected or Internet-restricted environment.

VMware, Inc. 356 VMware Tanzu Kubernetes Grid

If you are using Tanzu Kubernetes Grid in an Internet-restricted environment, you performed the procedure in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment before you deployed the management cluster. n You have downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n You have installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n You have installed yq:

n For Tanzu Kubernetes Grid v1.3.0, install yq v3.

n For Tanzu Kubernetes Grid v1.3.1 and later, install yq v4.5 or later.

IMPORTANT: The extensions folder tkg-extensions-v1.3.1+vmware.1 contains subfolders for each type of extension, for example, authentication, ingress, registry, and so on. At the top level of the folder there is an additional subfolder named extensions. The extensions folder also contains subfolders for authentication, ingress, registry, and so on. Take care to run commands from the location provided in the instructions. Commands are usually run from within the extensions folder.

Prepare a Shared Services Cluster for Harbor Deployment

Each Tanzu Kubernetes Grid instance can only have one shared services cluster. You must deploy Harbor on a cluster that will only be used for shared services.

To prepare a shared services cluster for running Harbor Extension on:

1 Create a shared services cluster, if it is not already created, by following the procedure Create a Shared Services Cluster.

2 Deploy Contour Extension on the shared services cluster.

Harbor Extension requires Contour Extension to be present on the cluster, to provide ingress control. For how to deploy Contour Extension, see Deploy Contour on the Tanzu Kubernetes Cluster.

3 (Optional) Deploy External DNS Extension on the shared services cluster. External DNS Extension is recommended for using Harbor Extension in environments with load balancing, as described in Harbor Registry and External DNS, above.

Your shared services cluster is now ready for you to deploy the Harbor Extension on it.

Deploy Harbor Extension on the Shared Services Cluster

After you have deployed a shared services cluster that includes the Contour Extension, you can deploy the Harbor Extension.

1 Set the context of kubectl to the shared services cluster.

kubectl config use-context tkg-services-admin@tkg-services

VMware, Inc. 357 VMware Tanzu Kubernetes Grid

2 Create a namespace for the Harbor Extension on the shared services cluster.

kubectl apply -f registry/harbor/namespace-role.yaml

You should see confirmation that a tanzu-system-registry namespace, service account, and RBAC role bindings are created.

namespace/tanzu-system-registry created serviceaccount/harbor-extension-sa created role.rbac.authorization.k8s.io/harbor-extension-role created rolebinding.rbac.authorization.k8s.io/harbor-extension-rolebinding created clusterrole.rbac.authorization.k8s.io/harbor-extension-cluster-role created clusterrolebinding.rbac.authorization.k8s.io/harbor-extension-cluster-rolebinding created

3 Make a copy of the harbor-data-values.yaml.example file and name it harbor-data-values.yaml.

cp registry/harbor/harbor-data-values.yaml.example registry/harbor/harbor-data-values.yaml

The harbor-data-values.yaml configures the Harbor extension. You can also customize your Harbor setup using ytt overlays. See ytt Overlays and Example: Clean Up S3 and Trust Let's Encrypt below.

4 Set the mandatory passwords and secrets in harbor-data-values.yaml.

You can do this in one of two ways:

n To automatically generate random passwords and secrets, run the following command:

bash registry/harbor/generate-passwords.sh registry/harbor/harbor-data-values.yaml

n To set your own passwords and secrets, update the following entries in harbor-data- values.yaml:

n harborAdminPassword

n secretKey

n database.password

n core.secret

n core.xsrfKey

n jobservice.secret

n registry.secret

5 Specify other settings in harbor-data-values.yaml.

n Set the hostname setting to the hostname you want to use to access Harbor. For example: harbor.yourdomain.com.

VMware, Inc. 358 VMware Tanzu Kubernetes Grid

n To use your own certificates, update the tls.crt, tls.key, and ca.crt settings with the contents of your certificate, key, and CA certificate. The certificate can be signed by a trusted authority or be self-signed. If you leave these blank, Tanzu Kubernetes Grid automatically generates a self-signed certificate.

n If you used the generate-passwords.sh script, optionally update the harborAdminPassword with something that is easier to remember.

n Optionally update the persistence settings to specify how Harbor stores data.

If you need to store a large quantity of container images in Harbor, set persistence.persistentVolumeClaim.registry.size to a larger number.

If you do not update the storageClass under persistence settings, Harbor uses the shared services cluster's default storageClass. If the default storageClass or a storageClass that you specify in harbor-data-values.yaml supports the accessMode ReadWriteMany, you must update the persistence.persistentVolumeClaim accessMode settings for registry, jobservice, database, redis, and trivy from ReadWriteOnce to ReadWriteMany. vSphere 7 with VMware vSAN 7 supports accessMode: ReadWriteMany but vSphere 6.7u3 does not. If you are using vSphere 7 without vSAN, or you are using vSphere 6.7u3, use the default value ReadWriteOnce.

n Optionally update the other Harbor settings. The settings that are available in harbor- data-values.yaml are a subset of the settings that you set when deploying open source Harbor with Helm. For information about the other settings that you can configure, see Deploying Harbor with High Availability via Helm in the Harbor documentation.

6 Create a Kubernetes secret named harbor-data-values with the values that you set in harbor- data-values.yaml.

kubectl create secret generic harbor-data-values --from-file=values.yaml=registry/harbor/harbor- data-values.yaml -n tanzu-system-registry

7 Deploy the Harbor extension.

kubectl apply -f registry/harbor/harbor-extension.yaml

You should see the confirmation extension.clusters.tmc.cloud.vmware.com/harbor created.

8 View the status of the Harbor service.

kubectl get app harbor -n tanzu-system-registry

The status of the Harbor app should show Reconcile Succeeded when Harbor has deployed successfully.

NAME DESCRIPTION SINCE-DEPLOY AGE harbor Reconcile succeeded 3m11s 23m

9 If the status is not Reconcile Succeeded, view the full status details of the Harbor service.

VMware, Inc. 359 VMware Tanzu Kubernetes Grid

Viewing the full status can help you to troubleshoot the problem.

kubectl get app harbor -n tanzu-system-registry -o yaml

10 Check that the new services are running by listing all of the pods that are running in the cluster.

kubectl get pods -A

In the tanzu-system-regisry namespace, you should see the harbor core, clair, database, jobservice, notary, portal, redis, registry, and trivy services running in a pod with names similar to harbor-registry-76b6ccbc75-vj4jv.

NAMESPACE NAME READY STATUS RESTARTS AGE [...] tanzu-system-ingress contour-6b568c9b88-h5s2r 1/1 Running 0 26m tanzu-system-ingress contour-6b568c9b88-mlg2r 1/1 Running 0 26m tanzu-system-ingress envoy-wfqdp 2/2 Running 0 26m tanzu-system-registry harbor-clair-9ff9b98d-6vlk4 2/2 Running 1 23m tanzu-system-registry harbor-core-557b58b65c-4kzhn 1/1 Running 0 23m tanzu-system-registry harbor-database-0 1/1 Running 0 23m tanzu-system-registry harbor-jobservice-847b5c8756-t6kfs 1/1 Running 0 23m tanzu-system-registry harbor-notary-server-6b74b8dd56-d7swb 1/1 Running 2 23m tanzu-system-registry harbor-notary-signer-69d4669884-dglzm 1/1 Running 2 23m tanzu-system-registry harbor-portal-8f677757c-t4cbj 1/1 Running 0 23m tanzu-system-registry harbor-redis-0 1/1 Running 0 23m tanzu-system-registry harbor-registry-85b96c7777-wsdnj 2/2 Running 0 23m tanzu-system-registry harbor-trivy-0 1/1 Running 0 23m tkg-system kapp-controller-778b5f484c-fkbvg 1/1 Running 0 59m vmware-system-tmc extension-manager-6c64cdd984-s99gc 1/1 Running 0 27m

11 Obtain the Harbor CA certificate from the harbor-tls secret in the tanzu-system-registry namespace.

kubectl -n tanzu-system-registry get secret harbor-tls -o=jsonpath="{.data.ca\.crt}" | base64 -d

Make a copy of the output.

Connect to the Harbor User Interface

The Harbor UI is exposed via the Envoy service load balancer that is running in the Contour extension on the shared services cluster. To allow users to connect to the Harbor UI, you must map the address of the Envoy service load balancer to the hostname of the Harbor service, for example harbor.yourdomain.com. How you map the address of the Envoy service load balancer to the hostname depends on whether your Tanzu Kubernetes Grid instance is running on vSphere, on Amazon EC2 or on Azure.

1 Obtain the address of the Envoy service load balancer.

kubectl get svc envoy -n tanzu-system-ingress -o jsonpath='{.status.loadBalancer.ingress[0]}'

VMware, Inc. 360 VMware Tanzu Kubernetes Grid

On vSphere without NSX Advanced Load Balancer (ALB), the Envoy service is exposed via NodePort instead of LoadBalancer, so the above output will be empty, and you can use the IP address of any worker node in the shared services cluster instead. On Amazon EC2, it has a FQDN similar to a82ebae93a6fe42cd66d9e145e4fb292-1299077984.us-west-2.elb.amazonaws.com. On vSphere with NSX ALB and Azure, the Envoy service has a Load Balancer IP address similar to 20.54.226.44.

2 Map the address of the Envoy service load balancer to the hostname of the Harbor service.

n vSphere: If you deployed Harbor on a shared services cluster that is running on vSphere, you must add an IP to hostname mapping in /etc/hosts or add corresponding A records in your DNS server. For example, if the IP address is 10.93.9.100, add the following to /etc/ hosts:

10.93.9.100 harbor.yourdomain.com notary.harbor.yourdomain.com

On Windows machines, the equivalent to /etc/hosts/ is C:\Windows\System32\Drivers\etc \hosts.

n Amazon EC2 or Azure: If you deployed Harbor on a shared services cluster that is running on Amazon EC2 or Azure, you must create two DNS CNAME records (on Amazon EC2) or two DNS A records (on Azure) for the Harbor hostnames on a DNS server on the Internet.

n One record for the Harbor hostname, for example, harbor.yourdomain.com, that you configured in harbor-data-values.yaml, that points to the FQDN or IP of the Envoy service load balancer.

n Another record for the Notary service that is running in Harbor, for example, notary.harbor.yourdomain.com, that points to the FQDN or IP of the Envoy service load balancer.

Users can now connect to the Harbor UI by navigating to https://harbor.yourdomain.com in a Web browser and log in as user admin with the harborAdminPassword that you configured in harbor-data- values.yaml.

Push and Pull Images to and from the Harbor Extension

Now that Harbor is set up as a shared service, you can push images to it to make them available for your Tanzu Kubernetes clusters to pull.

1 If Harbor uses a self-signed certificate, download the Harbor CA certificate from https:// harbor.yourdomain.com/api/v2.0/systeminfo/getcert, and install it on your local machine, so Docker can trust this CA certificate.

n On Linux, save the certificate as /etc/docker/certs.d/harbor.yourdomain.com/ca.crt.

n On macOS, follow this procedure.

n On Windows, right-click the certificate file and select Install Certificate.

VMware, Inc. 361 VMware Tanzu Kubernetes Grid

2 Log in to the Harbor registry with the user admin. When prompted, enter the harborAdminPassword that you set when you deployed the Harbor Extension on the shared services cluster.

docker login harbor.yourdomain.com -u admin

3 Tag an existing image that you have already pulled locally, for example nginx:1.7.9.

docker tag nginx:1.7.9 harbor.yourdomain.com/library/nginx:1.7.9

4 Push the image to the Harbor registry.

docker push harbor.yourdomain.com/library/nginx:1.7.9

5 Now you can pull the image from the Harbor registry on any machine where the Harbor CA certificate is installed.

docker pull harbor.yourdomain.com/library/nginx:1.7.9

Push the Tanzu Kubernetes Grid Images into the Harbor Registry

The Tanzu Kubernetes Grid images are published in a public container registry and used by Tanzu Kubernetes Grid to deploy Tanzu Kubernetes clusters and extensions. When creating a Tanzu Kubernetes cluster, in order for Tanzu Kubernetes cluster nodes to pull Tanzu Kubernetes Grid images from the Harbor shared service rather than over the Internet, you must first push those images to the Harbor shared service.

This procedure is optional if your Tanzu Kubernetes clusters have internet connectivity to pull external images.

If you only want to store your application images rather than the Tanzu Kubernetes Grid images in the Harbor shared service, follow the procedure in Trust Custom CA Certificates on Cluster Nodes to enable the Tanzu Kubernetes cluster nodes to pull images from the Harbor shared service, and skip the rest of this procedure.

NOTE: If your Tanzu Kubernetes Grid instance is running in an Internet-restricted environment, you must perform these steps on a machine that has an Internet connection, that can also access the Harbor registry that you have just deployed as a shared service.

1 Create a public project named tkg from Harbor UI. Or, you can use another project name.

2 Set the FQDN of the Harbor registry that is running as a shared service as an environment variable.

On Windows platforms, use the SET command instead of export. Include the name of the default project in the variable. For example, if you set the Harbor hostname to harbor.yourdomain.com, set the following:

export TKG_CUSTOM_IMAGE_REPOSITORY=harbor.yourdomain.com/tkg

VMware, Inc. 362 VMware Tanzu Kubernetes Grid

3 Follow step 2 and step 3 in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment to generate and run the publish-images.sh script.

4 When the script finishes, add or update the following rows in the global configuration file, ~/.tanzu/tkg/config.yaml.

These variables ensure that when creating a Management Cluster or Tanzu Kubernetes Cluster, Tanzu Kubernetes Grid always pulls Tanzu Kubernetes Grid images from the Harbor Extension that is running as a shared service, rather than from the external internet. If your Harbor Extension uses self-signed certificates, also add the following to the configuration file:

n TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: false. Because the Tanzu connectivity webhook injects the Harbor CA certificate into cluster nodes, TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY should always be set to false when using Harbor as a shared service.

n TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE. Provide the CA certificate in base64 encoded format.

TKG_CUSTOM_IMAGE_REPOSITORY: harbor.yourdomain.com/tkg TKG_CUSTOM_IMAGE_REPOSITORY_SKIP_TLS_VERIFY: false TKG_CUSTOM_IMAGE_REPOSITORY_CA_CERTIFICATE: LS0t[...]tLS0tLQ==

If your Tanzu Kubernetes Grid instance is running in an Internet-restricted environment, you can disconnect the Internet connection now.

You can now use the tanzu cluster create command to deploy Tanzu Kubernetes clusters, and the images will be pulled from the Harbor Extension that is running in the shared services cluster. You can push images to the Harbor registry to make them available to all clusters that are running in the Tanzu Kubernetes Grid instance.

Connections between Tanzu Kubernetes cluster nodes and Harbor are secure, regardless of whether you use a trusted or a self-signed certificate for the Harbor shared service. ytt Overlays and Example: Clean Up S3 and Trust Let's Encrypt

In addition to modifying harbor-data-values.yaml, you can use ytt overlays to configure your Harbor setup, as described in Extensions and Shared Services in Customizing Clusters, Plans, and Extensions with ytt Overlays and in the extensions mods examples in the TKG Lab repository.

One TKG Lab example, in the step Prepare Manifests and Deploy Harbor Extension, cleans PersistentVolumeClaim (PVC) info from Harbor's S3 storage and lets the Harbor extension trust a Let's Encrypt certificate authority.

The example procedure does this by running a script generate-and-apply-harbor-yaml.sh that sets up the configuration files used to deploy the Harbor extension. To customize the Harbor extension, the script applies three ytt overlay files: n overlay-s3-pvc-fix.yaml clears PVC data to allow the Harbor registry to use S3 for data storage

VMware, Inc. 363 VMware Tanzu Kubernetes Grid

n trust-certificate/overlay.yaml lets any extension (Harbor in this case) trust a Let's Encrypt CA; useful for OIDC providers with Let's Encrypt-based certs n harbor-extension-overlay.yaml directs the Harbor extension to always deploy the Harbor app with the previous two overlays

See the TKG Lab repository and its Step by Step setup guide for more examples.

Update a Running Harbor Deployment

If you need to make changes to the configuration of the Harbor extension after deployment, follow these steps to update your deployed Harbor extension.

1 Update the Harbor configuration in registry/harbor/harbor-data-values.yaml.

For example, increase the amount of registry storage by updating the persistence.persistentVolumeClaim.registry.size value.

2 Update the Kubernetes secret, which contains the Harbor configuration.

This command assumes that you are running it from tkg-extensions-v1.3.1+vmware.1/ extensions.

kubectl create secret generic harbor-data-values --from-file=values.yaml=registry/harbor/harbor- data-values.yaml -n tanzu-system-registry -o yaml --dry-run | kubectl replace -f-

Note that the final - on the kubectl replace command above is necessary to instruct kubectl to accept the input being piped to it from the kubectl create secret command.

The Harbor extension will be reconciled using the new values you just added. The changes should show up in five minutes or less. This is handled by the Kapp controller, which synchronizes every five minutes.

Implementing User Authentication

This release of Tanzu Kubernetes Grid introduces user authentication with Pinniped, that runs automatically in management clusters if you enable identity management during deployment. Previously, user authentication was implemented by deploying the Dex and Gangway extensions. The extensions for Dex and Gangway are included in this release for historical reasons, but are deprecated. For new deployments, you must enable Pinniped in your management clusters. Do not to use the Dex and Gangway extensions. n For information about identity management with Pinniped, see Enabling Identity Management in Tanzu Kubernetes Grid. n For information about migrating existing Dex and Gangway deployments to Pinniped, see Register Core Add-ons.

VMware, Inc. 364 VMware Tanzu Kubernetes Grid

Delete Tanzu Kubernetes Grid Extensions

If you have deployed extensions that you no longer require, you can delete them from management clusters and Tanzu Kubernetes clusters.

Prepare to Delete Extensions

1 In a terminal, navigate to the folder that contains the unpacked Tanzu Kubernetes Grid extension manifest files, tkg-extensions-v1.3.1+vmware.1/extensions.

cd /tkg-extensions-v1.3.1+vmware.1/extensions

Run all of the commands in these procedures from this location.

2 Set the context of kubectl to the management cluster or Tanzu Kubernetes cluster on which the extension is deployed.

kubectl config use-context contour-test-admin@contour-test

IMPORTANT: For all of the extensions, do not delete namespace-role.yaml before the application has been fully deleted. This leads to errors due to the service account that is used by kapp- controller being deleted.

Delete the Contour Extension

1 Delete the Contour extension.

kubectl delete -f ingress/contour/contour-extension.yaml

2 Delete the Contour application.

kubectl delete app contour -n tanzu-system-ingress

3 Delete the Contour namespace.

kubectl delete -f ingress/contour/namespace-role.yaml

Delete the Fluent Bit Extension

1 Delete the Fluent Bit extension.

kubectl delete -f logging/fluent-bit/fluent-bit-extension.yaml

2 Delete the Fluent Bit application.

kubectl delete app fluent-bit -n tanzu-system-logging

3 Delete the Fluent Bit namespace.

kubectl delete -f logging/fluent-bit/namespace-role.yaml

VMware, Inc. 365 VMware Tanzu Kubernetes Grid

Delete the Prometheus and Grafana Extensions

1 Delete the Prometheus extension.

kubectl delete -f monitoring/prometheus/prometheus-extension.yaml

2 Delete the Prometheus application.

kubectl delete app prometheus -n tanzu-system-monitoring

3 Delete the Prometheus namespace.

kubectl delete -f monitoring/prometheus/namespace-role.yaml

4 Delete the Grafana extension.

kubectl delete -f monitoring/grafana/grafana-extension.yaml

5 Delete the Grafana application.

kubectl delete app grafana -n tanzu-system-monitoring

6 Delete the Grafana namespace.

kubectl delete -f monitoring/grafana/namespace-role.yaml

Delete the External DNS Extension

1 Delete the External DNS extension.

kubectl delete -f registry/dns/external-dns-extension.yaml

2 Delete the External DNS application.

kubectl delete app external-dns -n tanzu-system-registry

3 Delete the External DNS namespace.

kubectl delete -f dns/external-dns/namespace-role.yaml

Delete the Harbor Extension

1 Delete the Harbor extension.

kubectl delete -f registry/harbor/harbor-extension.yaml

2 Delete the Harbor application.

kubectl delete app harbor -n tanzu-system-registry

VMware, Inc. 366 VMware Tanzu Kubernetes Grid

3 Delete the Harbor namespace.

kubectl delete -f registry/harbor/namespace-role.yaml

Delete the Dex and Gangway Extensions

1 Delete the Dex extension.

kubectl delete -f authentication/dex/dex-extension.yaml

2 Delete the Dex application.

kubectl delete app dex -n tanzu-system-auth

3 Delete the Dex namespace.

kubectl delete -f authentication/dex/namespace-role.yaml

4 Delete the Gangway extension.

kubectl delete -f authentication/gangway/gangway-extension.yaml

5 Delete the Gangway application.

kubectl delete app gangway -n tanzu-system-auth

6 Delete the Gangway namespace.

kubectl delete -f authentication/gangway/namespace-role.yaml

Delete the Extensions Utilities

If you delete all extensions from a cluster, you can remove common extensions utilities.

If the extensions are deployed on a Tanzu Kubernetes cluster, optionally delete the cert-manager.

kubectl delete -f ../cert-manager/

Do not delete cert-manager from management clusters.

VMware, Inc. 367 Building Machine Images 8

You can build custom machine images for Tanzu Kubernetes Grid to use as a VM template for the Tanzu Kubernetes (workload) cluster nodes that it creates. Each custom machine image packages a base OS version and a Kubernetes version, along with any additional customizations, into an image that runs on vSphere, Amazon EC2, or Microsoft Azure infrastructure. The base OS can be an OS that VMware supports but does not distribute, such as Red Hat Enterprise Linux (RHEL) v7.

This topic provides background on custom images for Tanzu Kubernetes Grid, and explains how to build them.

This chapter includes the following topics: n Overview: Kubernetes Image Builder n Build a Custom Machine Image n Use a Custom Machine Image

Overview: Kubernetes Image Builder

To build custom machine images for Tanzu Kubernetes Grid workload clusters, you use the container image from the upstream Kubernetes Image Builder project. Kubernetes Image Builder runs on your local workstation and uses the following: n Packer automates and standardizes the image-building process for current and future CAPI providers, and packages the images for their target infrastructure once they are built. n Ansible standardizes the process of configuring and provisioning machines across multiple target distribution families, such as Ubuntu and CentOS. n To build the images, Image Builder uses native infrastructure for each provider:

n Amazon EC2

n You build your custom images from base AMIs that are published on Amazon EC2, such as official Ubuntu AMIs.

n The custom image is built inside AWS and then stored in your AWS account in one or more regions.

n See Building Images for AWS in the Image Builder documentation.

VMware, Inc. 368 VMware Tanzu Kubernetes Grid

n Azure:

n You can store your custom image in an Azure Shared Image Gallery.

n See Building Images for Azure in the Image Builder documentation.

n vSphere:

n Image Builder builds Open Virtualization Archive (OVA) images

n You build the machine images from the Linux distribution's original installation ISO.

n You import the resulting OVA into a vSphere cluster, take a snapshot for fast cloning, and then mark the machine image as a vm template.

n See Building Images for vSphere in the Image Builder documentation.

See Customization in the Image Builder documentation for how you can customize your image. Before making any modifications, consult with VMware Customer Reliability Engineering (CRE) for best practices and recommendations.

After you have created a custom image, you enable the tanzu CLI to use it by creating a custom Tanzu Kubernetes release (TKr) based on the image.

Custom Images Replace Default Images

For common combinations of OS version, Kubernetes version, and target infrastructure, Tanzu Kubernetes Grid provides default machine images. For example, one ova-ubuntu-2004- v1.20.5+vmware.2-tkg image serves as the OVA image for Ubuntu v20.04 and Kubernetes v1.20.5 on vSphere.

For other combinations of OS version, Kubernetes version, and infrastructure, such as with the RHEL v7 OS, there are no default machine images, but you can build them.

If you build and use a custom image with the same OS version, Kubernetes version, and infrastructure that a default image already has, your custom image replaces the default. The tanzu CLI then creates new clusters using your custom image, and no longer uses the default image, for that combination of OS version, Kubernetes version, and target infrastructure.

Cluster API

Cluster API (CAPI) is built on the principles of immutable infrastructure. All nodes that make up a cluster are derived from a common template or machine image.

When CAPI creates a cluster from a machine image, it expects several things to be configured, installed, and accessible or running, including: n The versions of kubeadm, kubelet and kubectl specified in the workload cluster manifest. n A container runtime, most often containerd. n All required images for kubeadm init and kubeadm join. You must include any images that are not published and must be pulled locally, as with VMware-signed images. n cloud-init configured to accept bootstrap instructions.

VMware, Inc. 369 VMware Tanzu Kubernetes Grid

Build a Custom Machine Image

This procedure builds a custom machine image for Tanzu Kubernetes Grid to use when creating workload cluster nodes. It works by:

1 Collecting parameter strings that give a Kubernetes Image Builder command the context and inputs that it needs to create the custom image.

2 Passing the parameter strings to a long docker run command that runs the Kubernetes Image Builder command in a container.

Prerequisites

To create a custom machine image, you need: n An account on your target infrastructure, AWS, Azure, or vSphere n A macOS or Linux workstation with the following installed:

n Docker Desktop

n For AWS: The aws command-line interface (CLI)

n For Azure: The az CLI

n For vSphere: A local copy of the OVFTool linux installer

n RHEL: To build a RHEL 7 image for vSphere, you need to use a Linux workstation, not macOS.

Procedure

1 On AWS and Azure, log in to your infrastructure CLI. Authenticate and specify your region, if prompted: n AWS: Run aws configure. n Azure: Run az login.

1 On vSphere, do the following:

a Download Open Virtualization Format (OVF) tool from VMware {code}. You will need the installer for x86_64 Linux, as you will not be installing this locally, rather installing into a Linux container. In the following step, this file is referred to as YOUR-OVFTOOL-INSTALLER- FILE, and should be in the same directory as your new Dockerfile

b Create a Dockerfile and fill in values as shown:

FROM k8s.gcr.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.9 USER root ENV LC_CTYPE=POSIX ENV OVFTOOL_FILENAME=YOUR-OVFTOOL-INSTALLER-FILE ADD $OVFTOOL_FILENAME /tmp/

VMware, Inc. 370 VMware Tanzu Kubernetes Grid

RUN /bin/sh /tmp/$OVFTOOL_FILENAME --console --required --eulas-agreed && \ rm -f /tmp/$OVFTOOL_FILENAME USER imagebuilder ENV IB_OVFTOOL=1

a Build a new container image from the Dockerfile. It is recommended to give a custom name that will be meaningful to you:

docker build . -t projects.registry.vmware.com/tkg/imagebuilder-byoi:v0.1.9

a Create a vSphere credentials JSON file and fill in its values:

{ "cluster": "", "convert_to_template": "false", "create_snapshot": "true", "datacenter": "", "datastore": "", "folder": "", "insecure_connection": "false", "linked_clone": "true", "network": "", "password": "", "resource_pool": "", "template": "", "username": "", "vcenter_server": "" }

2 Determine the Image Builder configuration version that you want to build from.

n Search the VMware {code} Sample Exchange for TKG Image Builder to list the available versions.

n Each version corresponds to the Kubernetes version that Image Builder uses. For example, TKG-Image-Builder-for-Kubernetes-v1.20.5-master.zip builds a Kubernetes v1.20.5 image.

n If you need to create a management cluster, which you must do when you first install Tanzu Kubernetes Grid, choose the default Kubernetes version of your Tanzu Kubernetes Grid version. For example, in Tanzu Kubernetes Grid v1.3.1, the default Kubernetes version is v1.20.5.

3 The Image Builder configurations have two different architectures and build instructions, based on their Kubernetes versions: n For v1.20.5, v1.20.4, v1.19.9, v1.19.8, v1.18.17, v1.18,16, or v1.17.16, continue with the procedure below. n For v1.19.3, v1.19.1, v1.18.10, v1.18.8, v1.17.13, and v1.17.77, Follow the Build an Image with Kubernetes Image Builder instructions in the Tanzu Kubernetes Grid v1.2 documentation:

n Build and Use Custom AMI images on Amazon EC2

VMware, Inc. 371 VMware Tanzu Kubernetes Grid

n Build and Use Custom VM images on Azure

n Build and Use Custom OVA Images on vSphere

After creating a custom image file following the v1.2 procedure, continue with Use a Custom Machine Image below. Do not follow the Tanzu Kubernetes Grid v1.2 procedure to add a reference to the custom image to a Bill of Materials (BoM) file.

1 Download the configuration code zip file, and unpack its contents.

2 cd into the TKG-Image-Builder- directory, so that the tkg.json file is in your current directory.

3 Collect the following parameter strings to plug into the command in the next step. Many of these specify docker run -v parameters that copy your current working directories into the / home/imagebuilder directory of the container used to build the image. n AUTHENTICATION: Copies your local CLI directory:

n AWS: Use ~/.aws:/home/imagebuilder/.aws

n Azure: Use ~/.azure:/home/imagebuilder/.azure

n vSphere: /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json n SOURCES: Copies the repo's tkg.json file, which lists download sources for versioned OS, Kubernetes, container network interface (CNI). images:

n Use /PATH/TO/tkg.json:/home/imagebuilder/tkg.json n ROLES: Copies the repo's tkg directory, which contains Ansible roles required by Image Builder.

n Use /PATH/TO/tkg:/home/imagebuilder/tkg

n To add custom Ansible roles, edit the tkg.json file to reformat the custom_role_names setting with escaped quotes ("), in order to make it a list with multiple roles. For example: "custom_role_names": ""/home/imagebuilder/tkg /home/imagebuilder/mycustomrole"", n TESTS: Copies a goss test directory designed for the image's target infrastructure, OS, and Kubernetes verson:

n Use the filename of a file in the repo's goss directory, for example amazon- ubuntu-1.20.5+vmware.2-goss-spec.yaml. n CUSTOMIZATIONS: Copies a customizations file in JSON format. n (Azure) AZURE-CREDS: Path to an Azure credentials file, as described in the Image Builder documentation. n CONTAINER: A container hosted on Google Cloud Platform:

n AWS: Use k8s.gcr.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.9

n Azure: Use k8s.gcr.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.9

n vSphere: Use k8s.gcr.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.9

VMware, Inc. 372 VMware Tanzu Kubernetes Grid

n COMMAND: Use a command like one of the following, based on the custom image OS. For vSphere and Azure images, the commands start with build-node-ova- and build-azure-sig-:

n build-ami-ubuntu-2004: Ubuntu v20.04

n build-ami-ubuntu-1804: Ubuntu v18.04

n build-ami-amazon-2: Amazon Linux 2

1 Using the strings above, run the Image Builder in a Docker container:

docker run -it --rm \ -v AUTHENTICATION \ -v SOURCES \ -v ROLES \ -v /PATH/TO/goss/TESTS.yaml:/home/imagebuilder/goss/goss.yaml \ -v /PATH/TO/CUSTOMIZATIONS.json:/home/imagebuilder/CUSTOMIZATIONS.json \ --env PACKER_VAR_FILES="tkg.json CUSTOMIZATIONS.json" \ --env-file AZURE-CREDS \ CONTAINER \ COMMAND \

Notes:

n Omit env-file if you are not building an image for Azure.

n This command may take several minutes to complete.

For example, to create a custom image with Ubuntu v20.04 and Kubernetes v1.20.5 to run on AWS, running from the directory that contains tkg.json:

docker run -it --rm \ -v ~/.aws:/home/imagebuilder/.aws \ -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \ -v $(pwd)/tkg:/home/imagebuilder/tkg \ -v $(pwd)/goss/amazon-ubuntu-1.20.5+vmware.2-goss-spec.yaml:/home/imagebuilder/goss/goss.yaml \ -v /PATH/TO/CUSTOMIZATIONS.json /home/imagebuilder/aws.json \ --env PACKER_VAR_FILES="tkg.json aws.json" \ k8s.gcr.io/scl-image-builder/cluster-node-image-builder-amd64:v0.1.9 \ build-ami-ubuntu-2004

For vSphere, you must use the custom container image created above. You must also set a version string that will match what you pass in your custom TKr in the later steps. While VMware published OVAs will have a version string like v1.20.5+vmware.2-tkg.1, it is recommended that the -tkg.1 be replaced with a string meaningful to your organization. To set this version string, define it in a metadata.json file like the following:

{ "VERSION": "v1.20.5+vmware.2-myorg.0" }

VMware, Inc. 373 VMware Tanzu Kubernetes Grid

When building OVAs, the .ova file is saved to the local filesystem of your workstation. Whatever folder you want those OVAs to be saved in should be mounted to /home/ imagebuilder/output within the container. Then, create the OVA using the container image:

docker run -it --rm \ -v /PATH/TO/CREDENTIALS.json:/home/imagebuilder/vsphere.json \ -v $(pwd)/tkg.json:/home/imagebuilder/tkg.json \ -v $(pwd)/tkg:/home/imagebuilder/tkg \ -v $(pwd)/goss/vsphere-ubuntu-1.20.5+vmware.2-goss-spec.yaml:/home/imagebuilder/goss/ goss.yaml \ -v $(pwd)/metadata.json:/home/imagebuilder/metadata.json \ -v /PATH/TO/OVA/DIR:/home/imagebuilder/output \ --env PACKER_VAR_FILES="tkg.json vsphere.json" \ --env OVF_CUSTOM_PROPERTIES=/home/imagebuilder/metadata.json \ projects.registry.vmware.com/tkg/imagebuilder-byoi:v0.1.9 \ build-node-ova-vsphere-ubuntu-2004

RHEL: To build a RHEL OVA you need to use a Linux machine, not macOS, because Docker on macOS does not support the --network host option. You must also include additional flags in the docker run command above, so that the container mounts your RHEL ISO rather than pulling from a public URL, and so that it can access Red Hat Subscription Manager credentials to connect to vCenter:

-v $(pwd)/isos/rhel-server-7.7-x86-64-dvd.iso:/rhel-server-7.7-x86-64-dvd.iso \ --network host \ --env RHSM_USER=USER --env RHSM_PASS=PASS

Where:

n RHSM_USER and RHSM_PASS, are the user/password combination that registers the node with Red Hat Subscription Manager to gain access to RPM repositories. This node must be unregistered after installation has finished.

n You map your local RHEL ISO path, in $(pwd)/isos/rhel-sever-7.7-x86-64-dvd.iso in the example above, as an additional volume.

Use a Custom Machine Image

After you have created a custom image, you enable the tanzu CLI to use it by creating a custom Tanzu Kubernetes release (TKr) based on the image.

VMware, Inc. 374 VMware Tanzu Kubernetes Grid

To create a custom TKr, you add it to the Bill of Materials (BoM) of the TKr for the image's Kubernetes version. For example, to add a custom image that you built with Kubernetes v1.20.5, you modify the current ~/.tanzu/tkg/bom/tkr-bom-v1.20.5.yaml file.

1 From your ~/.tanzu/tkg/bom/ directory, open the TKr BoM corresponding to your custom image's Kubernetes version. For example with a filename like tkr-bom-v1.20.5+vmware.2- tkg.1.yaml for Kubernetes v1.20.5. n If the directory lacks the TKr BoM file that you need, you can bring it in by deploying a cluster with the desired Kubernetes version, as described in Deploy a Cluster with a Non-Default Kubernetes Version.

1 In the BoM file, find the image definition blocks for your infrastructure: ova for vSphere, ami for AWS, and azure for Azure.

2 Determine whether an existing definition block applies to your image's OS, as listed by osinfo.name, .version, and .arch.

3 If no existing block applies to your image's osinfo, add a new block as follows. If an existing block does apply, replace its values as follows: n vSphere:

n name: a unique name for your OVA that includes the OS version, like my-ubuntu-2004

n version: follow existing version value format, but use the unique VERSION assigned in metadata.json when you created the OVA, for example v1.20.5+vmware.2-myorg.0. n AWS - for each region that you plan to use the custom image in:

n id: follow existing id value format, but use a unique hex string at the end, for example ami-693a5e2348b25e428 n Azure:

n sku: a unique SKU for your image that includes the OS version, like my-k8s-1dot20dot4- ubuntu-2004

If the BoM file defines images under regions, your new or modified custom image definition block must be listed first in its region. Within each region, the cluster creation process picks the first suitable image listed.

1 Save the BoM file. If its filename includes a plus (+) character, save the modified file under a new filename that replaces the + with a triple dash (---). For example, tkr-bom-v1.20.5--- vmware.2-tkg.1.yaml.

2 base64-encode the file contents into a binary string, for example:

cat tkr-bom-v1.20.5---vmware.2-tkg.1.yaml | base64 -w 0

VMware, Inc. 375 VMware Tanzu Kubernetes Grid

3 Create a ConfigMap YAML file in the tkr-system namespace, also without a + in its filename, and fill in values as shown:

apiVersion: v1 kind: ConfigMap metadata: name: CUSTOM-TKG-BOM labels: tanzuKubernetesRelease: CUSTOM-TKR binaryData: bomContent: BOM-BINARY-CONTENT

Where:

n CUSTOM-TKG-BOM is the name of the ConfigMap YAML file, without the .yaml extension, such as my-custom-tkr-bom-v1.20.5---vmware.2-tkg.1

n CUSTOM-TKR is a name for your TKr, such as my-custom-tkr-v1.20.5---vmware.2-tkg.1

n BOM-BINARY-CONTENT is the base64-encoded content of your customized BoM file.

4 Save the ConfigMap file, set the kubectl context to a management cluster you want to add TKr to, and apply the file to the cluster, for example:

kubectl apply -f my-custom-tkr-bom-v1.20.5---vmware.2-tkg.1.yaml

n Once the ConfigMap is created, the TKr Controller reconciles the new object by creating a TanzuKubernetesRelease. The default reconciliation period is??600 seconds. You can avoid this delay by deleting the TKr Controller pod, which makes the pod restore and reconcile immediately:

1 List pods in the tkr-system namespace:

kubectl get pod -n tkr-system

2 Retrieve the name of the TKr Controller pod, which looks like tkr-controller-manager- f7bbb4bd4-d5lfd

3 Delete the pod:

kubectl delete pod -n tkr-system TKG-CONTROLLER

Where TKG-CONTROLLER is the name of the TKr Controller pod.

5 To check that the custom TKr was added, run tanzu kubernetes-release get or kubectl get tkr or and look for the CUSTOM-TKR value set above in the output.

Once your custom TKr is listed by the kubectl and tanzu CLIs, you can pass it to the --tkr option of tanzu cluster create to create clusters with your new custom image.

VMware, Inc. 376 Upgrading Tanzu Kubernetes Grid 9

To upgrade Tanzu Kubernetes Grid, you download and install the new version of the Tanzu CLI on the machine that you use as the bootstrap machine. You must also download and install base image templates and VMs, depending on whether you are upgrading clusters that you previously deployed to vSphere, Amazon EC2, or Azure.

After you have installed the new versions of the components, you use the tanzu management- cluster upgrade and tanzu cluster upgrade CLI commands to upgrade management clusters and Tanzu Kubernetes clusters.

This chapter includes the following topics: n Prerequisites n Procedure n Download and Install the Tanzu CLI n Import Configuration Files from Existing v1.2 Management Clusters n Replace v1.2 tkg Commands with tanzu Commands n Prepare to Upgrade Clusters on vSphere n Prepare to Upgrade Clusters on Amazon EC2 n Prepare to Upgrade Clusters on Azure n Set the TKG_BOM_CUSTOM_IMAGE_TAG n Upgrade Management Clusters n Upgrade Workload Clusters n Upgrade the Tanzu Kubernetes Grid Extensions n Register Core Add-ons n Upgrade Crash Recovery and Diagnostics n Install NSX Advanced Load Balancer After Tanzu Kubernetes Grid Upgrade (vSphere) n What to Do Next n Upgrade Management Clusters n Upgrade Tanzu Kubernetes Clusters

VMware, Inc. 377 VMware Tanzu Kubernetes Grid

n Upgrade Tanzu Kubernetes Grid Extensions n Register Core Add-ons n Select an OS During Cluster Upgrade n Upgrade vSphere Deployments in an Internet-Restricted Environment

Prerequisites

Before you begin the upgrade to Tanzu Kubernetes Grid v1.3.1, you must ensure the following prequisites are met. n Your current deployment is Tanzu Kubernetes Grid v1.2.x or v1.3.0. To upgrade to v1.3.x from Tanzu Kubernetes Grid versions earlier than v1.2, you must first upgrade to v1.2.x with the tkg v1.2.x CLI. n If your deployment is running on vSphere, you have migrated your clusters from an HA Proxy Load Balancer to Kube-VIP. You must complete this migration before upgrading to Tanzu Kubernetes Grid v1.3.x. For migration instructions, see Migrate Clusters from an HA Proxy Load Balancer to Kube-VIP in the v1.2 documentation.

Procedure

The next sections are the overall steps required to upgrade Tanzu Kubernetes Grid. This procedure assumes that you are upgrading to Tanzu Kubernetes Grid v1.3.1.

Some steps are only required if you are performing a major upgrade from Tanzu Kubernetes Grid v1.2.x to v1.3.1 and are not required if you are performing a minor upgrade from Tanzu Kubernetes Grid v1.3.0 to v1.3.1.

If you deployed the previous version of Tanzu Kubernetes Grid on vSphere in an Internet- restricted environment, see Upgrade vSphere Deployments in an Internet-Restricted Environment.

Download and Install the Tanzu CLI

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

To download the Tanzu CLI, perform the following steps.

1 Follow the instructions in Chapter 3 Install the Tanzu CLI and Other Tools to download and install the Tanzu CLI and kubectl on the machine where you currently run your tkg commands for v1.2.x or tanzu commands for v1.3.0.

2 After you install tanzu, run tanzu version to check that the correct version of the Tanzu CLI is properly installed.

3 After you install kubectl, run kubectl version to check that the correct version of kubectl is properly installed.

VMware, Inc. 378 VMware Tanzu Kubernetes Grid

For information about Tanzu CLI commands and options that are available, see the Tanzu CLI Command Reference.

Import Configuration Files from Existing v1.2 Management Clusters

This step is only required for major upgrades from Tanzu Kubernetes Grid v1.2.x to v1.3.x.

On the machine where you currently run tkg commands to manage your Tanzu Kubernetes Grid v1.2 clusters, import your existing management cluster configurations into the configuration file used by the tanzu CLI.

If you store your management cluster configuration files in the default location, ~/.tkg/ config.yaml, run the following command:

tanzu management-cluster import

If the configuration files for your management clusters are stored in a non-default location or if you have multiple management cluster config.yaml files, specify the location of the file in the command.

tanzu management-cluster import -f /path/to/config.yaml

If the import is successful, the command returns a message similar to the following:

the old providers folder /Users/username/.tkg/providers is backed up to /Users/username/.tkg/ providers-20210309174325-0aw0ckqa successfully imported server: tkg-cluster-mc

Management cluster configuration imported successfully

By importing your existing configuration files, your Tanzu Kubernetes Grid v1.2 management clusters are now accessible by the Tanzu CLI.

In ~/.tanzu/config.yaml, you should see new entries for the management clusters you imported in the servers section. For example:

servers: - managementClusterOpts: context: tkg-cluster-mc-admin@tkg-cluster-mc path: /Users/username/.kube-tkg/tkg-cluster-mc_config name: tkg-cluster-mc type: managementcluster

Replace v1.2 tkg Commands with tanzu Commands

This step is only required for major upgrades from Tanzu Kubernetes Grid v1.2.x to v1.3.x.

VMware, Inc. 379 VMware Tanzu Kubernetes Grid

If you use any tkg commands in automation scripts for your Tanzu Kubernetes clusters, make sure to replace any tkg command invocations with equivlent tanzu CLI commands before you upgrade to v1.3.x.

Tanzu Kubernetes Grid v1.3.x clusters cannot be managed with the tkg CLI.

For a reference of command equivalents, see Table of Equivalents in the Tanzu CLI Command Reference.

Prepare to Upgrade Clusters on vSphere

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

Before you can upgrade a Tanzu Kubernetes Grid deployment on vSphere, you must import into vSphere new versions of the base image templates that the upgraded management and Tanzu Kubernetes clusters will run. VMware publishes base image templates in OVA format for each supported OS and Kubernetes version. After importing the OVAs, you must convert the resulting VMs into VM templates.

This procedure assumes that you are upgrading to Tanzu Kubernetes Grid v1.3.1.

1 Go to the Tanzu Kubernetes Grid downloads page and log in with your My VMware credentials.

2 Download the latest Tanzu Kubernetes Grid OVAs for the OS and Kubernetes version lines that your management and Tanzu Kubernetes clusters are running.

For example, for Photon v3 images:

n Kubernetes v1.20.5: Photon v3 Kubernetes v1.20.5 OVA

n Kubernetes v1.19.9: Photon v3 Kubernetes v1.19.9 OVA

n Kubernetes v1.18.17: Photon v3 Kubernetes v1.18.17 OVA

For Ubuntu 20.04 images:

n Kubernetes v1.20.5: Ubuntu 2004 Kubernetes v1.20.5 OVA

n Kubernetes v1.19.9: Ubuntu 2004 Kubernetes v1.19.9 OVA

n Kubernetes v1.18.17: Ubuntu 2004 Kubernetes v1.18.17 OVA

Important: Make sure you download the most recent OVA base image templates in the event of security patch releases. You can find updated base image templates that include security patches on the Tanzu Kubernetes Grid product download page.

3 In the vSphere Client, right-click an object in the vCenter Server inventory and select Deploy OVF template.

4 Select Local file, click the button to upload files, and navigate to a downloaded OVA file on your local machine.

VMware, Inc. 380 VMware Tanzu Kubernetes Grid

5 Follow the installer prompts to deploy a VM from the OVA.

n Accept or modify the appliance name.

n Select the destination datacenter or folder.

n Select the destination host, cluster, or resource pool.

n Accept the end user license agreements (EULA).

n Select the disk format and destination datastore.

n Select the network for the VM to connect to.

6 Click Finish to deploy the VM.

7 When the OVA deployment finishes, right-click the VM and select Template > Convert to Template.

8 In the VMs and Templates view, right-click the new template, select Add Permission, and assign your Tanzu Kubernetes Grid user, for example, tkg-user, to the template with the Tanzu Kubernetes Grid role, for example, TKG. You created this user and role in Prepare to Deploy Management Clusters to vSphere.

Repeat the procedure for each of the Kubernetes versions for which you have downloaded the OVA file.

VMware Cloud on AWS SDDC Compatibility

If you are upgrading Tanzu Kubernetes clusters that are deployed on VMware Cloud on AWS, verify that the underlying Software-Defined Datacenter (SDDC) version used by your existing deployment is compatible with the version of Tanzu Kubernetes Grid you are upgrading to.

To view the version of an SDDC, select View Details on the SDDC tile in VMware Cloud Console and click on the Support pane.

To validate compatibility with Tanzu Kubernetes Grid, refer to the VMware Product Interoperablity Matrix.

Prepare to Upgrade Clusters on Amazon EC2

No specific action is required for either major v1.2.x to v1.3.1 or minor v1.3.0 to v1.3.1 upgrades.

Amazon Linux 2 Amazon Machine Images (AMI) that include the supported Kubernetes versions are publicly available to all Amazon EC2 users, in all supported AWS regions. Tanzu Kubernetes Grid automatically uses the appropriate AMI for the Kubernetes version that you specify during upgrade.

In Tanzu Kubernetes Grid v1.2 and later, you created the required IAM resources by enabling the Automate creation of AWS CloudFormation Stack checkbox in the installer interface or by running the tkg config permissions aws command from the CLI.

VMware, Inc. 381 VMware Tanzu Kubernetes Grid

When upgrading your management cluster to v1.3.x, you do not need to recreate the AWS CloudFormation Stack.

For more information, see Deploy Management Clusters with the Installer Interface or Deploy Management Clusters from a Configuration File.

Prepare to Upgrade Clusters on Azure

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

Before upgrading a Tanzu Kubernetes Grid deployment on Azure, you must accept the terms for the new default VM image and for each non-default VM image that you plan to use for your cluster VMs. You need to accept these terms once per subscription.

To accept the terms:

1 List all available VM images for Tanzu Kubernetes Grid in the Azure Marketplace:

az vm image list --publisher vmware-inc --offer tkg-capi --all

2 Accept the terms for the new default VM image:

az vm image terms accept --urn publisher:offer:sku:version

For example, to accept the terms for the default VM image in Tanzu Kubernetes Grid v1.3.1, k8s-1dot20dot5-ubuntu-2004, run:

az vm image terms accept --urn vmware-inc:tkg-capi:k8s-1dot20dot5-ubuntu-2004:2021.05.17

3 If you plan to upgrade any of your Tanzu Kubernetes clusters to a non-default Kubernetes version, such as v1.19.8 or v1.18.16, accept the terms for each non-default version that you want to use for your cluster VMs.

Set the TKG_BOM_CUSTOM_IMAGE_TAG

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

Before you can upgrade a management cluster to v1.3, you must specify the correct BOM file to use as a local environment variable. In the event of a patch release to Tanzu Kubernetes Grid, the BOM file may require an update to coincide with updated base image files.

Note For information about the most recent security patch updates to VMware Tanzu Kubernetes Grid v1.3, see the VMware Tanzu Kubernetes Grid v1.3.1 Release Notes and this Knowledgebase Article.

On the machine where you run the Tanzu CLI, perform the following steps:

1 Remove any existing BOM data.

rm -rf ~/.tanzu/tkg/bom

VMware, Inc. 382 VMware Tanzu Kubernetes Grid

2 Specify the updated BOM to use by setting the following variable.

export TKG_BOM_CUSTOM_IMAGE_TAG="v1.3.1-patch1"

3 Run tanzu management-cluster create command with no additional parameters.

tanzu management-cluster create

This command produces an error but results in the BOM files being downloaded to ~/.tanzu/tkg/bom.

Upgrade Management Clusters

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

To upgrade Tanzu Kubernetes Grid, you must upgrade all management clusters in your deployment. You cannot upgrade Tanzu Kubernetes clusters until you have upgraded the management clusters that manage them.

Follow the procedure in Upgrade Management Clusters to upgrade your management clusters.

Upgrade Workload Clusters

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

After you upgrade the management clusters in your deployment, you can upgrade the Tanzu Kubernetes clusters that are managed by those management clusters.

Follow the procedure in Upgrade Tanzu Kubernetes Clusters to upgrade the Tanzu Kubernetes clusters that are running your workloads.

Upgrade the Tanzu Kubernetes Grid Extensions

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

If you implemented any or all of the Tanzu Kubernetes Grid extensions in Tanzu Kubernetes Grid v1.2.x or v1.3.0, you must upgrade the extensions after you upgrade your management and workload clusters to Tanzu Kubernetes Grid v1.3.1. For information about how to upgrade the extensions, see Upgrade Tanzu Kubernetes Grid Extensions.

Register Core Add-ons

This step is only required for major upgrades from Tanzu Kubernetes Grid v1.2.x to v1.3.x.

After you upgrade your management and workload clusters from Tanzu Kubernetes Grid v1.2.x to v1.3.x, follow the instructions in Register Core Add-ons to register the CNI, vSphere CPI, vSphere CSI, Pinniped, and Metrics Server add-ons with tanzu-addons-manager, the component that manages the lifecycle of add-ons.

VMware, Inc. 383 VMware Tanzu Kubernetes Grid

Upgrade Crash Recovery and Diagnostics

This step is required for both major v1.2.x to v1.3.1 and minor v1.3.0 to v1.3.1 upgrades.

For information about how to upgrade Crash Recovery and Diagnostics, see Install or Upgrade the Crash Recovery and Diagnostics Binary.

Install NSX Advanced Load Balancer After Tanzu Kubernetes Grid Upgrade (vSphere)

If you are using NSX ALB on vSphere, follow this procedure to set it up after upgrading your Tanzu Kubernetes Grid installation to v1.3.1.

1 If NSX ALB was not enabled in your Tanzu Kubernetes Grid v1.2 installation, perform the following sub-steps (If you did have NSX ALB enabled for v1.2, skip to the export commands below):

a Configure the Avi Controller. For more information, see Avi Controller: Basics and subsequent sections.

b On your local system, open the file ~/.config/tanzu/tkg/CONFIG.YAML.

Where CONFIG.YAML is your configuration yaml file for the management cluster.

c Add the Avi details in the following AVI fields, and save the file:

AVI_CA_DATA: AVI_CLOUD_NAME: AVI_CONTROLLER: AVI_DATA_NETWORK: AVI_DATA_NETWORK_CIDR: AVI_ENABLE: "true" AVI_LABELS: "" AVI_USERNAME: AVI_PASSWORD: AVI_SERVICE_ENGINE_GROUP:

Example for the configuration yaml file:

AVI_CA_DATA: |------BEGIN CERTIFICATE----- MIICxzCCAa+gAwIBAgIUT+SWtJ1JK4... -----END CERTIFICATE----- AVI_CLOUD_NAME: Default-Cloud AVI_CONTROLLER: 10.83.20.229 AVI_DATA_NETWORK: VM Network AVI_DATA_NETWORK_CIDR: 10.83.0.0/19 AVI_ENABLE: "true" AVI_LABELS: "" AVI_USERNAME: admin AVI_PASSWORD: AVI_SERVICE_ENGINE_GROUP: Default-Group

VMware, Inc. 384 VMware Tanzu Kubernetes Grid

2 Run the following commands to export the required environment variables:

export _TKG_CLUSTER_FORCE_ROLE=management export FILTER_BY_ADDON_TYPE="avi/ako-operator" export NAMESPACE="tkg-system"

3 Run the following command to generate the manifest file:

tanzu cluster create ${MANAGEMENT_CLUSTER_NAME} --dry-run -f ~/.config/tanzu/tkg/clusterconfigs/ MANAGEMENT_CLUSTER_CONFIG.yaml --vsphere-controlplane-endpoint 1.1.1.1 > ako-operator-addon- manifest.yaml

4 Run the following command to apply the changes:

kubectl apply -f ako-operator-addon-manifest.yaml

What to Do Next n Examine your upgraded management clusters or register them in Tanzu Mission Control. See Examine the Management Cluster Deployment and Register Your Management Cluster with Tanzu Mission Control. n If you have not done so already, enable identity management in Tanzu Kubernetes Grid. See Enabling Identity Management in Tanzu Kubernetes Grid.

Upgrade Management Clusters

To upgrade your Tanzu Kubernetes Grid instance, you must first upgrade the management cluster. You cannot upgrade Tanzu Kubernetes clusters until you have upgraded the management cluster that manages them.

IMPORTANT: Management clusters and Tanzu Kubernetes clusters use client certificates to authenticate clients. These certificates are valid for one year. To renew them, upgrade your clusters at least once a year.

Prerequisites n You performed the steps in Chapter 9 Upgrading Tanzu Kubernetes Grid that occur before the step for upgrading management clusters. n If you deployed the previous version of Tanzu Kubernetes Grid in an Internet-restricted environment, you have performed the steps in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment to recreate and run the gen-publish-images.sh and publish- images.sh scripts with the new component image versions.

VMware, Inc. 385 VMware Tanzu Kubernetes Grid

Procedure

1 Run the tanzu login command to see an interactive list of management clusters available for upgrade.

tanzu login

2 Select the management cluster that you want to upgrade. See List Management Clusters and Change Context for more information.

3 Run the tanzu cluster list command with the --include-management-cluster option. This command shows the versions of Kubernetes running on the management cluster and all of the clusters that it manages:

$ tanzu cluster list --include-management-cluster NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN k8s-1-17-13-cluster default running 1/1 1/1 v1.17.13+vmware.1 dev k8s-1-18-10-cluster default running 1/1 1/1 v1.18.10+vmware.1 dev k8s-1-19-3-cluster default running 1/1 1/1 v1.19.3+vmware.1 dev mgmt-cluster tkg-system running 1/1 1/1 v1.20.4+vmware.1 management dev

4 Before you run the upgrade command, remove all unmanaged kapp-controller deployment artifacts from the management cluster. An unmanaged kapp-controller deployment is a deployment that exists outside of the vmware-system-tmc namespace.

a Delete the kapp-controller deployment.

kubectl delete deployment kapp-controller -n kapp-controller

Note: If you receive a NotFound error message, ignore the error. You should continue with the following deletion steps in case you have any orphaned objects related to a pre- existing kapp-controller deployment.

Error from server (NotFound): deployments.apps "kapp-controller" not found

b Delete all kapp-controller objects.

``` kubectl delete clusterrole kapp-controller-cluster-role kubectl delete clusterrolebinding kapp-controller-cluster-role-binding kubectl delete serviceaccount kapp-controller-sa -n kapp-controller ```

5 If you set up Harbor access by installing a connectivity API on your v1.2 management cluster, follow the Replace Connectivity API with a Load Balancer procedure below. If your workload clusters access Harbor via a load balancer, proceed to the next step.

VMware, Inc. 386 VMware Tanzu Kubernetes Grid

6 Run the tanzu management-cluster upgrade command and enter y to confirm.

The following command upgrades the current management cluster.

tanzu management-cluster upgrade

If multiple base VM images in your IaaS account have the same version of Kubernetes that you are upgrading to, use the --os-name option to specify the OS you want. See Select an OS During Cluster Upgrade for more information.

For example, on vSphere if you have uploaded both Photon and Ubuntu OVA templates with Kubernetes v1.20.5, specify --os-name ubuntu to upgrade your management cluster to run on an Ubuntu VM.

tanzu management-cluster upgrade --os-name ubuntu

To skip the confirmation step when you upgrade a cluster, specify the --yes option.

tanzu management-cluster upgrade --yes

The upgrade process first upgrades the Cluster API providers for vSphere, Amazon EC2, or Azure that are running in the management cluster. Then, it upgrades the version of Kubernetes in all of the control plane and worker nodes of the management cluster.

If the upgrade times out before it completes, run tanzu management-cluster upgrade again and specify the --timeout option with a value greater than the default of 30 minutes.

tanzu management-cluster upgrade --timeout 45m0s

7 When the upgrade finishes, run the tanzu cluster list command with the --include- management-cluster option again to check that the management cluster has been upgraded.

tanzu cluster list --include-management-cluster

You see that the management cluster is now running the new version of Kubernetes, but that the Tanzu Kubernetes clusters are still running previous versions of Kubernetes.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN k8s-1-17-13-cluster default running 1/1 1/1 v1.17.13+vmware.1 dev k8s-1-18-10-cluster default running 1/1 1/1 v1.18.10+vmware.1 dev k8s-1-19-3-cluster default running 1/1 1/1 v1.19.3+vmware.1 dev mgmt-cluster tkg-system running 1/1 1/1 v1.20.5+vmware.2 management dev

VMware, Inc. 387 VMware Tanzu Kubernetes Grid

Replace Connectivity API with a Load Balancer

In Tanzu Kubernetes Grid v1.2 workload clusters access the Harbor service via a load balancer or a connectivity API installed in the management cluster. Tanzu Kubernetes Grid v1.3 only supports a load balancer for Harbor access. If your v1.2 installation uses the connectivity API, you need to remove it and set up a load balancer for the Harbor domain name before you upgrade:

1 Set up a load balancer for your workload clusters. For vSphere, see Install VMware NSX Advanced Load Balancer on a vSphere Distributed Switch.

2 Remove the tkg-connectivity operator and tanzu-registry webhook from the management cluster:

a Set the context of kubectl to the context of your management cluster:

kubectl config use-context MGMT-CLUSTER-admin@MGMT-CLUSTER

Where MGMT-CLUSTER is the name of your management cluster.

b Run the following commands to remove the resources and related objects:

kubectl delete mutatingwebhookconfiguration tanzu-registry-webhook kubectl delete namespace tanzu-system-connectivity kubectl delete namespace tanzu-system-registry kubectl delete clusterrolebinding tanzu-registry-webhook kubectl delete clusterrole tanzu-registry-webhook kubectl delete clusterrolebinding tkg-connectivity-operator kubectl delete clusterrole tkg-connectivity-operator

3 Undo the effects of the tanzu-registry webhook:

a List all cluster control plane resources in the management cluster:

kubectl get kubeadmcontrolplane -A | grep -v tkg-system

These resources correspond to the workload clusters and shared service cluster.

b For each workload cluster control plane listed, run kubectl edit to edit its resource manifest as follows. For example, for a workload cluster my_cluster_1, run:

kubectl -n NAMESPACE edit kubeadmcontrolplane my_cluster_1-control-plane

Where NAMESPACE is the namespace of the management cluster. If the namespace is default, you can omit this option.

c In the manifest's files: section, delete /opt/tkg/tanzu-registry-proxy.sh from from the files: array.

d In the manifest's preKubeadmCommands: section, delete the two lines that start with the following commands:

n echo - appends an IP address for Harbor into the /etc/hosts file.

n /opt/tkg/tanzu-registry-proxy.sh - executes the tanzu-registry-proxy.sh script.

VMware, Inc. 388 VMware Tanzu Kubernetes Grid

Update the Callback URL for Management Clusters with OIDC Authentication

In Tanzu Kubernetes Grid v1.3.0, Pinniped used Dex as the endpoint for both OIDC and LDAP providers. In Tanzu Kubernetes Grid v1.3.1 and later, Pinniped no longer requires Dex and uses the Pinniped endpoint for OIDC providers. In Tanzu Kubernetes Grid v1.3.1 and later, Dex is only used if you use an LDAP provider. If you used Tanzu Kubernetes Grid v1.3.0 to deploy management clusters that implement OIDC authentication, when you upgrade those management clusters to v1.3.1, the dexsvc service running in the management cluster is removed and replaced by the pinniped-supervisor service. Consequently, you must update the callback URLs that you specified in your OIDC provider after you deployed the management clusters with Tanzu Kubernetes Grid v1.3.0, so that it connects to the pinniped-supervisor service rather than to the dexsvc service.

Obtain the Address of the Pinniped Service Before you can update the callback URL, you must obtain the address of the Pinniped service that is running in the upgraded cluster.

1 Get the admin context of the management cluster.

tanzu management-cluster kubeconfig get --admin

If your management cluster is named id-mgmt-test, you should see the confirmation Credentials of workload cluster 'id-mgmt-test' have been saved. You can now access the cluster by running 'kubectl config use-context id-mgmt-test-admin@id-mgmt-test'. The admin context of a cluster gives you full access to the cluster without requiring authentication with your IDP.

2 Set kubectl to the admin context of the management cluster.

kubectl config use-context id-mgmt-test-admin@id-mgmt-test

3 Get information about the services that are running in the management cluster.

In Tanzu Kubernetes Grid v1.3.1 and later, the identity management service runs in the pinniped-supervisor namespace:

kubectl get all -n pinniped-supervisor

You see the following entry in the output:

vSphere:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/pinniped-supervisor NodePort 100.70.70.12 5556:31234/TCP 84m

VMware, Inc. 389 VMware Tanzu Kubernetes Grid

Amazon EC2:

NAME TYPE CLUSTER-IP EXTERNAL- IP PORT(S) AGE service/pinniped-supervisor LoadBalancer 100.69.13.66 ab1[...]71.eu- west-1.elb.amazonaws.com 443:30865/TCP 56m

Azure:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/pinniped-supervisor LoadBalancer 100.69.169.220 20.54.226.44 443:30451/TCP 84m

4 Note the following information:

n For management clusters that are running on vSphere, note the port on which the pinniped-supervisor service is running. In the example above, the port listed under EXTERNAL-IP is 31234.

n For clusters that you deploy to Amazon EC2 and Azure, note the external address of the LoadBalancer node of the pinniped-supervisor service is running, that is listed under EXTERNAL-IP.

Update the Callback URL

Once you have obtained information about the address at which pinniped-supervisor is running, you must update the callback URL for your OIDC provider. For example, if your IDP is Okta, perform the following steps:

1 Log in to your Okta account.

2 In the main menu, go to Applications.

3 Select the application that you created for Tanzu Kubernetes Grid.

4 In the General Settings panel, click Edit.

5 Under Login, update Login redirect URIs to include the address of the node in which the pinniped-supervisor is running.

n On vSphere, update the pinniped-supervisor port number that you noted in the previous procedure.

https://:31234/callback

n On Amazon EC2 and Azure, update the external address of the LoadBalancer node on which the pinniped-supervisor is running, that you noted in the previous procedure.

https:///callback

Specify https, not http.

6 Click Save.

VMware, Inc. 390 VMware Tanzu Kubernetes Grid

What to Do Next

You can now Upgrade Tanzu Kubernetes Clusters that this management cluster manages and Deploy Tanzu Kubernetes Clusters. By default, any new clusters that you deploy with this management cluster will run the new default version of Kubernetes.

However, if required, you can use the tanzu cluster create command with the --tkr option to deploy new clusters that run different versions of Kubernetes. For more information, see Deploy Tanzu Kubernetes Clusters with Different Kubernetes Versions.

Upgrade Tanzu Kubernetes Clusters

After you have upgraded a management cluster, you can Upgrade Tanzu Kubernetes Clusters that the management cluster manages.

IMPORTANT: Management clusters and Tanzu Kubernetes clusters use client certificates to authenticate clients. These certificates are valid for one year. To renew them, upgrade your clusters at least once a year.

Prerequisites n You performed the steps in Chapter 9 Upgrading Tanzu Kubernetes Grid that occur before the step for upgrading Tanzu Kubernetes Clusters. n You performed the steps in Upgrade Management Clusters to upgrade the management cluster that manages the Tanzu Kubernetes clusters that you want to upgrade. n If you are upgrading clusters that run on vSphere, before you can upgrade clusters to a non- default version of Kubernetes for your version of Tanzu Kubernetes Grid, the appropriate base image template OVAs must be available in vSphere as VM templates. For information about importing OVA files into vSphere, see Prepare to Upgrade Clusters on vSphere. n If you are upgrading clusters that run on Amazon EC2, the Amazon Linux 2 Amazon Machine Images (AMI) that include the supported Kubernetes versions are publicly available to all Amazon EC2 users, in all supported AWS regions. Tanzu Kubernetes Grid automatically uses the appropriate AMI for the Kubernetes version that you specify during upgrade. n If you are upgrading clusters that run on Azure, ensure that you completed the steps in Prepare to Upgrade Clusters on Azure.

Procedure

The upgrade process upgrades the version of Kubernetes in all of the control plane and worker nodes of your Tanzu Kubernetes clusters.

1 Run the tanzu login command to see an interactive list of available management clusters.

tanzu login

VMware, Inc. 391 VMware Tanzu Kubernetes Grid

2 Select a management cluster to switch the context of the Tanzu CLI. You should select the management cluster that manages the clusters you want to upgrade. See List Management Clusters and Change Context for more information.

3 Run the tanzu cluster list command with the --include-management-cluster option.

tanzu cluster list --include-management-cluster

The tanzu cluster list command shows the version of Kubernetes that is running in the management cluster and all of the clusters that it manages. In this example, you can see that the management cluster has already been upgraded to v1.20.5, but the Tanzu Kubernetes clusters are running older versions of Kubernetes.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN k8s-1-17-13-cluster default running 1/1 1/1 v1.17.13+vmware.1 dev k8s-1-18-10-cluster default running 1/1 1/1 v1.18.10+vmware.1 dev k8s-1-19-3-cluster default running 1/1 1/1 v1.19.3+vmware.1 dev mgmt-cluster tkg-system running 1/1 1/1 v1.20.5+vmware.1 management dev

4 If your v1.2 management cluster used the connectivity API to support Harbor access, you need to remove tanzu-system-connectivity artifacts from each workload cluster as follows:

a Set the context of kubectl to the context of your workload cluster:

kubectl config use-context WORKLOAD-CLUSTER-admin@WORKLOAD-CLUSTER

Where WORKLOAD-CLUSTER is the name of your workload cluster.

b Delete the tanzu-system-connectivity namespace:

kubectl delete ns tanzu-system-connectivity

Note: If your Harbor registry in v1.2 used a fictitious domain name such as harbor.system.tanzu instead of a FQDN, you cannot upgrade workload clusters automatically, but must instead create a new v1.3 workload cluster and migrate the workloads to the new cluster manually.

1 Before you upgrade a Tanzu Kubernetes cluster, remove all unmanaged kapp-controller deployment artifacts from the Tanzu Kubernetes cluster. An unmanaged kapp-controller deployment is a deployment that exists outside of the vmware-system-tmc namespace. You can assume it is in the kapp-controller namespace.

a Get the credentials of the cluster.

tanzu cluster kubeconfig get CLUSTER-NAME --admin

VMware, Inc. 392 VMware Tanzu Kubernetes Grid

For example, using cluster k8s-1-19-3-cluster:

tanzu cluster kubeconfig get k8s-1-19-3-cluster --admin

b Set the context of kubectl to the cluster:

kubectl config use-context k8s-1-19-3-cluster-admin@k8s-1-19-3-cluster

c Delete the kapp-controller deployment on the cluster.

kubectl delete deployment kapp-controller -n kapp-controller

Note: If you receive a NotFound error message, ignore the error. You should continue with the following deletion steps in case you have any orphaned objects related to a pre- existing kapp-controller deployment.

Error from server (NotFound): deployments.apps "kapp-controller" not found

d Delete all kapp-controller objects.

``` kubectl delete clusterrole kapp-controller-cluster-role kubectl delete clusterrolebinding kapp-controller-cluster-role-binding kubectl delete serviceaccount kapp-controller-sa -n kapp-controller ```

2 To discover which versions of Kubernetes are made available by a management cluster, run the tanzu kubernetes-release get command.

tanzu kubernetes-release get

The output lists all of the versions of Kubernetes that you can use to deploy clusters, with the following notes:

n COMPATIBLE: The current management cluster can deploy workload clusters with this Tanzu Kubernetes release (tkr).

n UPGRADEAVAILABLE: This tkr is not the most current in its Kubernetes version line. Any workload clusters running this tkr version can be upgraded to newer versions.

For example:

NAME VERSION COMPATIBLE UPGRADEAVAILABLE v1.17.16---vmware.2-tkg.2 v1.17.16+vmware.2-tkg.2 True True v1.18.16---vmware.1-tkg.2 v1.18.16+vmware.1-tkg.2 True True v1.18.17---vmware.1-tkg.1 v1.18.17+vmware.1-tkg.1 True True v1.19.8---vmware.1-tkg.2 v1.19.8+vmware.1-tkg.2 True True v1.19.9---vmware.1-tkg.1 v1.19.9+vmware.1-tkg.1 True True v1.20.4---vmware.1-tkg.2 v1.20.4+vmware.1-tkg.2 True True v1.20.5---vmware.1-tkg.1 v1.20.5+vmware.1-tkg.1 True False

VMware, Inc. 393 VMware Tanzu Kubernetes Grid

3 To discover the newer tkr versions to which you can upgrade a workload cluster running an older tkr version, run the tanzu kubernetes-release available-upgrades get command.

tanzu kubernetes-release available-upgrades get v1.19.8---vmware.1-tkg.1 NAME VERSION v1.19.9---vmware.1-tkg.1 v1.19.9+vmware.1-tkg.1 v1.20.4---vmware.1-tkg.2 v1.20.4+vmware.1-tkg.2 v1.20.5---vmware.1-tkg.1 v1.20.5+vmware.1-tkg.1

You cannot skip minor versions when upgrading your tkr version. For example, you cannot upgrade a cluster directly from v1.18.x to v1.20.x. You must upgrade a v1.18.x cluster to v1.19.x before upgrading the cluster to v1.20.x.

4 Run the tanzu cluster upgrade CLUSTER-NAME command and enter y to confirm.

To upgrade the cluster to the default version of Kubernetes for this release of Tanzu Kubernetes Grid, run the tanzu cluster upgrade command without any options. For example, the following command upgrades the cluster k8s-1-19-3-cluster from v1.19.3 to v1.20.4.

tanzu cluster upgrade k8s-1-19-3-cluster

If the cluster is not running in the default namespace, specify the --namespace option.

tanzu cluster upgrade CLUSTER-NAME --namespace NAMESPACE-NAME

To skip the confirmation step when you upgrade a cluster, specify the --yes option.

tanzu cluster upgrade CLUSTER-NAME --yes

If an upgrade times out before it completes, run tanzu cluster upgrade again and specify the --timeout option with a value greater than the default of 30 minutes.

tanzu cluster upgrade CLUSTER-NAME --timeout 45m0s

If multiple base VM images in your IaaS account have the same version of Kubernetes that you are upgrading to, use the --os-name option to specify the OS you want. See Select an OS During Cluster Upgrade for more information.

For example, on vSphere if you have uploaded both Photon and Ubuntu OVA templates with Kubernetes v1.20.5, specify --os-name ubuntu to upgrade your workload cluster to run on an Ubuntu VM.

tanzu cluster upgrade CLUSTER-NAME --os-name ubuntu

VMware, Inc. 394 VMware Tanzu Kubernetes Grid

Since you cannot skip minor versions of tkr, the upgrade command fails if you try to upgrade a cluster that is more than one minor version behind the default version. For example, you cannot upgrade directly from v1.18.x to v1.20.x. To upgrade a cluster to a version of Kubernetes that is not the default version for this release of Tanzu Kubernetes Grid, specify the --tkr option with the NAME of the chosen version, as listed by tanzu kubernetes-release get above. For example, to upgrade the cluster k8s-1-18-10-cluster from v1.18.10 to v1.19.8.

tanzu cluster upgrade k8s-1-18-10-cluster --tkr v1.19.9---vmware.1-tkg.1 --yes

5 When the upgrade finishes, run the tanzu cluster list command with the --include- management-cluster option again, to check that the Tanzu Kubernetes cluster has been upgraded.

tanzu cluster list --include-management-cluster

You see that the k8s-1-17-13-cluster and k8s-1-19-3-cluster Tanzu Kubernetes clusters are now running Kubernetes v1.18.17 and v1.20.5 respectively.

NAME NAMESPACE STATUS CONTROLPLANE WORKERS KUBERNETES ROLES PLAN k8s-1-17-13-cluster default running 1/1 1/1 v1.18.17+vmware.1 dev k8s-1-18-10-cluster default running 1/1 1/1 v1.19.9+vmware.1 dev k8s-1-19-3-cluster default running 1/1 1/1 v1.20.5+vmware.1 dev mgmt-cluster tkg-system running 1/1 1/1 v1.20.5+vmware.1 management dev

What to Do Next

You can now continue to use the Tanzu CLI to manage your clusters, and run your applications with the new version of Kubernetes.

To complete the upgrade, you should upgrade any extensions you have deployed such as Contour, Fluent Bit or Prometheus that are running on your Tanzu Kubernetes clusters.

You must also register any add-ons such as CNI, vSphere CPI, Pinniped or Metrics Server that you will be using in your Tanzu Kubernetes Grid deployment.

For more information on upgrading extensions, see Upgrade Tanzu Kubernetes Grid Extensions.

For instructions on how to register add-ons after upgrading your clusters from Tanzu Kubernetes Grid v1.2.x to v1.3.x, see Register Core Add-ons.

Upgrade Tanzu Kubernetes Grid Extensions

This topic describes how to upgrade Tanzu Kubernetes Grid extensions. You upgrade the extensions after you upgrade to v1.3.x. Tanzu Kubernetes Grid extensions are deployed and managed by kapp-controller from the Carvel Tools.

VMware, Inc. 395 VMware Tanzu Kubernetes Grid

You should upgrade the following extensions on clusters that have been upgraded to Tanzu Kubernetes Grid v1.3.x: n Contour n Harbor n Fluent Bit n Prometheus n Grafana n External DNS

Considerations for Upgrading Extensions from v1.2.x to v1.3.x

This section lists some changes in Tanzu v1.3.x that impact the upgrade of extensions from v1.2.x to v1.3.x.

Dex and Gangway Extensions Upgrade Tanzu Kubernetes Grid v1.3.x allows you to manage authentication with Pinniped instead of using Dex and Gangway.

Instead of upgrading the Dex and Gangway extensions, you should migrate your clusters to use Dex and Pinniped. After you migrate, you use the add-on manager to perform upgrades of Dex and Pinniped. For instructions on how to migrate your clusters, see Register Core Add-ons.

Registry Update

Tanzu Kubernetes Grid v1.3.x switches the registry from registry.tkg.vmware.run to projects.registry.vmware.com/tkg.

To implement this registry change, you must apply the cert-manager included in Tanzu Kubernetes Grid v1.3.x on each cluster where you are upgrading Contour, Prometheus and Grafana extensions.

Extension Manager Removal Tanzu Kubernetes Grid v1.3.1 removes the Tanzu Mission Control extension manager from the extensions bundle. Extensions are no longer wrapped inside the extension resource. Instead of deploying extensions with Tanzu Mission Control extension manager, you deploy them by using the Kapp controller. As part of the upgrade, you must remove the extension resource for each extension being upgraded on the cluster.

Prerequisites

This procedure assumes that you are upgrading to Tanzu Kubernetes Grid v1.3.1 from either v1.2.x or v1.3.0.

VMware, Inc. 396 VMware Tanzu Kubernetes Grid

To upgrade Tanzu Kubernetes Grid extensions to v1.3.1: n You previously deployed one or more of the extensions on clusters running Tanzu Kubernetes Grid v1.2.x or v1.3.0. n You have upgraded the management clusters to Tanzu Kubernetes Grid v1.3.1 or later. n You have upgraded the clusters on which the extensions are running to Tanzu Kubernetes Grid v1.3.1 or later. n You have installed the Carvel tools. For information about installing the Carvel tools, see Install the Carvel Tools. n You have downloaded and unpacked the bundle of Tanzu Kubernetes Grid extensions for v1.3.1 to the tkg-extensions-v1.3.1+vmware.1/extensions folder. For information about where to obtain the bundle, see Download and Unpack the Tanzu Kubernetes Grid Extensions Bundle. n Read the Tanzu Kubernetes Grid 1.3.1 Release Notes for updates related to security patches.

Upgrade the Contour Extension

Follow these steps to upgrade the Contour extension.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Contour.

2 Delete the Contour extension resource.

kubectl delete extension contour -n tanzu-system-ingress

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n v1.2.1 tkg-extensions-v1.2.1+vmware.1/extensions

n v1.2.0 tkg-extensions-v1.2.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Change directories to the extensions subfolder.

cd extensions

6 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

VMware, Inc. 397 VMware Tanzu Kubernetes Grid

7 Update cert-manager to switch to the new registry.

kubectl apply -f ../cert-manager/

8 Change directories to ingress/contour.

9 Obtain the current contour-data-values.yaml and secret used by the current Contour Extension.

kubectl get secret contour-data-values -n tanzu-system-ingress -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > current-contour-data-values.yaml

10 Copy the contour-data-values.yaml.example file for your provider and name it contour-data- values.yaml.

vSphere:

cp vsphere/contour-data-values.yaml.example vsphere/contour-data-values.yaml

Amazon EC2:

cp aws/contour-data-values.yaml.example aws/contour-data-values.yaml

Azure:

cp azure/contour-data-values.yaml.example azure/contour-data-values.yaml

11 Manually copy over any customizations in current-contour-data-values.yaml into the contour- data-values.yaml file for your provider. For example, you may have customized NodePort.

Do not change this configuration data. Otherwise, the upgrade will fail.

12 Update the contour-data-values secret with the new contour-data-values.yaml.

vSphere:

kubectl create secret generic contour-data-values --from-file=values.yaml=vsphere/contour-data- values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

Amazon EC2:

kubectl create secret generic contour-data-values --from-file=values.yaml=aws/contour-data- values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

Azure:

kubectl create secret generic contour-data-values --from-file=values.yaml=azure/contour-data- values.yaml -n tanzu-system-ingress -o yaml --dry-run | kubectl replace -f-

13 Deploy the new Contour extension.

kubectl apply -f contour-extension.yaml

VMware, Inc. 398 VMware Tanzu Kubernetes Grid

You should see the confirmation extension.clusters.tmc.cloud.vmware.com/contour created.

14 View the status of the Contour service.

kubectl get app contour -n tanzu-system-ingress

The status of the Contour app should show Reconcile Succeeded when Contour has deployed successfully.

NAME DESCRIPTION SINCE-DEPLOY AGE contour Reconcile succeeded 15s 72s

15 Check that the ingress and HTTP proxy resources are valid and that the ingress traffic is working.

kubectl get ingress -A kubectl get httpproxy -A

## Upgrade the Harbor Extension

Follow these steps to upgrade the Harbor extension.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Harbor.

2 Delete the Harbor extension resource.

kubectl delete extension harbor -n tanzu-system-registry

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n v1.2.1 tkg-extensions-v1.2.1+vmware.1/extensions

n v1.2.0 tkg-extensions-v1.2.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Switch kubectl config context to the shared services cluster.

6 If you have not done so already, upgrade the Contour Extension on the shared services cluster as described in the previous procedure, Upgrade the Contour Extension. Contour is a dependency for Harbor.

VMware, Inc. 399 VMware Tanzu Kubernetes Grid

7 Change directories to the extensions subfolder.

cd extensions

8 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

9 Change directories to registry/harbor.

10 Obtain the harbor-data-values.yaml and secret used by the current Harbor Extension.

kubectl get secret harbor-data-values -n tanzu-system-registry -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > current-harbor-data-values.yaml

11 Copy harbor-data-values.yaml.example to harbor-data-values.yaml.

cp harbor-data-values.yaml.example harbor-data-values.yaml

12 Manually copy all custom configuration data in current-harbor-data-values.yaml into the harbor-data-values.yaml file. You may need to copy in individual values such as hostname, harborAdminPassword, secretKey, core.secret, persistence, and so on.

Do not change this configuration data. Otherwise, the upgrade will fail.

13 Update the harbor-data-values secret with the new harbor-data-values.yaml.

kubectl create secret generic harbor-data-values --from-file=values.yaml=harbor-data-values.yaml - n tanzu-system-registry -o yaml --dry-run | kubectl replace -f-

14 Deploy the new Harbor Extension.

kubectl apply -f harbor-extension.yaml

15 Retrieve the status of Harbor Extension.

kubectl get app harbor -n tanzu-system-registry

The Harbor App status should change to Reconcile succeeded after the Harbor Extension is deployed successfully.

View detailed status:

kubectl get app harbor -n tanzu-system-registry -o yaml

16 In a web browser, navigate to the current Harbor portal URL.

17 Ensure that you can sign in with the current username and password and that all your projects and images are present.

After the upgrade completes, all commands such as docker login, docker pull, docker push should be fully functional since the Harbor CA certificate does not change during the upgrade.

VMware, Inc. 400 VMware Tanzu Kubernetes Grid

Upgrade the Fluent Bit Extension

Follow these steps to upgrade the Fluent Bit Extension.

Note The Fluent Bit extension in Tanzu Kubernetes Grid v1.3.x adds support for Syslog. To install and configure Syslog in Fluent Bit, see Prepare the Fluent Bit Configuration File for a Syslog Output Plugin.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Fluent Bit.

2 Delete the Fluent Bit extension resource.

kubectl delete extension fluent-bit -n tanzu-system-logging

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n v1.2.1 tkg-extensions-v1.2.1+vmware.1/extensions

n v1.2.0 tkg-extensions-v1.2.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Change directories to the extensions subfolder.

cd extensions

6 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

7 Set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Fluent Bit.

8 Change directories to logging/fluent-bit.

9 Obtain the current fluent-bit-data-values.yaml files and secrets used by the current Fluent Bit Extension.

Elastic Search

kubectl get secret fluent-bit-data-values -n tanzu-system-logging -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > elasticsearch/current-fluent-bit-data-values.yaml

VMware, Inc. 401 VMware Tanzu Kubernetes Grid

Kafka

kubectl get secret fluent-bit-data-values -n tanzu-system-logging -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > kafka/current-fluent-bit-data-values.yaml

Splunk

kubectl get secret fluent-bit-data-values -n tanzu-system-logging -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > splunk/current-fluent-bit-data-values.yaml

HTTP

kubectl get secret fluent-bit-data-values -n tanzu-system-logging -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > http/current-fluent-bit-data-values.yaml

syslog (Only applicable if you are upgrading from v1.3.0 or later)

kubectl get secret fluent-bit-data-values -n tanzu-system-logging -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > syslog/current-fluent-bit-data-values.yaml

10 Copy the fluent-bit-data-values.yaml.example file for each backend logging component and name it fluent-bit-data-values.yaml.

Elastic Search

cp elasticsearch/fluent-bit-data-values.yaml.example elasticsearch/fluent-bit-data-values.yaml

Kafka

cp kafka/fluent-bit-data-values.yaml.example kafka/fluent-bit-data-values.yaml

Splunk

cp splunk/fluent-bit-data-values.yaml.example splunk/fluent-bit-data-values.yaml

HTTP

cp http/fluent-bit-data-values.yaml.example http/fluent-bit-data-values.yaml

syslog (Only applicable if you are upgrading from v1.3.0 or later)

cp syslog/fluent-bit-data-values.yaml.example syslog/fluent-bit-data-values.yaml

11 For each backend component, manually copy over any customizations in your current- fluent-bit-data-values.yaml files into the fluent-bit-data-values.yaml file for the component.

12 Update the fluent-bit-data-values secrets with the new fluent-bit-data-values.yaml file.

ElasticSearch:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=elasticsearch/fluent- bit-data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

VMware, Inc. 402 VMware Tanzu Kubernetes Grid

Kafka:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=kafka/fluent-bit- data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

Splunk:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=splunk/fluent-bit- data-values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

HTTP:

kubectl create secret generic fluent-bit-data-values --from-file=values.yaml=http/fluent-bit-data- values.yaml -n tanzu-system-logging -o yaml --dry-run | kubectl replace -f-

13 Deploy the Fluent Bit extension.

kubectl apply -f fluent-bit-extension.yaml

You should see the confirmation extension.clusters.tmc.cloud.vmware.com/fluent-bit created.

14 View the status of the Fluent Bit service.

kubectl get app fluent-bit -n tanzu-system-logging

The status of the Fluent Bit app should show Reconcile Succeeded when Fluent Bit has deployed successfully.

NAME DESCRIPTION SINCE-DEPLOY AGE fluent-bit Reconcile succeeded 54s 14m

15 Check that the Fluent Bit daemon set is running and that log collection and forwarding is functioning.

kubectl get ds -n tanzu-system-logging

Upgrade the Prometheus Extension

Follow these steps to upgrade the Prometheus Extension.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Prometheus.

2 Delete the Prometheus extension resource.

kubectl delete extension prometheus -n tanzu-system-monitoring

VMware, Inc. 403 VMware Tanzu Kubernetes Grid

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n v1.2.1 tkg-extensions-v1.2.1+vmware.1/extensions

n v1.2.0 tkg-extensions-v1.2.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Change directories to the extensions subfolder.

cd extensions

6 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

7 Update cert-manager to switch to the new registry.

kubectl apply -f ../cert-manager/

8 Change directories to monitoring/prometheus.

9 Obtain the current prometheus-data-values.yaml files and secrets used by the current Prometheus Extension.

kubectl get secret prometheus-data-values -n tanzu-system-monitoring -o 'go- template={{ index .data "values.yaml" }}' | base64 -d > current-prometheus-data-values.yaml

10 Copy the prometheus-data-values.yaml.example file to prometheus-data-values.yaml.

cp prometheus-data-values.yaml.example prometheus-data-values.yaml

11 Manually copy over any customizations in your current-prometheus-data-values.yaml files into the prometheus-data-values.yaml file.

12 Update the prometheus-data-values secret with the new prometheus-data-values.yaml.

kubectl create secret generic prometheus-data-values --from-file=values.yaml=prometheus-data- values.yaml -n tanzu-system-monitoring -o yaml --dry-run | kubectl replace -f-

13 Deploy the Prometheus extension.

kubectl apply -f prometheus-extension.yaml

VMware, Inc. 404 VMware Tanzu Kubernetes Grid

You should see a confirmation that extensions.clusters.tmc.cloud.vmware.com/prometheus has been created.

14 The extension takes several minutes to deploy. To check the status of the deployment, use the kubectl get app command.

kubectl get app prometheus -n tanzu-system-monitoring

While the extension is being deployed, the "Description" field from the kubectl get app command shows a status of Reconciling. After Prometheus is deployed successfully, the status of the Prometheus app shown by the kubectl get app command changes to Reconcile succeeded.

You can view detailed status information with this command:

kubectl get app prometheus -n tanzu-system-monitoring -o yaml

## Upgrade the Grafana Extension

Follow these steps to upgrade the Grafana extension.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade Grafana.

2 Delete the Grafana extension resource.

kubectl delete extension grafana -n tanzu-system-monitoring

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n v1.2.1 tkg-extensions-v1.2.1+vmware.1/extensions

n v1.2.0 tkg-extensions-v1.2.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Change directories to the extensions subfolder.

cd extensions

6 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

VMware, Inc. 405 VMware Tanzu Kubernetes Grid

7 Update cert-manager to switch to the new registry.

kubectl apply -f ../cert-manager/

8 If you have not done so already, upgrade the Contour Extension as described in the previous procedure, Upgrade the Contour Extension. Contour is a dependency for Grafana.

9 Change directories to monitoring/grafana.

10 Obtain the current grafana-data-values.yaml files and secrets used by the current Grafana Extension.

kubectl get secret grafana-data-values -n tanzu-system-monitoring -o 'go-template={{ index .data "values.yaml" }}' | base64 -d > current-grafana-data-values.yaml

11 Copy the grafana-data-values.yaml.example file to grafana-data-values.yaml.

cp grafana-data-values.yaml.example grafana-data-values.yaml

12 Manually copy over any customizations in your current-grafana-data-values.yaml files into the grafana-data-values.yaml file.

13 Update the grafana-data-values secret with the new grafana-data-values.yaml file.

kubectl create secret generic grafana-data-values --from-file=values.yaml=grafana-data- values.yaml -n tanzu-system-monitoring -o yaml --dry-run | kubectl replace -f-

14 Deploy the new Grafana extension.

kubectl apply -f grafana-extension.yaml

You should see a confirmation that extensions.clusters.tmc.cloud.vmware.com/grafana was created.

15 The extension takes several minutes to deploy. To check the status of the deployment, use the kubectl get app -n tanzu-system-monitoring command:

kubectl get app grafana -n tanzu-system-monitoring

While the extension is being deployed, the "Description" field from the kubectl get app command shows a status of Reconciling. After Grafana is deployed successfully, the status of the Grafana app as shown by the kubectl get app command changes to Reconcile succeeded.

You can view detailed status information with this command:

kubectl get app grafana -n tanzu-system-monitoring -o yaml

## Upgrade the External DNS Extension

External DNS is available as an extension starting in Tanzu Kubernentes Grid v1.3.0 and later. This upgrade procedure only applies if you already installed External DNS in v1.3.0 and are upgrading the extension to v1.3.1 or later.

VMware, Inc. 406 VMware Tanzu Kubernetes Grid

If you want to install the External DNS extension, see Implementing Service Discovery with External DNS.

Follow these steps to upgrade the External DNS extension.

1 In a terminal, set the context of kubectl to the Tanzu Kubernetes cluster where you want to upgrade External DNS.

2 Delete the External DNS extension resource.

kubectl delete extension external-dns -n tanzu-system-service-discovery

3 Delete the extension manager unless the cluster is attached in Tanzu Mission Control. If the cluster is attached in Tanzu Mission Control, you can skip this entire step.

n Change directory to the extension bundle for the current version of the extension.

n v1.3.0 tkg-extensions-v1.3.0+vmware.1/extensions

n Remove the extension manager.

kubectl delete -f tmc-extension-manager.yaml

4 Navigate to the tkg-extensions-v1.3.1+vmware.1 folder where you downloaded the new extensions bundle.

5 Switch kubectl config context to the shared services cluster.

6 If you have not done so already, upgrade the Contour and Harbor extensions on the shared services cluster as described in the previous procedure, Upgrade the Contour Extension and Upgrade the Harbor Extension.

7 Change directories to the extensions subfolder.

cd extensions

8 Update the Kapp controller.

kubectl apply -f kapp-controller.yaml

9 Obtain the current external-dns-data-values.yaml files and secrets used by the current External DNS extension.

kubectl get secret external-dns-data-values -n tanzu-system-service-discovery -o 'go- template={{ index .data "values.yaml" }}' | base64 -d > current-external-dns-data-values.yaml

10 Copy the external-dns-data-values-PROVIDER.yaml.example file where PROVIDER matches your DNS service provider to external-dns-data-values.yaml. For example:

cp external-dns-values-azure.yaml.example external-dns-data-values.yaml

11 Manually copy over any customizations in your current-external-dns-data-values.yaml files into the external-dns-values.yaml file.

VMware, Inc. 407 VMware Tanzu Kubernetes Grid

12 Update the external-dns-values secret with the new external-dns-values.yaml file.

kubectl create secret generic external-dns-values --from-file=values.yaml=external-dns- values.yaml -n tanzu-system-service-discovery -o yaml --dry-run | kubectl replace -f-

13 Deploy the new External DNS extension.

kubectl apply -f external-dns-extension.yaml

14 The extension takes several minutes to deploy. To check the status of the deployment, use the kubectl get app -n tanzu-system-service-discovery command:

kubectl get app external-dns -n tanzu-system-service-discovery

While the extension is being deployed, the "Description" field from the kubectl get app command shows a status of Reconciling. After External DNS is deployed successfully, the status of the External DNS app as shown by the kubectl get app command changes to Reconcile succeeded.

You can view detailed status information with this command:

kubectl get app external-dns -n tanzu-system-service-discovery -o yaml

Register Core Add-ons

This topic describes how to register the CNI, vSphere CPI, vSphere CSI, Pinniped, and Metrics Server add-ons with tanzu-addons-manager, the component that manages the lifecycle of add-ons. Skip this step if: n Your management and Tanzu Kubernetes (workload) clusters were created using Tanzu Kubernetes Grid v1.3.0 or later. n You already registered the add-ons by following the instructions in this topic.

About Add-on Lifecycle Management in Tanzu Kubernetes Grid

When you create a management or a workload cluster using Tanzu Kubernetes Grid v1.3.0 or later, it automatically installs the following core add-ons in the cluster: n CNI: cni/calico or cni/antrea n (vSphere only) vSphere CPI: cloud-provider/vsphere-cpi n (vSphere only) vSphere CSI: csi/vsphere-csi n Authentication: authentication/pinniped n Metrics Server: metrics/metrics-server

VMware, Inc. 408 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Grid manages the lifecycle of these add-ons. For example, it automatically upgrades the add-ons when you upgrade your management and workload clusters using the tanzu management-cluster upgrade and tanzu cluster upgrade commands. This ensures that your Tanzu Kubernetes Grid version and add-on versions are compatible.

Upgrades from Tanzu Kubernetes Grid v1.2.x to v1.3.x Upgrading your management and workload clusters from Tanzu Kubernetes Grid v1.2.x to v1.3.x does not automatically upgrade the CNI, vSphere CPI, and vSphere CSI add-ons. To enable automatic lifecycle management for these add-ons, you must manually register them with the tanzu-addons-manager. The Pinniped and Metrics Server add-ons are new components in Tanzu Kubernetes Grid v1.3.0. You must enable them if you want to use identity management with Pinniped and Metrics Server in your upgraded clusters.

Important: The Dex and Gangway extensions are deprecated in Tanzu Kubernetes Grid v1.3.0 and will be removed in a future release. It is strongly recommended to migrate any existing clusters that implement the Dex and Gangway extensions to the integrated Pinniped authentication service.

Prerequisites

Before following the instructions in this topic, confirm that: n Your management and workload clusters have been upgraded to Tanzu Kubernetes Grid v1.3.x. n tanzu-addons-controller-manager and kapp-controller are running in your management cluster, by using kubectl get pods -n tkg-system.

Register the Core Add-ons

After you upgrade your management and workload clusters to Tanzu Kubernetes Grid v1.3.x, follow the instructions below to register the core add-ons with tanzu-addons-manager: n To register the CNI, vSphere CPI, and vSphere CSI add-ons, see Register the CNI, vSphere CPI, and vSphere CSI Add-ons. n To register the Pinniped and Metrics Server add-ons, see Enable the Pinniped and Metrics Server Add-ons.

Register the CNI, vSphere CPI, and vSphere CSI Add-ons

When registering the CNI, vSphere CPI, or vSphere CSI add-on with tanzu-addons-manager, register the add-on that is running in the management cluster first and then in each workload cluster.

VMware, Inc. 409 VMware Tanzu Kubernetes Grid

To register the CNI, vSphere CPI, or vSphere CSI add-on that is running in a management or a workload cluster:

1 Create a configuration file for the cluster.

a Set the following variables. For example, for vSphere:

# This is the name of your target management or workload cluster. CLUSTER_NAME=YOUR-CLUSTER-NAME # For the management cluster, the namespace must be "tkg-system". For workload clusters, the default namespace is "default". NAMESPACE=YOUR-CLUSTER-NAMESPACE # CLUSTER_PLAN can be "dev", "prod", and so on. CLUSTER_PLAN=YOUR-CLUSTER-PLAN

# If you are creating the configuration file for a management cluster, you can use the following commands to retrieve the values of CLUSTER_CIDR and SERVICE_CIDR from your management cluster. CLUSTER_CIDR=$(kubectl get kubeadmconfig -n tkg-system -l cluster.x-k8s.io/cluster-name= $CLUSTER_NAME,cluster.x-k8s.io/control-plane= -o jsonpath='{.items[0].spec.clusterConfiguration.networking.podSubnet}') SERVICE_CIDR=$(kubectl get kubeadmconfig -n tkg-system -l cluster.x-k8s.io/cluster-name= $CLUSTER_NAME,cluster.x-k8s.io/control-plane= -o jsonpath='{.items[0].spec.clusterConfiguration.networking.serviceSubnet}')

# If you are creating the configuration file for a workload cluster, you can use the following commands to retrieve the values of CLUSTER_CIDR and SERVICE_CIDR from your workload cluster. CLUSTER_CIDR=$(kubectl get cluster "${CLUSTER_NAME}" -n "${NAMESPACE}" -o jsonpath='{.spec.clusterNetwork.pods.cidrBlocks[0]}') SERVICE_CIDR=$(kubectl get cluster "${CLUSTER_NAME}" -n "${NAMESPACE}" -o jsonpath='{.spec.clusterNetwork.services.cidrBlocks[0]}')

# Set these variables if your cluster is running on vSphere. You can use the commands bellow to retrieve their values from the cluster. If your cluster is running on Amazon EC2 or Azure, set the AWS_ or AZURE_ variables that you configured when you deployed the cluster. VSPHERE_SERVER=$(kubectl get VsphereCluster "${CLUSTER_NAME}" -n "${NAMESPACE}" -o jsonpath='{.spec.server}') VSPHERE_DATACENTER=$(kubectl get VsphereMachineTemplate "${CLUSTER_NAME}-control-plane" -n "$ {NAMESPACE}" -o jsonpath='{.spec.template.spec.datacenter}') VSPHERE_RESOURCE_POOL=$(kubectl get VsphereMachineTemplate "${CLUSTER_NAME}-control-plane" -n "${NAMESPACE}" -o jsonpath='{.spec.template.spec.resourcePool}') VSPHERE_DATASTORE=$(kubectl get VsphereMachineTemplate "${CLUSTER_NAME}-control-plane" -n "$ {NAMESPACE}" -o jsonpath='{.spec.template.spec.datastore}') VSPHERE_FOLDER=$(kubectl get VsphereMachineTemplate "${CLUSTER_NAME}-control-plane" -n "$ {NAMESPACE}" -o jsonpath='{.spec.template.spec.folder}') VSPHERE_NETWORK=$(kubectl get VsphereMachineTemplate "${CLUSTER_NAME}-control-plane" -n "$ {NAMESPACE}" -o jsonpath='{.spec.template.spec.network.devices[0].networkName}') VSPHERE_SSH_AUTHORIZED_KEY=$(kubectl get KubeadmControlPlane "${CLUSTER_NAME}-control-plane" - n "${NAMESPACE}" -o jsonpath='{.spec.kubeadmConfigSpec.users[0].sshAuthorizedKeys[0]}') VSPHERE_TLS_THUMBPRINT=$(kubectl get VsphereCluster "${CLUSTER_NAME}" -n "${NAMESPACE}" -o jsonpath='{.spec.thumbprint}')

VMware, Inc. 410 VMware Tanzu Kubernetes Grid

VSPHERE_INSECURE=TRUE-OR-FALSE VSPHERE_USERNAME='YOUR-VSPHERE-USERNAME' VSPHERE_PASSWORD='YOUR-VSPHERE-PASSWORD' VSPHERE_CONTROL_PLANE_ENDPOINT=FQDN-OR-IP

b If your cluster uses Calico as the CNI provider, add or include the following line in the cluster configuration file:

CNI: calico

c Create the configuration file. You can use the echo command to write the variables that you set above and their values to the file. For example:

echo "CLUSTER_CIDR: ${CLUSTER_CIDR}" >> config.yaml

2 Set the _TKG_CLUSTER_FORCE_ROLE environment variable to management or workload. For example:

export _TKG_CLUSTER_FORCE_ROLE="management"

export _TKG_CLUSTER_FORCE_ROLE="workload"

On Windows, use the SET command.

3 Register the add-ons.

n CNI add-on:

1 Set the FILTER_BY_ADDON_TYPE and REMOVE_CRS_FOR_ADDON_TYPE environment variables to the values below:

export FILTER_BY_ADDON_TYPE="cni/antrea"

export REMOVE_CRS_FOR_ADDON_TYPE="cni/antrea"

If your cluster uses Calico, replace cni/antrea with cni/calico.

2 Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target management or workload cluster.

3 Review the manifest and then apply it to the management cluster.

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

VMware, Inc. 411 VMware Tanzu Kubernetes Grid

n vSphere CPI add-on:

1 Set the FILTER_BY_ADDON_TYPE and REMOVE_CRS_FOR_ADDON_TYPE environment variables to the values below.

export FILTER_BY_ADDON_TYPE="cloud-provider/vsphere-cpi"

export REMOVE_CRS_FOR_ADDON_TYPE="cloud-provider/vsphere-cpi"

2 Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target management or workload cluster.

3 Review the manifest and then apply it to the management cluster.

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

n vSphere CSI add-on:

1 Set the FILTER_BY_ADDON_TYPE and REMOVE_CRS_FOR_ADDON_TYPE environment variables to the values below.

export FILTER_BY_ADDON_TYPE="csi/vsphere-csi"

export REMOVE_CRS_FOR_ADDON_TYPE="csi/vsphere-csi"

2 Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target management or workload cluster.

3 Review the manifest and then apply it to the management cluster.

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

Enable the Pinniped and Metrics Server Add-ons To enable the Pinniped and Metrics Server add-ons, follow the instructions below: n Pinniped Add-on: Follow these instructions if you want to enable identity management with Pinniped in your upgraded clusters. n Metrics Server Add-on: Follow these instructions if you want to enable Metrics Server in your upgraded clusters.

VMware, Inc. 412 VMware Tanzu Kubernetes Grid

Pinniped Add-on To enable identity management with Pinniped, you must enable the Pinniped add-on in your upgraded management and workload clusters. For information about identity management in Tanzu Kubernetes Grid v1.3.x, see Enabling Identity Management in Tanzu Kubernetes Grid.

Important: The Dex and Gangway extensions are deprecated in Tanzu Kubernetes Grid v1.3.0 and will be removed in a future release. It is strongly recommended to migrate any existing clusters that implement the Dex and Gangway extensions to the integrated Pinniped authentication service. If you implemented the Dex and Gangway extensions in Tanzu Kubernetes Grid v1.2.x, delete Dex and Gangway from both the management cluster and workload clusters before enabling the Pinniped add-on. Back up your Dex and Gangway configuration settings, such as ConfigMap.

To enable identity management with Pinniped:

1 Create a configuration file for your management cluster as described in step 1 in Register the CNI, vSphere CPI, and vSphere CSI Add-ons above.

2 Obtain your OIDC or LDAP identity provider details and add the following settings to the configuration file.

# Identity management type. This must be "oidc" or "ldap".

IDENTITY_MANAGEMENT_TYPE:

# Set these variables if you want to configure OIDC.

CERT_DURATION: 2160h CERT_RENEW_BEFORE: 360h OIDC_IDENTITY_PROVIDER_ISSUER_URL: OIDC_IDENTITY_PROVIDER_CLIENT_ID: OIDC_IDENTITY_PROVIDER_CLIENT_SECRET: OIDC_IDENTITY_PROVIDER_SCOPES: "email,profile,groups" OIDC_IDENTITY_PROVIDER_USERNAME_CLAIM: OIDC_IDENTITY_PROVIDER_GROUPS_CLAIM:

# Set these variables if you want to configure LDAP.

LDAP_BIND_DN: LDAP_BIND_PASSWORD: LDAP_HOST: LDAP_USER_SEARCH_BASE_DN: LDAP_USER_SEARCH_FILTER: LDAP_USER_SEARCH_USERNAME: userPrincipalName LDAP_USER_SEARCH_ID_ATTRIBUTE: DN LDAP_USER_SEARCH_EMAIL_ATTRIBUTE: DN LDAP_USER_SEARCH_NAME_ATTRIBUTE: LDAP_GROUP_SEARCH_BASE_DN: LDAP_GROUP_SEARCH_FILTER:

VMware, Inc. 413 VMware Tanzu Kubernetes Grid

LDAP_GROUP_SEARCH_USER_ATTRIBUTE: DN LDAP_GROUP_SEARCH_GROUP_ATTRIBUTE: LDAP_GROUP_SEARCH_NAME_ATTRIBUTE: cn LDAP_ROOT_CA_DATA_B64:

For more information about these variables, see Variables for Configuring Identity Providers - OIDC and Variables for Configuring Identity Providers - LDAP.

3 Set the _TKG_CLUSTER_FORCE_ROLE environment variable to management.

export _TKG_CLUSTER_FORCE_ROLE="management"

On Windows, use the SET command.

4 Set the FILTER_BY_ADDON_TYPE environment variable to authentication/pinniped.

export FILTER_BY_ADDON_TYPE="authentication/pinniped"

5 Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target management cluster.

6 Review the manifest and then apply it to the cluster. For example:

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

If you configured your management cluster to use OIDC authentication above, you must provide the callback URI for the management cluster to your OIDC identity provider. For more information, see Configure Identity Management After Management Cluster Deployment.

7 Enable the Pinniped add-on in each workload cluster that is managed by your management cluster. For each cluster, follow these steps:

a Create a configuration file for the cluster as described in step 1 in Register the CNI, vSphere CPI, and vSphere CSI Add-ons above.

b Add the following variables to the configuration file.

# This is the Pinniped supervisor service endpoint in the management cluster. SUPERVISOR_ISSUER_URL:

# Pinniped uses this b64-encoded CA bundle data for communication between the management cluster and the workload cluster. SUPERVISOR_ISSUER_CA_BUNDLE_DATA_B64:

You can retrieve these values by running kubectl get configmap pinniped-info -n kube- public -o yaml against the management cluster.

VMware, Inc. 414 VMware Tanzu Kubernetes Grid

c Set the _TKG_CLUSTER_FORCE_ROLE environment variable to workload.

export _TKG_CLUSTER_FORCE_ROLE="workload"

d Set the FILTER_BY_ADDON_TYPE environment variable to authentication/pinniped.

export FILTER_BY_ADDON_TYPE="authentication/pinniped"

e Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target workload cluster.

f Review the manifest and then apply it to the management cluster:

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

For information about how to grant users access to workload clusters on which you have implemented identity management, see Authenticate Connections to a Workload Cluster.

Metrics Server Add-on When enabling the Metrics Server add-on, enable the add-on in the management cluster first and then in each workload cluster. To enable the Metrics Server add-on in a management or a workload cluster:

1 Create a configuration file for the cluster as described in step 1 in Register the CNI, vSphere CPI, and vSphere CSI Add-ons above.

2 Set the _TKG_CLUSTER_FORCE_ROLE environment variable to management or workload. For example:

export _TKG_CLUSTER_FORCE_ROLE="management"

export _TKG_CLUSTER_FORCE_ROLE="workload"

On Windows, use the SET command.

3 Set the FILTER_BY_ADDON_TYPE environment variable to metrics/metrics-server.

export FILTER_BY_ADDON_TYPE="metrics/metrics-server"

4 Generate a manifest file for the add-on. For example:

tanzu cluster create ${CLUSTER_NAME} --dry-run -f config.yaml > ${CLUSTER_NAME}-addon- manifest.yaml

Where CLUSTER_NAME is the name of your target management or workload cluster.

VMware, Inc. 415 VMware Tanzu Kubernetes Grid

5 Review the manifest and then apply it to the management cluster. For example:

kubectl apply –f ${CLUSTER_NAME}-addon-manifest.yaml

Select an OS During Cluster Upgrade

If your IaaS account has multiple base VM images with the same version of Kubernetes that you are upgrading to, your tanzu management-cluster upgrade or tanzu cluster upgrade command can specify which OS version to use.

You specify the OS version with the --os-arch, --os-name, or --os-version options to the upgrade command. Possible values and defaults for these options include: n --os-name value depends on cloud infrastructure:

n vSphere: ubuntu (default) or photon for Photon OS

n Amazon EC2: ubuntu (default) or amazon for Amazon Linux

n Azure: ubuntu n --os-version value depends on os-name:

n ubuntu values include: 20.04 (default), 18.04

n photon values include: 3 (default)

n amazon values include: 2 (default) n --os-arch value: amd64 (default)

If you do not specify an --os-name when upgrading a cluster, its nodes retain their existing --os- name setting.

Upgrade vSphere Deployments in an Internet-Restricted Environment

If you deployed the previous version of Tanzu Kubernetes Grid in an Internet-restricted environment, do the following steps on a machine with an Internet connection.

1 Download and install the new version of the Tanzu CLI. See Download and Install the Tanzu CLI and Other Tools.

2 Perform the steps in Prepare to Upgrade Clusters on vSphere to deploy the new base OS image OVA files.

3 Perform the steps in Deploying Tanzu Kubernetes Grid in an Internet-Restricted Environment to run the gen-publish-images.sh and publish-images.sh scripts.

If you still have the publish-images.sh script from when you deployed the previous version of Tanzu Kubernetes Grid, you must regenerate it by running gen-publish-images.sh before you run publish-images.sh.

VMware, Inc. 416 VMware Tanzu Kubernetes Grid

Running gen-publish-images.sh updates publish-images.sh so that it pulls the correct versions of the components for the new version of Tanzu Kubernetes Grid and pushes them into your local private Docker registry.

The gen-publish-images.sh script obtains the correct versions of the components from the YAML files that are created in the ~/.tanzu/tkg/bom folder when you first run a tanzu CLI command with a new version of Tanzu Kubernetes Grid.

VMware, Inc. 417 Tanzu Kubernetes Grid Security and Networking 10

This topic lists resources for securing Tanzu Kubernetes Grid infrastructure.

This chapter includes the following topics: n Ports and Protocols n Tanzu Kubernetes Grid Firewall Rules n CIS Benchmarking for Clusters

Ports and Protocols

Networking ports and protocols used by Tanzu Kubernetes Grid are listed in the VMware Ports and Protocols tool.

For each internal communication path, VMware Ports and Protocols lists: n Product n Version n Source n Destination n Ports n Protocols n Purpose n Service Description n Classification (Outgoing, Incoming, or Bidirectional)

You can use this information to configure firewall rules.

VMware, Inc. 418 VMware Tanzu Kubernetes Grid

Tanzu Kubernetes Grid Firewall Rules

Name Source Destination Service Purpose workload-to- tkg-workload-cluster- tkg-management- TCP:6443* Allow workload cluster to mgmt network cluster-network register with management cluster mgmt-to- tkg-management-cluster- tkg-workload-cluster- TCP:6443*, Allow management network to workload network network 5556 configure workload cluster allow-pinniped tkg-workload-cluster- tkg-management- TCP:31234 Allow Pinniped concierge on network cluster-network workload cluster to access Pinniped supervisor on management cluster, which may be running behind a NodePort or LoadBalancer service allow-mgmt- tkg-management-cluster- tkg-management- all Allow all internal cluster subnet network cluster-network communication allow-wl-subnet tkg-workload-cluster- tkg-workload-cluster- all Allow all internal cluster network network communication jumpbox-to-k8s Jumpbox IP tkg-management- TCP:6443* Allow Jumpbox to create cluster-network, tkg- management cluster and manage workload-cluster- clusters. network dhcp any NSX-T: any / no NSX- DHCP Allows hosts to get DHCP T: DHCP IP addresses. to-harbor tkg-management-cluster- Harbor IP HTTPS Allows components to retrieve network, tkg-workload- container images cluster-network, jumpbox IP to-vcenter tkg-management-cluster- vCenter IP HTTPS Allows components to access network, tkg-workload- vSphere to create VMs and cluster-network, Storage Volumes jumpbox IP dns-ntp- tkg-management-cluster- DNS, NTP servers DNS, NTP Core services outbound network, tkg-workload- cluster-network, jumpbox IP ssh-to-jumpbox any Jumpbox IP SSH Access from outside to the jumpbox

(For NSX ALB**) tkg-management-cluster- Avi Controller** TCP:443 Allow Avi Kubernetes Operator ako-to-avi network, tkg-workload- (AKO) and AKO Operator cluster-network (AKOO) access to Avi Controller deny-all any any all deny by default

*With NSX Advanced Load Balancer, you can override the port 6443 setting with the VSPHERE_CONTROL_PLANE_ENDPOINT_PORT cluster configuration variable.

VMware, Inc. 419 VMware Tanzu Kubernetes Grid

**For additional firewall rules required for NSX Advanced Load Balancer, formerly known as Avi Vantage, see Protocol Ports Used by Avi Vantage for Management Communication in the Avi Networks documentation.

CIS Benchmarking for Clusters

To assess cluster security, you can run Center for Internet Security (CIS) Kubernetes benchmark tests on clusters deployed by Tanzu Kubernetes Grid.

For Tanzu Kubernetes Grid clusters that do not pass all sections of the tests, see explanations and possible remediations in the table below.

Expected Test Failures for CIS Benchmark Inspection on Kubernetes Clusters Provisioned by Tanzu Kubernetes Grid v1.3.1

Section Test Description Explanation

1.1.12 Ensure that the etcd data directory The data directory (/var/lib/etcd) is owned by root:root. ownership is set to etcd:etcd (Scored) To provision clusters, Tanzu Kubernetes Grid uses Cluster API which, in turn, uses the kubeadm tool to provision Kubernetes. kubeadm makes etcd run containerized as a static pod, therefore the directory does not need to be set to a particular user. kubeadm configures the directory to not be readable by non-root users.

1.2.6 Ensure that the --kubelet-certificate- This flag is not set by default. authority argument is set as appropriate (Scored)

1.2.16 Ensure that the admission control This flag is not set by default. plugin PodSecurityPolicy is set Remediation: (Automated) Using VMware Tanzu Mission Control, you can make the deployments to your clusters more secure by implementing constraints that govern what deployed pods can do. Security policies, implemented using OPA Gatekeeper, allow you to restrict certain aspects of pod execution in your clusters, such as privilege escalation, Linux capabilities, and allowed volume types. See Pod Security Management in the Tanzu Mission Control documentation. Alternatively, this admission controller flag can be enabled natively in Kubernetes. For more information, please refer to Pod Security Policies in the VMware Tanzu Developer Center.

4.2.6 Ensure that the --protect-kernel- This flag is not set by default. defaults argument is set to true The CIS is concerned that without kernel default protection set, a pod (Scored) might be run in the cluster that is a mismatch for the security posture of the cluster as a whole. This is a low-likelihood occurrence.

1.2.21 Ensure that the --profiling argument This flag is not set by default. is set to false (Automated)

1.3.2 Ensure that the --profiling argument This flag is not set by default. is set to false (Automated)

1.4.1 Ensure that the --profiling argument This flag is not set by default. is set to false (Automated)

VMware, Inc. 420 VMware Tanzu Kubernetes Grid

For help remediating these this items, engage with VMware Tanzu support.

For benchmarking clusters in Tanzu Mission Control, see the Expected Test Failures for CIS Benchmark Inspection on Provisioned Tanzu Kubernetes Clusters table under About the CIS Benchmark Inspection and Provisioned Tanzu Kubernetes Clusters, in the Tanzu Mission Control documentation.

VMware, Inc. 421 Tanzu Kubernetes Grid Logs and Troubleshooting 11

The topics in this section provide information about how to find the Tanzu Kubernetes Grid logs, how to troubleshoot frequently encountered Tanzu Kubernetes Grid issues, and how to use the Crash Recovery and Diagnostics tool. n Access the Tanzu Kubernetes Grid Logs n Audit Logging n Troubleshooting Tips for Tanzu Kubernetes Grid n Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostics n Use an Existing Bootstrap Cluster to Deploy Management Clusters

This chapter includes the following topics: n Access the Tanzu Kubernetes Grid Logs n Audit Logging n Troubleshooting Tips for Tanzu Kubernetes Grid n Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostics n Use an Existing Bootstrap Cluster to Deploy Management Clusters

Access the Tanzu Kubernetes Grid Logs

Tanzu Kubernetes Grid retains logs for management cluster and Tanzu Kubernetes cluster deployment and operation.

Access Management Cluster Deployment Logs

To monitor and troubleshoot management cluster deployments, review: n The log file listed in the terminal output Logs of the command execution can also be found at... n The log from your cloud provider module for Cluster API. Retrieve the most recent one as follows:

a Search your tanzu management-cluster create output for Bootstrapper created. Kubeconfig: and copy the kubeconfig file path listed. The file is in ~/.kube-tkg/tmp/.

VMware, Inc. 422 VMware Tanzu Kubernetes Grid

b Run the following, based on your cloud provider:

n vSphere: kubectl logs deployment.apps/capv-controller-manager -n capv-system manager --kubeconfig

n Amazon EC2: kubectl logs deployment.apps/capa-controller-manager -n capa-system manager --kubeconfig

n Azure: kubectl logs deployment.apps/capz-controller-manager -n capz-system manager --kubeconfig

Monitor Tanzu Kubernetes Cluster Deployments in Cluster API Logs

After running tanzu cluster create, you can monitor the deployment process in the Cluster API logs on the management cluster.

To access these logs, follow the steps below:

1 Set kubeconfig to your management cluster. For example:

kubectl config use-context my-management-cluster-admin@my-management-cluster

2 Run the following:

n capi logs:

kubectl logs deployments/capi-controller-manager -n capi-system manager

n IaaS-specific logs:

n vSphere: kubectl logs deployments/capv-controller-manager -n capv-system manager

n Amazon EC2: kubectl logs deployments/capa-controller-manager -n capa-system manager

n Azure: kubectl logs deployments/capz-controller-manager -n capz-system manager

Audit Logging

This topic describes audit logging in Tanzu Kubernetes Grid.

Overview

In Tanzu Kubernetes Grid, you can access the following audit logs: n Audit logs from the Kubernetes API server. See Kubernetes Audit Logs below. n System audit logs for each node in a cluster, collected using auditd. See System Audit Logs for Nodes below.

VMware, Inc. 423 VMware Tanzu Kubernetes Grid

Kubernetes Audit Logs

Kubernetes audit logs record requests to the Kubernetes API server. To enable Kubernetes auditing on a management or Tanzu Kubernetes cluster, set the ENABLE_AUDIT_LOGGING variable to true before you deploy the cluster.

To access these logs in Tanzu Kubernetes Grid, navigate to /var/log/kubernetes/audit.log on the control plane node. If you deploy Fluent Bit on the cluster, it will forward the logs to your log destination. For instructions, see Implementing Log Forwarding with Fluent Bit.

To view the audit policy and audit backend configuration, navigate to: n /etc/kubernetes/audit-policy.yaml on the control plane node n ~/.tanzu/tkg/providers/ytt/03_customizations/audit-logging/audit_logging.yaml on your machine

System Audit Logs for Nodes

When you deploy a management or Tanzu Kubernetes cluster, auditd is enabled on the cluster by default. You can access your system audit logs on each node in the cluster by navigating to /var/log/audit/audit.log.

If you deploy Fluent Bit on the cluster, it will forward these audit logs to your log destination. For instructions, see Implementing Log Forwarding with Fluent Bit.

Troubleshooting Tips for Tanzu Kubernetes Grid

This section includes tips to help you to troubleshoot common problems that you might encounter when installing Tanzu Kubernetes Grid and deploying Tanzu Kubernetes clusters.

Many of these procedures use the kind CLI on your bootstrap machine. To install kind, see Installation in the kind documentation.

Clean Up After an Unsuccessful Management Cluster Deployment

Problem

An unsuccessful attempt to deploy a Tanzu Kubernetes Grid management cluster leaves orphaned objects in your cloud infrastructure and on your bootstrap machine.

Solution

1 Monitor your tanzu management-cluster create command output either in the terminal or Tanzu Kubernetes Grid installer interface. If the command fails, it prints a help message that includes the following: "Failure while deploying management cluster... To clean up the resources created by the management cluster: tkg delete mc...."

2 Run tanzu management-cluster delete YOUR-CLUSTER-NAME. This command removes the objects that it created in your infrastructure and locally.

VMware, Inc. 424 VMware Tanzu Kubernetes Grid

You can also use the alternative methods described below: n Bootstrap machine cleanup:

n To remove a kind cluster, use the kind CLI. For example:

kind get clusters kind delete cluster --name tkg-kind-example1234567abcdef

n To remove Docker objects, use the docker CLI. For example, docker rm, docker rmi, and docker system prune.

CAUTION: If you are running Docker processes that are not related to Tanzu Kubernetes Grid on your system, remove unneeded Docker objects individually. n Infrastructure provider cleanup:

n vSphere: Locate, power off, and delete the VMs and other resources that were created by Tanzu Kubernetes Grid.

n AWS: Log in to your Amazon EC2 dashboard and delete the resources manually or use an automated solution.

n Azure: In Resource Groups, open your AZURE_RESOURCE_GROUP. Use checkboxes to select and Delete the resources that were created by Tanzu Kubernetes Grid, which contain a timestamp in their names.

Delete Users, Contexts, and Clusters with kubectl

To clean up your kubectl state by deleting some or all of its users, contexts, and clusters:

1 Open your ~/.kube/config and ~/.kube-tkg/config files.

2 For the user objects that you want to delete, run:

kubectl config unset users.USER-NAME kubectl config unset users.USER-NAME --kubeconfig ~/.kube-tkg/config

Where USER-NAME is the name property of each top-level user object, as listed in the config files.

3 For the context objects that you want to delete, run:

kubectl config unset contexts.CONTEXT-NAME kubectl config unset contexts.CONTEXT-NAME --kubeconfig ~/.kube-tkg/config

Where CONTEXT-NAME is the name property of each top-level context object, as listed in the config files, typically of the form contexts.mycontext-admin@mycontext.

4 For the cluster objects that you want to delete, run:

kubectl config unset clusters.CLUSTER-NAME kubectl config unset clusters.CLUSTER-NAME --kubeconfig ~/.kube-tkg/config

VMware, Inc. 425 VMware Tanzu Kubernetes Grid

Where CLUSTER-NAME is the name property of each top-level cluster object, as listed in the config files.

5 If the config files list the current context as a cluster that you deleted, unset the context:

kubectl config unset current-context kubectl config unset current-context --kubeconfig ~/.kube-tkg/config

6 If you deleted management clusters that are tracked by the tanzu CLI, delete them from the tanzu CLI's state by running tanzu config server delete as described in Delete Management Clusters from Your Tanzu CLI Configuration. n To see the management clusters that the tanzu CLI tracks, run tanzu login.

Kind Cluster Remains after Deleting Management Cluster

Problem

Running tanzu management-cluster delete removes the management cluster, but fails to delete the local kind cluster from the bootstrap machine.

Solution

1 List all running kind clusters and remove the one that looks like tkg-kind-unique_ID

kind delete cluster --name tkg-kind-unique_ID

2 List all running clusters and identify the kind cluster.

docker ps -a

3 Copy the container ID of the kind cluster and remove it.

docker kill container_ID

Failed Validation, Credentials Error on Amazon EC2

Problem

Running tanzu management-cluster create fails with an error similar to the following:

Validating the pre-requisites... Looking for AWS credentials in the default credentials provider chain

Error: : Tkg configuration validation failed: failed to get AWS client: NoCredentialProviders: no valid providers in chain caused by: EnvAccessKeyNotFound: AWS_ACCESS_KEY_ID or AWS_ACCESS_KEY not found in environment SharedCredsLoad: failed to load shared credentials file caused by: FailedRead: unable to open file caused by: open /root/.aws/credentials: no such file or directory EC2RoleRequestError: no EC2 instance role found caused by: EC2MetadataError: failed to make EC2Metadata request

VMware, Inc. 426 VMware Tanzu Kubernetes Grid

Solution

Tanzu Kubernetes Grid uses the default AWS credentials provider chain. Before creating a management or a workload cluster on Amazon EC2, you must configure your AWS account credentials as described in Configure AWS Credentials.

Failed Validation, Legal Terms Error on Azure

Before creating a management or workload cluster on Azure, you must accept the legal terms that cover the VM image used by cluster nodes. Running tanzu management-cluster create or tanzu cluster create without having accepted the license fails with an error like:

User failed validation to purchase resources. Error message: 'You have not accepted the legal terms on this subscription: '*********' for this plan. Before the subscription can be used, you need to accept the legal terms of the image.

If this happens, accept the legal terms and try again: n Management Cluster: See Accept the Base Image License n Workload Cluster: See the Azure instructions in Deploy a Cluster with a Non-Default Kubernetes Version.

Deploying a Tanzu Kubernetes Cluster Times Out, but the Cluster Is Created

Problem

Running tanzu cluster create fails with a timeout error similar to the following:

I0317 11:11:16.658433 clusterclient.go:341] Waiting for resource my-cluster of type *v1alpha3.Cluster to be up and running E0317 11:26:16.932833 common.go:29] Error: unable to wait for cluster and get the cluster kubeconfig: error waiting for cluster to be provisioned (this may take a few minutes): cluster control plane is still being initialized E0317 11:26:16.933251 common.go:33]

However, if you run tanzu cluster list, the cluster appears to have been created.

------+

NAME STATUS ------+

my-cluster Provisioned ------+

VMware, Inc. 427 VMware Tanzu Kubernetes Grid

Solution

1 Use the tanzu cluster kubeconfig get CLUSTER-NAME --admin command to add the cluster credentials to your kubeconfig.

tanzu cluster kubeconfig get my-cluster --admin

2 Set kubectl to the cluster's context.

kubectl config set-context my-cluster@user

3 Check whether the cluster nodes are all in the ready state.

kubectl get nodes

4 Check whether all of the pods are up and running.

kubectl get pods -A

5 If all of the nodes and pods are running correctly, your Tanzu Kubernetes cluster has been created successfully and you can ignore the error.

6 If the nodes and pods are not running correctly, attempt to delete the cluster.

tanzu cluster delete my-cluster

7 If tanzu cluster delete fails, use kubectl to delete the cluster manually, as described in Delete Users, Contexts, and Clusters with kubectl.

Pods Are Stuck in Pending on Cluster Due to vCenter Connectivity

Problem

When you run kubectl get pods -A on the created cluster, some pods remain in pending.

You run kubectl describe pod -n pod-namespace pod-name on an affected pod and review events and see the following event:

n node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate

Solution

Ensure there is connectivity and firewall rules in place to ensure communication between the cluster and vCenter. For firewall ports and protocols requirements, see Ports and Protocols

Tanzu Kubernetes Grid UI Does Not Display Correctly on Windows

Problem

VMware, Inc. 428 VMware Tanzu Kubernetes Grid

When you run the tanzu management-cluster create --ui command on a Windows system, the UI opens in your default browser, but the graphics and styling are not applied. This happens because a Windows registry is set to application/x-css.

Solution

1 In Windows search, enter regedit to open the Registry Editor utility.

2 Expand HKEY_CLASSES_ROOT and select .css.

3 Right-click Content Type and select Modify.

4 Set the Value to text/css and click OK.

5 Run the tanzu management-cluster create --ui command again to relaunch the UI.

Running tanzu management-cluster create on macOS Results in kubectl Version Error

Problem

If you run the tanzu management-cluster create command on macOS with the latest stable version of Docker Desktop, tanzu management-cluster create fails with the error message:

Error: : kubectl prerequisites validation failed: kubectl client version v1.15.5 is less than minimum supported kubectl client version 1.17.0

This happens because Docker Desktop symlinks kubectl 1.15 into the path.

Solution

Place a newer supported version of kubectl in the path before Docker's version.

Connect to Cluster Nodes with SSH

You can use SSH to connect to individual nodes of management clusters or Tanzu Kubernetes clusters. To do so, the SSH key pair that you created when you deployed the management cluster must be available on the machine on which you run the SSH command. Consquently, you must run ssh commands on the machine on which you run tanzu commands.

The SSH keys that you register with the management cluster, and consequently that are used by any Tanzu Kubernetes clusters that you deploy from the management cluster, are associated with the following user accounts: n vSphere management cluster and Tanzu Kubernetes nodes running on both Photon OS and Ubuntu: capv n Amazon EC2 bastion nodes: ubuntu n Amazon EC2 management cluster and Tanzu Kubernetes nodes running on Ubuntu: ubuntu n Amazon EC2 management cluster and Tanzu Kubernetes nodes running on Amazon Linux: ec2-user n Azure management cluster and Tanzu Kubernetes nodes (always Ubuntu): capi

VMware, Inc. 429 VMware Tanzu Kubernetes Grid

To connect to a node by using SSH, run one of the following commands from the machine that you use as the bootstrap machine: n vSphere nodes: ssh capv@node_address n Amazon EC2 bastion nodes and management cluster and workload nodes on Ubuntu: ssh ubuntu@node_address n Amazon EC2 management cluster and Tanzu Kubernetes nodes running on Amazon Linux: ssh ec2-user@node_address n Azure nodes: ssh capi@node_address

Because the SSH key is present on the system on which you are running the ssh command, no password is required.

Recover Management Cluster Credentials

If you have lost the credentials for a management cluster, for example by inadvertently deleting the .kube-tkg/config file on the system on which you run tanzu commands, you can recover the credentials from the management cluster control plane node.

1 Run tanzu management-cluster create to recreate the .kube-tkg/config file.

2 Obtain the public IP address of the management cluster control plane node, from vSphere, Amazon EC2, or Azure.

3 Use SSH to log in to the management cluster control plane node.

See Connect to Cluster Nodes with SSH above for the credentials to use for each infrastructure provider.

4 Access the admin.conf file for the management cluster.

sudo vi /etc/kubernetes/admin.conf

The admin.conf file contains the cluster name, the cluster user name, the cluster context, and the client certificate data.

5 Copy the cluster name, the cluster user name, the cluster context, and the client certificate data into the .kube-tkg/config file on the system on which you run tanzu commands.

Restore ~/.tanzu Directory

Problem

The ~/.tanzu directory on the bootstrap machine has been accidentally deleted or corrupted. The tanzu CLI creates and uses this directory, and cannot function without it.

Solution

VMware, Inc. 430 VMware Tanzu Kubernetes Grid

To restore the contents of the ~/.tanzu directory:

1 To identify existing Tanzu Kubernetes Grid management clusters, run:

kubectl --kubeconfig ~/.kube-tkg/config config get-contexts

The command output lists names and contexts of all management clusters created or added by the v1.2 tkg or v1.3 tanzu CLI.

1 For each management cluster listed in the output, restore it to the ~/.tanzu directory and CLI by running:

tanzu login --kubeconfig ~/.kube-tkg/config --context MGMT-CLUSTER-CONTEXT --name MGMT-CLUSTER

Disable nfs-utils on Photon OS Nodes

Problem

In Tanzu Kubernetes Grid v1.1.2 and later, nfs-utils is enabled by default. If you do not require nfs-utils, you can remove it from cluster node VMs.

Solution

To disable nfs-utils on clusters that you deploy with Tanzu Kubernetes Grid v1.1.2 or later, use SSH to log in to the cluster node VMs and run the following command:

tdnf erase nfs-utils

For information about using nfs-utils on clusters deployed with Tanzu Kubernetes Grid v1.0 or 1.1.0, see Enable or Disable nfs-utils on Photon OS Nodes in the VMware Tanzu Kubernetes Grid 1.1.x Documentation.

Requests to NSX Advanced Load Balancer VIP fail with the message no route to host

Problem

If the total number of LoadBalancer type Service is large, and if all of the Service Engines are deployed in the same L2 network, requests to the NSX Advanced Load Balancer VIP can fail with the message no route to host.

This occurs because the default ARP rate limit on Service Engines is 100.

Solution

Set the ARP rate limit to a larger number. This parameter is not tunable in NSX Advanced Load Balancer Essentials, but it is tunable in NSX Advanced Load Balancer Enterprise Edition.

VMware, Inc. 431 VMware Tanzu Kubernetes Grid

Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostics

Crash Diagnostics (Crashd) is an open source project that makes it easy to diagnose problems with unstable or even unreponsive Kubernetes clusters.

Crashd uses a script file written in Starlark, a Python-like language, that interacts with your Tanzu Kubernetes clusters to collect infrastructure and cluster information. Crashd takes the output from the commands run by the script and adds the output to a tar file. The tar file is then saved locally for further analysis.

Tanzu Kubernetes Grid includes signed binaries for Crashd and a diagnostics script file for Photon OS Tanzu Kubernetes clusters.

Install or Upgrade the Crashd Binary

To install or upgrade crashd, follow the instructions below.

1 Go to the Tanzu Kubernetes Grid downloads page, and log in with your My VMware credentials.

2 Download Crashd for your platform.

n Linux: crashd-linux-amd64-v0.3.2-vmware.1.tar.gz

n macOS: crashd-darwin-amd64-v0.3.2-vmware.1.tar.gz

3 Use the tar command to unpack the binary for your platform.

n Linux:

tar -xvf crashd-linux-amd64-v0.3.2-vmware.1.tar.gz

n macOS:

tar -xvf crashd-darwin-amd64-v0.3.2-vmware.1.tar.gz

4 The previous step creates a directory named crashd with the following files:

crashd crashd/args crashd/diagnostics.crsh crashd/crashd-PLATFORM-amd64-v0.3.2+vmware.1

1 Move the binary into the /usr/local/bin folder.

n Linux:

mv ./crashd/crashd-linux-amd64-v0.3.2+vmware.1 /usr/local/bin/crashd

n macOS:

mv ./crashd/crashd-darwin-amd64-v0.3.2+vmware.1 /usr/local/bin/crashd

VMware, Inc. 432 VMware Tanzu Kubernetes Grid

Run Crashd on Photon OS Tanzu Kubernetes Grid Clusters

Crashd for Tanzu Kubernetes Grid provides a script file, diagnostics.crsh, along with a script argument file, args. When Crashd runs, Crashd takes the the argument values from the args file and passes the values to the script. The script runs commands to extract information that can help diagnose problems on Photon OS Tanzu Kubernetes Grid management clusters and Tanzu Kubernetes workload clusters, which have been deployed on vSphere from Tanzu Kubernetes Grid.

Prerequisites

Prior to running Crashd script diagnostics.crsh, your local machine must have the following programs on its execution path: n kind (v0.7.0 or higher) n kubectl n scp n ssh

Additionally, before you can run Crashd, you must follow these steps: n Configure Crashd with a SSH private/public key pair. n Ensure that your Tanzu Kubernetes Grid VMs are configured to use your SSH public key. n Extract the kubeconfig file for the management cluster by using command tanzu cluster kubeconfig get . n For a simpler setup, ensure that the kubeconfig, public-key file, the diagnostics.crsh file, and the args file are in the same location.

Configure Crashd 1 Navigate to the location where you downloaded and unpacked the Crashd bundle.

2 In a text editor, open the argument file args.

For example, use vi to edit the file.

vi args

The file contains a series of named key/value pairs that are passed to the script:

# Specifies cluster to target, (supported: bootstrap, mgmt, or workload) target=mgmt

# Underlying infrastructure used by TKG (supported: vsphere, aws) infra=vsphere

# working directory workdir=./workdir

VMware, Inc. 433 VMware Tanzu Kubernetes Grid

# User and private key for ssh connections to cluster nodes. ssh_user=capv ssh_pk_file=./capv.pem

# namespace where mgmt cluster is deployed mgmt_cluster_ns=tkg-system

# kubeconfig file path for management cluster mgmt_cluster_config=./tkg_cluster_config

# Uncomment the following to specify a comma separated # list of workload cluster names #workload_clusters=tkg-cluster-wc-498

# Uncomment the following to specify the namespace # associated with the workload cluster names above #workload_cluster_ns=default

3 Configure the collection of diagnostics from a bootstrap cluster.

If you are troubleshooting the initial setup of your cluster during bootstrap, update the following arguments in the file:

n target: Set this value to bootstrap.

n workdir: The location where files are collected.

4 Configure the collection of diagnostics from a management cluster.

When diagnosing a management cluster failure, update the following arguments in the args file:

n target: Set this value to mgmt.

n workdir: The location where files are collected.

n ssh_user: The SSH user used to access cluster machines. For clusters running on vSphere, the user name is capv.

n ssh_pk_file: The path to your SSH private key file. For information about creating the SSH key pairs, see Create an SSH Key Pair in Deploy a Management Cluster to vSphere.

n mgmt_cluster_ns: The namespace where the management cluster is deployed.

n mgmt_cluster_config The path of the kubeconfig file for the management cluster.

5 Configure the collection of diagnostics from one or more workload clusters:

When collecting diagnostics information from workload clusters, you must specify the following arguments:

n target: Set this value to workload.

n workdir: The location where files are collected.

n mgmt_cluster_ns: The namespace where the management cluster is deployed.

n mgmt_cluster_config The path of the kubeconfig file for the management cluster.

VMware, Inc. 434 VMware Tanzu Kubernetes Grid

In addition to the previous arguments, you must uncomment the following workload cluster values:

n workload_clusters: A comma-separated list of workload cluster names from which to collect diagnostics information.

n workload_cluster_ns: The namespace where the workload clusters are deployed.

Run Crashd

1 Run the crashd command from the location where the script file diagnostics.crsh and argument file args are located.

crashd run --args-file args diagnostics.crsh

2 Optionally, monitor Crashd output. By default, the crashd command runs silently until completion. However, you can use flag --debug to view log messages on the screen similar to the following:

crashd run --args-file args --debug diagnostics.crsh

DEBU[0003] creating working directory ./workdir/tkg-kind-12345 DEBU[0003] kube_capture(what=objects) DEBU[0003] Searching in 20 groups ... DEBU[0015] Archiving [./workdir/tkg-kind-12345] in bootstrap.tkg-kind-12345.diagnostics.tar.gz DEBU[0015] Archived workdir/tkg-kind-12345/kind-logs/docker-info.txt DEBU[0015] Archived workdir/tkg-kind-12345/kind-logs/tkg-kind-12345-control-plane/alternatives.log DEBU[0015] Archived workdir/tkg-kind-12345/kind-logs/tkg-kind-12345-control-plane/containerd.log

Use an Existing Bootstrap Cluster to Deploy Management Clusters

By default, when you deploy a management cluster by running tanzu management-cluster create, Tanzu Kubernetes Grid creates a temporary kind cluster on your local bootstrap machine. It then uses the local cluster to provision the final management cluster on its target cloud infrastructure—vSphere, Amazon EC2, or Azure—and deletes the temporary cluster after the management cluster successfully deploys. Running tanzu management-cluster delete to delete a management cluster invokes a similar process of creating, using, and then removing a temporary, local kind cluster.

In some circumstances, you might want to keep the local cluster after deploying or deleting a management cluster. For example, to examine the objects in the cluster or review its logs. To do this, you can deploy or delete the management cluster with the CLI options --use-existing- bootstrap-cluster or --use-existing-cleanup-cluster.

With these options, Tanzu Kubernetes Grid skips creating and deleting the local kind cluster, and instead uses a pre-existing local cluster that you already have or create new for the purpose.

VMware, Inc. 435 VMware Tanzu Kubernetes Grid

Warning: Using an existing bootstrap cluster is an advanced use case that is for experienced Kubernetes users. If possible, it is strongly recommended to use the default kind cluster that Tanzu Kubernetes Grid provides to bootstrap your management clusters.

Identify or Create a Local Bootstrap Cluster

To retain your local kind cluster during management cluster creation or deletion, you must first have a compatible cluster running on your bootstrap machine. Ensure this by either identifying or creating the cluster, as described in the subsections below.

Use an Existing Cluster To use an existing local cluster, you must make sure that both of the following are true: n The cluster has never previously been used to bootstrap or delete a management cluster. n The cluster was created with kind v0.11 or later.

n Check this by running docker ps and associating the kindest:node version listed with versions listed in the kind release notes.

n Background: If your bootstrap machine uses a Linux kernel built after the May 2021 Linux security patch, for example Linux 5.11 and 5.12 with Fedora, your bootstrap cluster must be created by a kind version v0.11 or later. Earlier versions of kind attempt to change a file made read-only in recent Linux versions, causing failure. The security patch is being backported to all LTS kernels from 4.9 onwards, causing possible management cluster deployment failures as operating system updates are shipped, including for Docker Machine on Mac OS and Windows Subsystem for Linux.

If both of these qualifications are true for the current local cluster, you may use it to create or delete a management cluster. Otherwise, you must replace the cluster as follows:

1 Delete the cluster.

2 Download and install a new version of kind as described in the kind documentation.

3 Create a kind cluster as described in the Create a New Cluster section below.

Create a New Cluster To create a new local bootstrap cluster, do one of the following, depending on your bootstrap machine's connectivity: n Fully-Online Environment:

Create the cluster:

kind create cluster

VMware, Inc. 436 VMware Tanzu Kubernetes Grid

n Internet-Restricted Environment:

a Create a kind cluster configuration file kind.yml as follows:

kind: Cluster apiVersion: kind.x-k8s.io/v1alpha4 name: tkg-kind nodes: - role: control-plane # This option mounts the host docker registry folder into # the control-plane node, allowing containerd to access them. extraMounts: - containerPath: CONTAINER-CA-PATH hostPath: HOST-CA-PATH containerdConfigPatches: - |- [plugins."io.containerd.grpc.v1.cri".registry.configs."REGISTRY-FQDN".tls] ca_file = "/etc/containerd/REGISTRY-FQDN/CA.crt"

Where:

n CONTAINER-CA-PATH is the path of the Harbor CA certificate on the kind container, like /etc/containerd/REGISTRY-FQDN

n HOST-CA-PATH is the path to the Harbor CA certificate on the bootstrap VM where the kind cluster is created, like /etc/docker/certs.d/REGISTRY-FQDN

n REGISTRY-FQDN is the name of the Harbor registry

n CA.crt is the CA certificate of the Harbor registry

By default, crashd looks for a cluster named tkg-kind, so naming the kind cluster tkg-kind makes it easy to collect logs if the cluster fails to bootstrap.

b Use the above config file to create the kind cluster.

kind create cluster --config kind.yml

Deploy a Management Cluster with an Existing Bootstrap Cluster

To deploy a Tanzu Kubernetes Grid management cluster with an existing bootstrap cluster:

1 Follow the above procedure to Identify or Create a Local Bootstrap Cluster.

2 Set the context of kubectl to the local bootstrap cluster:

kubectl config use-context my-bootstrap-cluster-admin@my-bootstrap-cluster

1 Deploy the management cluster by running tanzu management-cluster create command with the --use-existing-bootstrap-cluster option:

tanzu management-cluster create --file mc.yaml --use-existing-bootstrap-cluster my-bootstrap-cluster

VMware, Inc. 437 VMware Tanzu Kubernetes Grid

See Chapter 4 Deploying Management Clusters for more information about running tanzu management-cluster create.

Delete a Management Cluster with an Existing Bootstrap Cluster

To delete a Tanzu Kubernetes Grid management cluster with an existing bootstrap cluster:

1 Follow the above procedure to Identify or Create a Local Bootstrap Cluster.

2 Set the context of kubectl to the local bootstrap cluster:

kubectl config use-context my-bootstrap-cluster-admin@my-bootstrap-cluster

1 Delete the management cluster by running tanzu management-cluster delete command with the --use-existing-cleanup-cluster option:

tanzu management-cluster delete --use-existing-cleanup-cluster my-bootstrap-cluster

See Delete Tanzu Kubernetes Clusters for more information about running tanzu management- cluster delete.

VMware, Inc. 438