Managing cloud infrastructure using kpt

Last reviewed 2023-03-15 UTC

This tutorial introduces kpt, an open source tool by Google that lets you work with Kubernetes configurations (also known as manifests): package them, pull them, update them, and modify them. kpt is an alternative to template-based tools when you want to keep a clean separation between configurations and operations on those configurations. kpt lets you reuse and share code that is acting on the configurations (either to modify or inspect them).

This tutorial also showcases how you can combine kpt with other Google solutions like Config Sync and GKE Enterprise security blueprints. Whether you are a developer working with Kubernetes or a platform engineer managing a Kubernetes-based platform, this tutorial lets you discover how you can use kpt in your own Kubernetes-related workflows. This tutorial assumes that you are familiar with Kubernetes and Google Cloud.

Declarative configuration for cloud infrastructure is a well-established practice in the IT industry. It brings a powerful abstraction of the underlying systems. This abstraction frees you from having to manage low-level configuration details and dependencies. Therefore, declarative configuration has an advantage compared to imperative approaches, such as operations performed in graphical and command line interfaces.

The Kubernetes resource model has been influential in making declarative configuration approaches mainstream. Because the Kubernetes API is fully declarative by nature, you only tell Kubernetes what you want, not how to achieve what you want. The Kubernetes API lets you cleanly separate the configuration (whether desired or real) from operations on the configuration (adding, removing, and modifying objects). In other words, in the Kubernetes resource model, the configuration is data, and not code.

This separation of configuration from operations has many advantages: people and automated systems can understand and work on the configuration, and the software modifying the configuration is easily reusable. This separation also lets you easily implement a GitOps methodology (as defined in the GitOps-style continuous delivery with Cloud Build tutorial).

In this tutorial, you explore this separation of configuration declaration from configuration operations using kpt. This tutorial highlights the following features of kpt:

Package management: download and update Kubernetes configuration packages.
Functions: run arbitrary pieces of code to either modify or validate your configurations.
Function pipeline: a set of functions that the package author has included with the package.
Resource management: apply, update, and delete the resources that correspond to your configurations in a Kubernetes cluster.

Objectives

Create a Google Kubernetes Engine (GKE) cluster.
Use kpt to download an existing set of Kubernetes configurations.
Use kpt functions to customize the configurations.
Apply your configuration to the GKE cluster.
Use kpt to pull in some upstream changes for your configuration.
Use kpt in a real-world scenario to harden your GKE cluster.

Costs

In this document, you use the following billable components of Google Cloud:

Google Kubernetes Engine

To generate a cost estimate based on your projected usage, use the pricing calculator. New Google Cloud users might be eligible for a free trial.

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Go to project selector

Make sure that billing is enabled for your Google Cloud project.

In the Google Cloud console, activate Cloud Shell.

Activate Cloud Shell

At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
Configure Cloud Shell to use your project:
```
gcloud config set project PROJECT_ID
```
In Cloud Shell, enable the Google Kubernetes Engine and Cloud Source Repositories APIs:
```
gcloud services enable container.googleapis.com \
   sourcerepo.googleapis.com
```

When you finish this tutorial, you can avoid continued billing by deleting the resources that you created. For more details, see Cleaning up.

Creating a GKE cluster

In this section, you create the GKE cluster where you deploy configurations later in the tutorial.

In Cloud Shell, create a GKE cluster:

gcloud container clusters create kpt-tutorial \
   --num-nodes=1 --machine-type=n1-standard-4 \
   --zone=us-central1-a --enable-network-policy

Verify that you have access to the cluster. The following command returns information about the node or nodes that are in the cluster.
```
kubectl get nodes
```

Applying a kpt package

In this section, you use kpt to download a set of configurations, customize them, and apply them to the cluster that you created in the previous section. kpt should be installed inside your Cloud Shell environment. If it's not, install it with the following commands:

In Cloud Shell, install kpt:

sudo apt update && sudo apt-get install google-cloud-sdk-kpt

Download an example set of configurations. For more information, see kpt pkg get.
```
kpt pkg get https://github.com/GoogleContainerTools/kpt.git/package-examples/wordpress@v0.9
```
The preceding command downloads the wordpress sample package that is available in the kpt GitHub repository, at the version tagged v0.9.

Examine the package contents: kpt pkg tree.

kpt pkg tree wordpress

The output looks like the following:

Package "wordpress"
├── [Kptfile]  Kptfile wordpress
├── [service.yaml]  Service wordpress
├── deployment
│   ├── [deployment.yaml]  Deployment wordpress
│   └── [volume.yaml]  PersistentVolumeClaim wp-pv-claim
└── Package "mysql"
    ├── [Kptfile]  Kptfile mysql
    ├── [deployment.yaml]  PersistentVolumeClaim mysql-pv-claim
    ├── [deployment.yaml]  Deployment wordpress-mysql
    └── [deployment.yaml]  Service wordpress-mysql

The package contains two packages the top level one wordpress and a subpackage wordpress/mysql, both of these packages contain a metadata file Kptfile. Kptfile is only consumed by kpt itself and has data about the upstream source, customization and validation of the package

Update the label of the package

The author of the package added a rendering pipeline which is often used to convey the expected customizations.

less wordpress/Kptfile

The contents should look something like this:

apiVersion: kpt.dev/v1
kind: Kptfile
metadata:
  name: wordpress
upstream:
  type: git
  git:
    repo: https://github.com/GoogleContainerTools/kpt
    directory: /package-examples/wordpress
    ref: v0.9
  updateStrategy: resource-merge
upstreamLock:
  type: git
  git:
    repo: https://github.com/GoogleContainerTools/kpt
    directory: /package-examples/wordpress
    ref: package-examples/wordpress/v0.9
    commit: b9ea0bca019dafa9f9f91fd428385597c708518c
info:
  emails:
    - kpt-team@google.com
  description: This is an example wordpress package with mysql subpackage.
pipeline:
  mutators:
    - image: gcr.io/kpt-fn/set-labels:v0.1
      configMap:
        app: wordpress
  validators:
    - image: gcr.io/kpt-fn/kubeval:v0.3

You can use your favorite editor to change the parameters of the set-label function from app: wordpress to app: my-wordpress

Edit the MySQL deployment wordpress/mysql/deployment.yaml using your favorite code editor to change the MySQL version. Also, to further strengthen the security, change the MYSQL_ROOT_PASSWORD variable to MYSQL_PASSWORD, and add the following variables:

MYSQL_USER
MYSQL_RANDOM_ROOT_PASSWORD
MYSQL_DATABASE

Before:

[...]
  containers:
    - name: mysql
      image: mysql:5.6
      ports:
        - name: mysql
          protocol: TCP
          containerPort: 3306
      env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
[...]

After:

[...]
  containers:
    - name: mysql
      image: mysql:8.0
      ports:
        - name: mysql
          protocol: TCP
          containerPort: 3306
      env:
        - name: MYSQL_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        - name: MYSQL_RANDOM_ROOT_PASSWORD
          value: '1'
        - name: MYSQL_USER
          value: wordpress
        - name: MYSQL_DATABASE
          value: wordpress
[...]

Edit the Wordpress deployment wordpress/deployment/deployment.yaml using your favorite code editor to change the Wordpress version and add a WORDPRESS_DB_USER variable.

Before:

[...]
  containers:
    - name: wordpress
      image: wordpress:6.1-apache
      ports:
        - name: wordpress
          protocol: TCP
          containerPort: 80
      env:
        - name: WORDPRESS_DB_HOST
          value: wordpress-mysql
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
[...]

After:

[...]
  containers:
    - name: wordpress
      image: wordpress:4.8-apache
      ports:
        - name: wordpress
          protocol: TCP
          containerPort: 80
      env:
        - name: WORDPRESS_DB_HOST
          value: wordpress-mysql
        - name: WORDPRESS_DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-pass
              key: password
        - name: WORDPRESS_DB_USER
          value: wordpress
[...]

Unlike tools that operate only through parameters, kpt allows you to edit files in place using an editor and then will still merge the upstream updates. You can edit the deployment.yaml directly without having to craft a patch or creating a function in the pipeline.

Annotate the configuration with sample-annotation: sample-value.
```
kpt fn eval wordpress --image gcr.io/kpt-fn/set-annotations:v0.1 \
  -- sample-annotation=sample-value
```
The output should look something like this:
```
[RUNNING] "gcr.io/kpt-fn/set-annotations:v0.1"
[PASS] "gcr.io/kpt-fn/set-annotations:v0.1"
```
To see the new annotation you can examine any configuration value, a simple one to see is wordpress/service.yaml

In this example we did a customization using a function in a way that the package author didn't plan for. kpt is able to support declarative and imperative function execution to allow for a wide range of scenarios.

If this package was developed using infrastructure-as-code we would need to go to the source of the package and edit code there.
Run the pipeline and validate the changes using kubeval through kpt.
```
kpt fn render wordpress
```
The package author has included a validation step in the pipeline:
```
...
validators:
    - image: gcr.io/kpt-fn/kubeval:v0.3
```
The successful output of this rendering pipeline looks like this:
```
Package "wordpress/mysql":
[RUNNING] "gcr.io/kpt-fn/set-labels:v0.1"
[PASS] "gcr.io/kpt-fn/set-labels:v0.1" in 1.3s

Package "wordpress":
[RUNNING] "gcr.io/kpt-fn/set-labels:v0.1"
[PASS] "gcr.io/kpt-fn/set-labels:v0.1" in 1.3s
[RUNNING] "gcr.io/kpt-fn/kubeval:v0.3"
[PASS] "gcr.io/kpt-fn/kubeval:v0.3" in 3.7s

Successfully executed 3 function(s) in 2 package(s).
```
Note: To see the result of the preceding command for an invalid configuration, edit the wordpress/service.yaml file and change the apiVersion to v2. Save the file, and then rerun the command. Remember to revert your change before you proceed to the next step.

This step is an example of the benefits of the Kubernetes resource model: because your configurations are represented in this well-known model, you can use existing Kubernetes tools like kubeval.
Examine the differences between your local version of the configurations and the upstream configuration:
```
kpt pkg diff wordpress
```
Initialize the package for deployment:
```
kpt live init wordpress
```
The preceding command builds an inventory of the configurations that are in the package. Among other things, kpt uses the inventory to prune configurations when you remove them from the package. For more information, see kpt live.

Create a secret that contains the MySQL password:

kubectl create secret generic mysql-pass --from-literal=password=foobar

Apply the configurations to your GKE cluster:

kpt live apply wordpress

The output looks like the following:

installing inventory ResourceGroup CRD.
inventory update started
inventory update finished
apply phase started
service/wordpress apply successful
service/wordpress-mysql apply successful
deployment.apps/wordpress apply successful
deployment.apps/wordpress-mysql apply successful
persistentvolumeclaim/mysql-pv-claim apply successful
persistentvolumeclaim/wp-pv-claim apply successful
apply phase finished
reconcile phase started
service/wordpress reconcile successful
service/wordpress-mysql reconcile successful
deployment.apps/wordpress reconcile pending
deployment.apps/wordpress-mysql reconcile pending
persistentvolumeclaim/mysql-pv-claim reconcile pending
persistentvolumeclaim/wp-pv-claim reconcile pending
persistentvolumeclaim/wp-pv-claim reconcile successful
persistentvolumeclaim/mysql-pv-claim reconcile successful
deployment.apps/wordpress-mysql reconcile successful
deployment.apps/wordpress reconcile successful
reconcile phase finished
inventory update started
inventory update finished
apply result: 6 attempted, 6 successful, 0 skipped, 0 failed
reconcile result: 6 attempted, 6 successful, 0 skipped, 0 failed, 0 timed out

Wait a few minutes, and then verify that everything is running as expected:

kpt live status wordpress

The output looks like the following:

inventory-88521939/deployment.apps/default/wordpress is Current: Deployment is available. Replicas: 1
inventory-88521939/persistentvolumeclaim/default/wp-pv-claim is Current: PVC is Bound
inventory-88521939/service/default/wordpress-mysql is Current: Service is ready
inventory-88521939/persistentvolumeclaim/default/mysql-pv-claim is Current: PVC is Bound
inventory-88521939/deployment.apps/default/wordpress-mysql is Current: Deployment is available. Replicas: 1
inventory-88521939/service/default/wordpress is Current: Service is ready

Updating your local package

In this section, you update your local version of the package with some changes from the upstream package.

Create a Git repository for your configurations and configure your email and name. You need a local Git repository to be able to update the package. Replace YOUR_EMAIL with your email address and replace YOUR_NAME with your name.
```
cd wordpress/
git init -b main
git config user.email "YOUR_EMAIL"
git config user.name "YOUR_NAME"
```

Commit your configurations:

git add .
git commit -m "First version of Wordpress package"
cd ..

Update your local package. In this step, you pull the v0.10 version from upstream.
```
kpt pkg update wordpress@v0.10
```
Observe that updates from upstream are applied to your local package:
```
cd wordpress/
git diff
```
In this case, the update has changed the wordpress deployment volume from 3Gi to 4Gi. You can also see you own changes.

Commit your changes:

git commit -am "Update to package version v0.10"

Apply the new version of the package:
```
kpt live apply .
```

Removing a resource and a package

Because kpt tracks the resources that it creates, it can prune resources from the cluster when you delete resources from the package. It can also completely remove a package from the cluster. In this section, you remove a resource from the package, and then remove the package.

Remove the service.yaml file from the package:

git rm service.yaml
git commit -m "Remove service"

Apply the change, and then verify that kpt pruned the wordpress service:
```
kpt live apply .
kubectl get svc
```
Remove the rest of the package from the cluster, and then verify that you have nothing left in the cluster:
```
kpt live destroy .
kubectl get deployment
```

Using kpt to harden your GKE

The kpt live command is not the only way that you can apply a package to a Kubernetes cluster. In this section, you use kpt with Config Sync in a basic but realistic scenario. The Config Sync tool lets you manage your configuration centrally, uniformly, and declaratively for all your Kubernetes clusters from a Git repository. GKE Enterprise (which you use in this tutorial) includes Config Sync.

The GKE Enterprise security blueprints provide you with a range of prepackaged security settings for your GKE Enterprise-based workloads. In this section, you use the restricting traffic blueprint (and its directory in the GitHub repository). You use kpt to download the default-deny package. The package uses Kubernetes network policies to deny any traffic in your GKE cluster by default (except for DNS resolution). To apply the configurations, you then commit the configurations to the Config Sync Git repository.

Install Config Sync

In this section, you create the Git repository that Config Sync needs, and then you install Config Sync on your GKE cluster.

In Cloud Shell, use Cloud Source Repositories to create a Git repository for Config Sync:
```
cd ~
gcloud source repos create config-management
```
Generate an SSH key pair to authenticate against the Git repository:
```
cd ~
ssh-keygen -t rsa -b 4096  -N '' \
   -f cloud_source_repositories_key
```
Note: SSH is one of the ways that Config Sync can authenticate against a Git repository. Depending on the Git provider, you can also use authentication methods like personal access tokens (PATs) or Google service accounts. For more information about alternative authentication methods, see Granting Operator read-only access to Git.

Create the Kubernetes Secret that contains the SSH private key to access the Git repository:

kubectl create ns config-management-system && \
kubectl create secret generic git-creds \
  --namespace=config-management-system \
  --from-file=ssh=cloud_source_repositories_key

Display the public key, and then copy it:
```
cat cloud_source_repositories_key.pub
```
Go to the Cloud Source Repositories

Manage SSH Keys page.
In the Register SSH Key dialog that appears, enter the following values:
1. In the Key name field, enter config-management.
2. In the Key field, paste the public key.
3. Click Register.

Clone the Git repository to Cloud Shell:

gcloud source repos clone config-management
cd config-management
git checkout -b main

Download the Config Sync command-line tool called nomos. nomos should be installed inside your Cloud Shell environment. If it's not, install it with the following commands:
```
sudo apt update && sudo apt-get install google-cloud-sdk-nomos
```

Initialize the Config Sync repository:

nomos init
git add .
git commit -m "Config Management directory structure"
git push -u origin main

Deploy the Config Sync operator:

gsutil cp gs://config-management-release/released/latest/config-management-operator.yaml /tmp/config-management-operator.yaml
kubectl apply -f /tmp/config-management-operator.yaml

Configure Config Sync

Create a ConfigManagement custom resource to configure Config Sync:

PROJECT=$(gcloud config get-value project)
EMAIL=$(gcloud config get-value account)
cat <<EOF > /tmp/config-management.yaml
apiVersion: configmanagement.gke.io/v1
kind: ConfigManagement
metadata:
  name: config-management
spec:
  clusterName: kpt-tutorial
  git:
    syncRepo: ssh://${EMAIL}@source.developers.google.com:2022/p/${PROJECT}/r/config-management
    syncBranch: main
    secretType: ssh
EOF
kubectl -n config-management-system \
    apply -f /tmp/config-management.yaml

For more installation options, see the install Config Sync documentation.

In Cloud Shell, verify that Config Sync is working properly:
```
nomos status --contexts=$(kubectl config current-context)
```
This command returns the status as SYNCED. Config Sync might take some time to initialize. If the status isn't updated, wait a few minutes and then rerun the command.

Apply the GKE Enterprise security blueprint

In this section, you use kpt to download the default-deny package of the restricting traffic GKE Enterprise security blueprint. Then you use Config Sync to apply the package to the default namespace only.

Download the default-deny package:
```
cd ~
kpt pkg get https://github.com/GoogleCloudPlatform/anthos-security-blueprints.git/restricting-traffic/default-deny ./
```
You can explore the content of the package: the default-deny/Kptfile file contains the metadata of the package, and the default-deny/default-deny.yaml file contains a Kubernetes NetworkPolicy, which is the only configuration from this blueprint.
Use kpt to copy the package's content in the Config Sync repository, and then add labels to customize it:
```
kpt fn source default-deny/ | \
    kpt fn eval - --image=gcr.io/kpt-fn/set-annotations:v0.1 -- \
    anthos-security-blueprint=restricting-traffic | \
    kpt fn sink ~/config-management/namespaces/default/
```
As shown in this example, you can use pipes to chain kpt fn commands together. Chaining kpt fn commands lets you read configurations, modify them as you want, and write the result. You can chain as many kpt fn commands as you want.
Create the namespace.yaml file in the Config Sync repository:
```
cat >> ~/config-management/namespaces/default/namespace.yaml <<EOF
apiVersion: v1
kind: Namespace
metadata:
  name: default
EOF
```
The default namespace exists in your GKE cluster, but Config Sync doesn't manage it. When you create the directory and the file in this step, you make Config Sync manage the namespace. To apply the package to multiple namespaces at once, you can create an abstract namespace.

Verify that the Kubernetes NetworkPolicies were written to the Config Sync repository, and that they are annotated with anthos-security-blueprint: restricting-traffic:

cat config-management/namespaces/default/default-deny.yaml

The output looks like the following:

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata: # kpt-merge: /default-deny
  name: default-deny
  annotations:
    internal.kpt.dev/upstream-identifier: 'networking.k8s.io|NetworkPolicy|default|default-deny'
    anthos-security-blueprint: restricting-traffic
spec:
  policyTypes:
  - Ingress
  - Egress
  podSelector: {}
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          k8s-namespace: kube-system
      podSelector:
        matchExpressions:
        - key: k8s-app
          operator: In
          values: ["kube-dns", "node-local-dns"]
    ports:
    - protocol: TCP
      port: 53
    - protocol: UDP
      port: 53

Commit and push the changes:
```
cd ~/config-management/
git add namespaces/default/
git commit -m "Default deny"
git push
```
Note: In this tutorial, Config Sync applies the changes when you push them. For a production scenario, we recommend that you read Safe rollouts with Config Sync to learn about other rollout strategies.
Verify that the new policy is applied:
```
kubectl get networkpolicies
```
If the new policy isn't present, wait a few seconds and then run the command again. Config Sync updates the configurations every 15 seconds by default. If you need to do some additional troubleshooting, run the following command to get information about any potential Config Sync errors:
```
nomos status --contexts=$(kubectl config current-context)
```

To test the new policy, get a shell in a pod running inside the default namespace:

kubectl -n default run -i --tty --rm test \
        --image=busybox --restart=Never -- sh

Try to ping 8.8.8.8, and see that it doesn't work, as expected:

ping -c 3 -W 1 8.8.8.8

The output looks like the following:

PING 8.8.8.8 (8.8.8.8): 56 data bytes

--- 8.8.8.8 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss

Try to query the Kubernetes API server, and see that it doesn't work, as expected:

wget --timeout=3 https://${KUBERNETES_SERVICE_HOST}

The output looks like the following:

Connecting to 10.3.240.1 (10.3.240.1:443)
wget: download timed out

Exit the pod:
```
exit
```

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

In the Google Cloud console, go to the Manage resources page.
Go to Manage resources
In the project list, select the project that you want to delete, and then click Delete.
In the dialog, type the project ID, and then click Shut down to delete the project.

Delete the resources

If you want to keep the Google Cloud project that you used in this tutorial, you can delete the Git repository and the GKE cluster. In Cloud Shell, run the following commands:

gcloud source repos delete config-management --quiet
gcloud container clusters delete kpt-tutorial \
    --async --quiet --zone=us-central1-a

What's next

Learn more about Config Sync and its components.
Learn about Policy Controller to validate you application's deployment configuration.
Explore reference architectures, diagrams, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.