Deploying IBM Db2 Warehouse on Google Kubernetes Engine with NFS storage

This tutorial shows how to create an IBM Db2 Warehouse cluster on Google Kubernetes Engine (GKE) with a Network File System (NFS) volume as the storage layer. You use Filestore as the NFS backend for the shared storage solution. However, you can choose any other cloud-deployable NFS solution.

This tutorial is useful if you are a sysadmin, developer, engineer, or database administrator and you want to deploy an IBM Db2 Warehouse cluster on Google Cloud.

For an overview of IBM Db2 Warehouse and deployment options on Google Cloud, see the series overview.

In this tutorial, you use the following software:

  • Ubuntu-server 16.04
  • IBM Db2 Warehouse Enterprise Edition
  • IBM Db2 Warehouse Client


  • Get access to the IBM Db2 Warehouse Docker images in the Docker Store.
  • Create a Filestore instance.
  • Launch your GKE cluster.
  • Verify that the cluster is operational.
  • Initialize Docker Store authentication in the GKE cluster.
  • Deploy and run the NFS-Client provisioner in the cluster.
  • Deploy and run IBM Db2 Warehouse containers in the cluster.
  • Upload sample data in IBM Db2 Warehouse.
  • Connect to the IBM Db2 administration console.


This tutorial uses billable components of Google Cloud, including:

Use the Pricing Calculator to generate a cost estimate based on your projected usage.

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the GKE and Filestore APIs.

    Enable the APIs

  5. If you don't have a Docker ID, create one in the Docker Store.

In this tutorial, you use IBM Db2 Warehouse Enterprise Edition. If you don't already have a license for this software, you might be able to use a free trial version for this tutorial.

When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Cleaning up.


In this tutorial, you deploy a Kubernetes cluster using GKE in three different Google Cloud zones. In the cluster, you deploy three instances of IBM Db2 Warehouse:

  • An instance named db2wh-1 is initially designated as the head node.
  • Instances named db2wh-2 and db2wh-3 are initially designated as data nodes.

The role (head or data node) of individual instances can change if the head node fails over.

You also deploy a Filestore instance named db2wh-data-nfs, which acts as the shared storage for the cluster nodes.

The architecture is shown in the following diagram:


Getting access the IBM Db2 Warehouse Edition Docker images

In this tutorial, you allow your Docker Store account to download a free trial version of IBM Db2 Warehouse Edition from the Docker Store. This involves downloading two separate images—a server and a client.

  1. In your browser, go to the IBM Db2 Warehouse EE Docker image.

  2. Sign in using your Docker username and password.

  3. Click Proceed to checkout.

  4. Fill in your details.

  5. If you agree to the terms, select the I agree ... and I acknowledge ... checkboxes on the right.

  6. Click Get Content.

    This takes you to the Setup page. You don't need to follow those instructions, because you'll perform those steps later in the tutorial.

  7. Repeat the process for the IBM Db2 Warehouse Client image.

Preparing your environment

In this tutorial, you use us-central1 as the default region and us-central1-b as the default zone. To save time typing your Compute Engine zone options in the gcloud command-line tool, you set the region and zone as defaults.

You perform most of the steps for this tutorial in Cloud Shell. When you open Cloud Shell, you can also automatically clone the GitHub repo that's associated with this tutorial.

  1. Open Cloud Shell:

    OPEN Cloud Shell

  2. Set the default region and zone:

    gcloud config set compute/region us-central1
    gcloud config set compute/zone us-central1-b

Creating the Filestore instance

The next step is to create the Filestore database instance.

  • In Cloud Shell, create a Filestore instance:

    gcloud beta filestore instances create db2wh-data-nfs \
        --location=us-central1-c \
        --tier=STANDARD \
        --file-share=name="db2whdata",capacity=1TB \

    This creates a standard tier Filestore instance named db2wh-data-nfs with 1 TB capacity and a mount point named db2whdata.

Provisioning a service account to manage GKE clusters

For the tutorial, you create a service account to manage Compute Engine instances in the GKE cluster. GKE cluster nodes use this service account instead of the default service account. It's a best practice to limit the service account to just the roles and access permissions that are required in order to run the application.

The only role required for the service account is the Compute Admin role (roles/compute.admin). This role provides full control of all Compute Engine resources. The service account needs this role to manage GKE cluster nodes.

  1. In Cloud Shell, create an environment variable that stores the service account name:

    export GKE_SERVICE_ACCOUNT_NAME=db2dw-gke-service-account
  2. Create a service account:

    gcloud iam service-accounts create $GKE_SERVICE_ACCOUNT_NAME \
  3. Create an environment variable that stores the service account email account name:

    export GKE_SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \
        --format='value(email)' \
  4. Bind the compute.admin role to the service account:

    gcloud projects add-iam-policy-binding \
        $(gcloud config get-value project 2> /dev/null) \
        --member serviceAccount:$GKE_SERVICE_ACCOUNT_EMAIL \
        --role roles/compute.admin

Preparing the GKE cluster

In this section, you launch the GKE cluster, grant permissions, and finish the cluster configuration.

Launch the GKE cluster

You can now create and launch the GKE cluster.

  • In Cloud Shell, create a regional GKE cluster with a single node in each zone:

    gcloud container clusters create ibm-db2dw-demo \
        --enable-ip-alias \
        --image-type=ubuntu \
        --machine-type=n1-standard-16 \
        --metadata disable-legacy-endpoints=true \
        --node-labels=app=db2wh \
        --node-locations us-central1-a,us-central1-b,us-central1-c \
        --no-enable-basic-auth \
        --no-issue-client-certificate \
        --num-nodes=1 \
        --region us-central1 \

    This creates a cluster named ibm-db2dw-demo.

Because you're creating this cluster with just one node pool (the default one), all of the nodes of this cluster will be eligible to run IBM Db2 Warehouse workloads. (Only labeled nodes are eligible to host IBM Db2 Warehouse pods.) If you want more separation—for example, you want dedicated nodes for IBM Db2 Warehouse—you can either create a new node pool or a dedicated cluster.

Manage Docker Store authentication

In this tutorial, you create a secret to store your Docker Store credentials, so that your GKE cluster can download the IBM Db2 Warehouse Docker image from the Docker Store. For more details, see the relevant section of the Kubernetes Documentation. This approach is valid for private Docker Registry instances as well.

  1. In Cloud Shell, log in to the Docker Store (the Docker registry instance you're going to use):

    docker login
  2. Create a Kubernetes secret with your Docker Store credentials:

    kubectl create secret generic dockerstore \ \

Grant Cluster Admin privileges to the user

You need to grant your user the ability to create new roles in GKE, as described in the GKE documentation.

  • In Cloud Shell, grant the permission to create new roles to your user:

    kubectl create clusterrolebinding cluster-admin-binding \
        --clusterrole cluster-admin \
        --user $(gcloud config list \
        --format 'value(core.account)')

Deploy the NFS-Client provisioner

In this tutorial, you deploy a NFS-Client provisioner in the cluster. This provisioner takes care of initializing PersistentVolumes in the Filestore instance to back PersistentVolumeClaims.

  1. In Cloud Shell, create a service account to manage NFS resources:

    kubectl apply -f solutions-db2wh/nfs/rbac.yaml
  2. Deploy the NFS-Client provisioner:

    kubectl apply -f solutions-db2wh/nfs/deployment.yaml
  3. Create a StorageClass to back PersistentVolumeClaims with NFS volumes:

    kubectl apply -f solutions-db2wh/nfs/class.yaml
  4. Wait for the NFS-Client provisioner pods to be reported as Running:

    kubectl get pods --watch

    When they're running, you see Running in the output:

    nfs-client-provisioner   1/1       Running   0         10s

Create the nodes files

You can now create a configuration file that IBM Db2 Warehouse needs in order to bootstrap each instance.

  1. In Cloud Shell, create the nodes files:

    kubectl get nodes -o=jsonpath="{range \
    .address}{\"\n\"}{end}" | sed '1s/^/head_node=/' | \
    sed -e '2,$ s/^/data_node=/' > nodes
  2. Create a ConfigMap that contains the nodes file:

    kubectl create configmap db2wh-nodes --from-file=nodes

Deploying IBM Db2 Warehouse pods

You now create all the GKE pods that are required in order to run IBM Db2 Warehouse.

  1. In Cloud Shell, create the PersistentVolumeClaim. The PersistentVolumeClaim object enables the cluster to mount the NFS storage as a PersistentVolume on multiple pods at the same time.

    kubectl apply -f solutions-db2wh/persistent-volume-claim.yaml
  2. Run a job that copies the nodes file in the NFS volume:

    kubectl apply -f solutions-db2wh/nodes-file-deploy-job.yaml
  3. Verify that the nodes file deployment job has run:

    kubectl get jobs --watch

    The job has run when nodes-config is reported as Successful:

    nodes-config   1         1            19s
  4. Deploy a LoadBalancer Service to allow access to the IBM Db2 Warehouse administration console:

    kubectl apply -f solutions-db2wh/service.yaml
  5. Wait for the load balancer service named db2wh-ext to be assigned an external IP address:

    kubectl get services --watch

    In the output, you see an IP address for CLUSTER-IP and EXTERNAL-IP:

    NAME       TYPE         CLUSTER-IP   EXTERNAL-IP  PORT(S)                         AGE
    db2wh-ext  LoadBalancer yy.yy.yy.yy  xx.xx.xx.xx  8443:30973/TCP,50000:30613/TCP  7s
  6. Deploy the StatefulSet to start the IBM Db2 Warehouse pods:

    kubectl apply -f solutions-db2wh/statefulset.yaml
  7. Verify that the IBM Db2 Warehouse pods (db2wh-0, db2wh-1, and db2wh-2) are running:

    kubectl get pods --watch

    This might take a few minutes.

    The pods are running when you see the status Running for all of the pods:

    db2wh-1   0/1       Running   0         3m
    db2wh-2   0/1       Running   0         3m
    db2wh-0   0/1       Running   0         3m
  8. Create an environment variable that stores the IP address of the node that's running the IBM Db2 Warehouse head node:

    HEAD_NODE_IP=$(grep "head_node" nodes | awk -F ':' '{print $2}')
  9. Create an environment variable that stores the head node pod name:

    HEAD_NODE_POD_NAME=$(kubectl get pods \
    --field-selector=status.phase=Running -o=jsonpath="{range \
    .items[?('db2wh')]} \
    hostIP}{'\n'}{end}" | grep $HEAD_NODE_IP | awk -F ':' '{print $1}')
  10. Check the logs of one of the pods to ensure that the bootstrap process is running without problems:

    kubectl exec -it $HEAD_NODE_POD_NAME -- status --check-startup

    This might take 40 to 60 minutes, during which time you might see some errors detected. You can ignore them for this tutorial.

    The process is running correctly when you see the status running successfully in the output:

    HA Management up and running successfully!
    Successfully started IBM Db2 Warehouse service stack!
  11. Set up the administration console password:

    DB2_ADMIN_PASSWORD=$(openssl rand -hex 8)
    kubectl exec -it $HEAD_NODE_POD_NAME -- setpass ${DB2_ADMIN_PASSWORD}

Testing your deployment

You've finished configuring the pods, so you can now test the deployment.

Deploy the IBM Db2 Warehouse Client container

In order to upload data to IBM Db2 Warehouse, you now deploy the Client container and map the sample data to it using a Kubernetes ConfigMap.

  1. In Cloud Shell, create a ConfigMap that contains the sample data:

    kubectl create configmap sample-data \
        --from-file=solutions-db2wh/sample-data/nyc-wifi-locs.csv \
  2. Create the Deployment to start the IBM Db2 Warehouse Client container:

    kubectl apply -f solutions-db2wh/client.yaml
  3. Verify that the IBM Db2 Warehouse Client pod is running:

    kubectl get pods --watch

    This might take a few minutes.

    The pod is running when you see Running in the status:

    db2wh-client-xxxxx-xxxx   1/1       Running   0         3m

Upload sample data

To help you test the deployment, you upload sample data to IBM Db2 Warehouse server.

  1. In Cloud Shell, display the password created earlier:

  2. Create an environment variable that stores the IBM Db2 Warehouse Client container name:

    CLIENT_CONTAINER_NAME=$(kubectl get pods -l app=db2wh-client -o=jsonpath='{.items[0]}')
  3. Open a shell window in the Client container:

    kubectl exec -it $CLIENT_CONTAINER_NAME -- cli
  4. Create an environment variable that stores the password, where [PASSWORD] is the password you got earlier in this procedure:

  5. Create an environment variable that stores the database alias:


    BLUDB is the default database name in IBM Db2 Warehouse.

  6. Create an environment variable that stores the database hostname:

  7. Set up the database catalog:

    db_catalog --add $DB_HOST --alias $DB_ALIAS
  8. Create a table to hold sample data in IBM Db2 Warehouse server:

    dbsql -f /sample-table.sql -d $DB_ALIAS -h $DB_HOST -u bluadmin -W $DB_PASSWORD
  9. Upload data to the IBM Db2 Warehouse server:

    dbload -verbose -host $DB_HOST -u bluadmin \
    -pw $DB_PASSWORD -db $DB_ALIAS -schema BLUADMIN \
    -t NYC_FREE_PUBLIC_WIFI -df /nyc-wifi-locs.csv -delim ',' \
    -quotedValue DOUBLE -timeStyle 12HOUR -skipRows 1
  10. Close the IBM Db2 Warehouse Client shell:


Validate the data using the administration console

You now connect to the IBM Db2 Warehouse administration console and verify the data that you uploaded.

  1. In Cloud Shell, find the service's external IP address:

    kubectl get svc db2wh-ext
  2. Open a browser and go to the following URL, where [EXTERNAL_IP] is the IP address from the preceding step:


    You can bypass the security warning.

  3. Log in with the following credentials:

    • Username: bluadmin
    • Password: (the password you created in the previous procedure)
  4. If you accept the IBM Db2 Warehouse EULA, click Accept.

  5. On the left-hand side, open the menu and then select Administer > Tables:

    IBM Db2 Warehouse administration UI, showing the Tables folder

  6. Close the Quick Tour popup.


    Listing of available wi-fi hotspots

  8. Click the Data Distribution tab and make sure the table is populated:

    Data Distribution tab in the Tables listing

    You see 2871 rows in total, which is the entire dataset.

  9. Click Generate SQL.

  10. Select SELECT statement.

  11. Click OK.

    The Generate SQL tab opens and is pre-populated with auto-generated SELECT statement.

  12. Add a LIMIT clause to the auto-generated SELECT statement to limit the results to the first five records:

           "LAT", "LON", "X", "Y", "LOCATION_T", "REMARKS", "CITY", "SSID",
           "COUNDIST", "POSTCODE", "BOROCD", "CT2010", "BOROCT2010", "BIN", "BBL", "DOITT_ID"
      LIMIT 5;
  13. Click Run and then select Run All.

    A listing of records is displayed in the Result Set tab, showing that you successfully uploaded the sample data.

    Results of running SQL statement showing 5 records from the uploaded data

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

  1. In the Cloud Console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

What's next