This tutorial shows how to create an IBM Db2 Warehouse cluster on Google Kubernetes Engine (GKE) with a Network File System (NFS) volume as the storage layer. You use Filestore as the NFS backend for the shared storage solution. However, you can choose any other cloud-deployable NFS solution.
This tutorial is useful if you are a sysadmin, developer, engineer, or database administrator and you want to deploy an IBM Db2 Warehouse cluster on Google Cloud.
For an overview of IBM Db2 Warehouse and deployment options on Google Cloud, see the series overview.
In this tutorial, you use the following software:
- Ubuntu-server 16.04
- IBM Db2 Warehouse Enterprise Edition
- IBM Db2 Warehouse Client
Objectives
- Get access to the IBM Db2 Warehouse Docker images in the Docker Store.
- Create a Filestore instance.
- Launch your GKE cluster.
- Verify that the cluster is operational.
- Initialize Docker Store authentication in the GKE cluster.
- Deploy and run the NFS-Client provisioner in the cluster.
- Deploy and run IBM Db2 Warehouse containers in the cluster.
- Upload sample data in IBM Db2 Warehouse.
- Connect to the IBM Db2 administration console.
Costs
This tutorial uses billable components of Google Cloud, including:
Use the Pricing Calculator to generate a cost estimate based on your projected usage.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the GKE and Filestore APIs.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
-
Enable the GKE and Filestore APIs.
- If you don't have a Docker ID, create one in the Docker Store.
In this tutorial, you use IBM Db2 Warehouse Enterprise Edition. If you don't already have a license for this software, you might be able to use a free trial version for this tutorial.
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For more information, see Clean up.
Architecture
In this tutorial, you deploy a Kubernetes cluster using GKE in three different Google Cloud zones. In the cluster, you deploy three instances of IBM Db2 Warehouse:
- An instance named
db2wh-1
is initially designated as the head node. - Instances named
db2wh-2
anddb2wh-3
are initially designated as data nodes.
The role (head or data node) of individual instances can change if the head node fails over.
You also deploy a Filestore instance named db2wh-data-nfs
,
which acts as the shared storage for the cluster nodes.
The architecture is shown in the following diagram:
Getting access the IBM Db2 Warehouse Edition Docker images
In this tutorial, you allow your Docker Store account to download a free trial version of IBM Db2 Warehouse Edition from the Docker Store. This involves downloading two separate images—a server and a client.
In your browser, go to the IBM Db2 Warehouse EE Docker image.
Sign in using your Docker username and password.
Click Proceed to checkout.
Fill in your details.
If you agree to the terms, select the I agree ... and I acknowledge ... checkboxes on the right.
Click Get Content.
This takes you to the Setup page. You don't need to follow those instructions, because you'll perform those steps later in the tutorial.
Repeat the process for the IBM Db2 Warehouse Client image.
Preparing your environment
In this tutorial, you use us-central1
as the default region and
us-central1-b
as the default zone. To save time typing your
Compute Engine zone
options in the Google Cloud CLI, you set the region and zone as
defaults.
You perform most of the steps for this tutorial in Cloud Shell. When you open Cloud Shell, you can also automatically clone the GitHub repo that's associated with this tutorial.
Open Cloud Shell:
Set the default region and zone:
gcloud config set compute/region us-central1 gcloud config set compute/zone us-central1-b
Creating the Filestore instance
The next step is to create the Filestore database instance.
In Cloud Shell, create a Filestore instance:
gcloud beta filestore instances create db2wh-data-nfs \ --location=us-central1-c \ --tier=STANDARD \ --file-share=name="db2whdata",capacity=1TB \ --network=name="default",reserved-ip-range="10.0.0.0/29"
This creates a standard tier Filestore instance named
db2wh-data-nfs
with 1 TB capacity and a mount point nameddb2whdata
.
Provisioning a service account to manage GKE clusters
For the tutorial, you create a service account to manage Compute Engine instances in the GKE cluster. GKE cluster nodes use this service account instead of the default service account. It's a best practice to limit the service account to just the roles and access permissions that are required in order to run the application.
The only role required for the service account is the Compute Admin role
(roles/compute.admin
). This role provides full control of all
Compute Engine resources. The service account needs this role to manage
GKE cluster nodes.
In Cloud Shell, create an environment variable that stores the service account name:
export GKE_SERVICE_ACCOUNT_NAME=db2dw-gke-service-account
Create a service account:
gcloud iam service-accounts create $GKE_SERVICE_ACCOUNT_NAME \ --display-name=$GKE_SERVICE_ACCOUNT_NAME
Create an environment variable that stores the service account email account name:
export GKE_SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \ --format='value(email)' \ --filter=displayName:"$GKE_SERVICE_ACCOUNT_NAME")
Bind the
compute.admin
role to the service account:gcloud projects add-iam-policy-binding \ $(gcloud config get-value project 2> /dev/null) \ --member serviceAccount:$GKE_SERVICE_ACCOUNT_EMAIL \ --role roles/compute.admin
Preparing the GKE cluster
In this section, you launch the GKE cluster, grant permissions, and finish the cluster configuration.
Launch the GKE cluster
You can now create and launch the GKE cluster.
In Cloud Shell, create a regional GKE cluster with a single node in each zone:
gcloud container clusters create ibm-db2dw-demo \ --enable-ip-alias \ --image-type=ubuntu \ --machine-type=n1-standard-16 \ --metadata disable-legacy-endpoints=true \ --node-labels=app=db2wh \ --node-locations us-central1-a,us-central1-b,us-central1-c \ --no-enable-basic-auth \ --no-issue-client-certificate \ --num-nodes=1 \ --region us-central1 \ --service-account=$GKE_SERVICE_ACCOUNT_EMAIL
This creates a cluster named
ibm-db2dw-demo
.
Because you're creating this cluster with just one node pool (the default one), all of the nodes of this cluster will be eligible to run IBM Db2 Warehouse workloads. (Only labeled nodes are eligible to host IBM Db2 Warehouse pods.) If you want more separation—for example, you want dedicated nodes for IBM Db2 Warehouse—you can either create a new node pool or a dedicated cluster.
Manage Docker Store authentication
In this tutorial, you create a secret to store your Docker Store credentials, so that your GKE cluster can download the IBM Db2 Warehouse Docker image from the Docker Store. For more details, see the relevant section of the Kubernetes Documentation. This approach is valid for private Docker Registry instances as well.
In Cloud Shell, log in to the Docker Store (the Docker registry instance you're going to use):
docker login
Create a Kubernetes secret with your Docker Store credentials:
kubectl create secret generic dockerstore \ --type=kubernetes.io/dockerconfigjson \ --from-file=.dockerconfigjson="$HOME"/.docker/config.json
Grant Cluster Admin privileges to the user
You need to grant your user the ability to create new roles in GKE, as described in the GKE documentation.
In Cloud Shell, grant the permission to create new roles to your user:
kubectl create clusterrolebinding cluster-admin-binding \ --clusterrole cluster-admin \ --user $(gcloud config list \ --format 'value(core.account)')
Deploy the NFS-Client provisioner
In this tutorial, you deploy a NFS-Client provisioner in the cluster. This provisioner takes care of initializing PersistentVolumes in the Filestore instance to back PersistentVolumeClaims.
In Cloud Shell, create a service account to manage NFS resources:
kubectl apply -f solutions-db2wh/nfs/rbac.yaml
Deploy the NFS-Client provisioner:
kubectl apply -f solutions-db2wh/nfs/deployment.yaml
Create a StorageClass to back PersistentVolumeClaims with NFS volumes:
kubectl apply -f solutions-db2wh/nfs/class.yaml
Wait for the NFS-Client provisioner pods to be reported as
Running
:kubectl get pods --watch
When they're running, you see
Running
in the output:nfs-client-provisioner 1/1 Running 0 10s
Create the nodes files
You can now create a configuration file that IBM Db2 Warehouse needs in order to bootstrap each instance.
In Cloud Shell, create the
nodes
files:kubectl get nodes -o=jsonpath="{range \ .items[?(@.metadata.labels.app=='db2wh')]}\ {.metadata.name}{':'}{.status.addresses[?(@.type=='InternalIP')]\ .address}{\"\n\"}{end}" | sed '1s/^/head_node=/' | \ sed -e '2,$ s/^/data_node=/' > nodes
Create a ConfigMap that contains the
nodes
file:kubectl create configmap db2wh-nodes --from-file=nodes
Deploying IBM Db2 Warehouse pods
You now create all the GKE pods that are required in order to run IBM Db2 Warehouse.
In Cloud Shell, create the PersistentVolumeClaim. The PersistentVolumeClaim object enables the cluster to mount the NFS storage as a PersistentVolume on multiple pods at the same time.
kubectl apply -f solutions-db2wh/persistent-volume-claim.yaml
Run a job that copies the
nodes
file in the NFS volume:kubectl apply -f solutions-db2wh/nodes-file-deploy-job.yaml
Verify that the
nodes
file deployment job has run:kubectl get jobs --watch
The job has run when
nodes-config
is reported asSuccessful
:NAME DESIRED SUCCESSFUL AGE nodes-config 1 1 19s
Deploy a LoadBalancer Service to allow access to the IBM Db2 Warehouse administration console:
kubectl apply -f solutions-db2wh/service.yaml
Wait for the load balancer service named
db2wh-ext
to be assigned an external IP address:kubectl get services --watch
In the output, you see an IP address for
CLUSTER-IP
andEXTERNAL-IP
:NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE db2wh-ext LoadBalancer yy.yy.yy.yy xx.xx.xx.xx 8443:30973/TCP,50000:30613/TCP 7s
Deploy the StatefulSet to start the IBM Db2 Warehouse pods:
kubectl apply -f solutions-db2wh/statefulset.yaml
Verify that the IBM Db2 Warehouse pods (
db2wh-0
,db2wh-1
, anddb2wh-2
) are running:kubectl get pods --watch
This might take a few minutes.
The pods are running when you see the status
Running
for all of the pods:db2wh-1 0/1 Running 0 3m db2wh-2 0/1 Running 0 3m db2wh-0 0/1 Running 0 3m
Create an environment variable that stores the IP address of the node that's running the IBM Db2 Warehouse head node:
HEAD_NODE_IP=$(grep "head_node" nodes | awk -F ':' '{print $2}')
Create an environment variable that stores the head node pod name:
HEAD_NODE_POD_NAME=$(kubectl get pods \ --field-selector=status.phase=Running -o=jsonpath="{range \ .items[?(@.metadata.labels.app=='db2wh')]} \ {.metadata.name}{':'}{.status.\ hostIP}{'\n'}{end}" | grep $HEAD_NODE_IP | awk -F ':' '{print $1}')
Check the logs of one of the pods to ensure that the bootstrap process is running without problems:
kubectl exec -it $HEAD_NODE_POD_NAME -- status --check-startup
This might take 40 to 60 minutes, during which time you might see some errors detected. You can ignore them for this tutorial.
The process is running correctly when you see the status
running successfully
in the output:HA Management up and running successfully! Successfully started IBM Db2 Warehouse service stack!
Set up the administration console password:
DB2_ADMIN_PASSWORD=$(openssl rand -hex 8) kubectl exec -it $HEAD_NODE_POD_NAME -- setpass ${DB2_ADMIN_PASSWORD}
Testing your deployment
You've finished configuring the pods, so you can now test the deployment.
Deploy the IBM Db2 Warehouse Client container
In order to upload data to IBM Db2 Warehouse, you now deploy the Client container and map the sample data to it using a Kubernetes ConfigMap.
In Cloud Shell, create a ConfigMap that contains the sample data:
kubectl create configmap sample-data \ --from-file=solutions-db2wh/sample-data/nyc-wifi-locs.csv \ --from-file=solutions-db2wh/sample-data/sample-table.sql
Create the Deployment to start the IBM Db2 Warehouse Client container:
kubectl apply -f solutions-db2wh/client.yaml
Verify that the IBM Db2 Warehouse Client pod is running:
kubectl get pods --watch
This might take a few minutes.
The pod is running when you see
Running
in the status:db2wh-client-xxxxx-xxxx 1/1 Running 0 3m
Upload sample data
To help you test the deployment, you upload sample data to IBM Db2 Warehouse server.
In Cloud Shell, display the password created earlier:
echo $DB2_ADMIN_PASSWORD
Create an environment variable that stores the IBM Db2 Warehouse Client container name:
CLIENT_CONTAINER_NAME=$(kubectl get pods -l app=db2wh-client -o=jsonpath='{.items[0].metadata.name}')
Open a shell window in the Client container:
kubectl exec -it $CLIENT_CONTAINER_NAME -- cli
Create an environment variable that stores the password, where
[PASSWORD]
is the password you got earlier in this procedure:DB_PASSWORD=[PASSWORD]
Create an environment variable that stores the database alias:
DB_ALIAS=BLUDB
BLUDB
is the default database name in IBM Db2 Warehouse.Create an environment variable that stores the database hostname:
DB_HOST=db2wh-ext.default.svc.cluster.local
Set up the database catalog:
db_catalog --add $DB_HOST --alias $DB_ALIAS
Create a table to hold sample data in IBM Db2 Warehouse server:
dbsql -f /sample-table.sql -d $DB_ALIAS -h $DB_HOST -u bluadmin -W $DB_PASSWORD
Upload data to the IBM Db2 Warehouse server:
dbload -verbose -host $DB_HOST -u bluadmin \ -pw $DB_PASSWORD -db $DB_ALIAS -schema BLUADMIN \ -t NYC_FREE_PUBLIC_WIFI -df /nyc-wifi-locs.csv -delim ',' \ -quotedValue DOUBLE -timeStyle 12HOUR -skipRows 1
Close the IBM Db2 Warehouse Client shell:
exit
Validate the data using the administration console
You now connect to the IBM Db2 Warehouse administration console and verify the data that you uploaded.
In Cloud Shell, find the service's external IP address:
kubectl get svc db2wh-ext
Open a browser and go to the following URL, where
[EXTERNAL_IP]
is the IP address from the preceding step:https://[EXTERNAL_IP]:8443
You can bypass the security warning.
Log in with the following credentials:
- Username:
bluadmin
- Password: (the password you created in the previous procedure)
- Username:
If you accept the IBM Db2 Warehouse EULA, click Accept.
On the left-hand side, open the menu and then select Administer > Tables:
Close the Quick Tour popup.
Click NYC_FREE_PUBLIC_WIFI:
Click the Data Distribution tab and make sure the table is populated:
You see 2871 rows in total, which is the entire dataset.
Click Generate SQL.
Select SELECT statement.
Click OK.
The Generate SQL tab opens and is pre-populated with auto-generated
SELECT
statement.Add a
LIMIT
clause to the auto-generatedSELECT
statement to limit the results to the first five records:SELECT "THE_GEOM", "OBJECTID", "BORO", "TYPE", "PROVIDER", "NAME", "LOCATION", "LAT", "LON", "X", "Y", "LOCATION_T", "REMARKS", "CITY", "SSID", "SOURCEID", "ACTIVATED", "BOROCODE", "BORONAME", "NTACODE", "NTANAME", "COUNDIST", "POSTCODE", "BOROCD", "CT2010", "BOROCT2010", "BIN", "BBL", "DOITT_ID" FROM "BLUADMIN"."NYC_FREE_PUBLIC_WIFI" LIMIT 5;
Click Run and then select Run All.
A listing of records is displayed in the Result Set tab, showing that you successfully uploaded the sample data.
Clean up
To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.
Delete the project
- In the console, go to the Manage resources page.
- In the project list, select the project that you want to delete, and then click Delete.
- In the dialog, type the project ID, and then click Shut down to delete the project.
What's next
- Read other documents in this series:
- Read about IBM Db2 Warehouse architecture.
- Read about IBM Db2 Warehouse requirements.
- Explore reference architectures, diagrams, tutorials, and best practices about Google Cloud. Take a look at our Cloud Architecture Center.