在 GKE 中将 MySQL 数据从 Persistent Disk 迁移到 Hyperdisk

本教程演示了如何将 Google Kubernetes Engine 上现有 MySQL 数据从永久性磁盘 (PD) 迁移到 Hyperdisk,以提升存储性能。与永久性磁盘相比,Hyperdisk 可提供更高的 IOPS 和吞吐量,从而缩短数据库查询和事务的延迟时间,进而提高 MySQL 性能。您可以根据机器类型兼容性,使用磁盘快照将数据迁移到不同的磁盘类型。例如,Hyperdisk 卷仅与某些第三代、第四代及更高版本的机器类型(例如 N4)兼容,而这些机器类型不支持永久性磁盘。如需了解详情,请参阅可用的机器系列

为了演示从 Persistent Disk 到 Hyperdisk 的迁移,本教程使用 Sakila 数据库提供示例数据集。Sakila 是 MySQL 提供的一个示例数据库,您可以用作教程和示例的架构。它代表一家虚构的 DVD 租赁店,包含电影、演员、客户和租赁等表。

本指南适用于负责创建和分配存储空间以及管理数据安全性和数据访问权限的存储专家和存储管理员。如需详细了解我们在 Google Cloud 内容中提及的常见角色和示例任务,请参阅常见的 GKE 用户角色和任务

部署架构

下图展示了从 Persistent Disk 到 Hyperdisk 的迁移过程。

  • MySQL 应用在具有 N2 机器类型的 GKE 节点池上运行,并将其数据存储在永久性磁盘 SSD 上。
  • 为确保数据一致性,应用会缩减规模以防止新的写入。
  • 系统会创建永久性磁盘的快照,作为完整的时间点数据备份。
  • 系统会根据快照预配新的 Hyperdisk,并在单独的 Hyperdisk 兼容型 N4 节点池中部署新的 MySQL 实例。新实例会挂接到新创建的 Hyperdisk,从而完成向更高性能存储空间的迁移。
架构图:显示了如何使用快照将 MySQL 数据从 Persistent Disk 迁移到 Hyperdisk。
图 1:使用快照将 MySQL 数据从永久性磁盘迁移到 Hyperdisk。

目标

在本教程中,您将学习如何执行以下操作:

  • 部署 MySQL 集群。
  • 上传测试数据集。
  • 创建数据快照。
  • 通过快照创建 Hyperdisk。
  • 在启用 Hyperdisk 的 N4 机器类型节点池中启动新的 MySQL 集群。
  • 验证数据完整性,以确认迁移是否成功。

费用

在本文档中,您将使用 Google Cloud的以下收费组件:

  • GKE
  • Compute Engine, which includes:
    • Storage capacity provisioned for both Persistent Disk and Hyperdisk.
    • Storage costs for the snapshots.

如需根据您的预计使用量来估算费用,请使用价格计算器

新 Google Cloud 用户可能有资格申请免费试用

准备工作

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project.

  4. Enable the Compute Engine, GKE, Identity and Access Management Service Account Credentials APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Roles required to select or create a project

    • Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
    • Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project.

  7. Enable the Compute Engine, GKE, Identity and Access Management Service Account Credentials APIs.

    Roles required to enable APIs

    To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

    Enable the APIs

  8. Make sure that you have the following role or roles on the project: roles/container.admin, roles/iam.serviceAccountAdmin, roles/compute.admin

    Check for the roles

    1. In the Google Cloud console, go to the IAM page.

      Go to IAM
    2. Select the project.
    3. In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.

    4. For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

    Grant the roles

    1. In the Google Cloud console, go to the IAM page.

      前往 IAM
    2. 选择项目。
    3. 点击 授予访问权限
    4. 新的主账号字段中,输入您的用户标识符。 这通常是 Google 账号的电子邮件地址。

    5. 选择角色列表中,选择一个角色。
    6. 如需授予其他角色,请点击 添加其他角色,然后添加其他各个角色。
    7. 点击 Save(保存)。
    8. 设置 Cloud Shell

      1. In the Google Cloud console, activate Cloud Shell.

        Activate Cloud Shell

        Cloud Shell 会话随即会启动并显示命令行提示符。该会话可能需要几秒钟来完成初始化。

      2. 设置默认项目:

          gcloud config set project PROJECT_ID
        

        PROJECT_ID 替换为您的项目 ID。

      3. 准备环境

        1. 在 Cloud Shell 中,为您的项目、位置和集群前缀设置环境变量。

          export PROJECT_ID=PROJECT_ID
          export EMAIL_ADDRESS=EMAIL_ADDRESS
          export KUBERNETES_CLUSTER_PREFIX=offline-hyperdisk-migration
          export LOCATION=us-central1-a
          

          替换以下内容:

          • PROJECT_ID:您的 Google Cloud 项目 ID
          • EMAIL_ADDRESS:您的电子邮件地址。
          • LOCATION:您要在其中创建部署资源的可用区。在本教程中,请使用 us-central1-a 可用区。
        2. 从 GitHub 克隆示例代码库:

          git clone https://github.com/GoogleCloudPlatform/kubernetes-engine-samples
          
        3. 转到 offline-hyperdisk-migration 目录以开始创建部署资源:

          cd kubernetes-engine-samples/databases/offline-hyperdisk-migration
          

        创建 GKE 集群和节点池

        为简单起见,本教程使用地区级集群,因为 Hyperdisk 卷是地区级资源,只能在单个可用区内访问。

        1. 创建可用区级 GKE 集群:

          gcloud container clusters create ${KUBERNETES_CLUSTER_PREFIX}-cluster \
              --location ${LOCATION} \
              --node-locations ${LOCATION} \
              --shielded-secure-boot \
              --shielded-integrity-monitoring \
              --machine-type "e2-micro" \
              --num-nodes "1"
          
        2. 为初始 MySQL 部署添加具有 N2 机器类型的节点池:

          gcloud container node-pools create regular-pool \
              --cluster ${KUBERNETES_CLUSTER_PREFIX}-cluster \
              --machine-type n2-standard-4 \
              --location ${LOCATION} \
              --num-nodes 1
          
        3. 在 Hyperdisk 上添加一个具有 N4 机器类型的节点池,MySQL 部署将迁移到该节点池并在此处运行:

          gcloud container node-pools create hyperdisk-pool \
              --cluster ${KUBERNETES_CLUSTER_PREFIX}-cluster \
              --machine-type n4-standard-4 \
              --location ${LOCATION} \
              --num-nodes 1
          
        4. 连接到该集群:

          gcloud container clusters get-credentials ${KUBERNETES_CLUSTER_PREFIX}-cluster --location ${LOCATION}
          

        在 Persistent Disk 上部署 MySQL

        在本部分中,您将部署一个使用永久性磁盘进行存储的 MySQL 实例,并使用示例数据加载该实例。

        1. 为 Hyperdisk 创建并应用 StorageClass。本教程的后面部分将使用此 StorageClass

          apiVersion: storage.k8s.io/v1
          kind: StorageClass
          metadata:
            name: balanced-storage
          provisioner: pd.csi.storage.gke.io
          volumeBindingMode: WaitForFirstConsumer
          allowVolumeExpansion: true
          parameters:
            type: hyperdisk-balanced
            provisioned-throughput-on-create: "250Mi"
            provisioned-iops-on-create: "7000"
          kubectl apply -f manifests/01-storage-class/storage-class-hdb.yaml
          
        2. 创建并部署包含节点亲和性的 MySQL 实例,以确保 Pod 调度到 regular-pool 节点上,并预配永久性磁盘 SSD 卷。

          apiVersion: v1
          kind: Service
          metadata:
            name: regular-mysql
            labels:
              app: mysql
          spec:
            ports:
              - port: 3306
            selector:
              app: mysql
            clusterIP: None
          ---
          apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
            name: mysql-pv-claim
            labels:
              app: mysql
          spec:
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 30Gi
            storageClassName: premium-rwo
          ---
          apiVersion: apps/v1
          kind: Deployment
          metadata:
            name: existing-mysql
            labels:
              app: mysql
          spec:
            selector:
              matchLabels:
                app: mysql
            strategy:
              type: Recreate
            template:
              metadata:
                labels:
                  app: mysql
              spec:
                containers:
                - image: mysql:8.0
                  name: mysql
                  env:
                  - name: MYSQL_ROOT_PASSWORD
                    value: migration
                  - name: MYSQL_DATABASE
                    value: mysql
                  - name: MYSQL_USER
                    value: app
                  - name: MYSQL_PASSWORD
                    value: migration
                  ports:
                  - containerPort: 3306
                    name: mysql
                  volumeMounts:
                  - name: mysql-persistent-storage
                    mountPath: /var/lib/mysql
                affinity: 
                  nodeAffinity:
                    preferredDuringSchedulingIgnoredDuringExecution:
                    - weight: 1
                      preference:
                        matchExpressions:
                        - key: "node.kubernetes.io/instance-type"
                          operator: In
                          values:
                          - "n2-standard-4"
                volumes:
                - name: mysql-persistent-storage
                  persistentVolumeClaim:
                    claimName: mysql-pv-claim
          kubectl apply -f manifests/02-mysql/mysql-deployment.yaml
          

          此清单会创建一个 MySQL 部署和服务,并动态预配一个用于数据存储的永久性磁盘。root 用户的密码为 migration

        3. 部署 MySQL 客户端 Pod 以加载数据,并验证数据迁移:

          apiVersion: v1
          kind: Pod
          metadata:
            name: mysql-client
          spec:
            containers:
            - name: main
              image: mysql:8.0
              command: ["sleep", "360000"]
              resources:
                requests:
                  memory: 1Gi
                  cpu: 500m
                limits:
                  memory: 1Gi
                  cpu: "1"
              env:
              - name: MYSQL_ROOT_PASSWORD
                value: migration
          kubectl apply -f manifests/02-mysql/mysql-client.yaml
          kubectl wait pods mysql-client --for condition=Ready --timeout=300s
          
        4. 连接到客户端 Pod:

          kubectl exec -it mysql-client -- bash
          
        5. 在客户端 Pod shell 中,下载并导入 Sakila 示例数据集:

          # Download the dataset
          curl --output dataset.tgz "https://downloads.mysql.com/docs/sakila-db.tar.gz"
          
          # Extract the dataset
          tar -xvzf dataset.tgz -C /home/mysql
          
          # Import the dataset into MySQL (the password is "migration").
          mysql -u root -h regular-mysql.default -p
              SOURCE /sakila-db/sakila-schema.sql;
              SOURCE /sakila-db/sakila-data.sql;
          
        6. 验证数据是否已导入:

          USE sakila;
          SELECT      table_name,      table_rows  FROM      INFORMATION_SCHEMA.TABLES  WHERE TABLE_SCHEMA = 'sakila';
          

          输出会显示包含行数的表列表。

          | TABLE_NAME                 | TABLE_ROWS |
          +----------------------------+------------+
          | actor                      |        200 |
          | actor_info                 |       NULL |
          | address                    |        603 |
          | category                   |         16 |
          | city                       |        600 |
          | country                    |        109 |
          | customer                   |        599 |
          | customer_list              |       NULL |
          | film                       |       1000 |
          | film_actor                 |       5462 |
          | film_category              |       1000 |
          | film_list                  |       NULL |
          | film_text                  |       1000 |
          | inventory                  |       4581 |
          | language                   |          6 |
          | nicer_but_slower_film_list |       NULL |
          | payment                    |      16086 |
          | rental                     |      16419 |
          | sales_by_film_category     |       NULL |
          | sales_by_store             |       NULL |
          | staff                      |          2 |
          | staff_list                 |       NULL |
          | store                      |          2 |
          +----------------------------+------------+
          23 rows in set (0.01 sec)
          
        7. 退出 mysql 会话:

          exit;
          
        8. 退出客户端 Pod shell:

          exit
          
        9. 获取为 MySQL 创建的 PersistentVolume (PV) 的名称,并将其存储在环境变量中:

          export PV_NAME=$(kubectl get pvc mysql-pv-claim -o jsonpath='{.spec.volumeName}')
          

        将数据迁移到 Hyperdisk 卷

        现在,您有了一个 MySQL 工作负载,其数据存储在 Persistent Disk SSD 卷上。本部分介绍如何使用快照将此数据迁移到 Hyperdisk 卷。此迁移方法还会保留原始的 Persistent Disk 卷,以便您在必要时回滚到使用原始 MySQL 实例。

        1. 虽然您可以从磁盘创建快照,而无需将磁盘从工作负载中分离,但为了确保 MySQL 的数据完整性,您必须在创建快照期间停止向磁盘写入任何新数据。将 MySQL 部署缩减为 0 个副本以停止写入:

          kubectl scale deployment regular-mysql --replicas=0
          
        2. 根据现有永久性磁盘创建快照:

          gcloud compute disks snapshot ${PV_NAME} --location=${LOCATION} --snapshot-name=original-snapshot --description="snapshot taken from pd-ssd"
          
        3. 根据快照创建一个名为 mysql-recovery 的新 Hyperdisk 卷:

          gcloud compute disks create mysql-recovery --project=${PROJECT_ID} \
              --type=hyperdisk-balanced \
              --size=150GB --location=${LOCATION} \
              --source-snapshot=projects/${PROJECT_ID}/global/snapshots/original-snapshot
          
        4. 使用您的项目 ID 更新已恢复 PV 的清单文件:

          apiVersion: v1
          kind: PersistentVolume
          metadata:
            name: backup
          spec:
            storageClassName: balanced-storage
            capacity:
              storage: 150G
            accessModes:
              - ReadWriteOnce
            claimRef:
              name: hyperdisk-recovery
              namespace: default
            csi:
              driver: pd.csi.storage.gke.io
              volumeHandle: projects/PRJCTID/zones/us-central1-a/disks/mysql-recovery
              fsType: ext4
          ---
          apiVersion: v1
          kind: PersistentVolumeClaim
          metadata:
            namespace: default
            name: hyperdisk-recovery
          spec:
            storageClassName: balanced-storage
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 150G
          sed -i "s/PRJCTID/$PROJECT_ID/g" manifests/02-mysql/restore_pv.yaml
          
        5. 从新的 Hyperdisk 创建 PersistentVolume (PVC) 和 PersistentVolumeClaim:

          kubectl apply -f manifests/02-mysql/restore_pv.yaml
          

        验证数据迁移

        部署使用新创建的 Hyperdisk 卷的新 MySQL 实例。此 Pod 将调度到由 N4 节点组成的 hyperdisk-pool 节点池中。

        1. 部署新的 MySQL 实例:

          apiVersion: v1
          kind: Service
          metadata:
            name: recovered-mysql
            labels:
              app: new-mysql
          spec:
            ports:
              - port: 3306
            selector:
              app: new-mysql
            clusterIP: None
          ---
          apiVersion: apps/v1
          kind: Deployment
          metadata:
            name: new-mysql
            labels:
              app: new-mysql
          spec:
            selector:
              matchLabels:
                app: new-mysql
            strategy:
              type: Recreate
            template:
              metadata:
                labels:
                  app: new-mysql
              spec:
                containers:
                - image: mysql:8.0
                  name: mysql
                  env:
                  - name: MYSQL_ROOT_PASSWORD
                    value: migration
                  - name: MYSQL_DATABASE
                    value: mysql
                  - name: MYSQL_USER
                    value: app
                  - name: MYSQL_PASSWORD
                    value: migration
                  ports:
                  - containerPort: 3306
                    name: mysql
                  volumeMounts:
                  - name: mysql-persistent-storage
                    mountPath: /var/lib/mysql
                affinity: 
                  nodeAffinity:
                    preferredDuringSchedulingIgnoredDuringExecution:
                    - weight: 1
                      preference:
                        matchExpressions:
                        - key: "cloud.google.com/gke-nodepool"
                          operator: In
                          values:
                          - "hyperdisk-pool"      
                volumes:
                - name: mysql-persistent-storage
                  persistentVolumeClaim:
                    claimName: hyperdisk-recovery
          kubectl apply -f manifests/02-mysql/recovery_mysql_deployment.yaml
          
        2. 如需验证数据完整性,请再次连接到 MySQL 客户端 Pod:

          kubectl exec -it mysql-client -- bash
          
        3. 在客户端 Pod 中,连接到新的 MySQL 数据库 (recovered-mysql.default) 并验证数据。密码为 migration

          mysql -u root -h recovered-mysql.default -p
          USE sakila;
          SELECT table_name, table_rows FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'sakila';
          

          数据应与永久性磁盘卷上的原始 MySQL 实例中的数据相同。

        4. 退出 mysql 会话:

          exit;
          
        5. 退出客户端 Pod shell:

          exit
          

        清理

        为避免因本教程中使用的资源导致您的 Google Cloud 账号产生费用,请删除包含这些资源的项目,或者保留项目但删除各个资源。

        删除项目

        1. In the Google Cloud console, go to the Manage resources page.

          Go to Manage resources

        2. In the project list, select the project that you want to delete, and then click Delete.
        3. In the dialog, type the project ID, and then click Shut down to delete the project.

        删除各个资源

        如果您使用的是现有项目,并且不想将其删除,请逐个删除资源:

        1. 设置用于清理的环境变量,并检索由 mysql-pv-claim PersistentVolumeClaim 创建的 Persistent Disk 卷的名称:

          export PROJECT_ID=PROJECT_ID
          export KUBERNETES_CLUSTER_PREFIX=offline-hyperdisk-migration
          export location=us-central1-a
          export PV_NAME=$(kubectl get pvc mysql-pv-claim -o jsonpath='{.spec.volumeName}')
          

          PROJECT_ID 替换为您的项目 ID。

        2. 删除该快照:

          gcloud compute snapshots delete original-snapshot --quiet
          
        3. 删除 GKE 集群:

          gcloud container clusters delete ${KUBERNETES_CLUSTER_PREFIX}-cluster --location=${LOCATION} --quiet
          
        4. 删除永久性磁盘卷和 Hyperdisk 卷:

          gcloud compute disks delete ${PV_NAME} --location=${LOCATION} --quiet
          gcloud compute disks delete mysql-recovery --location=${LOCATION} --quiet
          

        后续步骤