GKE on Bare Metal 快速入门

GKE on Bare Metal 简介

借助 GKE on Bare Metal,您可以定义四种类型的集群:

  • admin - 用于管理用户集群的集群。
  • user - 用于运行工作负载的集群。
  • 独立 - 一个可以管理自身也可以运行工作负载的单个集群,但不能创建或管理其他用户集群。
  • 混合 - 一个同时用于管理和工作负载的单个集群,也可以管理其他用户集群。

在本快速入门中,您将使用 GKE on Bare Metal 部署双节点混合集群。您将了解如何创建集群以及如何监控集群创建过程。

本快速入门假定您对 Kubernetes 有基本的了解。

为 GKE on Bare Metal 准备

在 GKE on Bare Metal 中创建集群之前,您必须执行以下操作:

  1. 创建 Google Cloud 项目
  2. 配置管理员工作站

创建 Google Cloud 项目

在本快速入门中,您将创建一个新的 Google Cloud 项目来组织您的所有 Google Cloud 资源。如需在 GKE on Bare Metal 中创建集群,您需要一个使您的帐号具有 Owner 角色的 Google Cloud 项目。

如需了解详情,请参阅创建和管理项目

配置 Linux 管理员工作站

本快速入门使用 bmctlkubectl 来创建和使用集群。此命令行工具在 Linux 管理员工作站上运行。如需了解如何设置管理员工作站,请参阅管理员工作站前提条件

创建集群节点

创建两个机器以用作集群的节点:

  • 一个机器用作控制层面节点。
  • 一个机器用作工作器节点。

如需详细了解集群节点的要求,请参阅集群节点机器前提条件

创建集群

要创建集群,请执行以下操作:

  1. 使用 bmctl 创建配置文件。
  2. 修改配置文件,以针对集群和网络进行自定义。
  3. 使用 bmctl 从配置文件创建集群。

创建配置文件

要创建配置文件并自动启用服务账号和 API,请确保您位于 baremetal 目录中,然后使用以下标志发出 bmctl 命令:

./bmctl create config -c CLUSTER_NAME \
  --enable-apis --create-service-accounts --project-id=PROJECT_ID

CLUSTER_NAME 是集群的名称。 PROJECT_ID 是您在创建 Google Cloud 项目中创建的项目。

上述命令会在 bmctl-workspace/cluster1/cluster1.yaml 路径中的 baremetal 目录下创建一个配置文件

修改配置文件

要修改配置文件,请执行以下操作:

  1. 在编辑器中打开 bmctl-workspace/cluster1/cluster1.yaml 配置文件。
  2. 根据具体节点和网络要求修改该文件。请参考以下示例配置文件。本快速入门不使用或包含 OpenID Connect (OIDC) 相关信息。
# gcrKeyPath:  < to GCR service account key>
gcrKeyPath: baremetal/gcr.json
# sshPrivateKeyPath:  < to SSH private key, used for node access>
sshPrivateKeyPath: .ssh/id_rsa
# gkeConnectAgentServiceAccountKeyPath:  < to Connect agent service account key>
gkeConnectAgentServiceAccountKeyPath: baremetal/connect-agent.json
# gkeConnectRegisterServiceAccountKeyPath:  < to Hub registration service account key>
gkeConnectRegisterServiceAccountKeyPath: baremetal/connect-register.json
# cloudOperationsServiceAccountKeyPath:  < to Cloud Operations service account key>
cloudOperationsServiceAccountKeyPath: baremetal/cloud-ops.json
---
apiVersion: v1
kind: Namespace
metadata:
  name: cluster-cluster1
---
# Cluster configuration. Note that some of these fields are immutable once the cluster is created.
# For more info, see https://cloud.google.com/anthos/clusters/docs/bare-metal/1.14/reference/cluster-config-ref#cluster_configuration_fields
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
spec:
  # Cluster type. This can be:
  #   1) admin:  to create an admin cluster. This can later be used to create user clusters.
  #   2) user:   to create a user cluster. Requires an existing admin cluster.
  #   3) hybrid: to create a hybrid cluster that runs admin cluster components and user workloads.
  #   4) standalone: to create a cluster that manages itself, runs user workloads, but does not manage other clusters.
  type: hybrid
  # Anthos cluster version.
  anthosBareMetalVersion: 1.14.11
  # GKE connect configuration
  gkeConnect:
    projectID: PROJECT_ID
  # Control plane configuration
  controlPlane:
    nodePoolSpec:
      nodes:
      # Control plane node pools. Typically, this is either a single machine
      # or 3 machines if using a high availability deployment.
      - address:  CONTROL_PLANE_NODE_IP
  # Cluster networking configuration
  clusterNetwork:
    # Pods specify the IP ranges from which pod networks are allocated.
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    # Services specify the network ranges from which service virtual IPs are allocated.
    # This can be any RFC 1918 range that does not conflict with any other IP range
    # in the cluster and node pool resources.
    services:
      cidrBlocks:
      - 10.96.0.0/20
  # Load balancer configuration
  loadBalancer:
    # Load balancer mode can be either 'bundled' or 'manual'.
    # In 'bundled' mode a load balancer will be installed on load balancer nodes during cluster creation.
    # In 'manual' mode the cluster relies on a manually-configured external load balancer.
    mode: bundled
    # Load balancer port configuration
    ports:
      # Specifies the port the load balancer serves the Kubernetes control plane on.
      # In 'manual' mode the external load balancer must be listening on this port.
      controlPlaneLBPort: 443
    # There are two load balancer virtual IP (VIP) addresses: one for the control plane
    # and one for the L7 Ingress service. The VIPs must be in the same subnet as the load balancer nodes.
    # These IP addresses do not correspond to physical network interfaces.
    vips:
      # ControlPlaneVIP specifies the VIP to connect to the Kubernetes API server.
      # This address must not be in the address pools below.
      controlPlaneVIP: CONTROL_PLANE_VIP
      # IngressVIP specifies the VIP shared by all services for ingress traffic.
      # Allowed only in non-admin clusters.
      # This address must be in the address pools below.
      ingressVIP: INGRESS_VIP
    # AddressPools is a list of non-overlapping IP ranges for the data plane load balancer.
    # All addresses must be in the same subnet as the load balancer nodes.
    # Address pool configuration is only valid for 'bundled' LB mode in non-admin clusters.
    # addressPools:
    # - name: pool1
    #   addresses:
    #   # Each address must be either in the CIDR form (1.2.3.0/24)
    #   # or range form (1.2.3.1-1.2.3.5).
    #   - LOAD_BALANCER_ADDRESS_POOL-
    # A load balancer nodepool can be configured to specify nodes used for load balancing.
    # These nodes are part of the kubernetes cluster and run regular workloads as well as load balancers.
    # If the node pool config is absent then the control plane nodes are used.
    # Node pool configuration is only valid for 'bundled' LB mode.
    # nodePoolSpec:
    #   nodes:
    #   - address: LOAD_BALANCER_NODE_IP;
  # Proxy configuration
  # proxy:
  #   url: http://[username:password@]domain
  #   # A list of IPs, hostnames or domains that should not be proxied.
  #   noProxy:
  #   - 127.0.0.1
  #   - localhost
  # Logging and Monitoring
  clusterOperations:
    # Cloud project for logs and metrics.
    projectID: PROJECT_ID
    # Cloud location for logs and metrics.
    location: us-central1
    # Whether collection of application logs/metrics should be enabled (in addition to
    # collection of system logs/metrics which correspond to system components such as
    # Kubernetes control plane or cluster management agents).
    # enableApplication: false
  # Storage configuration
  storage:
    # lvpNodeMounts specifies the config for local PersistentVolumes backed by mounted disks.
    # These disks need to be formatted and mounted by the user, which can be done before or after
    # cluster creation.
    lvpNodeMounts:
      # path specifies the host machine path where mounted disks will be discovered and a local PV
      # will be created for each mount.
      path: /mnt/localpv-disk
      # storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
      # is created during cluster creation.
      storageClassName: local-disks
    # lvpShare specifies the config for local PersistentVolumes backed by subdirectories in a shared filesystem.
    # These subdirectories are automatically created during cluster creation.
    lvpShare:
      # path specifies the host machine path where subdirectories will be created on each host. A local PV
      # will be created for each subdirectory.
      path: /mnt/localpv-share
      # storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
      # is created during cluster creation.
      storageClassName: local-shared
      # numPVUnderSharedPath specifies the number of subdirectories to create under path.
      numPVUnderSharedPath: 5
  # NodeConfig specifies the configuration that applies to all nodes in the cluster.
  nodeConfig:
    # podDensity specifies the pod density configuration.
    podDensity:
      # maxPodsPerNode specifies the maximum number of pods allowed on a single node.
      maxPodsPerNode: 250

---
# Node pools for worker nodes
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
  name: node-pool-1
  namespace: cluster-cluster1
spec:
  clusterName: cluster1
  nodes:
  - address: WORKER_NODE_1_IP
  - address: WORKER_NODE_2_IP

运行预检检查并创建集群

bmctl 命令会在创建集群之前对集群配置文件运行预检检查。如果检查运行成功,则 bmctl 会创建集群。

要运行预检检查并创建集群,请执行以下操作:

  1. 确保您位于 baremetal 目录中。
  2. 使用以下命令创建集群:
  3. ./bmctl create cluster -c CLUSTER_NAME
    
    例如:
    ./bmctl create cluster -c cluster1
    

    bmctl 命令监控预检检查和集群创建,然后向屏幕显示输出并将详细信息写入 bmctl 日志。

您可以在以下目录中找到 bmctl、预检检查和节点安装日志:baremetal/bmctl-workspace/CLUSTER_NAME/log

bmctl 预检会检查相关集群安装的以下条件:

  • Linux 发行版和版本受支持。
  • SELinux 未处于“强制执行”模式。
  • 在 Ubuntu 上,未启用防火墙 (UFW) 处于未启用状态。
  • Google Container Registry 可访问。
  • VIP 可用。
  • 集群机器可以相互连接。
  • 负载均衡器机器位于同一第 2 层子网上。

集群创建过程可能需要几分钟时间才能完成。

获取有关集群的信息

成功创建集群后,请使用 kubectl 命令显示新集群的相关信息。在创建集群期间,bmctl 命令会为使用 kubectl 查询的集群写入 kubeconfig 文件。该 kubeconfig 文件会写入到 bmctl-workspace/CLUSTER_NAME/CLUSTER_NAME-kubeconfig

例如:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get nodes

此命令返回:

NAME      STATUS   ROLES    AGE   VERSION
node-01   Ready    master   16h   v1.17.8-gke.16
node-02   Ready    <none>   16h   v1.17.8-gke.16

如果集群创建的预检检查失败,请在预检检查日志中查找错误,并在集群配置文件中更正错误。预检检查日志位于 /log 目录下,路径为

~/baremetal/bmctl-workspace/CLUSTER_NAME/log

集群中每个机器的预检检查日志都位于 CLUSTER_NAME 目录中,并按 IP 地址组织。例如:

bmctl-workspace/cluster1/log
└── preflight-20201007-034844
    ├── 172.17.0.3
    ├── 172.17.0.4
    ├── 172.17.0.5
    ├── 172.17.0.6
    ├── 172.17.0.7
    └── node-network

忽略预检检查错误

如果在预检检查后集群创建失败,您可以尝试在 bmctl 命令中使用 --force 标志来重新安装集群。

--force 标志在现有集群上进行安装,但会忽略已分配的服务器端口导致的任何预检检查失败结果。

  1. 确保您位于 baremetal 目录中。
  2. 使用带有 --force 标志的以下命令来重新创建集群:
  3. ./bmctl create cluster -c CLUSTER_NAME --force
    
    例如:
    ./bmctl create cluster -c cluster1 --force

创建 Deployment 和 Service

以下是 Deployment 的清单。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
spec:
  selector:
    matchLabels:
      app: metrics
      department: sales
  replicas: 3
  template:
    metadata:
      labels:
        app: metrics
        department: sales
    spec:
      containers:
      - name: hello
        image: "gcr.io/google-samples/hello-app:2.0"

将此清单保存为 my-deployment.yaml

使用以下命令创建 Deployment:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig create -f my-deployment.yaml

查看 Deployment:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get deployments

输出显示 Deployment 有三个可用的 Pod,它们已准备就绪:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
my-deployment      3/3     3            3           16s

以下清单定义 LoadBalancer 类型的 Service:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: metrics
    department: sales
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080

将此清单保存为 my-service.yaml

使用以下命令创建 Service:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig create -f my-service.yaml

查看 Service:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get service my-service

输出:

NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S
my-service   LoadBalancer   172.26.232.2   172.16.1.21   80:30060/TCP

GKE on Bare Metal 为服务提供外部 IP 地址。使用外部 IP 地址来调用 Service:

curl 172.16.1.21

输出为 hello world 消息:

Hello, world!
Version: 2.0.0
Hostname: my-deployment-75d45b64f9-6clxj

创建高可用性控制层面

本快速入门创建了一个简单的双节点混合集群。如果要创建高可用性控制层面,请创建一个具有三个控制层面节点的集群。

例如,修改配置文件以将两个额外节点添加到控制层面:

controlPlane:
  nodePoolSpec:
    clusterName: cluster1
    nodes:
    # Control Plane node pools. Typically, this is either a single machine
    # or 3 machines if using a high availability deployment.
    - address: <Machine 1 IP>
    - address: <Machine 2 IP>
    - address: <Machine 3 IP>

在负载均衡器自己的节点池中运行负载均衡器

本快速入门创建了一个简单的双节点混合集群。这样,负载均衡器就会在运行控制层面的同一节点上运行。

如果您希望负载均衡器在其自己的节点池中运行,请修改配置文件 loadBalancer 部分的 nodePoolSpec 值:

  loadBalancer:
    nodePoolSpec:
      clusterName: "cluster1"
      nodes:
      - address: <LB Machine 1 IP>
      - address: <LB Machine 2 IP>