Anthos clusters on Bare Metal 快速入门

Anthos clusters on Bare Metal 简介

使用 Anthos clusters on Bare Metal,您可以定义四种类型的集群:

  • 管理 - 用于管理用户集群的集群。
  • 用户 - 用于运行工作负载的集群。
  • 独立 - 一个可以管理自身也可以运行工作负载的单个集群,但不能创建或管理其他用户集群。
  • 混合 - 一个同时用于管理和工作负载的单个集群,也可以管理其他用户集群。

在本快速入门中,您将使用 Anthos clusters on Bare Metal 部署双节点混合集群。您将了解如何创建集群以及如何监控集群创建过程。

本快速入门假定您对 Kubernetes 有基本的了解。

为 Anthos clusters on Bare Metal 做好准备

在 Anthos clusters on Bare Metal 中创建集群之前,您必须执行以下操作:

  1. 创建 Google Cloud 项目
  2. 安装 Google Cloud CLI
  3. 配置 Linux 管理员工作站
  4. 安装 bmctl 工具

创建 Google Cloud 项目

在本快速入门中,您将创建一个新的 Google Cloud 项目来组织您的所有 Google Cloud 资源。如需在 Anthos Clusters on Bare Metal 中创建集群,您需要一个在其中您的帐号具有 Owner 角色的 Google Cloud 项目。

如需了解详情,请参阅创建和管理项目

安装 Google Cloud CLI

本快速入门使用 kubectlbmctl 工具创建和设置集群。如需安装这些工具,您需要 gcloudgsutilGoogle Cloud CLI 包括 gcloudgsutilkubectl 命令行工具。

如需安装所需的工具,请完成以下步骤:

  1. 在管理员工作站上,按照相关说明安装并初始化 Google Cloud CLI。此过程会安装 gcloudgsutil
  2. 更新 Google Cloud CLI:
    gcloud components update
    
  3. 登录您的 Google 帐号,以管理您的服务和服务帐号:

    gcloud auth login --update-adc

    您的浏览器会打开一个新标签页,并提示您选择一个帐号。

  4. 使用 gcloud 安装 kubectl

    gcloud components install kubectl

配置 Linux 管理员工作站

安装 gcloudgsutilkubectl 后,配置 Linux 管理员工作站。请勿将 Cloud Shell 用作管理员工作站。

  1. 按照上一部分中的说明安装 gcloudgsutilkubectl
  2. 安装 Docker 19.03 或更高版本。如需了解如何配置 Docker,请转到与您的 Linux 发行版相对应的页面:
  3. 如需使用 root 访问权限,请在管理员工作站和远程集群节点机器上设置 SSH。最开始,您需要在远程集群节点机器上启用 root SSH 密钥身份验证,以共享管理员工作站的密钥。密钥准备就绪后,您可以停用 SSH 密码身份验证。
  4. 在管理员工作站上生成私键/公键对。不要为键设置密码。使用 SSH 在管理员工作站和集群节点机器之间建立安全的无密码连接时需要这些键。使用以下命令生成这些键:
    ssh-keygen -t rsa

    您还可以使用对集群节点机器的 SUDO 用户访问权限设置 SSH,但对于无密码的非根用户连接,您需要使用相应的凭据更新集群配置文件。如需了解详情,请转到示例集群配置文件中的 #Node access configuration 部分。

  5. 将生成的公钥添加到集群节点机器。默认情况下,公钥存储在 id_rsa.pub 身份文件中。
    ssh-copy-id -i ~/.ssh/identity_file root@cluster_node_ip
  6. 在集群节点机器上停用 SSH 密码身份验证,并在管理工作站上使用以下命令来验证公共密钥身份验证在管理工作站和集群节点机器之间是否有效。
    ssh -o IdentitiesOnly=yes -i identity_file root@cluster_node_ip

下载并安装 bmctl 工具

您可以使用 bmctl 命令行工具在 Anthos clusters on Bare Metal 中创建集群。bmctl 命令会自动设置 Google 服务帐号,并启用在指定项目中使用 Anthos clusters on Bare Metal 所需的 API。

如果您要创建自己的服务帐号或自行进行其他手动项目设置,请参阅启用 Google 服务和服务帐号,然后再使用 bmctl 创建集群。

如需下载并安装 bmctl 工具,请执行以下操作:

  1. bmctl 创建一个新目录:
    cd ~
    mkdir baremetal
    cd baremetal
    
  2. Cloud Storage 存储分区下载 bmctl
    gsutil cp gs://anthos-baremetal-release/bmctl/1.12.9/linux-amd64/bmctl bmctl
    chmod a+x bmctl
    
  3. 查看帮助信息以确保 bmctl 已正确安装:
    ./bmctl -h

创建集群节点

创建两个机器以用作集群的节点:

  • 一个机器用作控制层面节点。
  • 一个机器用作工作器节点。

如需详细了解集群节点的要求,请转到硬件操作系统要求(CentosRHELUbuntu)。

创建集群

要创建集群,请执行以下操作:

  1. 使用 bmctl 创建配置文件。
  2. 修改配置文件,以针对集群和网络进行自定义。
  3. 使用 bmctl 从配置文件创建集群。

创建配置文件

要创建配置文件并自动启用服务帐号和 API,请确保您位于 baremetal 目录中,然后使用以下标志发出 bmctl 命令:

./bmctl create config -c CLUSTER_NAME \
  --enable-apis --create-service-accounts --project-id=PROJECT_ID

CLUSTER_NAME 是集群的名称。 PROJECT_ID 是您在创建 Google Cloud 项目中创建的项目。

上述命令会在 bmctl-workspace/cluster1/cluster1.yaml 路径中的 baremetal 目录下创建一个配置文件

修改配置文件

要修改配置文件,请执行以下操作:

  1. 在编辑器中打开 bmctl-workspace/cluster1/cluster1.yaml 配置文件。
  2. 根据具体节点和网络要求修改该文件。请参考以下示例配置文件。本快速入门不使用或包含 OpenID Connect (OIDC) 相关信息。
# gcrKeyPath:  < to GCR service account key>
gcrKeyPath: baremetal/gcr.json
# sshPrivateKeyPath:  < to SSH private key, used for node access>
sshPrivateKeyPath: .ssh/id_rsa
# gkeConnectAgentServiceAccountKeyPath:  < to Connect agent service account key>
gkeConnectAgentServiceAccountKeyPath: baremetal/connect-agent.json
# gkeConnectRegisterServiceAccountKeyPath:  < to Hub registration service account key>
gkeConnectRegisterServiceAccountKeyPath: baremetal/connect-register.json
# cloudOperationsServiceAccountKeyPath:  < to Cloud Operations service account key>
cloudOperationsServiceAccountKeyPath: baremetal/cloud-ops.json
---
apiVersion: v1
kind: Namespace
metadata:
  name: cluster-cluster1
---
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
  name: cluster1
  namespace: cluster-cluster1
spec:
  # Cluster type. This can be:
  #   1) admin:  to create an admin cluster. This can later be used to create user clusters.
  #   2) user:   to create a user cluster. Requires an existing admin cluster.
  #   3) hybrid: to create a hybrid cluster that runs admin cluster components and user workloads.
  #   4) standalone: to create a cluster that manages itself, runs user workloads, but does not manage other clusters.
  type: hybrid
  # Anthos cluster version.
  anthosBareMetalVersion: 1.12.9
  # GKE connect configuration
  gkeConnect:
    projectID: PROJECT_ID
  # Control plane configuration
  controlPlane:
    nodePoolSpec:
      nodes:
      # Control plane node pools. Typically, this is either a single machine
      # or 3 machines if using a high availability deployment.
      - address:  CONTROL_PLANE_NODE_IP
  # Cluster networking configuration
  clusterNetwork:
    # Pods specify the IP ranges from which pod networks are allocated.
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    # Services specify the network ranges from which service virtual IPs are allocated.
    # This can be any RFC 1918 range that does not conflict with any other IP range
    # in the cluster and node pool resources.
    services:
      cidrBlocks:
      - 172.26.232.0/24
  # Load balancer configuration
  loadBalancer:
    # Load balancer mode can be either 'bundled' or 'manual'.
    # In 'bundled' mode a load balancer will be installed on load balancer nodes during cluster creation.
    # In 'manual' mode the cluster relies on a manually-configured external load balancer.
    mode: bundled
    # Load balancer port configuration
    ports:
      # Specifies the port the load balancer serves the Kubernetes control plane on.
      # In 'manual' mode the external load balancer must be listening on this port.
      controlPlaneLBPort: 443
    # There are two load balancer virtual IP (VIP) addresses: one for the control plane
    # and one for the L7 Ingress service. The VIPs must be in the same subnet as the load balancer nodes.
    # These IP addresses do not correspond to physical network interfaces.
    vips:
      # ControlPlaneVIP specifies the VIP to connect to the Kubernetes API server.
      # This address must not be in the address pools below.
      controlPlaneVIP: CONTROL_PLANE_VIP
      # IngressVIP specifies the VIP shared by all services for ingress traffic.
      # Allowed only in non-admin clusters.
      # This address must be in the address pools below.
      ingressVIP: INGRESS_VIP
    # AddressPools is a list of non-overlapping IP ranges for the data plane load balancer.
    # All addresses must be in the same subnet as the load balancer nodes.
    # Address pool configuration is only valid for 'bundled' LB mode in non-admin clusters.
    # addressPools:
    # - name: pool1
    #   addresses:
    #   # Each address must be either in the CIDR form (1.2.3.0/24)
    #   # or range form (1.2.3.1-1.2.3.5).
    #   - LOAD_BALANCER_ADDRESS_POOL-
    # A load balancer nodepool can be configured to specify nodes used for load balancing.
    # These nodes are part of the kubernetes cluster and run regular workloads as well as load balancers.
    # If the node pool config is absent then the control plane nodes are used.
    # Node pool configuration is only valid for 'bundled' LB mode.
    # nodePoolSpec:
    #   nodes:
    #   - address: LOAD_BALANCER_NODE_IP;
  # Proxy configuration
  # proxy:
  #   url: http://[username:password@]domain
  #   # A list of IPs, hostnames or domains that should not be proxied.
  #   noProxy:
  #   - 127.0.0.1
  #   - localhost
  # Logging and Monitoring
  clusterOperations:
    # Cloud project for logs and metrics.
    projectID: PROJECT_ID
    # Cloud location for logs and metrics.
    location: us-central1
    # Whether collection of application logs/metrics should be enabled (in addition to
    # collection of system logs/metrics which correspond to system components such as
    # Kubernetes control plane or cluster management agents).
    # enableApplication: false
  # Storage configuration
  storage:
    # lvpNodeMounts specifies the config for local PersistentVolumes backed by mounted disks.
    # These disks need to be formatted and mounted by the user, which can be done before or after
    # cluster creation.
    lvpNodeMounts:
      # path specifies the host machine path where mounted disks will be discovered and a local PV
      # will be created for each mount.
      path: /mnt/localpv-disk
      # storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
      # is created during cluster creation.
      storageClassName: local-disks
    # lvpShare specifies the config for local PersistentVolumes backed by subdirectories in a shared filesystem.
    # These subdirectories are automatically created during cluster creation.
    lvpShare:
      # path specifies the host machine path where subdirectories will be created on each host. A local PV
      # will be created for each subdirectory.
      path: /mnt/localpv-share
      # storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
      # is created during cluster creation.
      storageClassName: local-shared
      # numPVUnderSharedPath specifies the number of subdirectories to create under path.
      numPVUnderSharedPath: 5
  # NodeConfig specifies the configuration that applies to all nodes in the cluster.
  nodeConfig:
    # podDensity specifies the pod density configuration.
    podDensity:
      # maxPodsPerNode specifies the maximum number of pods allowed on a single node.
      maxPodsPerNode: 250
    # containerRuntime specifies which container runtime to use for scheduling containers on nodes.
    # containerd and docker are supported.
    containerRuntime: containerd

---
# Node pools for worker nodes
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
  name: node-pool-1
  namespace: cluster-cluster1
spec:
  clusterName: cluster1
  nodes:
  - address: WORKER_NODE_1_IP
  - address: WORKER_NODE_2_IP

运行预检检查并创建集群

bmctl 命令会在创建集群之前对集群配置文件运行预检检查。如果检查运行成功,则 bmctl 会创建集群。

要运行预检检查并创建集群,请执行以下操作:

  1. 确保您位于 baremetal 目录中。
  2. 使用以下命令创建集群:
  3. ./bmctl create cluster -c CLUSTER_NAME
    
    例如:
    ./bmctl create cluster -c cluster1
    

    bmctl 命令监控预检检查和集群创建,然后向屏幕显示输出并将详细信息写入 bmctl 日志。

您可以在以下目录中找到 bmctl、预检检查和节点安装日志:baremetal/bmctl-workspace/CLUSTER_NAME/log

bmctl 预检会检查相关集群安装的以下条件:

  • Linux 发行版和版本受支持。
  • SELinux 未处于“强制执行”模式。
  • 在 Ubuntu 上,未启用防火墙 (UFW) 处于未启用状态。
  • Google Container Registry 可访问。
  • VIP 可用。
  • 集群机器可以相互连接。
  • 负载均衡器机器位于同一第 2 层子网上。

集群创建过程可能需要几分钟时间才能完成。

获取有关集群的信息

成功创建集群后,请使用 kubectl 命令显示新集群的相关信息。在创建集群期间,bmctl 命令会为使用 kubectl 查询的集群写入 kubeconfig 文件。该 kubeconfig 文件会写入到 bmctl-workspace/CLUSTER_NAME/CLUSTER_NAME-kubeconfig

例如:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get nodes

此命令返回:

NAME      STATUS   ROLES    AGE   VERSION
node-01   Ready    master   16h   v1.17.8-gke.16
node-02   Ready    <none>   16h   v1.17.8-gke.16

如果集群创建的预检检查失败,请在预检检查日志中查找错误,并在集群配置文件中更正错误。预检检查日志位于 /log 目录下,路径为

~/baremetal/bmctl-workspace/CLUSTER_NAME/log

集群中每个机器的预检检查日志都位于 CLUSTER_NAME 目录中,并按 IP 地址组织。例如:

bmctl-workspace/cluster1/log
└── preflight-20201007-034844
    ├── 172.17.0.3
    ├── 172.17.0.4
    ├── 172.17.0.5
    ├── 172.17.0.6
    ├── 172.17.0.7
    └── node-network

忽略预检检查错误

如果在预检检查后集群创建失败,您可以尝试在 bmctl 命令中使用 --force 标志来重新安装集群。

--force 标志在现有集群上进行安装,但会忽略已分配的服务器端口导致的任何预检检查失败结果。

  1. 确保您位于 baremetal 目录中。
  2. 使用带有 --force 标志的以下命令来重新创建集群:
  3. ./bmctl create cluster -c CLUSTER_NAME --force
    
    例如:
    ./bmctl create cluster -c cluster1 --force

创建 Deployment 和 Service

以下是 Deployment 的清单。

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
spec:
  selector:
    matchLabels:
      app: metrics
      department: sales
  replicas: 3
  template:
    metadata:
      labels:
        app: metrics
        department: sales
    spec:
      containers:
      - name: hello
        image: "gcr.io/google-samples/hello-app:2.0"

将此清单保存为 my-deployment.yaml

使用以下命令创建 Deployment:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig create -f my-deployment.yaml

查看 Deployment:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get deployments

输出显示 Deployment 有三个可用的 Pod,它们已准备就绪:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
my-deployment      3/3     3            3           16s

以下清单定义 LoadBalancer 类型的 Service:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: metrics
    department: sales
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 8080

将此清单保存为 my-service.yaml

使用以下命令创建 Service:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig create -f my-service.yaml

查看 Service:

kubectl --kubeconfig bmctl-workspace/cluster1/cluster1-kubeconfig get service my-service

输出:

NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S
my-service   LoadBalancer   172.26.232.2   172.16.1.21   80:30060/TCP

Anthos clusters on Bare Metal 会为该服务提供外部 IP 地址。使用外部 IP 地址来调用 Service:

curl 172.16.1.21

输出为 hello world 消息:

Hello, world!
Version: 2.0.0
Hostname: my-deployment-75d45b64f9-6clxj

创建高可用性控制层面

本快速入门创建了一个简单的双节点混合集群。如果要创建高可用性控制层面,请创建一个具有三个控制层面节点的集群。

例如,修改配置文件以将两个额外节点添加到控制层面:

controlPlane:
  nodePoolSpec:
    clusterName: cluster1
    nodes:
    # Control Plane node pools. Typically, this is either a single machine
    # or 3 machines if using a high availability deployment.
    - address: <Machine 1 IP>
    - address: <Machine 2 IP>
    - address: <Machine 3 IP>

在负载均衡器自己的节点池中运行负载均衡器

本快速入门创建了一个简单的双节点混合集群。这样,负载均衡器就会在运行控制层面的同一节点上运行。

如果您希望负载均衡器在其自己的节点池中运行,请修改配置文件 loadBalancer 部分的 nodePoolSpec 值:

  loadBalancer:
    nodePoolSpec:
      clusterName: "cluster1"
      nodes:
      - address: <LB Machine 1 IP>
      - address: <LB Machine 2 IP>