在 Anthos clusters on Bare Metal 中,混合集群担任管理员集群和用户集群的双重角色。它们运行工作负载,同时管理其他集群以及自身。
在资源受限的情况下,使用混合集群可使您免于运行单独的管理员集群,并提供可用性高 (HA) 的可靠性。在高可用性混合集群中,如果一个节点发生故障,其他节点将取代此节点。
混合集群与独立集群的区别在于,它们也可以管理其他集群。独立集群无法创建或管理其他集群。
但是,创建混合集群时,您需要在灵活性与安全性之间进行一定的权衡。由于混合集群管理自身,因此在同一集群上运行工作负载会增加敏感的管理数据(如 SSH 密钥)的安全漏洞的风险。
您可以使用 bmctl
命令创建具有高可用性 (HA) 控制层面的混合集群。bmctl
命令可以在单独的工作站上运行,也可以在其中一个混合集群节点上运行。
预备知识:
- 从 Cloud Storage 下载了最新的
bmctl
(gs://anthos-baremetal-release/bmctl/1.8.9/linux-amd64/bmctl
)。 - 运行
bmctl
的工作站与目标混合集群中的所有节点有网络连接。 - 运行
bmctl
的工作站与目标混合集群的控制层面 VIP 有网络连接。 - 用于创建混合集群的 SSH 密钥可供 root 用户使用,或者 SUDO 用户可以访问目标混合集群中的所有节点。
- Connect-register 服务帐号配置为与 Connect 一起使用。
如需查看创建混合集群的分步说明,请参阅 Anthos clusters on Bare Metal 快速入门。
启用 SELinux
如果想要启用 SELinux 来保护容器,则必须确保先在 Enforced
模式的宿主机上启用 SELinux,然后再安装 Anthos clusters on Bare Metal。RHEL 和 CentOS 系统上默认启用 SELinux。如果在集群中停用了 SELinux 或者您不确定是否已启用 SELinux,请参阅使用 SELinux 保护容器了解有关如何启用 SELinux 的说明。
Anthos clusters on Bare Metal 仅支持 RHEL 和 CentOS 系统中的 SELinux。
登录 gcloud 并创建混合集群配置文件
- 使用
gcloud auth application-default
login 以用户身份登录 gcloud: - Service Account Admin
- Service Account Key Admin
- Project IAM Admin
- Compute Viewer
- Service Usage Admin
- 获取要用于创建集群的 Cloud 项目 ID:
gcloud auth application-default login
export GOOGLE_APPLICATION_CREDENTIALS=JSON_KEY_FILE
export CLOUD_PROJECT_ID=$(gcloud config get-value project)
使用 bmctl
创建混合集群
登录 gcloud 并设置项目后,您可以使用 bmctl
命令创建集群配置文件。请注意,在此示例中,所有服务帐号都由 bmctl create config
命令自动创建:
bmctl create config -c HYBRID_CLUSTER_NAME --enable-apis \ --create-service-accounts --project-id=CLOUD_PROJECT_ID
以下示例展示了如何为与项目 ID my-gcp-project
关联且名为 hybrid1
的混合集群创建配置文件:
bmctl create config -c hybrid1 --create-service-accounts --project-id=my-gcp-project
该文件会写入 bmctl-workspace/hybrid1/hybrid1.yaml。
除了自动启用 API 和创建服务帐号外,您还可以为现有服务帐号提供适当的 IAM 权限。也就是说,您可以在上一步中跳过 bmctl
命令中的自动服务帐号创建部分:
bmctl create config -c hybrid1
修改集群配置文件
现在您有了集群配置文件,请对其进行修改以做出以下更改:
提供 SSH 私钥以访问混合集群节点:
# bmctl configuration variables. Because this section is valid YAML but not a valid Kubernetes # resource, this section can only be included when using bmctl to # create the initial admin/hybrid cluster. Afterwards, when creating user clusters by directly # applying the cluster and node pool resources to the existing cluster, you must remove this # section. gcrKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-gcr.json sshPrivateKeyPath: /path/to/your/ssh_private_key gkeConnectAgentServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-connect.json gkeConnectRegisterServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-register.json cloudOperationsServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-cloud-ops.json
您必须使用 Connect 将集群注册到项目队列。
- 如果您使用自动 API 启用和服务帐号创建功能创建了配置文件,则可以跳过此步骤。
- 如果您创建配置文件时未使用自动 API 启用和服务帐号创建功能,请在集群配置文件的相应
gkeConnectAgentServiceAccountKeyPath
和gkeConnectRegisterServiceAccountKeyPath
字段中引用下载的服务帐号 JSON 密钥。
更改配置以指定
hybrid
集群类型而不是admin
:spec: # Cluster type. This can be: # 1) admin: to create an admin cluster. This can later be used to create user clusters. # 2) user: to create a user cluster. Requires an existing admin cluster. # 3) hybrid: to create a hybrid cluster that runs admin cluster components and user workloads. # 4) standalone: to create a cluster that manages itself, runs user workloads, but does not manage other clusters. type: hybrid
更改配置以指定多节点、高可用性的控制层面。 您需要指定奇数个节点,以通过多数仲裁实现高可用性:
# Control plane configuration controlPlane: nodePoolSpec: nodes: # Control plane node pools. Typically, this is either a single machine # or 3 machines if using a high availability deployment. - address: 10.200.0.4 - address: 10.200.0.5 - address: 10.200.0.6
指定集群节点的 pod 密度和容器运行时:
.... # NodeConfig specifies the configuration that applies to all nodes in the cluster. nodeConfig: # podDensity specifies the pod density configuration. podDensity: # maxPodsPerNode specifies at most how many pods can be run on a single node. maxPodsPerNode: 250 # containerRuntime specifies which container runtime to use for scheduling containers on nodes. # containerd and docker are supported. containerRuntime: containerd ....
对于混合集群,允许的
maxPodsPerNode
值为32-250
(HA 集群)和64-250
(非 HA 集群)。如果未指定,则默认值为110
。集群创建后,此值无法更新。pod 密度也受集群的可用 IP 资源的限制。如需了解详情,请参阅 pod 网络。
使用集群配置创建混合集群
使用 bmctl
命令部署集群:
bmctl create cluster -c CLUSTER_NAME
CLUSTER_NAME 指定您在上一部分中创建的集群名称。
以下示例命令为名为 hybrid1
的集群创建配置文件:
bmctl create cluster -c hybrid1
完整的混合集群配置示例
以下是使用 bmctl
命令创建的混合集群配置文件示例。请注意,此示例配置中使用了占位符集群名称、VIP 和地址。这些信息可能不适用于您的网络。
gcrKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-gcr.json
sshPrivateKeyPath: /bmctl/bmctl-workspace/.ssh/id_rsa
gkeConnectAgentServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-connect.json
gkeConnectRegisterServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-register.json
cloudOperationsServiceAccountKeyPath: /bmctl/bmctl-workspace/.sa-keys/my-gcp-project-anthos-baremetal-cloud-ops.json
---
apiVersion: v1
kind: Namespace
metadata:
name: cluster-hybrid1
---
apiVersion: baremetal.cluster.gke.io/v1
kind: Cluster
metadata:
name: hybrid1
namespace: cluster-hybrid1
spec:
# Cluster type. This can be:
# 1) admin: to create an admin cluster. This can later be used to create user clusters.
# 2) user: to create a user cluster. Requires an existing admin cluster.
# 3) hybrid: to create a hybrid cluster that runs admin cluster components and user workloads.
# 4) standalone: to create a cluster that manages itself, runs user workloads, but does not manage other clusters.
type: hybrid
# Anthos cluster version.
anthosBareMetalVersion: 1.8.9
# GKE connect configuration
gkeConnect:
projectID: $GOOGLE_PROJECT_ID
# Control plane configuration
controlPlane:
nodePoolSpec:
nodes:
# Control plane node pools. Typically, this is either a single machine
# or 3 machines if using a high availability deployment.
- address: 10.200.0.4
- address: 10.200.0.5
- address: 10.200.0.6
# Cluster networking configuration
clusterNetwork:
# Pods specify the IP ranges from which pod networks are allocated.
pods:
cidrBlocks:
- 192.168.0.0/16
# Services specify the network ranges from which service virtual IPs are allocated.
# This can be any RFC 1918 range that does not conflict with any other IP range
# in the cluster and node pool resources.
services:
cidrBlocks:
- 10.96.0.0/20
# Load balancer configuration
loadBalancer:
# Load balancer mode can be either 'bundled' or 'manual'.
# In 'bundled' mode a load balancer will be installed on load balancer nodes during cluster creation.
# In 'manual' mode the cluster relies on a manually-configured external load balancer.
mode: bundled
# Load balancer port configuration
ports:
# Specifies the port the load balancer serves the Kubernetes control plane on.
# In 'manual' mode the external load balancer must be listening on this port.
controlPlaneLBPort: 443
# There are two load balancer virtual IP (VIP) addresses: one for the control plane
# and one for the L7 Ingress service. The VIPs must be in the same subnet as the load balancer nodes.
# These IP addresses do not correspond to physical network interfaces.
vips:
# ControlPlaneVIP specifies the VIP to connect to the Kubernetes API server.
# This address must not be in the address pools below.
controlPlaneVIP: 10.200.0.71
# IngressVIP specifies the VIP shared by all services for ingress traffic.
# Allowed only in non-admin clusters.
# This address must be in the address pools below.
ingressVIP: 10.200.0.72
# AddressPools is a list of non-overlapping IP ranges for the data plane load balancer.
# All addresses must be in the same subnet as the load balancer nodes.
# Address pool configuration is only valid for 'bundled' LB mode in non-admin clusters.
addressPools:
- name: pool1
addresses:
# Each address must be either in the CIDR form (1.2.3.0/24)
# or range form (1.2.3.1-1.2.3.5).
- 10.200.0.72-10.200.0.90
# A load balancer node pool can be configured to specify nodes used for load balancing.
# These nodes are part of the Kubernetes cluster and run regular workloads as well as load balancers.
# If the node pool config is absent then the control plane nodes are used.
# Node pool configuration is only valid for 'bundled' LB mode.
# nodePoolSpec:
# nodes:
# - address: <Machine 1 IP>
# Proxy configuration
# proxy:
# url: http://[username:password@]domain
# # A list of IPs, hostnames or domains that should not be proxied.
# noProxy:
# - 127.0.0.1
# - localhost
# Logging and Monitoring
clusterOperations:
# Cloud project for logs and metrics.
projectID: $GOOGLE_PROJECT_ID
# Cloud location for logs and metrics.
location: us-central1
# Whether collection of application logs/metrics should be enabled (in addition to
# collection of system logs/metrics which correspond to system components such as
# Kubernetes control plane or cluster management agents).
# enableApplication: false
# Storage configuration
storage:
# lvpNodeMounts specifies the config for local PersistentVolumes backed by mounted disks.
# These disks need to be formatted and mounted by the user, which can be done before or after
# cluster creation.
lvpNodeMounts:
# path specifies the host machine path where mounted disks will be discovered and a local PV
# will be created for each mount.
path: /mnt/localpv-disk
# storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
# is created during cluster creation.
storageClassName: local-disks
# lvpShare specifies the config for local PersistentVolumes backed by subdirectories in a shared filesystem.
# These subdirectories are automatically created during cluster creation.
lvpShare:
# path specifies the host machine path where subdirectories will be created on each host. A local PV
# will be created for each subdirectory.
path: /mnt/localpv-share
# storageClassName specifies the StorageClass that PVs will be created with. The StorageClass
# is created during cluster creation.
storageClassName: local-shared
# numPVUnderSharedPath specifies the number of subdirectories to create under path.
numPVUnderSharedPath: 5
# NodeConfig specifies the configuration that applies to all nodes in the cluster.
nodeConfig:
# podDensity specifies the pod density configuration.
podDensity:
# maxPodsPerNode specifies at most how many pods can be run on a single node.
maxPodsPerNode: 250
# containerRuntime specifies which container runtime to use for scheduling containers on nodes.
# containerd and docker are supported.
containerRuntime: containerd
# KubeVirt configuration, uncomment this section if you want to install kubevirt to the cluster
# kubevirt:
# # if useEmulation is enabled, hardware accelerator (i.e relies on cpu feature like vmx or svm)
# # will not be attempted. QEMU will be used for software emulation.
# # useEmulation must be specified for KubeVirt installation
# useEmulation: false
# Authentication; uncomment this section if you wish to enable authentication to the cluster with OpenID Connect.
# authentication:
# oidc:
# # issuerURL specifies the URL of your OpenID provider, such as "https://accounts.google.com". The Kubernetes API
# # server uses this URL to discover public keys for verifying tokens. Must use HTTPS.
# issuerURL: <URL for OIDC Provider; required>
# # clientID specifies the ID for the client application that makes authentication requests to the OpenID
# # provider.
# clientID: <ID for OIDC client application; required>
# # clientSecret specifies the secret for the client application.
# clientSecret: <Secret for OIDC client application; optional>
# # kubectlRedirectURL specifies the redirect URL (required) for the gcloud CLI, such as
# # "http://localhost:[PORT]/callback".
# kubectlRedirectURL: <Redirect URL for the gcloud CLI; optional, default is "http://kubectl.redirect.invalid">
# # username specifies the JWT claim to use as the username. The default is "sub", which is expected to be a
# # unique identifier of the end user.
# username: <JWT claim to use as the username; optional, default is "sub">
# # usernamePrefix specifies the prefix prepended to username claims to prevent clashes with existing names.
# usernamePrefix: <Prefix prepended to username claims; optional>
# # group specifies the JWT claim that the provider will use to return your security groups.
# group: <JWT claim to use as the group name; optional>
# # groupPrefix specifies the prefix prepended to group claims to prevent clashes with existing names.
# groupPrefix: <Prefix prepended to group claims; optional>
# # scopes specifies additional scopes to send to the OpenID provider as a comma-delimited list.
# scopes: <Additional scopes to send to OIDC provider as a comma-separated list; optional>
# # extraParams specifies additional key-value parameters to send to the OpenID provider as a comma-delimited
# # list.
# extraParams: <Additional key-value parameters to send to OIDC provider as a comma-separated list; optional>
# # proxy specifies the proxy server to use for the cluster to connect to your OIDC provider, if applicable.
# # Example: https://user:password@10.10.10.10:8888. If left blank, this defaults to no proxy.
# proxy: <Proxy server to use for the cluster to connect to your OIDC provider; optional, default is no proxy>
# # deployCloudConsoleProxy specifies whether to deploy a reverse proxy in the cluster to allow Google Cloud
# # Console access to the on-premises OIDC provider for authenticating users. If your identity provider is not
# # reachable over the public internet, and you wish to authenticate using Google Cloud console, then this field
# # must be set to true. If left blank, this field defaults to false.
# deployCloudConsoleProxy: <Whether to deploy a reverse proxy for Google Cloud console authentication; optional>
# # certificateAuthorityData specifies a Base64 PEM-encoded certificate authority certificate of your identity
# # provider. It's not needed if your identity provider's certificate was issued by a well-known public CA.
# # However, if deployCloudConsoleProxy is true, then this value must be provided, even for a well-known public
# # CA.
# certificateAuthorityData: <Base64 PEM-encoded certificate authority certificate of your OIDC provider; optional>
# Node access configuration; uncomment this section if you wish to use a non-root user
# with passwordless sudo capability for machine login.
# nodeAccess:
# loginUser: <login user name>
---
# Node pools for worker nodes
apiVersion: baremetal.cluster.gke.io/v1
kind: NodePool
metadata:
name: node-pool-1
namespace: cluster-hybrid1
spec:
clusterName: hybrid1
nodes:
- address: 10.200.0.7
- address: 10.200.0.8