排查 GKE 中的 DNS 问题


本页面介绍如何解决与 Google Kubernetes Engine (GKE) 集群中的 DNS 提供商相关的问题。

如果您需要其他帮助,请与 Cloud Customer Care 联系。

Cloud DNS for GKE 事件

本部分详细介绍 GKE 中的常见 Cloud DNS 问题。

Cloud DNS 被停用

Cloud DNS API 被停用时会发生以下事件:

Warning   FailedPrecondition        service/default-http-backend
Failed to send requests to Cloud DNS: Cloud DNS API Disabled. Please enable the Cloud DNS API in your project PROJECT_NAME: Cloud DNS API has not been used in project PROJECT_NUMBER before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/dns.googleapis.com/overview?project=PROJECT_NUMBER then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.

发生此错误是因为 Cloud DNS API 默认未启用。您必须手动启用 Cloud DNS API。

如需解决此问题,请启用 Cloud DNS API

未能向 Cloud DNS 发送请求:超出了 API 速率限制。

当项目超过 Cloud DNS 配额或限制时,会发生以下事件:

kube-system   27s         Warning   InsufficientQuota
managedzone/gke-cluster-quota-ee1bd2ca-dns     Failed to send requests to Cloud DNS: API rate limit exceeded. Contact Google Cloud support team to request a quota increase for your project PROJECT_NAME: Quota exceeded for quota metric 'Write requests' and limit 'Write limit for a minute for a region' of service 'dns.googleapis.com' for consumer 'project_number:PROJECT_NUMBER.

如需解决此问题,请查看 Cloud DNS 配额以及 Compute Engine 配额和限制。您可以使用 Google Cloud 控制台来增加配额。

由于之前的错误,未能向 Cloud DNS 发送请求

当错误导致级联故障时,会发生以下事件:

kube-system   27s         Warning   InsufficientQuota
managedzone/gke-cluster-quota-ee1bd2ca-dns     Failed to send requests to Cloud DNS: API rate limit exceeded. Contact Google Cloud support team to request a quota increase for your project PROJECT_NAME: Quota exceeded for quota metric 'Write requests' and limit 'Write limit for a minute for a region' of service 'dns.googleapis.com' for consumer 'project_number:PROJECT_NUMBER.
kube-system   27s         Warning   FailedPrecondition               service/default-http-backend                         Failed to send requests to Cloud DNS due to a previous error. Please check the cluster events.

要解决此问题,请检查集群事件以查找原始错误的根源,然后按照说明解决该根源问题。

在前面的示例中,托管式可用区的 InsufficientQuota 错误触发了级联故障。FailedPrecondition 的第二个错误表示发生了之前的错误,也就是初始配额不足问题。要解决此示例问题,您需要按照 Cloud DNS 配额错误排查指南操作。

未能绑定响应政策

当响应政策绑定到集群网络且 Cloud DNS for GKE 尝试将响应政策绑定到网络时,会发生以下事件:

kube-system   9s          Warning   FailedPrecondition               responsepolicy/gke-2949673445-rp
Failed to bind response policy gke-2949673445-rp to test. Please verify that another Response Policy is not already associated with the network: Network 'https://www.googleapis.com/compute/v1/projects/PROJECT_NAME/global/networks/NETWORK_NAME' cannot be bound to this response policy because it is already bound to another response policy.
kube-system   9s          Warning   FailedPrecondition               service/kube-dns
Failed to send requests to Cloud DNS due to a previous error. Please check the cluster events.

如需解决此问题,请完成以下步骤:

  1. 获取绑定到网络的响应政策:

    gcloud dns response-policies list --filter='networks.networkUrl: NETWORK_URL'
    

    NETWORK_URL 替换为错误中的网络网址,例如 https://www.googleapis.com/compute/v1/projects/PROJECT_ID/global/networks/NETWORK_NAME

    如果输出为空,则响应政策可能不在同一项目中。继续执行下一步以搜索响应政策。

    如果输出类似于以下内容,请跳到第 4 步以删除响应政策。

    [
       {
          "description": "Response Policy for GKE cluster \"CLUSTER_NAME\" with cluster suffix \"cluster.local.\" in project \"PROJECT_ID\" with scope \"CLUSTER_SCOPE\".",
          ...
          "kind": "dns#responsePolicy",
          "responsePolicyName": "gke-CLUSTER_NAME-POLICY_ID-rp"
       }
    ]
    
  2. 使用 IAM Policy Analyzer 获取具有 dns.networks.bindDNSResponsePolicy 权限的项目列表。

  3. 检查每个项目是否具有与网络绑定的响应政策:

    gcloud dns response-policies list --filter='networks.networkUrl:NETWORK_URL' \
        --project=PROJECT_NAME
    
  4. 删除响应政策

kube-dns 中指定的配置无效

当您应用对 Cloud DNS for GKE 无效的自定义 kube-dns ConfigMap 时,会发生以下事件:

kube-system   49s         Warning   FailedValidation                 configmap/kube-dns
Invalid configuration specified in kube-dns: error parsing stubDomains for ConfigMap kube-dns: dnsServer [8.8.8.256] validation: IP address "8.8.8.256" invalid

如需解决此问题,请查看错误中 ConfigMap 无效部分的详细信息。在前面的示例中,8.8.8.256 不是有效的 IP 地址。

后续步骤