设置中央日志服务器

本页面介绍了如何通过 Google Distributed Cloud 空气隔离数据中心组织为 Google Distributed Cloud (GDC) 空气隔离设备设置中央日志服务器。

如需创建中央日志记录位置,GDC 设备必须在 GDC 数据中心组织中包含以下组件:

  • 唯一项目
  • 用于存储审核日志的存储桶
  • 用于存储运维日志的存储桶

创建项目

必须在要导出日志的 GDC 数据中心组织中执行以下步骤。

  1. KUBECONFIG 设置为组织管理 API:

    export KUBECONFIG=ORG_MANAGEMENT_API_KUBECONFIG_PATH
    
  2. 如需获得导出日志所需的权限,请让您的组织 IAM 管理员为您授予 ClusterRole Project Creator (ClusterRole project-creator) 角色。如需详细了解这些角色,请参阅准备 IAM 权限

  3. 应用项目自定义资源,为要从中导出日志的 GDC 设备创建唯一项目:

    kubectl apply -f - <<EOF
    apiVersion: resourcemanager.gdc.goog/v1
    kind: Project
    metadata:
      namespace: platform
      name: APPLIANCE_PROJECT_NAME
      labels:                                                                                                                                                                                                                                                                   
        object.gdc.goog/tenant-category: user                                                                                                                                   
    EOF
    
  4. 验证新项目是否在 GDC 设备中可用:

    kubectl get namespace APPLIANCE_PROJECT_NAME
    
  5. 将新项目与结算账号相关联。如需跟踪项目资源费用,您必须拥有与项目相关联的结算账号。

  6. 如需获得导出日志所需的权限,请让您的组织 IAM 管理员在命名空间 APPLIANCE_PROJECT_NAME 中向您授予项目 IAM 管理员 (project-iam-admin) 角色。

创建存储桶

平台管理员 (PA) 必须在日志将导出到的 GDC 数据中心组织中执行以下步骤。

  1. KUBECONFIG 设置为组织管理 API:

    export KUBECONFIG=ORG_MANAGEMENT_API_KUBECONFIG_PATH
    
  2. 如需获得导出日志所需的权限,请让您的组织 IAM 管理员向您授予命名空间 APPLIANCE_PROJECT_NAME 中的 Project Bucket Admin (project-bucket-admin) 角色。

  3. 应用存储桶自定义资源以创建存储桶:

    apiVersion: object.gdc.goog/v1
    kind: Bucket
    metadata:
      name: BUCKET_NAME
      namespace: APPLIANCE_PROJECT_NAME
      labels:                                                                                                                                                                     
        object.gdc.goog/bucket-type: normal                                                                                                                                       
        object.gdc.goog/encryption-version: v2                                                                                                                                    
        object.gdc.goog/tenant-category: user
    spec:                                                                                                                                                                         
      description: Bucket for storing appliance xyz audit logs                                                                                                                     
      location: zone1                                                                                                                                                             
      storageClass: Standard
    
  4. 创建存储桶后,运行以下命令以确认并检查存储桶的详细信息:

    kubectl describe buckets BUCKET_NAME -n APPLIANCE_PROJECT_NAME
    
  5. 创建用于访问存储桶中对象的 ProjectServiceAccount

    kubectl apply -f - <<EOF
    ---
    apiVersion: resourcemanager.gdc.goog/v1
    kind: ProjectServiceAccount
    metadata:
      name: BUCKET_NAME-read-write-sa
      namespace: APPLIANCE_PROJECT_NAME
    spec: {}
    EOF
    
  6. 验证 ProjectServiceAccount 是否已传播:

    kubectl get projectserviceaccount BUCKET_NAME-read-write-sa -n APPLIANCE_PROJECT_NAME -o json | jq '.status'
    
  7. 确保已为相应存储桶向 ServiceAccount 添加 readwrite 权限。

    kubectl apply -f - <<EOF
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      name: BUCKET_NAME-read-write-role
      namespace: APPLIANCE_PROJECT_NAME
    rules:
    - apiGroups:
      - object.gdc.goog
      resourceNames:
      - BUCKET_NAME
      resources:
      - buckets
      verbs:
      - read-object
      - write-object
    ---
    apiVersion: rbac.authorization.k8s.io/v1
    kind: RoleBinding
    metadata:
      name: BUCKET_NAME-read-write-rolebinding
      namespace: APPLIANCE_PROJECT_NAME
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: Role
      name: BUCKET_NAME-read-write-role
    subjects:
    - kind: ServiceAccount
      name: BUCKET_NAME-read-write-sa
      namespace: APPLIANCE_PROJECT_NAME
    EOF
    
  8. 获取包含相应存储桶访问凭据的 Secret:

    kubectl get secret -n APPLIANCE_PROJECT_NAME -o json| jq --arg jq_src BUCKET_NAME-read-write-sa '.items[].metadata|select(.annotations."object.gdc.goog/subject"==$jq_src)|.name'
    

    输出必须类似于以下示例,其中显示了相应存储桶的 Secret 名称:

    "object-storage-key-sysstd-sa-olxv4dnwrwul4bshu37ikebgovrnvl773owaw3arx225rfi56swa"
    
  9. 将值导出到变量:

    export BUCKET_RW_SECRET_NAME=BUCKET_RW_SECRET_NAME
    
  10. 获取存储桶访问权限的密钥 ID:

    kubectl get secret $BUCKET_RW_SECRET_NAME -n appliance-xyz -o json | jq -r '.data."access-key-id"' | base64 -di
    

    输出必须如下例所示:

    PCEW2HU47Y8ACUWQO4SK
    
  11. 获取相应存储桶的私有访问密钥:

    kubectl get secret $BUCKET_RW_SECRET_NAME -n appliance-xyz -o json | jq -r '.data."secret-access-key"' | base64 -di
    

    输出必须如下例所示:

    TzGdAbgp4h2i5UeiYa9k09rNPFQ2tkYADs67+65E
    
  12. 获取相应存储桶的端点:

    kubectl get bucket BUCKET_NAME -n APPLIANCE_PROJECT_NAME -o json | jq '.status.endpoint'
    

    输出必须如下例所示:

    https://objectstorage.org-1.zone1.google.gdch.test
    
  13. 获取存储桶的完全限定名称:

    kubectl get bucket BUCKET_NAME -n APPLIANCE_PROJECT_NAME -o json | jq '.status.fullyQualifiedName'
    

    输出必须如下例所示:

    aaaoa9a-logs-bucket
    

从 GDC 转移数据

按照将日志导出到远程存储桶中的说明,使用存储桶的端点、完全限定名、访问密钥 ID 和私有访问密钥,将日志从 GDC 设备转移到之前在 GDC 气隙数据中心创建的存储桶。

在 Google Distributed Cloud 网闸隔离数据中心内设置 Loki 和 Grafana

以下步骤必须由 GDC 气隙数据中心组织(日志已导出到该组织)中的基础设施运维人员 (IO) 执行。

获取 IAM 角色

如需获得导出日志所需的权限,请让您的组织 IAM 管理员为您授予以下角色:在基础架构集群的命名空间 obs-system 中授予 Logs Restore Admin (logs-restore-admin) 角色,在管理平面的命名空间 obs-system 中授予 Datasource Viewer (datasource-viewer) 和 Datasource Editor (datasource-editor) 角色。

设置 Loki

  1. KUBECONFIG 设置为组织基础架构集群:

    export KUBECONFIG=ORG_INFRA_CLUSTER_KUBECONFIG_PATH
    
  2. 从 PA 获取设备日志存储桶的访问密钥 ID 和私有访问密钥,并创建一个包含 obs-system 命名空间中凭据的 Secret:

    kubectl create secret generic -n obs-system APPLIANCE_LOGS_BUCKET_SECRET_NAME 
    --from-literal=access-key-id=APPLIANCE_LOGS_BUCKET_ACCESS_KEY_ID 
    --from-literal=secret-access-key=APPLIANCE_LOGS_BUCKET_SECRET_ACCESS_KEY
    
  3. 从 PA 获取设备日志存储桶的端点和完全限定名称,并创建 Loki configmap

    kubectl apply -f - <<EOF
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: CONFIGMAP_NAME
      namespace: obs-system
    data:
      loki.yaml: |-
        auth_enabled: true
        common:
          ring:
            kvstore:
              store: inmemory
        compactor:
          working_directory: /data/loki/compactor
          compaction_interval: 10m
          retention_enabled: true
          retention_delete_delay: 2h
          retention_delete_worker_count: 150
          delete_request_store: s3
        ingester:
          chunk_target_size: 1572864
          chunk_encoding: snappy
          max_chunk_age: 2h
          chunk_idle_period: 90m
          chunk_retain_period: 30s
          autoforget_unhealthy: true
          lifecycler:
            ring:
              kvstore:
                store: inmemory
              replication_factor: 1
              heartbeat_timeout: 10m
          wal:
            enabled: false
        limits_config:
          discover_service_name: []
          retention_period: 48h
          reject_old_samples: false
          ingestion_rate_mb: 256
          ingestion_burst_size_mb: 256
          max_streams_per_user: 20000
          max_global_streams_per_user: 20000
          max_line_size: 0
          per_stream_rate_limit: 256MB
          per_stream_rate_limit_burst: 256MB
          shard_streams:
            enabled: false
            desired_rate: 3MB
        schema_config:
          configs:
          - from: "2020-10-24"
            index:
              period: 24h
              prefix: index_
            object_store: s3
            schema: v13
            store: tsdb
        server:
          http_listen_port: 3100
          grpc_server_max_recv_msg_size: 104857600
          grpc_server_max_send_msg_size: 104857600
          graceful_shutdown_timeout: 60s
        analytics:
          reporting_enabled: false
        storage_config:
          tsdb_shipper:
            active_index_directory: /tsdb/index
            cache_location: /tsdb/index-cache
            cache_ttl: 24h
          aws:
            endpoint: APPLIANCE_LOGS_BUCKET_ENDPOINT
            bucketnames: APPLIANCE_LOGS_BUCKET_FULLY_QUALIFIED_NAME
            access_key_id: ${S3_ACCESS_KEY_ID}
            secret_access_key: ${S3_SECRET_ACCESS_KEY}
            s3forcepathstyle: true
    ---
    EOF
    
  4. 创建 Loki statefulset 和服务:

    kubectl apply -f - <<EOF
    ---
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
      labels:
        app: STATEFULSET_NAME
      name: STATEFULSET_NAME
      namespace: obs-system
    spec:
      persistentVolumeClaimRetentionPolicy:
        whenDeleted: Retain
        whenScaled: Retain
      podManagementPolicy: OrderedReady
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app: STATEFULSET_NAME
      serviceName: STATEFULSET_NAME
      template:
        metadata:
          labels:
            app: STATEFULSET_NAME
            istio.io/rev: default
        spec:
          affinity:
            nodeAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - preference:
                  matchExpressions:
                  - key: node-role.kubernetes.io/control-plane
                    operator: DoesNotExist
                  - key: node-role.kubernetes.io/master
                    operator: DoesNotExist
                weight: 1
            podAntiAffinity:
              preferredDuringSchedulingIgnoredDuringExecution:
              - podAffinityTerm:
                  labelSelector:
                    matchExpressions:
                    - key: app
                      operator: In
                      values:
                      - STATEFULSET_NAME
                  topologyKey: kubernetes.io/hostname
                weight: 100
          containers:
          - args:
            - -config.file=/etc/loki/loki.yaml
            - -config.expand-env=true
            - -target=all
            env:
            - name: S3_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  key: access-key-ID
                  name: APPLIANCE_LOGS_BUCKET_SECRET_NAME
                  optional: false
            - name: S3_SECRET_ACCESS_KEY
              valueFrom:
                  secretKeyRef:
                    key: secret-access-key
                    name: APPLIANCE_LOGS_BUCKET_SECRET_NAME
                    optional: false
            image: gcr.io/private-cloud-staging/loki:v3.0.1-gke.1
            imagePullPolicy: Always
            livenessProbe:
              failureThreshold: 3
              httpGet:
                path: /ready
                port: loki-server
                scheme: HTTP
              initialDelaySeconds: 330
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 1
            name: STATEFULSET_NAME
            ports:
            - containerPort: 3100
              name: loki-server
              protocol: TCP
            - containerPort: 7946
              name: gossip-ring
              protocol: TCP
            readinessProbe:
              failureThreshold: 3
              httpGet:
                path: /ready
                port: loki-server
                scheme: HTTP
              initialDelaySeconds: 45
              periodSeconds: 10
              successThreshold: 1
              timeoutSeconds: 1
            resources:
              limits:
                ephemeral-storage: 2000Mi
                memory: 8000Mi
              requests:
                cpu: 300m
                ephemeral-storage: 2000Mi
                memory: 1000Mi
            securityContext:
              readOnlyRootFilesystem: true
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /etc/loki
              name: config
            - mountPath: /data
              name: loki-storage
            - mountPath: /tsdb
              name: loki-tsdb-storage
            - mountPath: /tmp
              name: temp
            - mountPath: /tmp/loki/rules-temp
              name: tmprulepath
            - mountPath: /etc/ssl/certs
              name: trust-bundle
              readOnly: true
          dnsPolicy: ClusterFirst
          restartPolicy: Always
          schedulerName: default-scheduler
          securityContext:
            fsGroup: 10001
            runAsGroup: 10001
            runAsUser: 10001
          terminationGracePeriodSeconds: 4800
          volumes:
          - emptyDir: {}
            name: temp
          - emptyDir: {}
            name: tmprulepath
          - configMap:
              defaultMode: 420
              name: trust-store-root-ext
              optional: true
            name: trust-bundle
          - configMap:
              defaultMode: 420
              name: CONFIGMAP_NAME
            name: config
      updateStrategy:
        type: RollingUpdate
      volumeClaimTemplates:
      - apiVersion: v1
        kind: PersistentVolumeClaim
        metadata:
          creationTimestamp: null
          name: loki-storage
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
          storageClassName: standard-rwo
          volumeMode: Filesystem
      - apiVersion: v1
        kind: PersistentVolumeClaim
        metadata:
          creationTimestamp: null
          name: loki-tsdb-storage
        spec:
          accessModes:
          - ReadWriteOnce
          resources:
            requests:
              storage: 5Gi
          storageClassName: standard-rwo
          volumeMode: Filesystem
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: STATEFULSET_NAME
      namespace: obs-system
    spec:
      internalTrafficPolicy: Cluster
      ipFamilies:
      - IPv4
      ipFamilyPolicy: SingleStack
      ports:
      - name: loki-server
        port: 3100
        protocol: TCP
        targetPort: loki-server
      selector:
        app: STATEFULSET_NAME
      sessionAffinity: None
      type: ClusterIP
    ---
    EOF
    

设置 Grafana DataSource

  1. KUBECONFIG 设置为组织管理 API:

    export KUBECONFIG=ORG_MANAGEMENT_API_KUBECONFIG_PATH
    
  2. 为基础设施和平台日志创建 DataSources

    kubectl apply -f - <<EOF
    ---
    apiVersion: monitoring.private.gdc.goog/v1alpha1
    kind: Datasource
    metadata:
      name: INFRA_DATASOURCE_NAME
      namespace: APPLIANCE_PROJECT_NAME-obs-system
    spec:
      datasource:
        access: proxy
        isDefault: false
        jsonData:
          httpHeaderName1: X-Scope-OrgID
        name: UI_FRIENDLY_NAME
        orgId: 1
        readOnly: true
        secureJsonData:
          httpHeaderValue1: infra-obs
        type: loki
        uid: INFRA_DATASOURCE_NAME
        url: http://STATEFULSET_NAME.obs-system.svc:3100
        version: 1
        withCredentials: false
    ---
    apiVersion: monitoring.private.gdc.goog/v1alpha1
    kind: Datasource
    metadata:
      name: PLATFORM_DATASOURCE_NAME
      namespace: APPLIANCE_PROJECT_NAME-obs-system
    spec:
      datasource:
        access: proxy
        isDefault: false
        jsonData:
          httpHeaderName1: X-Scope-OrgID
        name: UI_FRIENDLY_NAME
        orgId: 1
        readOnly: true
        secureJsonData:
          httpHeaderValue1: platform-obs
        type: loki
        uid: PLATFORM_DATASOURCE_NAME
        url: http://STATEFULSET_NAME.obs-system.svc:3100
        version: 1
        withCredentials: false
    ---
    EOF
    

在 Google Distributed Cloud 网闸隔离数据中心 Grafana 中查看日志

导出到 Google Distributed Cloud 气隙数据中心存储桶的日志可在 GDC 设备项目的 Grafana 实例中查看。