Monitor Policy Controller

Policy Controller uses Prometheus to collect and show metrics related to its processes.

You can also configure Cloud Monitoring to pull custom metrics from Prometheus. Then you can see custom metrics in both Prometheus and Monitoring. For more information, see Using Prometheus.

Scraping the metrics

All Prometheus metrics are available for scraping at port 8675. Before you can scrape metrics, you need to configure your cluster for Prometheus in one of two ways. Either:

  • Follow the Prometheus documentation to configure your cluster for scraping, or

  • Use the Prometheus Operator along with the following manifests, which scrape all Anthos Config Management metrics every 10 seconds.

    1. Create a temporary directory to hold the manifest files.

      mkdir acm-monitor
      cd acm-monitor
      
    2. Download the Prometheus Operator manifest from the CoreOS repository. repository, using the curl command:

      curl -o bundle.yaml https://raw.githubusercontent.com/coreos/prometheus-operator/master/bundle.yaml
      

      This manifest is configured to use the default namespace, which is not recommended. The next step modifies the configuration to use a namespace called monitoring instead. To use a different namespace, substitute it where you see monitoring in the remaining steps.

    3. Create a file to update the namespace of the ClusterRoleBinding in the bundle above.

      # patch-crb.yaml
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata:
        name: prometheus-operator
      subjects:
      - kind: ServiceAccount
        name: prometheus-operator
        namespace: monitoring # we are patching from default namespace
      
    4. Create a kustomization.yaml file that applies the patch and modifies the namespace for other resources in the manifest.

      # kustomization.yaml
      resources:
      - bundle.yaml
      
      namespace: monitoring
      
      patchesStrategicMerge:
      - patch-crb.yaml
      
    5. Create the monitoring namespace. You can use a different name for the namespace, but if you do, also change the value of namespace in the YAML manifests from the previous steps.

      kubectl create namespace monitoring
      
    6. Apply the kustomized manifest using the following commands:

      kubectl apply -k .
      
      until kubectl get customresourcedefinitions servicemonitors.monitoring.coreos.com ; \
      do date; sleep 1; echo ""; done

      The second command blocks until the CRDs are available on the cluster.

    7. Create the manifest for the resources necessary to configure a Prometheus server which scrapes metrics from Anthos Config Management.

      # acm.yaml
      apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: prometheus-acm
        namespace: monitoring
      ---
      apiVersion: rbac.authorization.k8s.io/v1beta1
      kind: ClusterRole
      metadata:
        name: prometheus-acm
      rules:
      - apiGroups: [""]
        resources:
        - nodes
        - services
        - endpoints
        - pods
        verbs: ["get", "list", "watch"]
      - apiGroups: [""]
        resources:
        - configmaps
        verbs: ["get"]
      - nonResourceURLs: ["/metrics"]
        verbs: ["get"]
      ---
      apiVersion: rbac.authorization.k8s.io/v1beta1
      kind: ClusterRoleBinding
      metadata:
        name: prometheus-acm
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: prometheus-acm
      subjects:
      - kind: ServiceAccount
        name: prometheus-acm
        namespace: monitoring
      ---
      apiVersion: monitoring.coreos.com/v1
      kind: Prometheus
      metadata:
        name: acm
        namespace: monitoring
        labels:
          prometheus: acm
      spec:
        replicas: 2
        serviceAccountName: prometheus-acm
        serviceMonitorSelector:
          matchLabels:
            prometheus: config-management
        podMonitorSelector:
          matchLabels:
            prometheus: config-management
        alerting:
          alertmanagers:
          - namespace: default
            name: alertmanager
            port: web
        resources:
          requests:
            memory: 400Mi
      ---
      apiVersion: v1
      kind: Service
      metadata:
        name: prometheus-acm
        namespace: monitoring
        labels:
          prometheus: acm
      spec:
        type: NodePort
        ports:
        - name: web
          nodePort: 31900
          port: 9190
          protocol: TCP
          targetPort: web
        selector:
          app: prometheus
          prometheus: acm
      --- 
        endpoints:
        - port: metrics
          interval: 10s 
      ---
      apiVersion: monitoring.coreos.com/v1
      kind: PodMonitor
      metadata:
        name: acm-pod
        namespace: monitoring
        labels:
          prometheus: config-management
      spec:
        selector:
          matchLabels:
            monitored: "true"
        namespaceSelector:
          matchNames:
          - gatekeeper-system
        podMetricsEndpoints:
        - port: metrics
          interval: 10s 
      
    8. Apply the manifest using the following commands:

      kubectl apply -f acm.yaml
      
      until kubectl rollout status statefulset/prometheus-acm -n monitoring; \
      do sleep 1; done
      

      The second command blocks until the Pods are running.

    9. You can verify the installation by forwarding the web port of the Prometheus server to your local machine.

      kubectl -n monitoring port-forward svc/prometheus-acm 9190
      

      You can now access the Prometheus web UI at http://localhost:9190.

    10. Remove the temporary directory.

      cd ..
      rm -rf acm-monitor
      

If Policy Controller is enabled on your cluster, you can query the following metrics (all prefixed with gatekeeper_):

Available metrics

Name Type Labels Description
gatekeeper_audit_duration_seconds Histogram Audit cycle duration distribution
gatekeeper_audit_last_run_time Gauge The epoch timestamp since the last audit runtime, given as seconds in floating-point
gatekeeper_constraint_template_ingestion_count Counter status Total number of constraint template ingestion actions
gatekeeper_constraint_template_ingestion_duration_seconds Histogram status Constraint Template ingestion duration distribution
gatekeeper_constraint_templates Gauge status Current number of constraint templates
gatekeeper_constraints Gauge enforcement_action, status Current number of constraints
gatekeeper_request_count Counter admission_status Count of admission requests from the API server
gatekeeper_request_duration_seconds Histogram admission_status Admission request duration distribution
gatekeeper_violations Gauge enforcement_action Number of audit violations detected in the last audit cycle
gatekeeper_watch_manager_intended_watch_gvk Gauge How many unique GroupVersionKinds Policy Controller is meant to be watching. This is a combination of synced resources and constraints. Not currently implemented
gatekeeper_watch_manager_watched_gvk Gauge How many unique GroupVersionKinds Policy Controller is actually watching. This is meant to converge on being equal to gatekeeper_watch_manager_intended_watch_gvk. Not currently implemented
gatekeeper_sync Gauge kind, status How many resources have been replicated into OPA's cache
gatekeeper_sync_duration_seconds Histogram Object sync duration distribution
gatekeeper_sync_last_run_time Gauge The last time a resource was synced