Get started with managed collection

This document describes how to set up Google Cloud Managed Service for Prometheus with managed collection. The setup is a minimal example of working ingestion, using a Prometheus deployment that monitors an example application and stores collected metrics in Monarch.

This document shows you how to do the following:

  • Set up your environment and command-line tools.
  • Set up managed collection for your cluster.
  • Configure a resource for target scraping and metric ingestion.
  • Migrate existing prometheus-operator custom resources.

We recommend that you use managed collection; it reduces the complexity of deploying, scaling, sharding, configuring, and maintaining the collectors. Managed collection is supported for both GKE and any other Kubernetes environments. For more information about managed and self-deployed data collection, see Data collection with Managed Service for Prometheus.

Before you begin

This section describes the configuration needed for the tasks described in this document.

Set up projects and tools

To use Google Cloud Managed Service for Prometheus, you need the following resources:

  • A Google Cloud project with the Cloud Monitoring API enabled.

    • If you don't have a Cloud project, then do the following:

      1. In the Google Cloud console, go to New Project:

        Create a New Project

      2. In the Project Name field, enter a name for your project and then click Create.

      3. Go to Billing:

        Go to Billing

      4. Select the project you just created if it isn't already selected at the top of the page.

      5. You are prompted to choose an existing payments profile or to create a new one.

      The Monitoring API is enabled by default for new projects.

    • If you already have a Cloud project, then ensure that the Monitoring API is enabled:

      1. Go to APIs & services:

        Go to APIs & services

      2. Select your project.

      3. Click Enable APIs and Services.

      4. Search for "Monitoring".

      5. In the search results, click through to "Cloud Monitoring API".

      6. If "API enabled" is not displayed, then click the Enable button.

  • A Kubernetes cluster. If you do not have a Kubernetes cluster, then follow the instructions in the Quickstart for GKE.

You also need the following command-line tools:

  • gcloud
  • kubectl

The gcloud and kubectl tools are part of the Google Cloud CLI. For information about installing them, see Managing Google Cloud CLI components. To see the gcloud CLI components you have installed, run the following command:

gcloud components list

Configure your environment

To avoid repeatedly entering your project ID or cluster name, perform the following configuration:

  • Configure the command-line tools as follows:

    • Configure the gcloud CLI to refer to the ID of your Cloud project:

      gcloud config set project PROJECT_ID
      
    • Configure the kubectl CLI to use your cluster:

      kubectl config set-cluster CLUSTER_NAME
      

    For more information about these tools, see the following:

Set up a namespace

Create the gmp-test Kubernetes namespace for resources you create as part of the example application:

kubectl create ns gmp-test

Set up managed collection

To download and deploy managed collection to your cluster, you must apply the setup and operator manifests for the managed service. You can apply the manifests by using the following:

  • The Google Cloud console for Google Kubernetes Engine.
  • The Google Cloud CLI. To use the gcloud CLI, you must be running GKE version 1.21.4-gke.300 or newer, and you must install the beta component of the gcloud CLI.
  • Terraform for Google Kubernetes Engine. To use Terraform to enable Managed Service for Prometheus, you must be running GKE version 1.21.4-gke.300 or newer.
  • The kubectl CLI, for non-GKE Kubernetes environments.

Google Cloud console

You can do the following by using the Google Cloud console:

  • Apply the manifests to an existing GKE cluster.
  • Create a new GKE cluster with the manifests applied.

To update an existing cluster with the manifests, do the following:

  1. In the Google Cloud console, select Kubernetes Engine, or use the following button:

    Go to Kubernetes Engine

  2. Select Clusters.

  3. Click on the name of the cluster.

  4. In the Features list, locate the Managed Service for Prometheus option. If it is listed as disabled, click Edit, and then select Enable Managed Service for Prometheus.

  5. Click Save changes.

To create a cluster with the manifests applied, do the following:

  1. In the Google Cloud console, select Kubernetes Engine, or use the following button:

    Go to Kubernetes Engine

  2. Select Clusters.

  3. Click Create.

  4. Click Configure for the GKE Standard option and configure the cluster by using the Cluster basics pane.

  5. In the navigation panel, click Features.

  6. In the Operations section, select Enable Managed Service for Prometheus.

  7. Click Save.

gcloud CLI

You can do the following by using the gcloud CLI:

  • Apply the manifests to an existing GKE cluster.
  • Create a new GKE cluster with the manifests applied.

These commands might take up to 5 minutes to complete.

To update an existing cluster with the manifests, run one of the following update commands based on whether your cluster is zonal or regional:

  • gcloud beta container clusters update CLUSTER_NAME --enable-managed-prometheus --zone ZONE
    
  • gcloud beta container clusters update CLUSTER_NAME --enable-managed-prometheus --region REGION
    

To create a cluster with the manifests applied, run the following command:

gcloud beta container clusters create CLUSTER_NAME --zone ZONE --enable-managed-prometheus

Terraform

For instructions on configuring managed collection using Terraform, see the Terraform registry for google_container_cluster (beta).

For general information about using Google Cloud with Terraform, see Terraform with Google Cloud.

kubectl CLI

To apply the manifests when you are using a non-GKE Kubernetes cluster, run the following commands:

kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.4.1/manifests/setup.yaml

kubectl apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.4.1/manifests/operator.yaml

After applying the manifests, managed collection will be running but no metrics are generated yet. You must deploy a PodMonitoring resource that scrapes a valid metrics endpoint to see any data in the Query UI.

For reference documentation about the Managed Service for Prometheus operator, see the manifests page.

Deploy the example application

The managed service provides a manifest for an example application that emits Prometheus metrics on its metrics port. The application uses three replicas.

To deploy the example application, run the following command:

kubectl -n gmp-test apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.4.1/examples/example-app.yaml

Configure a PodMonitoring resource

To ingest the metric data emitted by the example application, you use target scraping. Target scraping and metrics ingestion are configured using Kubernetes custom resources. The managed service uses PodMonitoring custom resources (CRs).

A PodMonitoring CR scrapes targets only in the namespace the CR is deployed in. To scrape targets in multiple namespaces, deploy the same PodMonitoring CR in each namespace. You can verify the PodMonitoring resource is installed in the intended namespace by running kubectl get podmonitoring -A.

For reference documentation about all the Managed Service for Prometheus CRs, see the prometheus-engine/doc/api reference.

The following manifest defines a PodMonitoring resource, prom-example, in the gmp-test namespace. The resource uses a Kubernetes label selector to find all pods in the namespace that have the label app with the value prom-example. The matching pods are scraped on a port named metrics, every 30 seconds, on the /metrics HTTP path.

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: prom-example
spec:
  selector:
    matchLabels:
      app: prom-example
  endpoints:
  - port: metrics
    interval: 30s

To apply this resource, run the following command:

kubectl -n gmp-test apply -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.4.1/examples/pod-monitoring.yaml

Your managed collector is now scraping the matching pods.

To configure horizontal collection that applies to a range of pods across all namespaces, use the ClusterPodMonitoring resource. The ClusterPodMonitoring resource provides the same interface as the PodMonitoring resource but does not limit discovered pods to a given namespace.

If you are running on GKE, then you can do the following:

If you are running outside of GKE, then you need to create a service account and authorize it to write your metric data, as described in the following section.

Provide credentials explicitly

When running on GKE, the collecting Prometheus server automatically retrieves credentials from the environment based on the Compute Engine default service account or the Workload Identity setup.

In non-GKE Kubernetes clusters, credentials must be explicitly provided through the OperatorConfig resource in the gmp-public namespace.

  1. Create a service account:

    gcloud iam service-accounts create gmp-test-sa
    

  2. Grant the required permissions to the service account:

    gcloud projects add-iam-policy-binding PROJECT_ID\
      --member=serviceAccount:gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com \
      --role=roles/monitoring.metricWriter
    

  3. Create and download a key for the service account:

    gcloud iam service-accounts keys create gmp-test-sa-key.json \
      --iam-account=gmp-test-sa@PROJECT_ID.iam.gserviceaccount.com
    
  4. Add the key file as a secret to your non-GKE cluster:

    kubectl -n gmp-public create secret generic gmp-test-sa \
      --from-file=key.json=gmp-test-sa-key.json
    

  5. Open the OperatorConfig resource for editing:

    kubectl -n gmp-public edit operatorconfig config
    

  6. Add the text shown in bold to the resource:

    apiVersion: monitoring.googleapis.com/v1
    kind: OperatorConfig
    metadata:
      namespace: gmp-public
      name: config
    collection:
      credentials:
        name: gmp-test-sa
        key: key.json
    
    Make sure you also add these credentials to the rules section so that managed rule evaluation works.

  7. Save the file and close the editor. After the change is applied, the pods are re-created and start authenticating to the metric backend with the given service account.

Additional topics for managed collection

This section describes how to do the following:

  • Configuring target scraping using Terraform
  • Scraping Kubelet and cAdvisor metrics.
  • Filter the data you export to the managed service.
  • Convert your existing prom-operator resources for use with the managed service.

Configuring target scraping using Terraform

You can automate the creation and management of PodMonitoring and ClusterPodMonitoring resources by using the kubernetes_manifest Terraform resource type or the kubectl_manifest Terraform resource type, either of which lets you specify arbitrary custom resources.

For general information about using Google Cloud with Terraform, see Terraform with Google Cloud.

Scraping Kubelet and cAdvisor metrics

The Kubelet exposes metrics about itself as well as cAdvisor metrics about containers running on its node. They can be ingested by updating the OperatorConfig.

  1. Open the OperatorConfig resource for editing:

    kubectl -n gmp-public edit operatorconfig config
    
  2. Add the following collection section, shown in bold type, to the resource:

    apiVersion: monitoring.googleapis.com/v1
    kind: OperatorConfig
    metadata:
      namespace: gmp-public
      name: config
    collection:
      kubeletScraping:
        interval: 15s
    
  3. Save the file and close the editor.

After a short time, the Kubelet metric endpoints will be scraped and the metrics become available for querying in Managed Service for Prometheus.

Filter exported metrics

If you collect a lot of data, you might want to prevent some time series from being sent to Managed Service for Prometheus to keep down costs.

To filter exported metrics, you can configure a set of PromQL series selectors in the OperatorConfig resource. A time series is exported to Managed Service for Prometheus when it satisfies at least one of the selectors; that is, when determining eligibility, conditions within a single selector are ANDed while conditions in separate selectors are ORed. By default, no selectors are specified and all time series are exported. The following example uses two selectors:

  1. Open the OperatorConfig resource for editing:

    kubectl -n gmp-public edit operatorconfig config
    
  2. Add a collection filter to the resource as shown in bold type below. The filter.matchOneOf configuration section has the same semantics as the match[] parameters for Prometheus federation.

    This example filter causes only metrics for the "prometheus" job as well as metrics produced by recording rules that aggregate to the job level—when following naming best practices—to be exported. Samples for all other time series are filtered out:

    apiVersion: monitoring.googleapis.com/v1
    kind: OperatorConfig
    metadata:
      namespace: gmp-public
      name: config
    collection:
      filter:
        matchOneOf:
        - '{job="prometheus"}'
        - '{__name__=~"job:.+"}'
    
  3. Save the file and close the editor.

Convert existing prometheus-operator resources

You can usually convert your existing prometheus-operator resources to Managed Service for Prometheus managed collection PodMonitoring and ClusterPodMonitoring resources.

For example, the ServiceMonitor resource defines monitoring for a set of services. The PodMonitoring resource serves a subset of the fields served by the ServiceMonitor resource. You can convert a ServiceMonitor CR to a PodMonitoring CR by mapping the fields as described in the following table:

monitoring.coreos.com/v1
ServiceMonitor
Compatibility
 
monitoring.googleapis.com/v1
PodMonitoring
.ServiceMonitorSpec.Selector Identical .PodMonitoringSpec.Selector
.ServiceMonitorSpec.Endpoints[] .TargetPort maps to .Port
.Path: compatible
.Interval: compatible
.Timeout: compatible
.PodMonitoringSpec.Endpoints[]
.ServiceMonitorSpec.TargetLabels PodMonitor must specify:
.FromPod[].From pod label
.FromPod[].To target label
.PodMonitoringSpec.TargetLabels

The following is a sample ServiceMonitor CR; the content in bold type is replaced in the conversion, and the content in italic type maps directly:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - targetPort: web
    path: /stats
    interval: 30s
  targetLabels:
  - foo

The following is the analogous PodMonitoring CR, assuming that your service and its pods are labeled with app=example-app. If this assumption does not apply, then you need to use the label selectors of the underlying Service resource.

The content in bold type has been replaced in the conversion:

apiVersion: monitoring.googleapis.com/v1
kind: PodMonitoring
metadata:
  name: example-app
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web
    path: /stats
    interval: 30s
  targetLabels:
    fromPod:
    - from: foo # pod label from example-app Service pods.
      to: foo

You can always continue to use your existing prometheus-operator resources and deployment configs by using self-deployed collectors instead of managed collectors. You can query metrics sent from both collector types, so you might want to use self-deployed collectors for your existing Prometheus deployments while using managed collectors for new Prometheus deployments.

Teardown

To disable managed collection deployed using gcloud or the GKE UI, run the following command:

gcloud beta container clusters update CLUSTER_NAME --disable-managed-prometheus

To disable managed collection deployed using kubectl, run the following command:

kubectl delete -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.4.1/manifests/operator.yaml

Run managed collection outside of GKE

In GKE environments, you can run managed collection without further configuration. In other Kubernetes environments, you need to explicitly provide credentials, a project-id value to contain your metrics, a location value (Google Cloud region) where your metrics will be stored, and a cluster value to save the name of the cluster in which the collector is running.

As gcloud does not work outside of Google Cloud environments, you need to deploy using kubectl instead. Unlike with gcloud, deploying managed collection using kubectl does not automatically upgrade your cluster when a new version is available. Remember to watch the releases page for new versions and manually upgrade by re-running the kubectl commands with the new version.

You can provide a service account key by modifying the OperatorConfig resource within operator.yaml as described in Provide credentials explicitly. You can provide project-id, location, and cluster values by adding them as args to the Deployment resource within operator.yaml.

We recommend choosing project-id based on your planned tenancy model for reads. Pick a project to store metrics in based on how you plan to organize reads later via metrics scopes. If you don't care, you can put everything into one project.

For location, we recommend choosing the nearest Google Cloud region to your deployment. The further the chosen Google Cloud region is from your deployment, the more write latency you'll have and the more you'll be affected by potential networking issues. You might want to consult this list of regions across multiple clouds. If you don't care, you can put everything into one Google Cloud region. You can't use global as your location.

For cluster, we recommend choosing the name of the cluster in which the operator is deployed.

When properly configured, your OperatorConfig should look like this:

    apiVersion: monitoring.googleapis.com/v1
    kind: OperatorConfig
    metadata:
      namespace: gmp-public
      name: config
    collection:
      credentials:
        name: gmp-test-sa
        key: key.json
    rules:
      credentials:
        name: gmp-test-sa
        key: key.json

And your Deployment resource should look like this:

apiVersion: apps/v1
kind: Deployment
...
spec:
  ...
  template:
    ...
    spec:
      ...
      containers:
      - name: operator
        ...
        args:
        - ...
        - "--project-id=PROJECT_ID"
        - "--cluster=CLUSTER_NAME"
        - "--location=ZONE"

This example assumes you have set the ZONE variable to a value like us-central1-b, for example.

Running Managed Service for Prometheus outside of Google Cloud incurs data ingress fees and might incur data egress fees if running on another cloud.

Further reading on managed collection custom resources

For reference documentation about all the Managed Service for Prometheus custom resources, see the prometheus-engine/doc/api reference.

What's next