Automated issue surfacing

Overview

Automated issue surfacing (AIS) provides quick information about ongoing issues detected within your Hybrid cluster. This information includes links to documentation for troubleshooting and resolution. Automated issue surfacing only looks for known, common, system detectable issues and cannot detect all the issues within a cluster.

Starting with Apigee hybrid v1.10, the Apigee runtime watcher component automatically scans the control plane and Kubernetes API server state to determine if there are any configuration issues. By default, the scanning happens every 60 seconds. You can change the interval or disable scanning if you prefer.

When AIS detects an issue, it creates a new instance of ApigeeIssue within the Kubernetes API server. These instances contain information about the issues and links to documentation on the specific issues.

When you resolve issues, they will automatically be deleted from the Kubernetes API server when the scanning determines they are no longer occurring.

Using Automated issue surfacing.

Check for any existing issues with the kubectl get apigeeissues command:

kubectl -n apigee get apigeeissues

For example:

kubectl -n apigee get apigeeissues

NAME                                 SEVERITY    AGE URL
vhost-missing-eg-nonprod             Error       1hr https://cloud.google.com/apigee/docs/hybrid/MISSING_ENV_GROUP
control-plane-connectivity-failure   Error       1d  https://cloud.google.com/apigee/docs/hybrid/OLD_TLS_VERSION

For more detailed information about a specific issue, use the kubectl describe command with the issue name. The name must be preceded by apigeeissues/ for example: apigeeissues/vhost-missing-eg-nonprod.

kubectl -n apigee describe apigeeissues/vhost-missing-eg-nonprod

Name:         vhost-missing-eg-nonprod
Namespace:    apigee
Labels:       
Annotations:  
API Version:  apigee.cloud.google.com/v1alpha1
Kind:         ApigeeIssue
Metadata:
  Creation Timestamp:  2022-08-25T20:41:56Z
  Managed Fields:
    API Version:  apigee.cloud.google.com/v3
  Resource Version:  12345678
  UID:               aaaaaaaa-bbbb-cccc-dddd-eeeeffffgggg
Spec:
  Severity: Error
  Reason: MISSING_ENV_GROUP
  Details: Expected envgroup "nonprod" for ApigeeRouteConfig "my-org-nonprod"
  Documentation: https://cloud.google.com/apigee/docs/hybrid/MISSING_ENV_GROUP
  Ignore: false
  IgnoreReason:
Events: 

Changing the scan interval

By default, Watcher scans the control plane for issues once every 60 seconds. To change the scan interval, specify the new interval in seconds with the watcher.args.issueScanInterval property in your overrides file. For example:

watcher:
  args:
    issueScanInterval: 120

Apply the configuration.

Upgrade the apigee-env chart for each Apigee environment.

helm upgrade ENV_NAME apigee-env/ \
  --install \
  --namespace NAMESPACE \
  --set env=ENV_NAME \
  --atomic \
  -f overrides.yaml

Disabling automated issue surfacing

You can disable Automated issue surfacing by setting the watcher.args.enableIssueScanning property to false in your overrides file. For example:

watcher:
  args:
    enableIssueScanning: false

Apply the configuration.

Upgrade the apigee-env chart for each Apigee environment.

helm upgrade ENV_NAME apigee-env/ \
  --install \
  --namespace NAMESPACE \
  --set env=ENV_NAME \
  --atomic \
  -f overrides.yaml
Automated issue surfacing can provide links directly to the troubleshooting guides, including: See Introduction to Apigee X and Apigee hybrid playbooks for an overview and list of Apigee troubleshooting guides.