Automated issue surfacing

Overview

Automated issue surfacing (AIS) provides quick information about ongoing issues detected within your Hybrid cluster. This information includes links to documentation for troubleshooting and resolution. Automated issue surfacing only looks for known, common, system detectable issues and cannot detect all the issues within a cluster.

Starting with Apigee hybrid v1.10, the Apigee runtime watcher component automatically scans the control plane and Kubernetes API server state to determine if there are any configuration issues. By default, the scanning happens every 60 seconds. You can change the interval or disable scanning if you prefer.

When AIS detects an issue, it creates a new instance of ApigeeIssue within the Kubernetes API server. These instances contain information about the issues and links to documentation on the specific issues.

When you resolve issues, they will automatically be deleted from the Kubernetes API server when the scanning determines they are no longer occurring.

Using Automated issue surfacing.

Check for any existing issues with the kubectl get apigeeissues command:

kubectl -n APIGEE_NAMESPACE get apigeeissues

For example:

kubectl -n APIGEE_NAMESPACE get apigeeissues

NAME                                 SEVERITY    AGE URL
vhost-missing-eg-nonprod             Error       1hr https://cloud.google.com/apigee/docs/hybrid/MISSING_ENV_GROUP
control-plane-connectivity-failure   Error       1d  https://cloud.google.com/apigee/docs/hybrid/OLD_TLS_VERSION

For more detailed information about a specific issue, use the kubectl describe command with the issue name. The name must be preceded by apigeeissues for example: apigeeissues vhost-missing-eg-nonprod.

kubectl -n APIGEE_NAMESPACE describe apigeeissues vhost-missing-eg-nonprod

Name:         vhost-missing-eg-nonprod
Namespace:    apigee
Labels:       
Annotations:  
API Version:  apigee.cloud.google.com/v1alpha1
Kind:         ApigeeIssue
Metadata:
  Creation Timestamp:  2022-08-25T20:41:56Z
  Managed Fields:
    API Version:  apigee.cloud.google.com/v3
  Resource Version:  12345678
  UID:               aaaaaaaa-bbbb-cccc-dddd-eeeeffffgggg
Spec:
  Severity: Error
  Reason: MISSING_ENV_GROUP
  Details: Expected envgroup "nonprod" for ApigeeRouteConfig "my-org-nonprod"
  Documentation: https://cloud.google.com/apigee/docs/hybrid/MISSING_ENV_GROUP
  Ignore: false
  IgnoreReason:
Events: 

Changing the scan interval

By default, Watcher scans the control plane for issues once every 60 seconds. To change the scan interval, specify the new interval in seconds with the watcher.args.issueScanInterval property in your overrides file. For example:

watcher:
  args:
    issueScanInterval: 120

Apply the configuration.

Helm

Upgrade the apigee-env chart for each Apigee environment.

helm upgrade $ORG_NAME apigee-org/ \
  --namespace APIGEE_NAMESPACE \
  -f OVERRIDES_FILE

apigeectl

Apply the change to all environments.

apigeectl apply -f OVERRIDES_FILE --org

Disabling automated issue surfacing

You can disable Automated issue surfacing by setting the watcher.args.enableIssueScanning property to false in your overrides file. For example:

watcher:
  args:
    enableIssueScanning: false

Apply the configuration.

Helm

Upgrade the apigee-env chart for each Apigee environment.

helm upgrade $ORG_NAME apigee-org/ \
  --namespace APIGEE_NAMESPACE \
  -f OVERRIDES_FILE

apigeectl

Apply the change to all environments.

apigeectl apply -f OVERRIDES_FILE --org
Automated issue surfacing can provide links directly to the troubleshooting guides, including: See Introduction to Apigee X and Apigee hybrid playbooks for an overview and list of Apigee troubleshooting guides.