Introduction to Apigee and Apigee hybrid playbooks

You're viewing Apigee and Apigee hybrid documentation.
View Apigee Edge documentation.

The act of troubleshooting is both an art and a science. The constant effort of Apigee technical support teams has been to demystify the art and expose the science behind problem identification and resolution.

What are playbooks?

Developed in collaboration with ApigeeTechnical Support teams, Apigee troubleshooting playbooks are designed to provide quick and effective solutions to errors or other issues that you may encounter when working with Apigee products.

Audience

Troubleshooting playbooks are intended for readers with a high-level understanding of Apigee and its architecture, as well as some understanding of basic concepts such as policies and analytics.

Some problems can be diagnosed and solved only by Apigee hybrid users and may require knowledge of internal components such such as Cassandra and Postgres datastores, Message Processors, and Routers.

If you are on Apigee, then we clearly specify when you can perform the indicated troubleshooting steps and when you need to contact Apigee Support for assistance.

Playbooks

This section describes the current playbooks.

To filter this table, do one or more of the following: select a category, select a product, type a search term, or click a column heading to sort.

Category Playbook/Problem description Error message Playbook applicable for
Cassandra Troubleshooting Cassandra restore During the Cassandra restoration in Apigee hybrid, you may encounter errors in the restore logs. Apigee hybrid only
Automated issue surfacing No network connectivity between runtime plane and control plane

Apigee API management requests fail:

  • API products, Developers, Apps do not get populated on the Apigee UI.
  • API proxy deployments do not complete.
  • Apigee API management requests fail.
Apigee hybrid only
Automated issue surfacing Virtual host missing environment group After running kubectl -n apigee get apigeeissues, the AIS_VIRTUALHOST_MISSING_ENVGROUP error is displayed. Apigee hybrid only
Automated issue surfacing Virtual host missing selector After running kubectl -n apigee get apigeeissues, the AIS_VIRTUALHOST_MISSING_SELECTOR error is displayed. Apigee hybrid only
Automated issue surfacing Ingress cert mismatch After running kubectl -n apigee get apigeeissues, the AIS_INGRESS_CERT_MISMATCH error is displayed. Apigee hybrid only
Automated issue surfacing Ingress cert expiry After running kubectl -n apigee get apigeeissues, the AIS_INGRESS_CERT_EXPIREY error is displayed. Apigee hybrid only
Automated issue surfacing Ingress mTLS CA cert expiry After running kubectl -n apigee get apigeeissues, the AIS_INGRESS_MTLS_CA_CERT_EXPIREY error is displayed. Apigee hybrid only
Automated issue surfacing Ingress mTLS CA cert invalid After running kubectl -n apigee get apigeeissues, the AIS_INGRESS_MTLS_CA_CERT_INVALID error is displayed. Apigee hybrid only
Cassandra Cassandra data replication failure When replicating data during a multi-region expansion, the CassandraDataReplication status may show an error state and data replication may fail. Apigee hybrid only
Cassandra Cassandra pods not starting in the secondary region Cassandra pods fail to start in one of the regions in a multi-region Hybrid setup. You may see a node already exists error message in the Cassandra pod logs, or a FailedPreStopHook warning in the Cassandra pod status. Apigee hybrid only
Cassandra Cassandra troubleshooting guide When you use kubectl to view the pod states, you see that one or more Cassandra pods are stuck. This guide describes the diagnosis and resolution for problems with the Cassandra datastore. Apigee hybrid only
Deployment API proxy deployments fail with no active runtime pods warning The No active runtime pods warning is displayed in the Details dialog next to the error message Deployment issues on ENVIRONMENT: REVISION_NUMBER on the API proxy page. Apigee hybrid only
Ingressgateway API calls fail with timeout errors

curl: (7) Failed to connect to example.apis.com port 443: Operation timed out
Apigee hybrid only
Ingressgateway API Calls failing with TLS errors

curl: (35) LibreSSL SSL_connect: SSL_ERROR_SYSCALL in connection to example.apis.com:443
Apigee hybrid only
Logging Troubleshooting Apigee logs missing from Cloud Logging No error messages are known to be shown in this scenario. Apigee and Apigee hybrid
Management/UI Inconsistent/no data observed for entities in hybrid UI or through Management APIs No error messages are known to be shown in this scenario. Apigee hybrid only
Network configuration Access routing issues with Apigee External clients are not able to access/connect to Apigee in a desired manner. These include either network connectivity failures (TLS handshake fails) or 4xx/5xx responses from Apigee. Apigee and Apigee hybrid
Network configuration Apigee connectivity issues with southbound PSC targets A network connection issue or a TCP timeout between Apigee and the target service would show up as a 503 error response and would show an error similar to below if you create a debug session.
{"fault":{"faultstring":"The Service is temporarily unavailable","detail":{"errorcode":"messaging.adaptors.http.flow.ServiceUnavailable","reason":"TARGET_CONNECT_TIMEOUT"}}}
Apigee and Apigee hybrid
Other Expanding Istio property replica counts when draining nodes When draining Istio pods some nodes may not drain because they have a replica count of 1, while 3 or more replicas are required. In order to avoid this, you should set the minimum replica count for each property to at least 3. Apigee hybrid only
Other Message processor troubleshooting guide One or more apigee-runtime pods are not in the Ready state. When you use kubectl to describe a failed apigee-runtime pod, you see the error:
Readiness probe failed: HTTP probe failed with statuscode: 500
Apigee hybrid only
Other Print build info The buildinfo API returns information about the current build for a runtime component. This information may be useful if you need to contact support. Apigee hybrid only
Other StreamingPull errors 100% If you see in your metrics dashboard that the method google.pubsub.vl.Subscriber.StreamingPull is failing with 100% errors, you can safely ignore the issue. This is expected behavior. Apigee hybrid only
Deployment Instance is not reporting status for environment group Deployments of API proxies fail with Instance INSTANCE_NAME is not reporting status for environment group ENV_GROUP_NAME error in the Apigee hybrid UI. Apigee hybrid only
Deployment API proxy deployments fail with apigee-serving-cert is not found or expired API proxy deployments fail with error messages in the apigee-watcher logs. Apigee hybrid only
Ingressgateway Expand Istio property replica counts to avoid problems when draining Istio nodes When draining Istio pods some nodes may not drain because they have a replica count of 1, while 3 or more replicas are required. In order to avoid this, you should set the minimum replica count for each property to at least 3. Apigee hybrid only
Network configuration No free IP address space troubleshooting During Apigee provisioning, if you select a network CIDR range that is not completely free, you may see an error message. Apigee and Apigee hybrid
Network configuration VPC Peering 503 Service Unavailable error with TARGET_CONNECT_TIMEOUT This document describes how to diagnose and correct "503 Service Unavailable" errors with TARGET_CONNECT_TIMEOUT when using VPC peering. Apigee
Network configuration 504 Gateway timeout - Target read timeout This document describes how to diagnose and correct "504 Gateway Timeout" errors with a TARGET_READ_TIMEOUT reason. Apigee and Apigee hybrid
Other Troubleshooting Apigee hybrid stuck in creating or releasing state This document describes how to reset Apigee hybrid components when they are stuck in a creating or releasing state. Apigee hybrid only