Examples of alerts

You're viewing Apigee X documentation.
View Apigee Edge documentation.

Apigee lets you create complex alerts based on multiple conditions and logical combinations. The following sections present some examples of alerts.

Alert for no 200 response code for 5 minutes

The next example creates an alert when there is no 200 response code (successful request) for 5 minutes.

Initial steps

To create the alert, start by doing the Initial steps for creating a metric-based alert.

Target

In the Target pane, do the following steps:

  1. Add the response_count metric by copying the code below and pasting it in the Select a metric field.
    apigee.googleapis.com/proxyV2/response_count
  2. Add a filter for response count as follows:
    1. In the Filter field, click Add a filter and select Apigee proxy response cumulative from the drop-down menu.
    2. In the next field, select =.
    3. Select 200 in the Value field.
    4. Click APPLY.

You can leave the remaining fields in the Target pane unchanged, since they are not used in this example.

Configuration

In the Configuration pane:

  1. In the Condition field, select is absent.
  2. In the For field, select 5 minutes. If no 200 response code is received in a 5 minute interval, the alert is triggered.

Is absent selected in Condition field.

Finally, click ADD to create the alert.

After doing these steps, an alert will be triggered when the proxy does not receive a 200 response code for 5 minutes.

Traffic spike alert

The following sections show how to create an alert that is triggered when the total API traffic is above 3600 for 1 minute.

Initial steps

To create the alert, start by doing the Initial steps for creating a metric-based alert.

Target

In the Target pane:

  1. Copy the code sample below and paste it in the Select a metric field.
    apigee.googleapis.com/proxyv2/response_count
  2. Click the Select a metric field and select Apigee proxy response cumulative.

    Find resource type and metric.

    This displays the fields shown below.

    Select target options.
  3. In the Aggregator field, select sum. This sums the total traffic over the time span specified by Period.
  4. Set Period to 1 minute.

    These settings will trigger an alert based on the total number of requests in each 1 minute interval of the time range.

Configuration

The Configuration settings specify the conditions that trigger the alert. The Threshold is the minimum value of the quantity specified in the Target section—which in this case is requests per second—that will set off the alert.

For example, suppose you want the alert to be triggered if the total traffic in any 1 minute interval is above 36. Since the threshold is measured in requests per second, you divide 3600 by 60 to get a threshold of 60.

Enter the following in the Configuration pane:

  • Set Threshold to 60.
  • Set For to 1 minute.

The Configuration pane now appears as shown below.

Configuration settings

Finally, click ADD to create the alert.

After doing these steps, an alert will be triggered when the total API traffic is above 3600 for 1 minute.

Latency alert

The following section shows how to create an alert that is triggered when the 90th percentile of API latency is above 600 ms for 10 minutes.

Initial steps

To create the alert, start by doing the Initial steps for creating a metric-based alert.

Target

Next, do the following steps in the Target pane:

  1. Copy the code below and paste it in the Select a metric field.
    apigee.googleapis.com/proxyv2/latencies_percentile
  2. Select Percentile of Apigee proxy response.

    Select latency metric.

  3. Under Filter, click in the Add a filter field and select percentile.

    Select a metric.

  4. In the Value field that appears below Select a metric, select 90.

    Select a metric.

  5. Click Apply.

Configuration

In the Configuration pane, do the following steps:

  1. Set Threshold to 600.

    Select threshold.

Finally, click ADD to create the alert.

With these settings, an alert will be triggered when the 90th percentile of the API's latency is above 600 for 10 minutes.

If you create a notification for the alert, when you receive a notification and click the View In Apigee button in the notification email, the Investigate dashboard will display a latency graph with the threshold, similar to the example shown below:

Select a metric.

Alert for a decrease in responses with response code 200

The next example shows how to create an alert for a 5 percent decrease in responses with response code 200 (a successful response) over a 3 minute interval.

Initial steps

To create the alert, start by doing the Initial steps for creating a metric-based alert.

Target

In the Target pane, do the following steps:

  1. Add the response_count metric by copying the code below and pasting it in the Select a metric field.
    apigee.googleapis.com/proxyV2/response_count
  2. Add a filter for response_code as follows:
    1. In the Filter field, click Add a filter and select response_code from the drop-down menu.
    2. Select 200 in the Value field.
    3. Click APPLY.

    Add a filter.

  3. In the Aggregator drop-down menu, select sum.
  4. In the Period drop-down menu, select 1 minute. The last two settings specify that the responses will be summed over 1 minute time intervals.

Note: The Period is not the interval (3 minutes) for the decrease in responses that triggers the alert. That interval is specified in the Configuration pane, which is described below.

After doing these steps, the Target pane should appear as shown below:

Target pane.

Configuration

Next, make the following selections in the Configuration pane:

  • In the Configure drop-down menu, select decreases by.
  • In the Threshold field, enter 5 percent.
  • In the For menu, select 3 minutes. This specifies that the decrease in responses must last for 3 minutes to trigger the alert.

After doing so, the Configuration pane should appear as shown below:

Configuration pane.

Finally, click ADD to create the alert.

With these settings, an alert will be triggered when sum of the response counts over the previous minute has a 5 percent decline over 3 minutes.

Alert for combinations of response codes and HTTP methods

The Group By field in the Target pane enables you to divide API traffic data into groups that match selected labels. The next example divides the data for response_count into four groups, based on all combinations of the HTTP methods GET and PUT, and the response codes 400 and 500, as shown in the following table.

Response code: 400 Response code: 500
HTTP method: GET GET&400 GET&500
HTTP method: PUT PUT&400 PUT&500

The example then creates an alert that is triggered if there is a 25 percent increase in responses over a 5 minute interval for any of the four combinations in the table above.

In this example, responses with response code 200 will be excluded from the count, which means that only unsuccessful responses are counted (as opposed to the previous example in which only successful responses are counted).

Initial steps

To create the alert, start by doing the Initial steps for creating a metric-based alert.

Target

In the Target pane, do the following steps:

  1. Add the request_count metric by copying the code below and pasting it in the Select a metric field field.
    apigee.googleapis.com/proxyV2/response_count
  2. To exclude responses with response_code 200, do the following in the Filter field:
    1. Type response_code.
    2. In the field below, select !=.
    3. In the Value field, select 200.
    4. Click Apply.
  3. In the Group By field:
    1. Click + Add a label and select HTTP method below the Value.
    2. Click + Add a label again and select response_code below the Value.
  4. In the Aggregator drop-down menu, select sum.
  5. In the Period drop-down menu, select 1 minute.

After doing these steps, the Target pane should appear as shown below:

Target pane.

Configuration

Next, make the following selections in the Configuration pane:

  • In the Configure drop-down menu, select Increases by.
  • In the Threshold field, enter 20.
  • In the For menu, select 5 minutes.

After doing so, the Configuration pane should appear as shown below:

Configuration pane.

Finally, click ADD to create the alert.

The alert is triggered if there is a 25 percent increase in responses for any of the four combinations of HTTP method and response code over 5 minutes.

4xx alerts with a specific fault code (auth error maybe? )

Latency alert when p95 exceeded 1sec.