This guide explains some of the issues that might arise when you use the Monitoring API v3.
The Monitoring API is one of the set of Cloud APIs. These APIs share a common set of error codes. For a list of the error codes defined by the Cloud APIs and general suggestions on handling the errors, see Handling errors.
Use APIs Explorer for debugging
APIs Explorer is a widget built into the reference pages for API methods. It lets you invoke the method by filling out fields; it does not require you to write any code.
If you are having trouble with a method invocation, use the APIs Explorer (Try this API) widget on the reference page for that method to debug your problem. See APIs Explorer for more information.
General API errors
Here are some of the Monitoring API errors and messages you might see from your API calls:
- "The requested URL was not found on this server": Some part of the URL is incorrect. Compare the URL against the URL for the method, shown on the method's reference page. Check for spelling errors ("project" instead of "projects") and capitalization problems ("TimeSeries" instead of "timeSeries").
401 UNAUTHENTICATEDwith "User is not authorized to access the project (or metric)." This might be an authorization problem, but it can also mean that you simply misspelled a project ID or metric type name. Check your spelling and capitalization.
If you are not using APIs Explorer, then try using it. If your API call works in APIs Explorer, then you probably do have an authorization issue in the environment you're using for your API call. Check in the API manager page to verify that the Monitoring API v3 is enabled for your project.
400 INVALID_ARGUMENTwith "Field filter had an invalid value": Check the spelling and formatting of your monitoring filter. For more information, see Monitoring Filters.
400 INVALID_ARGUMENTwith "Request was missing field interval.endTime"": You see this message if the end time missing, or if it is present but not properly formatted. If you are using APIs Explorer, do not quote the value of the time field.
Here are some examples of correct time specifications:
2016-05-11T01:23:45Z 2016-05-11T01:23:45.678Z 2016-05-11T01:23:45.678+05:00 2016-05-11T01:23:45.678-04:30
If your API call returns status code
200 and an empty response, there are
If your call uses a filter, then the filter might not have matched anything. The filter match is case-sensitive. To resolve filter problems, start by specifying only one filter component, such as
metric.type, and see if you get results. Add the other filter components one by one to build up your request.
If you are working with a custom metric, you might not have specified the project where your custom metric is defined.
If you are fetching time-series data by using
timeSeries.list, and some of the data points seem
to be missing, then check the following additional causes:
If the data is more than a few weeks old, it might have expired. For more information, see Data retention.
If the data was just written, it might not yet be in Monitoring. For more information, see Latency of metric data.
Check that you specified the time interval correctly:
- Check that the end time is correct.
- Check that the start time is correct, and earlier than the end time.
If the start time is missing or malformed, it defaults to the end-time
value, and the time interval will match only points whose start and
end times are exactly the interval's end time. (This is valid for
GAUGEmetrics, which measure a point in time, but not for
DELTAmetrics, which measure across time intervals. For more information, see Time intervals.
Retrying API errors
Two of the Cloud APIs error codes indicate circumstances in which it might be useful to retry the request:
503 UNAVAILABLE: retries are useful if the problem is a short-lived or transient condition.
429 RESOURCE_EXHAUSTED: retries are useful, after a delay, only for long-running background jobs with time-based quota, for example, if you are limited to n calls per t seconds. But if you've exhausted a volume-based quota, retries do not help; you have to get your quota increased.
When writing code that might retry requests, first ensure that the request is safe to retry.
Is the request safe to retry?
If your request is idempotent, then it is safe to retry. An idempotent action is one where any change in state does not depend on the current state. For example:
- Reading x is idempotent; there is no change to the value.
- Setting x to 10 is idempotent; this might change the state, if the value isn't already 10, but it doesn't matter what the current value is. And it doesn't matter how many times you attempt to set the value.
- Incrementing x is not idempotent; the new value depends on the current value.
Retry with exponential backoff
When implementing code to retry requests, you don't want to rapidly issue new requests indefinitely. If a system is overloaded, this approach contributes to the problem.
Instead, use a truncated exponential backoff approach. When requests fail because of transient overloads rather than true unavailability, the solution is reduce the load. A truncated exponential backoff follows this general pattern:
Establish how long you are willing to wait while retrying or how many attempts you are willing to make. When this limit is exceeded, consider the service unavailable and handle that condition appropriately for your application. This is what makes the backoff truncated; you stop retrying at some point.
Retry the request with increasingly long pauses to back off the frequency of retries. Retry until the request succeeds or your established limit is reached.
The interval is typically increased by some function of the power of the retry count, making it an exponential backoff.
There are many ways to implement an exponential backoff. The following is a simple example that adds an increasing backoff delay to a minimum delay of 1000ms. The initial backoff delay is 2ms, and it increases to 2retry_countms with each attempt.
The following table shows the retry intervals using the initial values:
- Minimum delay = 1s = 1000ms
- Initial backoff = 2ms
|Retry count||Additional delay (ms)||Retry after (ms)|
|0||20 = 1||1001|
|1||21 = 2||1002|
|2||22 = 4||1004|
|3||23 = 8||1008|
|4||24 = 16||1016|
|n||2n||1000 + 2n|
You can truncate the retry cycle by stopping either after n attempts or when the time spent exceeds a reasonable value for your application.
For more information, see the Wikipedia article Exponential backoff.